Regex聊天消息检测

本文关键字:检测 消息 聊天 Regex | 更新日期: 2023-09-27 18:22:05

我目前正在尝试开发一个软件来正确查看以.txt格式保存的WhatsApp消息(通过电子邮件发送),并正在尝试制作一个解析器。在过去的3个小时里,我一直在试用Regex,但一直没有找到解决方案,因为我以前几乎没有使用过Regex。

消息如下:

16.08.2015, 18:30 - Person 1: Some multiline text here
still in the message
16.08.2015, 18:31 - Person 2: some other message which could be multiline
16.08.2015, 18:33 - Person 1: once again

我正试图通过与Regex匹配来正确拆分它们(像这样)

List<string> messages = new List<string>();
messages = Regex.Matches(parseable, @"REGEXHERE").Cast<Match>().Select(m => m.Value).ToList();

他们最终会变成这样的

messages[0]="16.08.2015, 18:30 - Person 1: Some multiline text here'nstill in the message";
messages[1]="16.08.2015, 18:31 - Person 2: some other message which could be multiline";
messages[2]="16.08.2015, 18:33 - Person 1: once again";

我一直在尝试使用非常混乱的正则表达式,它看起来像'd'd''.'d'd''. [...]

Regex聊天消息检测

我不会使用一个RegEx。相反,我只使用StreadReaderStreamReader;你必须检查当前处理行是否是"聊天开始"行(使用RegEx),如果是,请检查以下行是否不是"聊天开始"行,并跟踪你是否应该追加或生成新行。我写了一个快速扩展方法来证明这一点:

public static class ChatReader
{
    static string pattern = @"'d'd'.'d'd'.'d'd'd'd, 'd'd:'d'd - .*?:";        
    static Regex rgx = new Regex(pattern);
    static string prevLine = "";
    static string currLine = "";
    public static IEnumerable<string> ReadChatMessages(this TextReader reader)
    {
        prevLine = reader.ReadLine();
        currLine = reader.ReadLine();
        bool isPrevChatMsg = rgx.IsMatch(prevLine);                
        while (currLine != null)
        {
            bool isCurrChatMsg = rgx.IsMatch(currLine);
            if (isPrevChatMsg && isCurrChatMsg)
            {
                yield return prevLine;
                prevLine = currLine;                    
            }
            else if (isCurrChatMsg)
            {
                yield return currLine;
                prevLine = currLine;
            }
            else
            {
                prevLine += ''n' + currLine;
            }
            currLine = reader.ReadLine();
        }
        yield return prevLine;
    }
}

可以像一样使用

List<string> chatMessages = reader.ReadChatMessages().ToList();