使用正则表达式从字符串中提取xml

本文关键字：提取 xml 字符串正则表达式 | 更新日期: 2023-09-27 18:25:34

我从服务器上有下面的日志文件，我想从下面的字符串中提取xml。

2:00:11 PM >>Response: <?xml version="1.0" encoding="UTF-8"?>
<HotelML xmlns="http://www.xpegs.com/v2001Q3/HotelML"><Head><Route Destination="TR" Source="00"><Operation Action="Create" App="UltraDirect-d1c1_" AppVer="V1_1" DataPath="/HotelML" StartTime="2013-07-31T08:33:13.223+00:00" Success="true" TotalProcessTime="711"/></Route>............
</HotelML>

3:00:11 PM >>Response: <?xml version="1.0" encoding="UTF-8"?>
<HotelML xmlns="http://www.xpegs.com/v2001Q3/HotelML"><Head><Route Destination="TR" Source="00"><Operation Action="Create" App="UltraDirect-d1c1_" AppVer="V1_1" DataPath="/HotelML" StartTime="2013-07-31T08:33:13.223+00:00" Success="true" TotalProcessTime="711"/></Route>............
</HotelML>
5:00:11 PM >>Response: <?xml version="1.0" encoding="UTF-8"?>
<HotelML xmlns="http://www.xpegs.com/v2001Q3/HotelML"><Head><Route Destination="TR" Source="00"><Operation Action="Create" App="UltraDirect-d1c1_" AppVer="V1_1" DataPath="/HotelML" StartTime="2013-07-31T08:33:13.223+00:00" Success="true" TotalProcessTime="711"/></Route>............
</HotelML>

我已经为其编写了以下正则表达式，但它只匹配字符串中的第一个条目。但我想将所有xml字符串作为集合返回。

(?<= Response:).*>.*</.*?>

使用正则表达式从字符串中提取xml

为什么不从<HotelML匹配到</HotelML？

类似于：

<HotelML .*</HotelML>

或者，只需逐行浏览文件，只要找到与匹配的行

^.* PM >>Response:.*$

以xml形式读取以下行，直到下一行匹配。。。

这里有另一种方法，应该会给您留下List<XDocument>:

using System.IO;
using System.Linq;
using System.Text.RegularExpressions;
using System.Xml.Linq;
class Program
{
    static void Main(string[] args)
    {
        var input = File.ReadAllText("text.txt");
        var xmlDocuments = Regex
            .Matches(input, @"([0-9AMP: ]*>>Response: )")
            .Cast<Match>()
            .Select(match =>
                {
                    var currentPosition = match.Index + match.Length;
                    var nextMatch = match.NextMatch();
                    if (nextMatch.Success == true)
                    {
                        return input.Substring(currentPosition,
                            nextMatch.Index - currentPosition);
                    }
                    else
                    {
                        return input.Substring(currentPosition);
                    }
                })
            .Select(s => XDocument.Parse(s))
            .ToList();
    }
}