Different behaviour between "new XmlTextReader" an

本文关键字:quot XmlTextReader an new behaviour Different between | 更新日期: 2023-09-27 18:26:35

给定以下源代码:

 using System;
 using System.IO;
 using System.Xml;
 using System.Xml.Schema;
 namespace TheXMLGames
 {
   class Program
   {
     static void Main(string[] args)
     {
       XmlReaderSettings settings = new XmlReaderSettings {
         Async = false,
         ConformanceLevel = ConformanceLevel.Fragment,
         DtdProcessing = DtdProcessing.Ignore,
         ValidationFlags = XmlSchemaValidationFlags.None,
         ValidationType = ValidationType.None,
         XmlResolver = null,
       };
       string head = File.ReadAllText("sample.xml");
       Stream stringStream = GenerateStreamFromString(head);
       // Variant 1
       //XmlReader reader = XmlReader.Create(stringStream);
       // Variant 2
       //XmlReader reader = XmlReader.Create(stringStream, settings);
       // Variant 3
       XmlTextReader reader = new XmlTextReader(stringStream);
       while (reader.Read())
         if (reader.NodeType != XmlNodeType.Whitespace)
           Console.WriteLine(reader.Name + ": " + reader.Value);
       // No Variant gets here without an exception,
       // but that's not the point!
       Console.ReadKey();
     }
     public static Stream GenerateStreamFromString(string s)
     {
       MemoryStream stream = new MemoryStream();
       StreamWriter writer = new StreamWriter(stream);
       writer.Write(s);
       writer.Flush();
       stream.Position = 0;
       return stream;
     }
   }
 }

sample.xml

 <?xml version="1.0" encoding="utf-8"?>
 <!DOCTYPE TestingFacility >
 <TestingFacility id="MACHINE_2015-11-11T11_11_11" version="2015-11-11">
 <Program>
   <Title>title</Title>
   <Steps>16</Steps>
 </Program>
 <Calibration>
   <Current offset="0" gain="111.11" />
   <Voltage offset="0" gain="111.11" />
 </Calibration>
 <Info type="Facilityname" value="MACHINE" />
 <Info type="Hardwareversion" value="HW11" />
 <Info type="Account" value="DJohn" />
 <Info type="Teststart" value="2015-11-11T11:11:11" />
 <Info type="Description" value="desc" />
 <Info type="Profiler" value="prof" />
 <Info type="Target" value="trgt" />

行为如下:

变体1
XmlReader.Create(流)
System.Xml.dll中发生类型为"System.Xml.XmlException"的未经处理的异常附加信息:出于安全原因,此XML文档中禁止使用DTD。若要启用DTD处理,请将XmlReaderSettings上的DtdProcessing属性设置为Parse,并将设置传递到XmlReader.Create方法中。

变体2
XmlReader.Create(流,设置)
System.Xml.dll中发生类型为"System.Xml.XmlException"的未经处理的异常附加信息:意外的DTD声明。第2行,位置3。

变体3
新建XmlTextReader(字符串流)
System.Xml.dll中发生类型为"System.Xml.XmlException"的未经处理的异常附加信息:出现意外的文件结尾。以下元素未关闭:TestingFacility。第19行,位置36。

变体1和2在第一行之后投掷
变量3按预期输出整个文件,当它到达末尾时,它会抱怨(正确!)。

该软件的工作原理显然与我使用Variant 3一样,但(现在)推荐的方法是通过XmlReader使用Factory

如果我篡改设置,它会变得更加奇怪。

如何更新代码并使用XmlReader.Create?

完整的项目可以在这里找到:https://drive.google.com/file/d/0B55cC50M31_8T0lub25oS2QxQ00/view

Different behaviour between "new XmlTextReader" an

我通常不建议使用NON-XML方法来解析XML文件。但有时当XML无效时,其他方法是更好的选择。由于您有一个巨大的XML文件,并且只试图获取一行数据,因此下面的代码可能是最佳选择。

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Data;
using System.IO;
using System.Text.RegularExpressions;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            string xml =
                "<?xml version='"1.0'" encoding='"utf-8'"?>'n" +
                "<!DOCTYPE SomeDocTypeIdidntPutThere>'n" +
                "<TestingFacility id='"MACHINE2_1970-01-01T11_22_33'" version='"1970-01-01'">'n" +
                "<Program>'n" +
                  "<Title>Fancy Title</Title>'n" +
                  "<Steps>136</Steps>'n" +
                "</Program>'n" +
                "<Info type='"Start'" value='"2070-01-01T11:22:33'" />'n" +
                "<Info type='"LotMoreOfThem'" value='"42'" />'n";
            MemoryStream stream = new MemoryStream(Encoding.UTF8.GetBytes(xml));
            StreamReader reader = new StreamReader(stream);
            string inputLine = "";
            string timeStr = "";
            while ((inputLine = reader.ReadLine()) != null)
            {
                inputLine = inputLine.Trim();
                if(inputLine.StartsWith("<Info type='"Start'""))
                {
                    string pattern = "value='"(?'time'[^'"]+)";
                    timeStr = Regex.Match(inputLine, pattern).Groups["time"].Value;
                    break;
                }
            }
            DateTime time;
            if (timeStr.Length > 0)
            {
                time = DateTime.Parse(timeStr);
            }
        }
    }
}
​

您的xml无效。您缺少结束标记,DOCTYPE必须与根标记匹配

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE TestingFacility>
<TestingFacility id="MACHINE2_1970-01-01T11_22_33" version="1970-01-01">
  <Program>
    <Title>Fancy Title</Title>
    <Steps>136</Steps>
  </Program>
  <Info type="Start" value="2070-01-01T11:22:33" />
  <Info type="LotMoreOfThem" value="42" />
</TestingFacility>​