RESTSharp 在反序列化 XML 时(包括字节顺序标记)时出现问题

本文关键字：问题顺序字节反序列化 XML 包括 RESTSharp | 更新日期: 2023-09-27 18:37:24

有一个公共Web服务，我想在一个简短的C#应用程序中使用它：http://ws.parlament.ch/

从此 Web 服务返回的 XML 在开头有一个"BOM"，这会导致 RESTSharp 无法反序列化 XML，并显示以下错误消息：

检索响应时出错。查看内部详细信息以获取更多信息。---> System.Xml.XmlException：根级别的数据无效。1号线，位置 1. at System.Xml.XmlTextReaderImpl.Throw（Exception e）
at System.Xml.XmlTextReaderImpl.Throw（String res， String arg） at System.Xml.XmlTextReaderImpl.ParseRootLevelWhitespace（） at System.Xml.XmlTextReaderImpl.ParseDocumentContent（） at System.Xml.XmlTextReaderImpl.Read（） at System.Xml.Linq.XDocument.Load（XmlReader reader， LoadOptions options） at System.Xml.Linq.XDocument.Parse（String text， LoadOptions options）
at System.Xml.Linq.XDocument.Parse（String text） at RestSharp.Deserializers.XmlDeserializer.Deserialize[T]（IRestResponse response） at RestSharp.RestClient.Deserialize[T]（IRestRequest 请求，IRestResponse 原始）
---内部异常堆栈跟踪结束---

下面是使用 http://ws.parlament.ch/sessions?format=xml 获取"会话"列表的简单示例：

public class Session
{
    public int Id { get; set; }
    public DateTime? Updated { get; set; }
    public int? Code { get; set; }
    public DateTime? From { get; set; }
    public string Name { get; set; }
    public DateTime? To { get; set; }
}

static void Main(string[] args)
    {
        var request = new RestRequest();
        request.RequestFormat = DataFormat.Xml;
        request.Resource = "sessions";
        request.AddParameter("format", "xml");
        var client = new RestClient("http://ws.parlament.ch/");
        var response = client.Execute<List<Session>>(request);
        if (response.ErrorException != null)
        {
            const string message = "Error retrieving response.  Check inner details for more info.";
            var ex = new ApplicationException(message, response.ErrorException);
            Console.WriteLine(ex);
        }
        List<Session> test = response.Data;
        Console.Read();
    }

当我第一次使用 Fiddler 操作返回的 xml 以删除前 3 位（"BOM"）时，上面的代码有效！有人可以帮我直接在 RESTSharp 中处理这个问题吗？我做错了什么？提前谢谢你！

RESTSharp 在反序列化 XML 时(包括字节顺序标记)时出现问题

我找到了解决方案 - 谢谢@arootbeer的提示！

除了包装XMLDeserializer，您还可以使用 #RESTSharp 中的"RestRequest.OnBeforeDeserialization"事件。所以你只需要在新的 RestRequest（）之后插入这样的东西（请参阅我的初始代码示例），然后它就可以完美地工作了！

request.OnBeforeDeserialization = resp =>
            {
                //remove the first ByteOrderMark
                //see: http://stackoverflow.com/questions/19663100/restsharp-has-problems-deserializing-xml-including-byte-order-mark
                string byteOrderMarkUtf8 = Encoding.UTF8.GetString(Encoding.UTF8.GetPreamble());
                if (resp.Content.StartsWith(byteOrderMarkUtf8))
                    resp.Content = resp.Content.Remove(0, byteOrderMarkUtf8.Length);
            };

我遇到了同样的问题，但不是专门针对RestSharp的。使用这个：

var responseXml = new UTF8Encoding(false).GetString(bytes);

原始讨论：XmlReader 在 UTF-8 BOM 上中断

答案中的相关引用：

xml 字符串不得（！）包含 BOM，BOM 只允许在用 UTF-8 编码的字节数据（例如流）中使用。这是因为字符串表示形式不是编码的，而是已经是 unicode 字符序列。

编辑：查看他们的文档，看起来处理此问题的最直接方法（除了 GitHub 问题）是调用非泛型Execute()方法并从该字符串反序列化响应。还可以创建一个包装默认 XML 反序列化程序的IDeserializer。

@dataCore发布的解决方案不太有效，但这个应该。

request.OnBeforeDeserialization = resp => {
    if (resp.RawBytes.Length >= 3 && resp.RawBytes[0] == 0xEF && resp.RawBytes[1] == 0xBB && resp.RawBytes[2] == 0xBF)
    {
        // Copy the data but with the UTF-8 BOM removed.
        var newData = new byte[resp.RawBytes.Length - 3];
        Buffer.BlockCopy(resp.RawBytes, 3, newData, 0, newData.Length);
        resp.RawBytes = newData;
        // Force re-conversion to string on next access
        resp.Content = null;
    }
};

将resp.Content设置为 null 是一种安全保护，因为只有在尚未设置为值的情况下RawBytes Content才会转换为字符串。

要使其与 RestSharp 一起使用，您可以手动解析响应内容并删除"<"之前的所有"有趣"字符。

var firstChar = responseContent[0];
// removing any 'funny' characters coming before '<'
while (firstChar != 60)
{
    responseContent= responseContent.Remove(0, 1);
    firstChar = responseContent[0];
}
XmlReader.Create(new StringReader(responseContent));