如何创建一个没有 dontEscape 参数的 Uri

本文关键字:dontEscape Uri 参数 何创建 创建 一个 | 更新日期: 2023-09-27 18:32:33

当我使用HttpWebRequest下载页面时,我遇到了一个问题:

在创建新的 Uri 之前,我转义了 url,并将其传递给 Uri 构造函数。但是当我使用 HttpWebRequest 下载页面时,它会将引号字符转换为 '.奇怪。

原始:

https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs

转义,并传递给 Uri 构造函数:

https://fr.wikipedia.org/wiki/Roi_Julian_!_L%27%C3%89lu_des_l%C3%A9murs

HttpWebRequest 发送到服务器:

https://fr.wikipedia.org/wiki/Roi_Julian_!_L'%C3%89lu_des_l%C3%A9murs

以下是我的测试代码:

private static void Test()
{
    var title = "Roi_Julian_!_L'Élu_des_lémurs";
    var url = "https://fr.wikipedia.org/wiki/" + Uri.EscapeDataString(title);
    var uri = new Uri(url);
    HttpWebDownload(uri);
}
private static void HttpWebDownload(Uri uri)
{
    WebResponse response = null;
    StreamReader reader = null;
    HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
    request.Method = "GET";
    request.AllowAutoRedirect = false;
    response = request.GetResponse();
    reader = new StreamReader(response.GetResponseStream(), Encoding.UTF8);
    string pageResponse = reader.ReadToEnd();
    Console.WriteLine(pageResponse);
}

这是 system.net 的跟踪日志:

System.Net Verbose: 0 : [10640] WebRequest::Create(https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs)
System.Net Verbose: 0 : [10640] HttpWebRequest#33111870::HttpWebRequest(https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs#-554901600)
System.Net Information: 0 : [10640] Current OS installation type is 'Server'.
System.Net Information: 0 : [10640] RAS supported: True
System.Net Verbose: 0 : [10640] Exiting HttpWebRequest#33111870::HttpWebRequest() 
System.Net Verbose: 0 : [10640] Exiting WebRequest::Create()    -> HttpWebRequest#33111870
System.Net Verbose: 0 : [10640] HttpWebRequest#33111870::GetResponse()
System.Net Error: 0 : [10640] Can't retrieve proxy settings for Uri 'https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs'. Error code: 12180.
System.Net Verbose: 0 : [10640] ServicePoint#66337667::ServicePoint(fr.wikipedia.org:443)
System.Net Information: 0 : [10640] Associating HttpWebRequest#33111870 with ServicePoint#66337667
System.Net Information: 0 : [10640] Associating Connection#35489797 with HttpWebRequest#33111870
System.Net Information: 0 : [10640] Connection#35489797 - Created connection from 10.168.184.78:55975 to 198.35.26.96:443.
System.Net Information: 0 : [10640] TlsStream#45795543::.ctor(host=fr.wikipedia.org, #certs=0)
System.Net Information: 0 : [10640] Associating HttpWebRequest#33111870 with ConnectStream#65677972
System.Net Information: 0 : [10640] HttpWebRequest#33111870 - Request: GET /wiki/Roi_Julian_!_L'%C3%89lu_des_l%C3%A9murs HTTP/1.1
System.Net Information: 0 : [10640] ConnectStream#65677972 - Sending headers
{
Host: fr.wikipedia.org
Connection: Keep-Alive
}.

我认为这是由 dontEscape 参数弃用引起的,因此,我添加了一个新函数来修复它,但是,我失败了。

private const ulong UserEscape = 0x00080000;           
public static void EnableUserEscape(Uri uri)
{
    FieldInfo fieldInfo = uri.GetType().GetField("m_Flags", BindingFlags.Instance | BindingFlags.NonPublic);
    if (fieldInfo == null)
    {
        throw new MissingFieldException("'m_Flags' field not found");
    }
    var uriFlags = (ulong)fieldInfo.GetValue(uri);
    uriFlags = uriFlags | UserEscape;
    fieldInfo.SetValue(uri, uriFlags);
 }

在将 Uri 传递给 HttpWebDownload() 之前,我使用此函数启用 UserEscape,但最终 HttpWebRequest 将这种 url(https://fr.wikipedia.org/wiki/Roi_Julian_!_L'?lu_des_l?murs) 发送到服务器。

任何人都可以给出解决方案吗?

谢谢

如何创建一个没有 dontEscape 参数的 Uri

在测试方法中,按如下方式定义 URI。 愿它能帮助你

带有@"somestring+SpecialCharacter"的字符串称为逐字字符串。它基本上意味着,"在到达下一个引号字符之前,不要对字符串内的特殊字符应用任何解释"

var url = @"https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs";