如何创建一个没有 dontEscape 参数的 Uri
本文关键字:dontEscape Uri 参数 何创建 创建 一个 | 更新日期: 2023-09-27 18:32:33
当我使用HttpWebRequest下载页面时,我遇到了一个问题:
在创建新的 Uri 之前,我转义了 url,并将其传递给 Uri 构造函数。但是当我使用 HttpWebRequest 下载页面时,它会将引号字符转换为 '.奇怪。
原始:
https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs
转义,并传递给 Uri 构造函数:
https://fr.wikipedia.org/wiki/Roi_Julian_!_L%27%C3%89lu_des_l%C3%A9murs
HttpWebRequest 发送到服务器:
https://fr.wikipedia.org/wiki/Roi_Julian_!_L'%C3%89lu_des_l%C3%A9murs
以下是我的测试代码:
private static void Test()
{
var title = "Roi_Julian_!_L'Élu_des_lémurs";
var url = "https://fr.wikipedia.org/wiki/" + Uri.EscapeDataString(title);
var uri = new Uri(url);
HttpWebDownload(uri);
}
private static void HttpWebDownload(Uri uri)
{
WebResponse response = null;
StreamReader reader = null;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
request.Method = "GET";
request.AllowAutoRedirect = false;
response = request.GetResponse();
reader = new StreamReader(response.GetResponseStream(), Encoding.UTF8);
string pageResponse = reader.ReadToEnd();
Console.WriteLine(pageResponse);
}
这是 system.net 的跟踪日志:
System.Net Verbose: 0 : [10640] WebRequest::Create(https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs)
System.Net Verbose: 0 : [10640] HttpWebRequest#33111870::HttpWebRequest(https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs#-554901600)
System.Net Information: 0 : [10640] Current OS installation type is 'Server'.
System.Net Information: 0 : [10640] RAS supported: True
System.Net Verbose: 0 : [10640] Exiting HttpWebRequest#33111870::HttpWebRequest()
System.Net Verbose: 0 : [10640] Exiting WebRequest::Create() -> HttpWebRequest#33111870
System.Net Verbose: 0 : [10640] HttpWebRequest#33111870::GetResponse()
System.Net Error: 0 : [10640] Can't retrieve proxy settings for Uri 'https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs'. Error code: 12180.
System.Net Verbose: 0 : [10640] ServicePoint#66337667::ServicePoint(fr.wikipedia.org:443)
System.Net Information: 0 : [10640] Associating HttpWebRequest#33111870 with ServicePoint#66337667
System.Net Information: 0 : [10640] Associating Connection#35489797 with HttpWebRequest#33111870
System.Net Information: 0 : [10640] Connection#35489797 - Created connection from 10.168.184.78:55975 to 198.35.26.96:443.
System.Net Information: 0 : [10640] TlsStream#45795543::.ctor(host=fr.wikipedia.org, #certs=0)
System.Net Information: 0 : [10640] Associating HttpWebRequest#33111870 with ConnectStream#65677972
System.Net Information: 0 : [10640] HttpWebRequest#33111870 - Request: GET /wiki/Roi_Julian_!_L'%C3%89lu_des_l%C3%A9murs HTTP/1.1
System.Net Information: 0 : [10640] ConnectStream#65677972 - Sending headers
{
Host: fr.wikipedia.org
Connection: Keep-Alive
}.
我认为这是由 dontEscape 参数弃用引起的,因此,我添加了一个新函数来修复它,但是,我失败了。
private const ulong UserEscape = 0x00080000;
public static void EnableUserEscape(Uri uri)
{
FieldInfo fieldInfo = uri.GetType().GetField("m_Flags", BindingFlags.Instance | BindingFlags.NonPublic);
if (fieldInfo == null)
{
throw new MissingFieldException("'m_Flags' field not found");
}
var uriFlags = (ulong)fieldInfo.GetValue(uri);
uriFlags = uriFlags | UserEscape;
fieldInfo.SetValue(uri, uriFlags);
}
在将 Uri 传递给 HttpWebDownload() 之前,我使用此函数启用 UserEscape,但最终 HttpWebRequest 将这种 url(https://fr.wikipedia.org/wiki/Roi_Julian_!_L'?lu_des_l?murs) 发送到服务器。
任何人都可以给出解决方案吗?
谢谢
在测试方法中,按如下方式定义 URI。 愿它能帮助你
带有@"somestring+SpecialCharacter"的字符串称为逐字字符串。它基本上意味着,"在到达下一个引号字符之前,不要对字符串内的特殊字符应用任何解释"
var url = @"https://fr.wikipedia.org/wiki/Roi_Julian_!_L'Élu_des_lémurs";