NullReferenceException in HtmlAgilityPack
本文关键字:HtmlAgilityPack in NullReferenceException | 更新日期: 2023-09-27 18:31:10
我正在尝试使用下面提到的网址中的xpath
提取link
string url = "http://www.album-cover-art.org/search.php?q=Ruin+-+Live+Album+Version+Lamb+of+God"
我的代码:
HtmlAgilityPack.HtmlWeb web = new HtmlAgilityPack.HtmlWeb();
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc = web.Load(url); //Exception generated here Line 23
if (htmlDoc.DocumentNode != null)
{
HtmlNode linkNode = htmlDoc.DocumentNode.SelectSingleNode(".//*[@id='related_search_row']/img/@src");
if (linkNode != null)
Console.WriteLine(linkNode.InnerText);
}
上面的代码编译良好,但是当我尝试运行时,它会生成异常
Unhandled Exception: System.NullReferenceException: Object reference not set to an instance of an object.
完整的堆栈跟踪
System.NullReferenceException: Object reference not set to an instance of an object.
at HtmlAgilityPack.HtmlDocument.ReadDocumentEncoding(HtmlNode node) in C:'Source'htmlagilitypack'Trunk'HtmlAgilityPack'HtmlDocument.cs:line 1916
at HtmlAgilityPack.HtmlDocument.PushNodeEnd(Int32 index, Boolean close) in C:'Source'htmlagilitypack'Trunk'HtmlAgilityPack'HtmlDocument.cs:line 1805
at HtmlAgilityPack.HtmlDocument.Parse() in C:'Source'htmlagilitypack'Trunk'HtmlAgilityPack'HtmlDocument.cs:line 1468
at HtmlAgilityPack.HtmlDocument.Load(TextReader reader) in C:'Source'htmlagilitypack'Trunk'HtmlAgilityPack'HtmlDocument.cs:line 769
at HtmlAgilityPack.HtmlWeb.Get(Uri uri, String method, String path, HtmlDocument doc, IWebProxy proxy, ICredentials creds) in C:'Source'htmlagilitypack'Trunk'HtmlAgilityPack'HtmlWeb.cs:line 1515
at HtmlAgilityPack.HtmlWeb.LoadUrl(Uri uri, String method, WebProxy proxy, NetworkCredential creds) in C:'Source'htmlagilitypack'Trunk'HtmlAgilityPack'HtmlWeb.cs:line 1563
at HtmlAgilityPack.HtmlWeb.Load(String url, String method) in C:'Source'htmlagilitypack'Trunk'HtmlAgilityPack'HtmlWeb.cs:line 1149
at HtmlAgilityPack.HtmlWeb.Load(String url) in C:'Source'htmlagilitypack'Trunk'HtmlAgilityPack'HtmlWeb.cs:line 1107
at ScreenScrapping.Program.Main(String[] args) in c:'Users'ranveer'csharp'ScreenScrapping'ScreenScrapping'Program.cs:line 23
所以,我的问题是为什么我会出现这个异常。
这是
HtmlAgilityPack中的一个错误。您尝试解析的文档<meta http-equiv="Content-Type" content="text/html; charset=iso-utf-8">
AgilityPack 无法将charset
值 ( iso-utf-8
) 解析为有效的编码名称。正如Simon Mourier所说,这是1.4.0.0中引入的错误。
为避免这种情况,请从流中手动加载文档并手动设置编码,如下所示:
var htmlDoc = new HtmlDocument();
htmlDoc.OptionReadEncoding = false;
var request = (HttpWebRequest)WebRequest.Create(url);
request.Method = "GET";
using (var response = (HttpWebResponse)request.GetResponse())
{
using (var stream = response.GetResponseStream())
{
htmlDoc.Load(stream, Encoding.UTF8);
}
}