Html敏捷性将表中的空值打包
本文关键字:空值 Html | 更新日期: 2023-09-27 18:20:21
我正在努力学习一些基本的刮削,多亏了这个网站,我学到了很多新东西,但现在我遇到了这个问题。。。这是我正在使用的代码:
var web = new HtmlWeb();
var doc = web.Load("url");
var nodes = doc.DocumentNode.SelectNodes("//*[@id='hotellist_inner']/div");
StreamWriter output = new StreamWriter("out.txt");
if (nodes != null)
{
foreach (HtmlNode item in nodes)
{
if (item != null && item.Attributes["data-recommended"] != null)
{
string line = "";
var nome = item.SelectSingleNode(".//h3/a").InnerText;
var rating = item.SelectSingleNode(".//span[@class='rating']").InnerText;
var price = item.SelectSingleNode("./div[2]/div[3]/div[2]/table/tbody/tr/td[4]/div/strong[1]");
var discount = item.SelectSingleNode("./div[2]/div[3]/div[2]/table/tbody/tr/td[4]/div/div[1]");
line = line + nome + "," + rating + "," + price + "," + discount;
Console.WriteLine(line);
output.WriteLine(line);
}
}
}
前两个项目(名称和评级)都很好,但当涉及到价格和折扣时,我得到的结果是空的。我用chrome scraper分析了这个页面(这里是链接),使用我使用的xpath可以很容易地得到结果。我不明白我做错了什么。如有任何帮助,我们将不胜感激!:D
快速浏览您试图抓取的网页后,并非所有item
都有价格和折扣信息。您需要正确处理这种情况以避免异常,例如,在获取InnerText
之前检查null
。你的代码有了这个细微的变化,就可以获得价格和折扣信息:
if (item != null && item.Attributes["data-recommended"] != null)
{
string line = "";
var nome = item.SelectSingleNode(".//h3/a").InnerText;
var rating = item.SelectSingleNode(".//span[@class='rating']").InnerText;
var price = item.SelectSingleNode("./div[2]/div[3]/div[2]/table/tbody/tr/td[4]/div/strong[1]");
var discount = item.SelectSingleNode("./div[2]/div[3]/div[2]/table/tbody/tr/td[4]/div/div[1]");
//set priceString to empty string if price is null, else set it to price.InnerText
var priceString = price == null ? "" : price.InnerText;
//do similar step for discountString
var discountString = discount == null ? "" : discount.InnerText;
line = line + nome + "," + rating + "," + priceString + "," + discountString;
Console.WriteLine(line);
output.WriteLine(line);
}