HtmlAgilityPack issue with HTML
本文关键字:HTML with issue HtmlAgilityPack | 更新日期: 2023-09-27 18:20:38
我想获得标题、图片src和其他详细信息,但有问题
<div class="thumb-container">
<a class="featured" title="Spectacularly " href="http://www.site.com"></a>
<div rel="0" id="property_image_1181140" class="thumb">
<a title="*Want this title*" href="*http://www.wanttogetthislink.com*">
<img style="width: 190px; height: 127px; left: -11px; top: 0px;" alt="Spectacularly upgraded 5 bed Family Villa For Sale" src="http://c1369013.r13.cf3.rackcdn.com/1181140-1-mini.jpg">
</a>
</div>
<div class="description-listing">
<div class="heading">
<div class="type">
<label>*5,900* sq.ft.,</label>
<span>*Villa*</span>
<p class="bedroom"><em>*5*</em></p>
<p class="bathroom"><em>*6*</em></p>
</div>
<p class="amount">
<label>AED</label>
<strong>*5,120,000*</strong>
</p>
</div>
这是我的代码
var allCarResults = rootNode.SelectNodes("//div[normalize-space(@class)='general-listing']");
foreach (var carResult in allCarResults)
{
var dataNode = carResult.SelectSingleNode(".//div[@class='thumb']");
var carNameNode = dataNode.SelectSingleNode(".//a");
}
在这里,我想在**中获取所有内容
我不知道该怎么做。。
原理基本相同,您需要为每个项目编写一个XPath,并从一个公共锚点中选择它:
HtmlNode thumbContainer = doc.DocumentNode.SelectSingleNode("//div[@class='thumb-container']");
HtmlNode link = thumbContainer.SelectSingleNode("./div[@class='thumb']/a");
string linkTitle = link.Attributes["title"].Value;
string linkHref = link.Attributes["href"].Value;
HtmlNode label = thumbContainer.SelectSingleNode("./div[@class='description-listing']/div[@class='heading']/div[@class='type']/label");
string labelText = label.InnerText;
// ... Similar for other items
或者,您可以遍历每个HtmlNode及其子项,然后针对每个项将其与您要查找的项列表进行匹配。