使用 C# 解析器分析 HTML 内容
本文关键字:HTML 内容 使用 | 更新日期: 2023-09-27 17:56:32
我有以下HTML文件,我想得到每个H2(标准(弹性费率)..和行政(弹性费率)...仅提供客房,含早餐。
然后将"仅房间"和"含早餐"(每个价格为 2 个)推到我有标准以及"仅房间"和"含早餐"中的 2 个价格,行政也一样
我尝试了使用AgilityPack的Fizzler,但是,我无法获得正确的结果,您能否为这种情况建议我一个想法或一个好的解析器?谢谢
<div id="accordionResizer" style="padding:5px; height:300px; border-radius:6px;" class="ui-widget-content regestancias">
<div id="accordion" class="dias">
<h2>
<a href="#">
Standard (Flexible Rate) from 139 €
</a>
</h2>
<div class="estancias_precios estancias_precios_new">
<table style="width: 285px;">
<tr class="" title="">
<cont>
<td style="width: 25px;">
<input type="radio" name="estancias" id="tarifa602385" elem="tarifa" idelem="602" idreg="385" precio="139" reg="Only%20Bed" nombre="Standard%20%28Flexible%20Rate%29" />
</td>
<td style="width: 155px;">
<label class="descrip" for="tarifa602385" precio="139.00" reg="Only%20Bed" nombre="Standard%20%28Flexible%20Rate%29">
Only Bed
</label>
</td>
<td style="width: 55px;"></td>
<td style="width: 55px;">
<strong class="precios_mos">139.00 €</strong>
</td>
</cont>
</tr>
<tr class="" title="">
<cont>
<td style="width: 25px;">
<input type="radio" name="estancias" id="tarifa602386" elem="tarifa" idelem="602" idreg="386" precio="156.9" reg="Breakfast%20Included" nombre="Standard%20%28Flexible%20Rate%29" />
</td>
<td style="width: 155px;">
<label class="descrip" for="tarifa602386" precio="156.90" reg="Breakfast%20Included" nombre="Standard%20%28Flexible%20Rate%29">
Breakfast Included
</label>
</td>
<td style="width: 55px;"></td>
<td style="width: 55px;">
<strong class="precios_mos">156.90 €</strong>
</td>
</cont>
</tr>
</table>
</div>
<h2>
<a href="#">
Executive (Flexible Rate) from 169 €
</a>
</h2>
<div class="estancias_precios estancias_precios_new">
<table style="width: 285px;">
<tr class="" title="">
<cont>
<td style="width: 25px;">
<input type="radio" name="estancias" id="tarifa666385" elem="tarifa" idelem="666" idreg="385" precio="169" reg="Only%20Bed" nombre="Executive%20%28Flexible%20Rate%29" />
</td>
<td style="width: 155px;">
<label class="descrip" for="tarifa666385" precio="169.00" reg="Only%20Bed" nombre="Executive%20%28Flexible%20Rate%29">
Only Bed
</label>
</td>
<td style="width: 55px;"></td>
<td style="width: 55px;">
<strong class="precios_mos">169.00 €</strong>
</td>
</cont>
</tr>
<tr class="" title="">
<cont>
<td style="width: 25px;">
<input type="radio" name="estancias" id="tarifa666386" elem="tarifa" idelem="666" idreg="386" precio="186.9" reg="Breakfast%20Included" nombre="Executive%20%28Flexible%20Rate%29" />
</td>
<td style="width: 155px;">
<label class="descrip" for="tarifa666386" precio="186.90" reg="Breakfast%20Included" nombre="Executive%20%28Flexible%20Rate%29">
Breakfast Included
</label>
</td>
<td style="width: 55px;"></td>
<td style="width: 55px;">
<strong class="precios_mos">186.90 €</strong>
</td>
</cont>
</tr>
</table>
</div>
</div>
</div>
在这里,
您将采用一种快速而肮脏的方法:
class RoomInfo
{
public String Name { get; set; }
public Dictionary<String, Double> Prices { get; set; }
}
private static void HtmlFile()
{
List<RoomInfo> rooms = new List<RoomInfo>();
HtmlDocument document = new HtmlDocument();
document.Load("file.txt");
var h2Nodes = document.DocumentNode.SelectNodes("//h2");
foreach (var h2Node in h2Nodes)
{
RoomInfo roomInfo = new RoomInfo
{
Name = h2Node.InnerText.Trim(),
Prices = new Dictionary<string, double>()
};
var labels = h2Node.NextSibling.NextSibling.SelectNodes(".//label");
foreach (var label in labels)
{
roomInfo.Prices.Add(label.InnerText.Trim(), Convert.ToDouble(label.Attributes["precio"].Value, CultureInfo.InvariantCulture));
}
rooms.Add(roomInfo);
}
}
剩下的就看你了! ;-)