解析HTML表到ASP.NET的问题

本文关键字:问题 NET ASP HTML 表到 解析 | 更新日期: 2023-09-27 18:04:27

我使用htmllagilitypack来解析html页面上的表到ASP。. NET,但是我每次都得到异常。

ASP。NET页面:

    <%@ Page Language="C#" AutoEventWireup="true" CodeBehind="Default.aspx.cs" Inherits="httprequest_web.Default" %>
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
<title></title>
</head>
<body>
<form id="form1" runat="server">
<div>
<asp:Label ID="Label1" runat="server" Text="Label"></asp:Label>
<hr />
<br />
<asp:Label ID="Label2" runat="server" Text="Label"></asp:Label>
&nbsp;<asp:TextBox ID="TextBox1" runat="server" MaxLength="6" TextMode="Number"></asp:TextBox>
<br />
<br />
<asp:Button ID="Button1" runat="server" Text="Button" OnClick="Button1_Click" />
&nbsp;<asp:Label ID="Label3" runat="server" Text="Label" Visible="False"></asp:Label>
<hr />
<br />
<asp:Label ID="Label4" runat="server" Text="Label" Visible="False"></asp:Label>
<br />
</div>
</form>
</body>
</html>

c代码:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.Net;
using System.IO;
using Newtonsoft.Json;
using HtmlAgilityPack;
namespace httprequest_web
{
public partial class Default : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
if(!IsPostBack)
{
Label1.Text = "Train Running Status/ JSON Output";
Label2.Text = "Please enter Train No.:";
}
}
protected void Button1_Click(object sender, EventArgs e)
{
//Generation of HTTP request from the train number specified in text box.
try
{
Label3.Visible = false;
//get request
string base_url = "http://railenquiry.in/runningstatus/";
string url_request = string.Concat(base_url, TextBox1.Text);
//using HtmlAgilityPack
WebClient rq = new WebClient();
string getdata = rq.DownloadString(url_request);
HtmlAgilityPack.HtmlDocument rec_data = new HtmlAgilityPack.HtmlDocument();
rec_data.LoadHtml(getdata);
List<List<string>> table = rec_data.DocumentNode.SelectSingleNode("//table[@class='table table-striped table-vcenter table-bordered table-responsive']")
.Descendants("tr").Where(tr => tr.Elements("td").Count() >= 1).Select(tr => tr.Elements("td").Select(td => td.InnerText.Trim()).ToList()).ToList();
}
catch
{
Label3.Visible = true;
Label3.Text = "Something went wrong. Please try again!";
}
}
}
}

HTML页面来源:点击这里我试图解析运行状态信息,以显示在我的网页上,但每次我得到一个空对象异常。请告诉我哪里做错了。提前感谢!

解析HTML表到ASP.NET的问题

HttpClient client = new HttpClient();
var html = await client.GetStringAsync("http://railenquiry.in/runningstatus/12736");
var doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);
//Same as in your question
var table = doc.DocumentNode.SelectSingleNode("//table[@class='table table-striped table-vcenter table-bordered table-responsive']");
var result = table.Descendants("tr")
                .Select(tr => tr.Descendants("td").Select(td => td.InnerText).ToList())
                .Skip(1) //Skip headers
                .Where(x => x.Count==11) //Skip the footer row
                .ToList();

result将是您在问题中所期望的List<List<string>>