Select all 'a' Node with HtmlAgilityPack

本文关键字:with HtmlAgilityPack Node Select all | 更新日期: 2023-09-27 18:06:12

我在WinRT中使用htmllagilitypack,并尝试将所有<a href="...">节点替换为我想要的。

我注意到HtmlAgilityPack改变了它在WinRT中的节点浏览方式,所以SelectNode不适用作为一个例子。

我写了如下代码,但运气不好。

foreach (HtmlNode node in doc.DocumentNode.FirstChild.Element("body").ChildNodes.Where(n => n.Name.Equals("a"))) // Want to find a-nodes in all html tags. SelectNode("//a[@href") doesn't work.
        {
                HtmlAttribute att = node.Attributes.FirstOrDefault(l => l.Name.Equals("href"));
                if (att != null)
                {
                    node.Attributes.Add("onClick", String.Format("gotoLink('{0}');", att.Value));
                    att.Value = "#";
                }
        }

我必须在文档层次结构上编写递归导航方法吗?

Select all 'a' Node with HtmlAgilityPack

您需要在WinRT中使用HtmlAgilityPack的LINQ API,例如:

//following LINQ selector used to replace XPath query : //a[@href]
foreach (HtmlNode node in doc.DocumentNode.Descendants("a").Where(o => "" != o.GetAttributeValue("href", "")))
{
    var att = node.GetAttributeValue("href", "")
    if (att != "")
    {
        node.Attributes.Add("onClick", String.Format("gotoLink('{0}');", att.Value));
        att.Value = "#";
    }
}
You can use linq for this as i have used    


protected void ClickMeButton_Click(object sender, EventArgs e)
        {
            HtmlNode dt;
            DataSet sampleDataSet = new DataSet();
            sampleDataSet.Locale = CultureInfo.InvariantCulture;
            DataTable sampleDataTable = sampleDataSet.Tables.Add("SampleData");
            DataTable ErrorDataTable = sampleDataSet.Tables.Add("ErrorData");
            DataRow sampleDataRow;
            DataRow sampleErrorRow;
            var inputPath = "";
            //int numberSelected = lstAddItems.SelectedItems.Count;
            //Add path one by one:
            int flag = 0;
            ArrayList ar = new ArrayList();
            ArrayList storeIndex = new ArrayList();
            using (WebClient client = new WebClient())
            {
                string pixarHtml = client.DownloadString("http://localhost:51450/20001.html");
                HtmlDocument document = new HtmlDocument();
                document.LoadHtml(pixarHtml);
                HtmlNode pixarDiv = (from d in document.DocumentNode.Descendants()
                                     where d.Name == "div" && d.Attributes["id"].Value == "wrapper"
                                     select d).First();
                HtmlNode pixarTable1 = (from d in document.DocumentNode.Descendants()
                                        where d.Name == "div" && d.Attributes["id"].Value == "data-review"
                                        select d).First();
                IEnumerable<HtmlNode> pixarRows = (from d in pixarTable1.Descendants() where d.Name == "dl" && d.Attributes["class"].Value == "clearfix" select d);
                //HtmlNode pixarTable = (from d in pixarDiv.Descendants() where d.Name == "table" select d).First();
                //IEnumerable<HtmlNode> pixarRows = (from d in pixarDiv.Descendants() where d.Name == "dl" && d.Attributes["class"]!=null && d.Attributes["class"].Value == "clearfix" select d);
                //IEnumerable<HtmlNode> columns;
                IEnumerable<HtmlNode> columns;
                String sth = "<table><tr><thead>";
                String std = "<tbody><tr>";
                foreach (HtmlNode row in pixarRows)
                {
                    if (row.ChildNodes != null)
                    {
                        if (row.ChildNodes["dd"] != null)
                        {
                            sth += "<th>" + row.ChildNodes["dt"].InnerText.Trim() + "</th>";
                            if (row.ChildNodes["dt"] != null)
                            {
                                std += "<td>" + row.ChildNodes["dd"].InnerText.Trim() + "</td>";
                            }
                            else
                            {
                                std += "<td></td>";
                            }
                        }
                    }
                }
                std += "</tr></tbody>";
                sth += "</thead></tr>" + std + "</table>";
                var arr = Encoding.ASCII.GetBytes(sth);
                File.WriteAllBytes("D:''Demo14.xls", arr);
            }
        }