Linq to XML querying an XDoc
本文关键字:an XDoc querying XML to Linq | 更新日期: 2023-09-27 18:00:22
我希望能够从span类访问文本,同时也能够获得span类之外的信息。例如:
这是我的XML文档信息示例:
<item><title>Operations Applications - MDI Diagnostics</title><link>http://confidential-link.com</link>
<description><![CDATA[<div style="margin-top:5px"><link rel="stylesheet" type="text/css" href="confidential" />
<span class="srch-Icon"><a href="confidential" title="Operations Applications - MDI Diagnostics">
<img src="confidential" alt="Web Page" border="0" /></a></span>
<span class="psrch-Description"> THE INFORMATION I WANT</span>
<p class="srch-Metadata"><span class="srch-URL">
<a href="confidential" title="Operations Applications - MDI Diagnostics">confidential link</a>
-
66KB
</span></p></div>]]></description>
<author>Bob Smith
</author><pubDate>Mon, 10 Mar 2014 18:53:49 GMT</pubDate><search:dotfileextension>.ASPX</search:dotfileextension><search:size>68076</search:size>
<search:hithighlightedsummary> SIMILAR TO THE INFORMATION I WANT, COULD BE OPTION 2 </search:hithighlightedsummary>
</item>
这是我现在拥有的:
var feeds = from feed in xdoc.Descendants("item")
select new RSSS
{
Site = "TOPS",
URL = url,
Title = feed.Element("title").Value,
Link = feed.Element("link").Value,
Description = feed.Element("description").Value
};
按预期返回"描述":
<div style="margin-top:5px"><link rel="stylesheet" type="text/css" href="confidential" />
<span class="srch-Icon"><a href="confidential" title="Operations Applications - MDI Diagnostics">
<img src="confidential" alt="Web Page" border="0" /></a></span>
<span class="psrch-Description"> THE INFORMATION I WANT</span>
<p class="srch-Metadata"><span class="srch-URL">
<a href="confidential" title="Operations Applications - MDI Diagnostics">confidential link</a>
-
66KB
</span></p></div>
那么,我如何在仍然能够访问链接和标题等信息的情况下访问"span class=psrch Description"之间的信息呢?
**因为我不想找像这样的东西
var feeds = from feed in xDoc.Descendants("Show")
where (string)feed.Attribute("Code") == "456"
select new
{
EventDate = feed.Attribute("Date").Value
};
这不允许我获得其他信息。
如果<Description>
包含符合XML的HTML,则可以将其加载到另一个XDocument
变量:
XDocument description = XDocument.Parse(feed.Element("description").Value);
然后,您可以使用另一个LINQ到XML查询来获取您感兴趣的<Description>
值的任何部分。
否则,您将需要另一种可以处理HTML的类型,例如HtmlAgilityPack的HtmlDocument
。