htmllagilitypack不能从一个网页中获取所有的html代码/文本

本文关键字:获取 html 文本 代码 网页 不能 一个 htmllagilitypack | 更新日期: 2023-09-27 18:12:09

首先,谢谢大家!

我能够从一个网页中提取一段代码,看起来类似于下面的代码块。

<div id="playerStats">
  <div id="hp"><span class="title">HP:</span></div>
  <div id="mp"><span class="title">MP:</span></div>
  <div id="magicResist"><span class="title">Magic Resist</span></div>
  <div id="physicalDefend"><span class="title">Physical Defence</span></div>
  <div id="phyCriticalReduceRate"><span class="title">Strike Resist</span></div>
  <div id="phyCriticalDamageReduce"><span class="title">Strike fortitude</span></div>
  <div id="physicalRight"><span class="title">Main Hand Attack</span></div>
  <div id="accuracyRight"><span class="title">Main Hand Accuracy</span></div>
  <div id="criticalRight"><span class="title">Main Hand Critical</span></div>
  <div id="physicalLeft"><span class="title">Off Hand Attack</span></div>
  <div id="accuracyLeft"><span class="title">Off Hand Accuracy</span></div>
  <div id="criticalLeft"><span class="title">Off Hand Critical</span></div>
  <div id="attackSpeed"><span class="title">Attack Speed</span></div>
  <div id="magicalBoost"><span class="title">Magic Boost</span></div>
  <div id="magicalAccuracy"><span class="title">Magic Accuracy</span></div>
  <div id="magicalCriticalRight"><span class="title">Crit Spell</span></div>
  <div id="castingTimeRatio"><span class="title">Casting Speed</span></div>
  <div id="block"><span class="title">Block</span></div>
  <div id="dodge"><span class="title">Evasion</span></div>
</div>

从下面的uri为这个视频游戏的角色统计页面。(您应该可以清楚地看到页面中间的统计表。)如果你使用类似于Google Chrome的F-12的浏览器功能来查看html源代码,你会注意到在/span和/div之间有类似于以下代码的值:

<div id="playerStats">
  <div id="hp"><span class="title">HP:</span>"12213"</div>
  <div id="mp"><span class="title">MP:</span>"4000"</div>
  <div id="magicResist"><span class="title">Magic Resist</span>"4618"</div>
  <div id="physicalDefend"><span class="title">Physical Defence</span>"1725"</div>
  <div id="phyCriticalReduceRate"><span class="title">Strike Resist</span>"1518"</div>
  <div id="phyCriticalDamageReduce"><span class="title">Strike fortitude</span>"392"</div>
  <div id="physicalRight"><span class="title">Main Hand Attack</span>"201"</div>
  <div id="accuracyRight"><span class="title">Main Hand Accuracy</span>"201"</div>
  <div id="criticalRight"><span class="title">Main Hand Critical</span>"201"</div>
  <div id="physicalLeft"><span class="title">Off Hand Attack</span>"201"</div>
  <div id="accuracyLeft"><span class="title">Off Hand Accuracy</span>"201"</div>
  <div id="criticalLeft"><span class="title">Off Hand Critical</span>"201"</div>
  <div id="attackSpeed"><span class="title">Attack Speed</span>"201"</div>
  <div id="magicalBoost"><span class="title">Magic Boost</span>"201"</div>
  <div id="magicalAccuracy"><span class="title">Magic Accuracy</span>"201"</div>
  <div id="magicalCriticalRight"><span class="title">Crit Spell</span>"201"</div>
  <div id="castingTimeRatio"><span class="title">Casting Speed</span>"201"</div>
  <div id="block"><span class="title">Block</span>"201"</div>
  <div id="dodge"><span class="title">Evasion</span>"201"</div>
</div>

接下来,我使用下面的代码来检索上面描述的第一个html代码块。

HtmlDocument doc = new HtmlDocument();
doc.Load(MyTestFile);
foreach(var node in doc.DocumentNode.SelectNodes("//div[@id='playerStats']/div/span"))
{
    Console.WriteLine(node.InnerText + " " + (node.NextSibling != null ?  node.NextSibling.InnerText : null));
}

我使用了WebRequest, WebClient, WebBrowser和HtmlWeb-agilitypack类来从web上拉下html文档。然而,我希望从中提取的最重要的部分并没有在文档中拉下,即与Hp, mp等相关的值……期望值在上面的html代码的第二个块中描述。

我怎么能让我的代码把这个简单的文本在文档中为我解析为好?

htmllagilitypack不能从一个网页中获取所有的html代码/文本

通过POST方法和几个参数调用http://psykopats.net/loadAion.php动态加载播放器信息,其中一个是player并标识播放器。在您的例子中,参数是:

server:66
type:1
player:299345

你可以看一下这个问题,看看如何在WebClient中使用POST。

响应是一个JSON字符串,其中包含您正在查找的内容:

stat: {baseCriticalResist:0, magicCriticalResist:0, physicalDefend:1402, baseMagicalSpeed:1,…}
accuracyLeft: 2617
accuracyRight: 2617
agi: 110
airResist: 0
attackSpeed: 1.1
baseAccuracyLeft: 1705
baseAccuracyRight: 1705
baseAgi: 110
baseAirResist: 0
baseAttackSpeed: 1.1
baseBlock: 837
baseCastingTimeRatio: "1.0"
baseCriticalDefend: 0
baseCriticalLeft: 53
baseCriticalResist: 0
baseCriticalRight: 103
baseDex: 110
baseDodge: 1839
baseDp: 4000
baseEarthResist: 0
baseFireResist: 0
baseHealBoost: 0
baseHealSkillBoost: 0
baseHp: 6688
baseKno: 90
baseMagCriticalDamageReduce: 0
baseMagCriticalReduceRate: 0
baseMagicCriticalDefend: 0
baseMagicCriticalResist: 0
baseMagicResist: 1384
baseMagicalAccuracy: ""
baseMagicalAttack: 0
baseMagicalBoost: 0
baseMagicalCriticalLeft: 50
baseMagicalCriticalRight: 50
baseMagicalSpeed: 1
baseMoveSpeed: 6
baseMp: 4318
baseParry: 1847
basePhyCriticalDamageReduce: 0
basePhyCriticalReduceRate: 190
basePhysicalDefend: 1162
basePhysicalLeft: 255
basePhysicalRight: 234
baseStr: 110
baseVit: 100
baseWaterResist: 0
baseWill: ""
block: 837
castingTimeRatio: 0.98
criticalDefend: 0
criticalLeft: 602
criticalResist: 0
criticalRight: ""
dex: 110
dodge: 2272
dp: 4000
earthResist: ""
fireResist: 0
healBoost: 0
healSkillBoost: 0
hp: 11210
kno: 90
magCriticalDamageReduce: 0
magCriticalReduceRate: 38
magicCriticalDefend: 0
magicCriticalResist: 0
magicResist: 1725
magicalAccuracy: 1201
magicalAttack: 0
magicalBoost: 0
magicalCriticalLeft: 50
magicalCriticalRight: 50
magicalSpeed: "1.0"
moveSpeed: 7.56
mp: 4618
parry: ""
phyCriticalDamageReduce: 201
phyCriticalReduceRate: 392
physicalDefend: 1402
physicalLeft: 658
physicalRight: 658
str: 110
vit: 100
waterResist: 0
will: 0

示例代码:

System.Net.WebClient wc = new System.Net.WebClient();
byte[] data = wc.UploadValues(
    "http://psykopats.net/loadAion.php",
    new System.Collections.Specialized.NameValueCollection(){
        {"server", "66"},
        {"type", "1"},
        {"player", "299345"}});
string json = System.Text.Encoding.ASCII.GetString(data);