Selenium.NoSuchElementException with dynamic tables

本文关键字：tables dynamic with NoSuchElementException Selenium | 更新日期: 2023-09-27 18:32:54

请帮我解决这个问题！

目前，我正在使用C#中的Selenium Firefox驱动程序抓取一个网站。但是，本网站上的数据是动态填充的，用于涵盖有关未来日期数据的表格。

虽然未来和过去日期的表结构完全相同，但在我的硒调用期间更新的表会抛出一个关于明显存在的 IWebElements 的"NoSuchElementException"。

这些是从表中复制的相关 XPath。一个来自过去的日期，它工作得很好，另一个来自未来日期，抛出异常。如您所见，它们是相同的。

XPath 18052015

/html/body/div[1]/div/div[2]/div[5]/div[1]/div/div[1]/div[2]/div[1]/div[7]/div[1]/table/tbody/tr[1]/

td[1]/div/a[2]

XPath 05022016

/html/body/div[1]/div/div[2]/div[5]/div[1]/div/div[1]/div[2]/div[1]/div[7]/div[1]/table/tbody/tr[1]/

td[1]/div/a[2]

使用

FindElements(By.XPath(...(( 函数，我使用两个 foreach 循环来遍历 Xpath 中突出显示的 tr 和 td，以获取 a[2] 标头中的一些文本。在这两种情况下，FireFox Firebug 中的 DOM 在这两种情况下似乎是相同的。我在两个表之间观察到的唯一区别是，每隔几秒钟，有关未来日期的表就会更新其值(通过 firebug 查看时也会重置表(。在这里，您有一段相关的代码，其中包含重要的注释。

            foreach (var tr in table.FindElements(By.XPath("div/table/tbody/tr")))
            {
                foreach (var td in tr.FindElements(By.XPath("td")))
                {
                    if(td.GetAttribute("innerHTML").Contains("some stuff"))
                    {
                        // This part is always reached, so condition is satisfied. > x is the relevant value, it is assigned the proper value when the error is thrown, but it still throws an exception.
                        x = td.FindElement(By.XPath("div/a[2]")).GetAttribute("href").Split('/')[4];
                        bmID = getBookmakerID(bmName);
                    }
                    if(td.GetAttribute("class").Contains("some other stuff"))
                    {
                    }
                }

你们

中有人以前遇到过类似的问题吗，你们能够解决它们吗？

Selenium.NoSuchElementException with dynamic tables

你能在你调用FindElement的每个步骤中添加Wait吗？请参阅以下示例：

IWait<IWebElement> wait = new DefaultWait<IWebElement>(table);
wait.Timeout = TimeSpan.FromSeconds(5);
wait.PollingInterval = TimeSpan.FromMilliseconds(300);
By locator = By.XPath("div/table/tbody/tr");
ReadOnlyCollection<IWebElement> rows;
wait.Until(e => e.FindElements(locator).Count > 0);
rows = table.FindElements(locator);

foreach (var tr in rows)
{
    wait = new DefaultWait<IWebElement>(tr);
    wait.Timeout = TimeSpan.FromSeconds(5);
    wait.PollingInterval = TimeSpan.FromMilliseconds(300);
    locator = By.XPath("td");
    ReadOnlyCollection<IWebElement> cells;
    wait.Until(e => e.FindElements(locator).Count > 0);
    cells = tr.FindElements(locator);
    foreach (var td in cells)
    {
        if (td.GetAttribute("innerHTML").Contains("some stuff"))
        {
            // This part is always reached, so condition is satisfied. > x is the relevant value, it is assigned the proper value when the error is thrown, but it still throws an exception.
            wait = new DefaultWait<IWebElement>(td);
            wait.Timeout = TimeSpan.FromSeconds(5);
            wait.PollingInterval = TimeSpan.FromMilliseconds(300);
            locator = By.XPath("div/a[2]");
            IWebElement link2;
            wait.Until(e => e.FindElements(locator).Count > 0);
            try
            {
                link2 = td.FindElement(locator);
            }
            catch (NoSuchElementException ex)
            {
                throw new NoSuchElementException("Unable to find element, locator: '"" + locator.ToString() + "'".");
            }
            x = link2.GetAttribute("href").Split('/')[4];
            bmID = getBookmakerID(bmName);
        }
        if (td.GetAttribute("class").Contains("some other stuff"))
        {
        }
    }
}

如果仍然错误，您可以在Visual Studio中轻松调试测试。

非常感谢您的帮助。 @Buaban-我已经添加了等待，但恐怕没有太大变化。它确实使算法走得更远，但最终它崩溃了。

最后，我们通过使用Selenium webdriver和HTMLAgilityPack的组合来解决它。由于代码太具体而无法实际发布(而且我目前没有它(，我将与您分享主要的哲学......哪个是短的：

使用Selenium网络驱动程序打开和导航浏览器，例如执行操作

转到正确的网址
打开下拉菜单/表格
登录/点击网站
将要从中翻录数据的表/字段定义为 Web 元素 (WE(
在此处查看教程：http://toolsqa.com/selenium-webdriver-tutorials-in-c-selenium-tutorial-in-c/

使用 HTMLAgilityPack 浏览和翻录定义的网页 (WE(

加载 WE 的 InnerHTML 属性，以便在 HTML AP 中进行处理
您可以在不同的div/trs/tds 之间导航它
这是很多 - 我的意思是很多 - 比使用Selenium Webdriver翻录更快，因为它将HTML解析为字符串
在这里看到一个很好的教程：http://www.mikesdotnetting.com/article/273/using-the-htmlagilitypack-to-parse-html-in-asp-net，特别是查看"位置<>"功能！

最后，这种处理自我刷新页面的方法已被证明非常稳定(到目前为止没有失败过一次(，非常快(由于将HTML解析为字符串(和灵活(因为它使用专用包从浏览器导航和翻录数据(。

祝您编码愉快！