English 中文(简体)
IMPORTXML和Xpath query in googlefiles program, but not when using same xapth with UrlFetchApp.fetch(url) in appsscript?
原标题:IMPORTXML and xpath query in google sheets formula works, but not when using same xapth with UrlFetchApp.fetch(url) in apps script?

试图将几条<代码>IMPORTXML的公式移至表面所附的备份文字功能中,以便安排该表每天运行,这样,许多<代码>IMPORTXML的公式不会造成装上hang或当天花很长时间。 我发现,在采用电子表格公式内对“URL”目标进行Xpath问询时,做的是罚款,但在试图通过笔记来使用同一xpath问询时,我会发现错误,而不是确定如何。

这项工作:

=IMPORTXML(CONCATENATE("https://www.marketwatch.com/investing/stock/",A2,"/company-profile"), "/html/body/div[3]/div[6]/div[2]/div[2]/div[1]/table/tbody/tr[6]/td[2]")

......但这不是:

function fetchAndWriteData() {
  var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("mysheet");
  var tickers = sheet.getRange("A2:A").getValues(); 
  var valuesColHeader = "p/cf"; 
  var xpathQuery = "/html/body/div[3]/div[6]/div[2]/div[2]/div[1]/table/tbody/tr[6]/td[2]";

  for (var i = 0; i < tickers.length; i++) {
    var ticker = tickers[i][0]; // Get the value from column A (1-indexed)
    if (ticker) {
      // Construct the URL
      var url = "https://www.marketwatch.com/investing/stock/"+ticker+"/company-profile";
      console.log(url)

      // Fetch the HTML content from the URL
      var response = UrlFetchApp.fetch(url);
      var content = response.getContentText();
      console.log(content)

      // Extract data using XPath
      var doc = XmlService.parse(content);
      var rootNode = doc.getRootElement();
      var elements = XmlService.getNamespace().getChild(rootNode, xpathQuery);

      if (elements.length > 0) {
        var extractedValue = elements[0].getValue();
        // Write the extracted value to the specified column
        var headerRow = sheet.getRange(1, 1, 1, sheet.getLastColumn()).getValues();
        var valuesColIndex = headerRow[0].indexOf(valuesColHeader) + 1; // 1-indexed
        sheet.getRange(i + 1, valuesColIndex).setValue(extractedValue);
      }
    }
  }
}

浏览错误:

例外:第268行的错误:与“链接”一类的“交叉来源”对应名称必须随附=特性。

......在处决线以下时:var doc = Xmlservice.parse (content);

A类标本:

“enterography

在此可以做些什么? 我如何确定这一点?

如果有任何工作(如果需要的话)使我能够使用一种价值选择查询/说明,而这种查询/说明可以简单地从浏览器检查员那里复制,因为我还有许多其他带有Xpaths的URL,我正试图去做文字:

“entergraph

是否有这样的文字功能或已经存在的东西?

问题回答

我认为,在现阶段,不幸的是,Xpath不能直接用于Xmlservice的物体。 此外,我担心所有超文本数据都可以通过<代码>加以分类。 Xmlservice.parse 。 我认为,你的错误信息可能是造成这种情况的。 因此,在你的情况中,如何使用reg和Xml Services的组合? 当这反映在你的文字中时,情况如下。

Modified script 1:

function fetchAndWriteData2() {
  var valuesColHeader = "p/cf";

  var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("mysheet");
  var [headers, ...tickers] = sheet.getDataRange().getDisplayValues();
  var values = tickers.map(([ticker]) => {
    if (ticker) {
      var url = "https://www.marketwatch.com/investing/stock/" + ticker + "/company-profile";
      var response = UrlFetchApp.fetch(url, { muteHttpExceptions: true });
      if (response.getResponseCode() == 200) {
        var content = response.getContentText();
        try {
          var table = content.match(/<table.*?VALUATION data table.*?>[wsS]*?</table>/);
          if (table) {
            var xmlObj = XmlService.parse(table[0]);
            var root = xmlObj.getRootElement();
            var extractedValue = root.getChild("tbody", root.getNamespace()).getChildren()[5].getChildren()[1].getValue();
            return [extractedValue];
          }
        } catch (e) {
          console.log(e.message);
        }
      }
    }
    return [null];
  });
  var column = headers.indexOf(valuesColHeader) + 1;
  sheet.getRange(2, column, values.length).setValues(values);
}
  • By the way, I thought that getValues() and setValues() are used in a loop, the process cost will be high. Ref (Author: me) So, I moved setValues outside of the loop.

Modified script 2:

缩略语

1. Install library

请安装“谷歌应用的精度”网站Ref:Gite Hub Library of cheeriogs。 参考:官方文件“向贵方项目提供图书馆”

2. Retrieve selector value

请通过显示图像的“选择”检索。 在这种情况下,取得以下价值。

#maincontent > div.region.region--primary > div.column.column--primary > div.group.left > div:nth-child(1) > table > tbody > tr:nth-child(6) > td.table__cell.w25

3. Modified script:

function fetchAndWriteData3() {
  var valuesColHeader = "p/cf";
  var selector = "#maincontent > div.region.region--primary > div.column.column--primary > div.group.left > div:nth-child(1) > table > tbody > tr:nth-child(6) > td.table__cell.w25";

  var sheet = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("mysheet");
  var [headers, ...tickers] = sheet.getDataRange().getDisplayValues();
  var values = tickers.map(([ticker]) => {
    if (ticker) {
      var url = "https://www.marketwatch.com/investing/stock/" + ticker + "/company-profile";
      var response = UrlFetchApp.fetch(url, { muteHttpExceptions: true });
      if (response.getResponseCode() == 200) {
        var content = response.getContentText();
        try {
          const $ = Cheerio.load(content);
          return [$($(selector)["0"]).text()];
        } catch (e) {
          console.log(e.message);
        }
      }
    }
    return [null];
  });
  var column = headers.indexOf(valuesColHeader) + 1;
  sheet.getRange(2, column, values.length).setValues(values);
}

Note:

  • Unfortunately, I do not know the values of tickers. And also, I m worried whether I could correctly understand your expected values. So, when this modified script didn t work, can you provide the sample values? By this, I would like to confirm it.

References:





相关问题
Logic for Implementing a Dynamic Web Scraper in C#

I am looking to develop a Web scraper in C# window forms. What I am trying to accomplish is as follows: Get the URL from the user. Load the Web page in the IE UI control(embedded browser) in ...

Capture ASP output for monitoring

How do I Capture ASP.NET output and then store it as temp memory so that I can use them in an application to do comparison. example. there s this site which has ASP output. Sorry I do not have ...

Error in using Python/mechanize select_form()?

I am trying to scrap some data from a website. The scripts I am trying to write, should get the content of the page: http://www.atpworldtour.com/Rankings/Singles.aspx Should simulate the user going ...

Retrieving dynamic text from a website in vb.net (VS2008)

I want to be able to retrieve dynamic data from a web page (share prices). I started out by retrieving the html code before I realised that as it is live data, the html code will be of little use. ...

Programming languages comparison for web data mining task

I need some help comparing different programming languages, such as: C++, Java, Python, Ruby and PHP, for a task which is related for web data mining (developing web crawler, string manipulations and ...

热门标签