English 中文(简体)
• 如何筛选非维生用户情景的超文本结果
原标题:How can I screen-scrape the HTML result of a non-trivial user scenario

I want to be able to get the HTML for a page which, if I was doing it interactively in a browser, would involve multiple actions and page loads: 1. Go to homepage 2. Enter text into a login form and submit the form (post) 3. The post will go through various redirections and frameset usage.

整个过程中都对库克群岛进行了调整。

在浏览器中,在提交文件后,我刚刚上过页。

但是,要用曲线(在PHP或任何东西)或植被或A.N.其他低水平技术来做到这一点,管理厨师、调整方向和框架就变得十分cho,而且非常严格地将我的文字约束到网站(使得我从网站上删除的网址上甚至极易受到小的改动)。

谁能建议这样做?

我已经研究了Crowbar <>em>和PhantomJSLynx(用厘米d_log/cmd_script options),但把所有东西加在一起,很难确切地说,我在Petherine 或 Chrome中做了些什么。

(除此以外,目标网站考虑这一文字或许有用/必要。) Firefox或 "

最佳回答

这样做的一个途径是利用Selenium RC。 虽然它通常用于测试,但其核心只是一个浏览器遥控服务。

http://seleniumhq.org/projects/remote-control/"rel=“nofollow”http://seleniumhq.org/projects/remote- Control/

问题回答

利用火焰之类工具,检查向网站提交标识的负责人,然后在你的代码中复制。

或只是与你的浏览器 log,然后在你的代码中重新使用 co。





相关问题
How to set response filename without forcing "save as" dialog

I am returning a stream in some response setting the appropriate content-type header. The behavior I m looking for is this: If the browser is able to render content of the given content type then it ...

Which Http redirects status code to use?

friendfeed.com uses 302. bit.ly uses 301. I had decided to use 303. Do they behave differently in terms of support by browsers ?

Does HttpWebRequest send 200 OK automatically?

Background: I am implementing Paypal IPN handler. This great article on Paypal states that I am required to send a 200 OK back to Paypal after I read the response. The processing of IPN request is ...

Java HTTPAUTH

我试图把桌面应用程序连接起来,我是同D.icio.us api @ Delicious Alan书写的,简单地向他们提供我的用户名和密码,并请他把书记上写给我......。

Finding out where curl was redirected

I m using curl to make php send an http request to some website somewhere and have set CURLOPT_FOLLOWLOCATION to 1 so that it follows redirects. How then, can I find out where it was eventually ...