English 中文(简体)
Click a Button in Scrapy
原标题:Click a Button in Scrapy

I m利用Schrapy using一个网页。 在你点击某个纽顿时,我只需要一些信息(当然,在点击后,在超文本中也会出现)。

我发现,如here。 但问题是,没有填写表格,因此,这并不是我所需要的。

How can I simply click a button, which then shows the information I need?

Do I have to use an external library like mechanize or lxml?

最佳回答

强奸不能解释 j。

如果你绝对必须同页上的javascript互动,你就希望使用Selenium。

如果采用 Sc办法,问题的解决取决于纽芬兰正在做些什么。

如果它只是显示以前隐藏的内容,那么你就可以在没有问题的情况下撤销数据,那么,它就没有出现在浏览器上,但这种超文本仍然存在。

如果在 but子受到压力时,它会通过美国宇宙航空研究开发机构积极收集内容,那么最好的事就是在你使用火焰之类工具向纽州施压时,看到吉大港警察局的请求。 届时,你可以直接要求获得URL的数据。

Do I have to use an external library like mechanize or lxml?

如果你想解释javascript,你需要使用一个不同的图书馆,尽管这两个图书馆都没有适合该法案。 他们都不知道什么是 j。 ium是前进的道路。

If you can give the URL of the page you re working on scraping I can take a look.

问题回答

<代码>Selenium browser 提供了非常独特的解决办法。 例如(pip装置-U selenium):

from selenium import webdriver

class northshoreSpider(Spider):
    name =  xxx 
    allowed_domains = [ www.example.org ]
    start_urls = [ https://www.example.org ]

    def __init__(self):
        self.driver = webdriver.Firefox()

    def parse(self,response):
            self.driver.get( https://www.example.org/abc )

            while True:
                try:
                    next = self.driver.find_element_by_xpath( //*[@id="BTN_NEXT"] )
                    url =  http://www.example.org/abcd 
                    yield Request(url,callback=self.parse2)
                    next.click()
                except:
                    break

            self.driver.close()

    def parse2(self,response):
        print  you are here! 

虽然它有了一个老的透镜一,但为此目的,认为使用Helium (built on summary of Selenium)非常有用,比使用Selenium更容易/simpler。 这将是如下内容:

from helium import *

start_firefox( your_url )
s = S( path_to_your_button )
click(s)
...

为了适当和充分利用 Java,你需要一个完整的浏览器发动机,只有Watir/WatiN/Selenium等才能做到这一点。





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签