English 中文(简体)
使用 python urlib 下载受保护的文件
原标题:Downloading a protected file using python urllib
问题回答

< a href=" "http://wwwsearch.sourceforge.net/mechanize/" rel="nofollow" > mechanize 库对于这种情况非常有用。 它模拟浏览器, 包括填充表格( 如登录表格) 和保持饼干等状态。 您可以登录到网站, 然后浏览到 pdf 文件。 您可以使用以下代码 :

br = mechanize.Browser()
br.open(login_url)
#code to log in with br
data = br.open(pdf_url).get_data()

然后,你不得不将数据分析为pdf文件,然后你可以做你需要做的一切。

当使用该网络应用程序时,会话会为您生成一个“ 会话 ” 。 会话细节存储在您的客户端的 cookie 中。 您的客户端会按照每个 HTTP 请求发送 cookie 内容 。 这样, 网络应用程序就会知道您的 HTTP 请求与同一会话对应 。 最初, 您只是该会话中的未知用户。 登录后, 网络应用程序会知道该会话中的请求来自授权用户 。

您有两个选项 :

  • log in via browser, craft the cookie and fake the browser in subsequent requests using Python
  • do everything in Python (starting from the initial request, logging in, document retrieval)

两者都可能是相当大的工作量(特别是如果你是这些事情的新手的话 ), 因为您必须调整您的代码以适应网络应用程序的具体细节。 像机械化(如其他人已经提到的)这样的图书馆可以节省一些工作。





相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签