English 中文(简体)
Get webpage contents with Python?
原标题:

I m using Python 3.1, if that helps.

Anyways, I m trying to get the contents of this webpage. I Googled for a little bit and tried different things, but they didn t work. I m guessing that this should be an easy task, but...I can t get it. :/.

Results of urllib, urllib2:

>>> import urllib2
Traceback (most recent call last):
  File "<pyshell#0>", line 1, in <module>
    import urllib2
ImportError: No module named urllib2
>>> import urllib
>>> urllib.urlopen("http://www.python.org")
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    urllib.urlopen("http://www.python.org")
AttributeError:  module  object has no attribute  urlopen 
>>> 

Python 3 solution

Thank you, Jason. :D.

import urllib.request
page = urllib.request.urlopen( http://services.runescape.com/m=hiscore/ranking?table=0&category_type=0&time_filter=0&date=1519066080774&user=zezima )
print(page.read())
最佳回答

Because you re using Python 3.1, you need to use the new Python 3.1 APIs.

Try:

urllib.request.urlopen( http://www.python.org/ )

Alternately, it looks like you re working from Python 2 examples. Write it in Python 2, then use the 2to3 tool to convert it. On Windows, 2to3.py is in python31 oolsscripts. Can someone else point out where to find 2to3.py on other platforms?

Edit

These days, I write Python 2 and 3 compatible code by using six.

from six.moves import urllib
urllib.request.urlopen( http://www.python.org )

Assuming you have six installed, that runs on both Python 2 and Python 3.

问题回答

If you re writing a project which installs packages from PyPI, then the best and most common library to do this is requests. It provides lots of convenient but powerful features. Use it like this:

import requests
response = requests.get( http://hiscore.runescape.com/index_lite.ws?player=zezima )
print (response.status_code)
print (response.content)

But if your project does not install its own dependencies, i.e. is limited to things built-in to the standard library, then you should consult one of the other answers.

If you ask me. try this one

import urllib2
resp = urllib2.urlopen( http://hiscore.runescape.com/index_lite.ws?player=zezima )

and read the normal way ie

page = resp.read()

Good luck though

Mechanize is a great package for "acting like a browser", if you want to handle cookie state, etc.

http://wwwsearch.sourceforge.net/mechanize/

You can use urlib2 and parse the HTML yourself.

Or try Beautiful Soup to do some of the parsing for you.

Also you can use faster_than_requests package. That s very fast and simple:

import faster_than_requests as r
content = r.get2str("http://test.com/")

Look at this comparison:

enter image description here

A solution with works with Python 2.X and Python 3.X:

try:
    # For Python 3.0 and later
    from urllib.request import urlopen
except ImportError:
    # Fall back to Python 2 s urllib2
    from urllib2 import urlopen

url =  http://hiscore.runescape.com/index_lite.ws?player=zezima 
response = urlopen(url)
data = str(response.read())

Suppose you want to GET a webpage s content. The following code does it:

# -*- coding: utf-8 -*-
# python

# example of getting a web page

from urllib import urlopen
print urlopen("http://xahlee.info/python/python_index.html").read()




相关问题
Can Django models use MySQL functions?

Is there a way to force Django models to pass a field to a MySQL function every time the model data is read or loaded? To clarify what I mean in SQL, I want the Django model to produce something like ...

An enterprise scheduler for python (like quartz)

I am looking for an enterprise tasks scheduler for python, like quartz is for Java. Requirements: Persistent: if the process restarts or the machine restarts, then all the jobs must stay there and ...

How to remove unique, then duplicate dictionaries in a list?

Given the following list that contains some duplicate and some unique dictionaries, what is the best method to remove unique dictionaries first, then reduce the duplicate dictionaries to single ...

What is suggested seed value to use with random.seed()?

Simple enough question: I m using python random module to generate random integers. I want to know what is the suggested value to use with the random.seed() function? Currently I am letting this ...

How can I make the PyDev editor selectively ignore errors?

I m using PyDev under Eclipse to write some Jython code. I ve got numerous instances where I need to do something like this: import com.work.project.component.client.Interface.ISubInterface as ...

How do I profile `paster serve` s startup time?

Python s paster serve app.ini is taking longer than I would like to be ready for the first request. I know how to profile requests with middleware, but how do I profile the initialization time? I ...

Pragmatically adding give-aways/freebies to an online store

Our business currently has an online store and recently we ve been offering free specials to our customers. Right now, we simply display the special and give the buyer a notice stating we will add the ...

Converting Dictionary to List? [duplicate]

I m trying to convert a Python dictionary into a Python list, in order to perform some calculations. #My dictionary dict = {} dict[ Capital ]="London" dict[ Food ]="Fish&Chips" dict[ 2012 ]="...

热门标签