Python: GET Webpage Content

By Xah Lee. Date: . Last updated: .

Suppose you want to GET a webpage's content. The following code does it:

# -*- coding: utf-8 -*-
# python 2

# get a website page

from urllib import urlopen
print urlopen("").read()

# prints the html source code
# get a website page

import urllib.request

print( urllib.request.urlopen("").read())

# prints the html source code

Encode URL

Some character in URL needs to be encoded, such as NON-ASCII character. You can use the quote function to encode them.

use unquote reverses it.

# -*- coding: utf-8 -*-
# python 2

from urllib import quote

print urllib.quote("~joe's home page")

# prints
# %7Ejoe%27s%20home%20page

print '' + quote("ménage à trois")
import urllib.parse

print( urllib.parse.quote("~joe's home page") )
# ~joe%27s%20home%20page

print( '' + urllib.parse.quote("ménage à trois") )

See also:



Text Processing