Python: GET Webpage Content

By Xah Lee. Date: . Last updated: .

Suppose you want to GET a webpage's content. The following code does it:

# -*- coding: utf-8 -*-
# python 2

# get a website page

from urllib import urlopen
print urlopen("https://www.google.com/").read()

# prints the html source code
# python 3

# get a website page

import urllib.request

print( urllib.request.urlopen("https://www.google.com/").read())

# prints the html source code

Encode URL

Some character in URL needs to be encoded, such as NON-ASCII character. You can use the quote function to encode them.

use unquote reverses it.

# -*- coding: utf-8 -*-
# python 2

from urllib import quote

print urllib.quote("~joe's home page")

# prints
# %7Ejoe%27s%20home%20page

print 'http://www.google.com/search?q=' + quote("ménage à trois")
# http://www.google.com/search?q=m%C3%A9nage%20%C3%A0%20trois
# python 3

import urllib.parse

print( urllib.parse.quote("~joe's home page") )
# ~joe%27s%20home%20page

print( 'http://www.google.com/search?q=' + urllib.parse.quote("ménage à trois") )
# http://www.google.com/search?q=m%C3%A9nage%20%C3%A0%20trois

See also:

Python

Regex

Text Processing

Web

Misc