Python: GET Webpage Content

By Xah Lee. Date: 2005-02-04. Last updated: 2019-03-14.

GET Webpage Content

Suppose you want to GET a webpage's content. The following code does it:

# get a website page

import urllib.request

print( urllib.request.urlopen("https://www.google.com/").read())

# prints the html source code

# -*- coding: utf-8 -*-
# python 2

# get a website page

from urllib import urlopen
print urlopen("https://www.google.com/").read()

# prints the html source code

Encode URL

Some character in URL needs to be encoded, such as NON-ASCII character. You can use the quote function to encode them.

use unquote reverses it.

# -*- coding: utf-8 -*-
# python 2

from urllib import quote

print urllib.quote("~joe's home page")

# prints
# %7Ejoe%27s%20home%20page

print 'http://www.google.com/search?q=' + quote("ménage à trois")
# http://www.google.com/search?q=m%C3%A9nage%20%C3%A0%20trois

import urllib.parse

print( urllib.parse.quote("~joe's home page") )
# ~joe%27s%20home%20page

print( 'http://www.google.com/search?q=' + urllib.parse.quote("ménage à trois") )
# http://www.google.com/search?q=m%C3%A9nage%20%C3%A0%20trois