Python: GET Webpage Content

By Xah Lee. Date: . Last updated: .

Suppose you want to GET a webpage's content. The following code does it:

# -*- coding: utf-8 -*-
# python

# example of getting a web page

from urllib import urlopen
print urlopen("http://xahlee.info/python/python_index.html").read()

Encoding URL

Sometimes in working with HTML pages, you need to create links. In URL, certain chars need to be encoded. For example, http://example.com/~xah needs to be http://example.com/%7Exah. Basically, any reserved chars ! * ' ( ) ; : @ & = + $ , / ? # [ ] when not used for special purposes such as CGI parameters, needs to be encoded by its hexadecimal. For example, ~ has hexadecimal 7e, so it needs to be encoded as %7e.

In Python, the quote function does it. unquote reverses it.

# -*- coding: utf-8 -*-
# python

from urllib import quote
print quote("~joe's home page")
print 'http://www.google.com/search?q=' + quote("ménage à trois")

20.5. urllib — Open arbitrary resources by URL — Python v2.7.6 documentation

See also: