Python: GET Webpage Content
GET Webpage Content
Suppose you want to GET a webpage's content. The following code does it:
# get a website page import urllib.request print( urllib.request.urlopen("https://www.google.com/").read()) # prints the html source code
# -*- coding: utf-8 -*- # python 2 # get a website page from urllib import urlopen print urlopen("https://www.google.com/").read() # prints the html source code
Encode URL
Some character in URL needs to be encoded, such as NON-ASCII character.
You can use the
quote
function to encode them.
use
unquote
reverses it.
# -*- coding: utf-8 -*- # python 2 from urllib import quote print urllib.quote("~joe's home page") # prints # %7Ejoe%27s%20home%20page print 'http://www.google.com/search?q=' + quote("ménage à trois") # http://www.google.com/search?q=m%C3%A9nage%20%C3%A0%20trois
import urllib.parse print( urllib.parse.quote("~joe's home page") ) # ~joe%27s%20home%20page print( 'http://www.google.com/search?q=' + urllib.parse.quote("ménage à trois") ) # http://www.google.com/search?q=m%C3%A9nage%20%C3%A0%20trois