Perl: GET Web Page Content

By Xah Lee. Date: . Last updated: .

In Perl, the easiest way to get a webpage is to use the Perl program HEAD or GET usually installed at /usr/bin. For example, in shell, type:


It'll return the web page content. You can save it to a file by GET > myfile.txt.

HEAD and GET are two calling methods of the HTTP protocol. The Perl script are named that way for this reason. [see HTTP Protocol Tutorial]

Here's HEAD example:

linux http GET HEAD command line tool
linux http GET HEAD command line tool

The linux commands {GET, HEAD, POST} are perl scripts. They are installed on Ubuntu. You can read their doc by man HEAD.

For more control, use LWP::Simple or LWP::UserAgent. Both of these you need to install.

# -*- coding: utf-8 -*-
# perl

# get web page content

use strict;
# use LWP::Simple;
use LWP::UserAgent;

my $ua = new LWP::UserAgent;
my $url='';
my $request = new HTTP::Request('GET', $url);
my $response = $ua->request($request);
my $content = $response->content();
print $content;

In the above, the $ua -> timeout(120); is a Object Oriented syntax.



Text Processing