Linux: Download Web Page: curl, wget
Here's how to download websites, 1 page or entire site.
Download web page/image by curl
# download a html page curl -O http://example.org/xyz/cat.html
Download Image Sequence
# download all jpg files named cat01.jpg to cat20.jpg curl -O http://example.org/xyz/cat[01-20].jpg
# download all jpg files named cat1.jpg to cat20.jpg curl -O http://example.org/xyz/cat[1-20].jpg
Other useful options are:
--referer http://example.org/
- Set a referer (that is, a link you came from). Some sites refuse to show content without this.
--user-agent "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0"
- Set user agent. Some sites refuse to show content without this. [see Show My User Agent]
Note: curl cannot be used to download entire website recursively. Use wget for that.
Download web page/image by wget
Download 1 Web Page
# download a file wget http://example.org/somedir/largeMovie.mov
Download Entire Website
# download website, 2 levels deep, wait 9 sec per page wget --wait=9 --recursive --level=2 http://example.org/
Some sites check on user agent. (user agent basically means browser). so you might add this option “--user-agent=”.
wget http://example.org/ --user-agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0'
[see Show My User Agent]