Using wget to download entire website

This article is taken from Downloading an Entire Web Site with wget by Linux Journal dated 5th Sept 08. The purpose is for off-line viewing or backup.

Remember that downloading entire website that is not yours is unethical and illegal as well.

# wget \
--recursive \
--no-clobber \
--page-requisites \
--html-extension \
--convert-links \
--restrict-file-names=windows \
--domains website.org \
--no-parent \
www.mywebsite.org/

The options are:

  • --recursive: download the entire Web site.

  • --domains website.org: don't follow links outside website.org.

  • --no-parent: don't follow links outside the directory tutorials/html/.

  • --page-requisites: get all the elements that compose the page (images, CSS and so on).

  • --html-extension: save files with the .html extension.

  • --convert-links: convert links so that they work locally, off-line.

  • --restrict-file-names=windows: modify filenames so that they will work in Windows as well.

  • --no-clobber: don't overwrite any existing files (used in case the download is interrupted and
    resumed).