Mirror a web site with wget
In order to mirror a web site for offline use or preservation purposes, wget is the tool of the day:
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent <URL to the website>
The options are as follows:
- --mirror: recursive download and other useful options
- --convert-links: change links to relative links for offline and local viewing
- --adjust-extension: fixes extensions when they don't match the type of their content
- --page-requisites: makes wget download all necessary files to display the web pages (e.g. style sheets, inlined images, etc.)
- --no-parent: only download pages below a certain hierarchy
Before using, make sure you have the right to mirror the said web site. Using too much bandwidth or making too many requests in a certain time might get you blocked.
Comments
Post a Comment