How to Archive Old Web Pages
Whether you have a personal website, business or just want to preserve Web content that may be removed from the World Wide Web there are options for archiving websites. There are two main methods for web archiving: software that allows you pull all of the content of a website off the Web to a local directory on your computer or subscription services that host the data from a website remotely.
Instructions
-
Archiving Web Pages
-
1
Determine what type of Web archiving tool would work best for you. If you have the technical knowledge to maintain Web pages, have access to an IT staff and a secure server you might want to have standalone software so that you can maintain the Web pages internally. If you don't have these things you may want to rely on an online service that will grab and store Web pages remotely.
-
2
Identify the Web pages that you expect to archive and determine what types of content need to be captured. Different archiving tools are capable of capturing various levels of a Web page. If you require that mulitmedia elements need to be archived that will change what types of archive tool you will want to use.
-
-
3
Choose the Web archiving tool that suits your needs based on the above steps. There are several open source options, which means that you are not relying on a commercial vendor to hold your data. Some options available include (links to all of the options are available under Resources):
Archive-It is a subscription service available through the Internet Archive, the largest web archive in the world, that grabs, catalogs, stores and makes available digital content that a user selects. All of the content is also accessible to the public.
HTTrack is a free standalone offline software program that allows the user to grab Web content off of the World Wide Web and store it in a local directory.
Grab-A-Site is a low-cost standalone offline software program that grabs Web content and stores it in a local directory.
Heritrix, also created by the Internet Archive, is a free standalone webcrawler program that stores Web data in a local directory.
-
4
Contact the publisher of the website and request permission to archive the site. Web content is copyrighted so be sure that you either have the rights to the content of the site.
-
5
Maintenance of the Web archiving tool that you select is imperative. You will want to check that the Web pages that you are archiving are grabbing all of the content that you want.
-
6
Set up a good access system so that if you have more than a few Web pages that you are archiving you will be able to find the content. Most programs that you can use for Web archiving will include the ability to add titles, descriptions, keywords and dates for Web pages.
-
1
References
Resources
- Photo Credit computer payment. image by Yuri Bizgaimer from Fotolia.com