Web Archiving Procedures
Created by Shannon Smith in Academic Year 2018-2019, with guidance from Jason Clark and Sara Mannheimer
Last updated by Sara Mannheimer on July 5, 2019
Table of Contents
The purpose of web archiving initiative at Montana State University (MSU) Library (henceforth, “the Library”) is to digitally capture, preserve, and provide access to websites documenting the history and culture of MSU and other entities related to the Library’s collecting areas. Please see the MSU Library’s Special Collections and Archival Informatics Collection Development Policy and Archive-It’s Storage and Preservation Policy for general policy guidelines.
Content archived as part of MSU Library collections supports the Library’s strategic plan objective 2.3: Expand, diversify, and adapt our collections and services. The Library’s content scope is outlined in the MSU Library Collection Development Policy. http://www.lib.montana.edu/collections/cdpolicy/
For content identified outside the MSU domain, the Library will contact the content creators to provide an opportunity for them to opt out of inclusion in the MSU Library web archives. At the request of content creators, the Library can also support self-documentation and storage of web content (using Web Recorder).
In addition to aligning with the MSU Library Collection Development Policy, The Library’s web archiving program especially focuses on at-risk web content, the Yellowstone ecosystem, and the history of MSU, including student life and faculty research.
The Library maintains an organization subscription to Archive-It. Archive-It serves as the primary tool through which the library will capture web content. Webrecorder is used to capture static web content such as digital scholarship projects or web-based theses and dissertations.
Web archiving falls under the purvue of the Special Collections and Archival Informatics (SCAI) department. Permissions to access the Archive-It manage interface are currently limited to the Data Librarian and Head of SCAI. In the future, these permissions may be extended to SCAI staff and/or students.
The Data Librarian is responsible for selecting websites for archiving and establishing crawls. In the future, SCAI staff and/or students will participate in the web archiving process.
The Data Librarian manages metadata. In the future, the CATS department may advise on metadata practice.
Current metadata fields used in the MSU web archives are:
- Collector (Montana State University Library)
- Access to websites archived using Archive-It is provided through the Archive-It web interface.
- Access to websites archived using Webrecorder is provided through MSU ScholarWorks or the Filmmaking Archive of MSU Science and Natural History MFA Program.
- For websites archived using Archive-It: please see the Archive-It Storage and Preservation Policy.
- For websites archived using Webrecorder: please see MSU Library Digital Preservation Policy and Procedures.
Content within official Montana State University websites is predominantly considered public record.
For websites captured outside of the university domain, the Library will provide an opt-out opportunity to organizations or individuals whose websites are selected for archiving. The Library will actively work to ensure compliance with copyright laws.
The Library acknowledges that organizations and individuals as content creators of websites have agency over their born-digital content. If you believe the Library may have harvested your web content in error, or that maintaining your content in our archive does not adequately reflect your organization please contact firstname.lastname@example.org. Content related to your organization can either be removed from our collection entirely, or treated as sensitive or nonpublic data per Montana State University Library Digital Preservation Procedures section 4.4.
This web archiving procedure document will be used in new staff training to support web archiving practices in the Library. Current local practices will be reviewed periodically to access alignment with current best practices in the field.
Hello [Insert Contact Name],
As part of our mission to collect and preserve the history of Montana and the immediate geographical region, Montana State University Library archives a selection of websites. I am reaching out to seek your permission to archive the website [Insert website name and URL] for inclusion in our web archive collections.
Archiving your website will include an initial capture as well as ongoing quarterly or semi-annual captures of the site. Website captures are completed using the Internet Archive’s Heritrix web crawler, and generally last for a few days. Once a crawl is complete, the crawler no longer interacts with your server. We capture websites at a slow rate so as not to interfere with access to your website.
You may request that we stop archiving or take down your archived website at any time.
If you do not wish to include your website in the collections at Montana State University Library, please opt out using this form: https://montana.qualtrics.com/jfe/form/SV_38bahgsxceRIDPv. Thank you in advance for your consideration.
[Name & contact]
To exclude your website / your organization's website from the web archiving collections at Montana State University Library, please enter your name and email address below. If you represent an organization, please enter the name of the organization.
First name (required)
Last name (required)
Email address (required)
Please note, if in the future you wish to add your web presence to the MSU Library collections contact: email@example.com
Live form accessible here: https://montana.qualtrics.com/jfe/form/SV_38bahgsxceRIDPv