Web Archiving Collection Development Policies Roundup
The British Library is only one of dozens of national libraries, universities and other organisations around the world that harvest, preserve and give access to web archive resources. We are also a member of the International Internet Preservation Consortium (IIPC) that has 48 members worldwide, committed to the long-term preservation of internet resources.
A world of web archiving
Following much recent discussion, member institutions of the International Internet Preservation Consortium (IIPC) recently added their collection development policies for web archiving to their website. This page also contains links to policies of some web archiving institutions which are not IIPC members. We are not looking at an exhaustive list for either category, however, it is still interesting to see policies brought together and read them in conjunction.
Different remits
These policies reflect the different scope of web archiving activities among the listed institutions. Some have a national remit (like The British Library), others selectively archive the web or undertake web archiving as a programme or for a project. Some (more fortunate) national institutions are supported by legislative frameworks such as Legal Deposit and can therefore archive the web at scale - the national libraries of France, UK, Finland and Austria belong to this category. Although not explicitly expressed, the lack of such framework has among others contributed to the choice of selective web archiving in some countries.
The policies vary in format, length and detail, but contain some common themes. They explain why institutions undertake web archiving, the scope of the material for collection, how websites are collected, stored and used. Some policies also cover the roles and responsibilities if web archiving is done using a collaborative model. The Bentley Historical Library’s policy is a good example, which includes a section on the responsibilities of the archive, the provider of the service the Library subscribes to, and that of the content owners.
Legal issues
Various legal aspects are covered by these polices, ranging from intellectual properties, permissions to sensitive data within the web archives. The Finnish policy, by far the most comprehensive among all policies, lists and describes legislations relevant to web archiving including the library’s interpretation. It also deals with the topic of sensitive data comprehensively, dividing it into the following categories:
- Personal Data Illegitimately Published
- Inspection and Correction of Personal Data
- Confidential and Secret Information
- Web Contents that Violate Law
- Web Contents Illegal to Hold
Common strategies
The National Library of Austria provide a short overview of the common strategies for web harvesting, which are adopted by institutions individually or in some form of combination. Access and use of the archive is another common theme - a number of policies state restricted access to web archives, especially those collected under national legal framework as an exemption to copyright.
Web archiving institutions share the same mission, face similar legal and technical challenges. The IIPC is a great platform to collaborate and learn transferrable lessons and practices. It may be a good idea to develop a common template for policy statements and make sure the most up to date versions are published.
By Helen Hockx-Yu, Head of Web Archiving, The British Library