Web archiving at LIKE39: what, why and how
[A guest post from Marja Kingma, curator of Dutch language collections at the British Library, and one of the leading lights of LIKE, the London Information and Knowledge Exchange.]
With a captivated audience of information professionals before him, Peter Webster (British Library) kicked off the new season of LIKE events at LIKE39. Peter had just moved to the British Library to take up his new role as Web Archiving Engagement & Liaison Officer and LIKE had just moved its meeting place to a new venue: The Castle, sister pub of The Crown Tavern. The upstairs room has state-of-the-art technical facilities and, more importantly, its own bar! And it is even closer to Farringdon Station than the Crown.
Peter is a contemporary historian and has worked with digital information in previous jobs. He now works with the UK Web Archive to raise peopleâ€™s awareness of it and to encourage them to submit sites to the Archive. LIKE39 provided an excellent platform for this, because attendees know about the general issues around the â€˜Digital Black Holeâ€™ and the ephemeral nature of the Web, but they were not all familiar with the UK Web Archive.
Archiving the web, i.e. harvesting websites based in the UK on a regular basis, is just an extension of what the BL, TNA and other participants do with print material. In the past much of what has been printed has been lost and now the same is threatening to happen with electronic material and websites. Websites either disappear completely, or are abandoned, leaving no trace of a contact, which Peter called â€˜orphansâ€™. An example is the City Information Group, where the idea for LIKE was born just as CIG folded. Its site is still on the web, but the â€˜Contactâ€™ page is no longer available. In this case, there is a good chance that a contact can be found, but this is much more problematic in other cases.
Professionals started to see a Digital Black Hole appearing and something had to be done. In 2003 the Legal Deposit Library Act was passed, establishing an legal deposit requirement for publishers of electronic material, but this Act still has to be implemented. Fortunately the BL, TNA, and the Wellcome Trust didnâ€™t wait for that to happen and started the UK Web Archive, selection and permission-based. After ten years of setting up and establishing partnerships, it is now â€˜business as usualâ€™, just in time for the implementation of the LDLA, which now seems to be going forward in earnest next year. This should make redundant the current practice of asking web owners' permission to capture their site, although this would still be necessary to make the archived copy publicly available. It should also speed up ingestion of content into the Archive, by systematic crawling of the UK domain. Alongside this method curators will continue to create thematic collections by actively bringing together websites from within the larger dataset.
It is important that all professionals dealing with websites in one way or other prepare for the preservation of their site(s), as part of the life-cycle for records management. Archiving your website preserves it for future access; researchers as well as the general public will always be able to see what your site looked like in the past. Any one can nominate sites for inclusion in the UK Web Archive, using the simple online form.
LIKEnews.org.uk is being processed for the UK Web Archive and that is a good thing, because we like to think of LIKE as the first networking group for information professionals founded on and managed by using social media tools. It would be a shame if future historians would not be able to track its development from the start!