UK Web Archive blog

28 posts from December 2011

16 December 2011

Advent Calendar: December 16th

Digital Curation Centre

'The Digital Curation Centre was been established to help solve the extensive challenges of digital preservation and to provide research, advice and support services to UK institutions'.

Archived on: 16 18 December 2006

Archived by: The JISC (Joint Information Systems Committee)

Still available on live web? Yes

Subject classification: Education & Research > Higher Education

Special collection? No

Other instances? Yes: 23 instances in total, archived regularly since March 2006. 

(Edited to correct date archived)

15 December 2011

TechTalk: UKWA web archiving tools on GitHub

We recently made a release to GitHub courtesy of the Open Planets Foundation. This primarily consists of two tools: the 'ArchiveExplorer' and 'ArchiveFS'.
ArchiveExplorer (Windows)
The UK Web Archive is stored by the Library in a series of WARC files. As explained in an earlier post, we use the Internet Archive's Wayback software to replay these files and provide access to websites. However, there are times when we only need to access the contents of an individual WARC file and we found that there seemingly were no tools available - from the IA or anywhere else - for viewing the contents of an individual WARC file. 

ArchiveExplorer serves this requirement and enables the various records in the WARC file to be viewed directly in the tool, or double-clicked to be opened externally.
ArchiveFS (Linux)
Various FUSE (Filesystem in Userspace) tools have existed for mounting and viewing the contents of various file types. Using the FUSE libraries, we've created a tool for mounting ARC/WARC files. After mounting, the contents of each file will then be available with the directory structure mimicking that of the original site. While the filesystem is read-only, any required file-operations (e.g. MIME identification, virus-scanning, etc.) can be performed as normal. 

Other web archiving institutions are welcome to make use of the tools, available from Github.

Roger Coram
Web Archiving Engineer, UK Web Archive 

Advent Calendar: December 15th

Consumers for Ethics in Research: CERES

'An independent charity set up 1989 to promote informed debate about research and help users of health services to develop and publicise their views on health research and on new treatments.'

Website archived on: 15 December 2006

Still available on live web? No


Archived by: The Wellcome Library

Subject classifications: Medicine & Health > Health Organisations and Services

Special collection? No

Other instances? Yes, 9 in total (archived from 2004 - 2007)

14 December 2011

13 December 2011

12 December 2011

Advent Calendar: December 12th

'Site created to help teachers, parents and professionals understand the workings of a Special School for Children with Severe and Profound Learning Difficulties'

Archived on: 12th December 2005

Still available on live web? No

Archived by: The British Library

Subject Classifications: Society & Culture; Education & Research > Special Needs Education

Special collection: No

Other instances? Yes - 12 Dec 2006 (though as a parked domain)

11 December 2011

10 December 2011

Advent Calendar: December 10th

Electronic Iraq

Website established in 2003 'to provide a humanitarian perspective on the looming conflict in Iraq'.

Archived on: December 10th 2004

Still available on live web? Partially. 

Archived by: The British Library

Subject Classifications: Arts & Humanities > News and Contemporary Events

Special collection? No

Other instances available? Yes - 13 in total, captured between 2004 and 2010.