THE BRITISH LIBRARY

UK Web Archive blog

1 posts from February 2018

01 February 2018

A New Playback Tool for the UK Web Archive

We are delighted to announce that the UK Web Archive will be working with Rhizome to build a version of pywb (Python Wayback) that we hope will greatly improve the quality of playback for access to our archived content.

What is playback of a web archive?

When we archive the web, just downloading the content is not enough. Data can be copied from the web into an archive in a variety of ways, but to make this archive actually accessible takes more than just opening downloaded files in a web browser. Technical details of pages and scripts coming out of the archive need to be presented in a way that enables them to work just like the originals, although they aren’t located on their actual servers anymore. Today’s web users have come to expect interactive features and dynamic layouts on all types of websites. Faithfully reproducing these behaviors in the archive has become an increasingly complex challenge, requiring web archive playback software that is on-par with the evolution of the web as a whole.

Why change?

Currently, we use the OpenWayback playback system, originally developed by the Internet Archive. But in more recent years, Rhizome have led the development of a new playback engine, called pywb (Python Wayback). This Python toolkit for accessing web archives is part of the Webrecorder project, and provides a modern and powerful alternative implementation that is being run as an open source project. This has led to rapid adoption of pywb, as the toolkit is already being used by the Portuguese Web Archive, perma.cc, the UK National Archives, the UK Parliamentary Archive, and a number of others.

Open development
To meet our needs we need to modify pywb, but as strong believers in open source development, all work will be in the open, and wherever appropriate, we will fold the improvements back into the core pywb project.

If all goes to plan, we expect to contribute the following back to pywb for others to use:

Other UKWA-specific changes, like theming, implementing our Legal Deposit restrictions, and deployment support, will be maintained separately.

Initially we will work with Rhizome to ensure our staff and curators can access our archived material via both pywb and OpenWayback. If the new playback tool performs as expected  we will move towards using pywb to support public access to all our web archives.