29 May 2015

Beginners Guide to Web Archives Part 1

Arriving at the British Library as an intern, one of the tasks laid out before me was to create and curate a special collection for the UK web archive. To some readers of this blog this activity may seem fairly self-explanatory. However, before arriving at the library I had never even heard of web archiving, let alone considered why we do it and who it could be useful for. In a short series of blogs I will explore these questions from the novice’s point of view, both my own and that of academic researchers hoping to use the resource. I hope to convey the new user’s perceptions of the challenges and opportunities of the archive, as well as providing an introduction for interested beginners.

Spiders spinning furiously

The web is a vast resource. In 2008 Google had found 1012 URLs online. It has been suggested that the web represents a rapid expansion in human knowledge. Certainly it enables greater access to human knowledge for billions of people. It is also a place where a huge range of opinions are openly expressed. However, the content of the web has a very rapid turnover, with around 40 % of websites changing their content within a week. Without web archiving (the practice of collecting and storing websites), many human writings are inevitably - often accidentally - lost.

The UK web archive now collects almost the entire UK web-space. One of the problems facing users of the archive is the astounding amount of data through which to sift. One way of getting around this problem is to create so-called ‘special collections’, groups of websites that fall under a particular theme. This enables the curator to provide the user with a set of data that is easier to sort and search.  

My special collection


As a science PhD student, I felt my special collection should be built with the aim of answering research questions related to a scientific topic. I specialise in oceanography and past climate changes and I am aware of the almost constant debate that occurs on hundreds of climate related websites about climate science, the social impacts of climate change and the policies that should be enforced. A special collection on these issues might be useful for answering questions such as: How has the web influenced public opinion on climate change? As new science rolls in, how do viewpoints expressed on the web change? How do different organisations use the web as a platform for promoting their beliefs?  

Global warming in perspective

To provide a resource for answering these questions I plan to select webpages from organisations including environmental charities, climate sceptic think-tanks, energy companies and government; and yet more pages of blogs, articles and discussion. I hope that this collection will become a useful resource for anyone interested in the climate change issue. But would this resource be something researchers might actually use? And how might they go about using it? Find out in my next post.

Peter Spooner, Science Policy Intern