UK Web Archive blog

Information from the team at the UK Web Archive, the Library's premier resource of archived UK websites

The UK Web Archive, the Library's premier resource of archived UK websites

19 June 2015

RESAW Conference – showcasing research of the historical web


RESAW, a self-organising initiative aimed at building a pan-European research infrastructure for the study of web archives, has been active for a couple of years now.  A group of active researchers from Europe and North America have gathered around this network. They met last week in Aarhus, Denmark and presented their work at the RESAW Conference, entitled “Web Archives as Scholarly Sources: Issues, Practices and Perspectives”. The diverse approaches and findings I witnessed reflect the increasing awareness and understanding of the characteristics of archived web material, and the development of appropriate research methods to study it.

Packed programme

The conference had a packed programme, with parallel sessions running on all three days. In addition to 3 plenary sessions, it included 10 long papers, 12 short papers, 4 themed panels and 1 workshop. The format was refreshing and worked really well in bringing forward different perspectives: presentations were kept strictly to fixed time, while each paper received structured comments, followed by questions and discussion with the audience. The only downside was the hard choices one had to make, deciding where to go when a number of interesting papers were on at the same time.


Meghan Dougherty of Loyola University, one of the very first researchers working with web archives, called for a more exclusive approach to archiving the web in her opening keynote.  Instead of preserving the web as series of linked documents, the focus should be on its rich complexity as new media including interactions, expectations, and how people live through and experience it. We otherwise risk excluding many features of today’s live web experience which will be valuable for future researchers.  Recognising the lack of good methodology for studying the historical web, Meghan observed the relevance of archaeological methods and practices, how they can help recover, document and analyse a record of information culture through virtual digging, and in that process taking into account the invisible and missing elements. She also asked archivists to reach out to researchers and researchers to collaborate more so that specialist knowledge and skills can be joined up. 


Aarhus University is also the home of the Danish State and University Library, which has been collaborating with the Royal Danish Library to archive the Danish Web since 2005. The conference coincided with the ten year anniversary of the National Danish Web Archive, which now contains 600TB of data. Ditte Laursen and Per Møldrup-Dalum presented an overview of the Danish Web and shared the various legal, curatorial, technical and access challenges the Archive had to face and address.  A key one is the identification and collection of Danish material hosted on non .dk domains, which is applicable to many national web archives. After focusing on comprehensive data collection, access and use are now high on the agenda. The Archive launched full-text search on the anniversary and there are exciting plans to actively develop data mining and analytics, and to strengthen collaboration with researchers.


It is no surprise that historians were among the first who started using web archives to study contemporary history. There was a strong presence of historians at the Conference who explored diverse aspects of the historical web. Ian Milligan of Waterloo University used the GeoCities Web Archive to explore the nature of virtual communities, highlighting the technical challenges and how critical overcoming these is to the historiography of the early web. Peter Webster studied British creationism in the historical UK Web Archive by analysing the creationist web estate and high-level patterning of host-to-host linkage, to conclude that in addition to its marginalisation British creationism was mostly ignored by academia, the media and the churches.  Sophie Gebeil presented a history of North African immigration memories through the French Web archives. A number of the researchers attached to the Big UK Domain Data for the Arts and Humanities project also presented their work, sharing the methodological frustrations and highlighting the challenges for large scale web archives to support qualitative research.


Media scholars, social scientists, computer scientists and music and literature scholars are also using web archives. It is encouraging to see how aspects of the web other than “text” were explored by researchers, including software, programming language, social networks and the earlier Bulletin Board System.  Anne Helmond showed how to make use of the social media code snippets, embedded in the archived source code , to issue API calls to social network platforms and obtain the embedded content (currently not collected by web archives) .  Anat Ben-David presented an impressive effort in understanding and recovering the former .yu ccTLD which has now disappeared from the web entirely. In both cases, I think there are things web archives can do to remove reliance on social networks and to surface content related to all expired ccTLDs.  


There was so much inspiring research, covering all aspects of the web, in ways we have not envisaged. Those interested in finding out more should follow the storyfied tweets, put together by the Institute of Historical Research (IHR), University of London, which was also a co-organiser of the Conference. This is what significantly differs from the past – providers of web archives had to speculate possible use scenarios. I do not think we are short of use cases now. The RESAW conference has given us much evidence and food for thought. The next step is to collate, synthesize and extract high-level requirements out of these and use them to guide our development of tools and services.


As a proud co-organiser of the Conference, it was a delight to see work produced by the British Library Web Archiving Team being used by researchers and other web archives. We should however bear in mind that it is too early to settle on fixed methods of using web archives. We must try different approaches and continue the exploration and experiments to move forward.

The absolute highlight of the Conference was the announcement of the next RESAW Conference in London, to be sponsored and organised by the IHR.  Hopefully RESAW will become an on-going platform for showcasing more research on the historical web, carried out by more researchers including those from the less privileged countries and regions. 


Helen Hockx-Yu, Head of Web Archiving


The comments to this entry are closed.