THE BRITISH LIBRARY

UK Web Archive blog

2 posts from February 2014

19 February 2014

Jorge Luis Borges and Twitter

[A guest post from writer and Museum Studies tutor Rebecca Reynolds]

When I first heard that the British Library was archiving every webpage with a .uk domain name, I immediately thought of Borges's short story Funes the Memorious, about a man who can forget nothing. 'I have more memories in myself alone than all men have had since the world was a world', Funes says; 'my memory, Sir, is like a garbage disposal'.

I spoke to Helen Hockx-Yu, Head of Web Archiving at the British Library, about this, focusing on Twitter pages. Will ephemera in such quantities be truly useful to researchers of the future?

Helen commented that this was up to researchers to decide but was clear that as many webpages as possible needed to be kept. 'When you research a person's life, or history, you don't have everything - you piece it together.' she said. 'Hopefully what we're doing would form part of those pieces.' She gave as an example Antony Gormley's 2009 One and Other art project in which members of the public took turns to stand on the fourth plinth in Trafalgar Square and say whatever they wanted. The website recording these people is no longer available but is in the UK Web Archive. For some websites, Helen said, 'being ephemeral is exactly their significance'.

And what about privacy? Would you like researchers of the future poring over one of your ill-considered blog posts or tweets? Webpages can be withdrawn only under certain circumstances such as defamation or breaches of confidentiality. Helen's advice here was simply to be careful what you put in the public domain.

I also spoke to Jonathan Fryer, Liberal Democrat Euro-candidate for London, two of whose Twitter pages have been put in a UK Web Archive collection devoted to blogs and bloggers. He thought archiving Twitter feeds was a good idea: 'Twitter has taken over from letters and other forms of exchange of information and ideas. Forms of communication such as blogs and Twitter need to be kept instead.'

Jonathan Fryer

Back to Borges's story. The narrator doubts that Funes can think, despite his prodigious memory: 'To think is to forget a difference, to generalise, to abstract. In the overly replete world of Funes there were nothing but details, almost contiguous details.' Perhaps the Twittersphere is another 'overly replete' world. In any case, here are some 'contiguous details' from Jonathan Fryer's Twitter page in the archive. Which, if any, do you think might be worth keeping?

Just purged 8 American floozies from my followers. How do they get to latch onto one like limpets?

David Cameron is 'very relaxed' about Andy Coulson and allegations of bugging and blagging. He shouldn't be.

Went to see 'Bruno'; a real curate's egg, but two or three brilliant scenes.

Jonathan Fryer's Twitter page will appear in a book I am currently working on, exploring unusual museum objects from around the UK, using interviews with people from inside and outside museums. Other ephemera in the book are a 19th-century leaflet advertising a live mermaid from Reading University's Centre for Ephemera Studies, and toilet paper from The Land of Lost Content museum in Shropshire.

Rebecca Reynolds (Twitter: @rebrey)

07 February 2014

New research project: Big UK Domain Data for the Arts and Humanities

We are delighted to have been awarded Arts and Humanities Research Council funding for a new research project, ‘Big UK Domain Data for the Arts and Humanities’. The project, one of 21 to be funded as part of the AHRC’s Big Data Projects call, is led by the Institute of Historical Research (University of London), in collaboration with ourselves at the British Library, the Oxford Internet Institute and Aarhus University.

Here are some details, from the project blog:

"The project aims to transform the way in which researchers in the arts and humanities engage with the archived web, focusing on data derived from the UK web domain crawl for the period 1996-2013. Web archives are an increasingly important resource for arts and humanities researchers, yet we have neither the expertise nor the tools to use them effectively. Both the data itself, totalling approximately 65 terabytes and constituting many billions of words, and the process of collection are poorly understood, and it is possible only to draw the broadest of conclusions from current analysis.

"A key objective of the project will be to develop a theoretical and methodological framework within which to study this data, which will be applicable to the much larger on-going UK domain crawl, as well as in other national contexts. Researchers will work with developers at the British Library to co-produce tools which will support their requirements, testing different methods and approaches. In addition, a major study of the history of UK web space from 1996 to 2013 will be complemented by a series of small research projects from a range of disciplines, for example contemporary history, literature, gender studies and material culture.