UK Web Archive blog

11 posts categorized "Research collaboration"

19 October 2021

Clouds and blackberries: how web archives can help us to track the changing meaning of words

By Dr Barbara McGillivray (Turing Fellow), Pierpaolo Basile (Assistant Professor in Computer Science, University of Bari), Dr Marya Bazzi (Turing Fellow) and  Dr Jenny Basford, Jason Webber (British Library)

NOTE: This a re-blog from the Alan Turing Institute, with permission.

The meaning of words changes all the time. Think of the word ‘blackberry’, for example, which has been used for centuries to refer to a fruit. In 1999, a new brand of mobile devices was launched with the name BlackBerry. Suddenly, there was a new way of using this old word. ‘Cloud’ is another example of a well-established word whose association with ‘cloud computing’ only emerged in the past couple of decades. Linguists call this phenomenon ‘semantic change’ and have studied its complex mechanisms for a long time. What has changed in recent years is that we now have access to huge collections of data which can be mined to find these changes automatically. Web archives are a great example of such collections, because they contain a record of the changing content of web pages.

But how can we automatically detect in a huge web archive when a word has changed its meaning? A common strategy is to build geometric representations of words called word embeddings. Word embeddings use lots of data about the context in which words are used so that similar words can be clustered together. We can then do operations on these embeddings, for example to find the words that are closest (and most similar in meaning) to a given word. It’s a useful technique, but building embeddings takes a lot of computing power. Having access to pre-trained embeddings can therefore make a big difference, enabling those in the scientific community without sufficient computational resources to participate in this research.

A team of researchers from The Alan Turing Institute and the Universities of Bari, Oxford and Warwick, in collaboration with the UK Web Archive team based at the British Library, has now released DUKweb, a set of large-scale resources that make pre-trained word embeddings freely available. Described in this article, DUKweb was created from the JISC UK Web Domain Dataset (1996-2013), a collection of all .uk websites archived by the Internet Archive between 1996 and 2013. (This dataset is held and maintained by the UK Web Archive, which has been collecting websites since 2005, initially on a selective basis and since 2013 at a whole domain level.) DUKweb contains 1.3 billion word occurrences and two types of word embeddings for each year of the JISC UK Web Domain Dataset. The size of DUKweb is 330GB.

Researchers can use DUKweb to study semantic change in English between 1996 and 2013, looking at, for instance, the effects of the growth of the internet and social media on word meanings. For example, if the word ‘blackberry’ is used mostly to refer to fruits in 1996 and to mobile phones in 2000, the 1996 embedding for this word will be quite different from its 2000 embedding. In this way, we can find words that may have changed meaning in this time period. The figure below (from Tsakalidis et al., 2019) shows four words whose contexts of use have changed in the last couple of decades: ‘blackberry’, ‘cloud’, ‘eta’ and ‘follow’. The bars indicate words most similar to these four words in 2000 (red bars) and in 2013 (blue bars). The scale along the bottom gives a measure of the change.

figure 02 - analysis - clouds, blackberries

The resources that underpin DUKweb are hosted on the British Library’s research repository, and are available for anyone in the world to download, reuse and repurpose for their own projects. This repository is part of the BL’s Shared Research Repository for cultural heritage organisations, which brings together the research outputs produced by participating institutions, and makes them discoverable to anybody with an internet connection. Providing a stable, dedicated location to hold heritage datasets in order to share them with a wider research community has been one of the key drivers in the implementation and development of this repository service. We are grateful to the British Library’s Repository Services team for supporting this collaboration between the UK Web Archive team and the Turing by making the content for DUKweb available.

Read the paper: DUKweb: diachronic word representations from the UK Web Archive corpus

 

11 November 2020

How Remembrance Day has Changed

By Liam Markey, PhD Student, University of Liverpool and the British Library

This blog examines how attitudes to Remembrance (or Armistice) day have changed and evolved over the course of the 20th century and beyond. Read the previous blog on 'Militarism and its role in the commemoration of British war dead' for background on the wider research project.

100 Years
2020 marks 100 years since the erection of a permanent Cenotaph at Whitehall and the interment of the Unknown Warrior in his tomb at Westminster. Along with the 2-minute silence, which was first observed in 1919, and the adoption of the poppy as the symbol of British commemoration in 1921, these practices have been ever present over the past century; they have become intrinsic components of the British collective identity in what is, arguably, a relatively short period of time.

Alleviating suffering and grief
Initially, this commemoration of the dead of the First World War performed two distinct purposes: firstly, practices served to alleviate the suffering of those who had lost loved ones. The bodies of the fallen were not repatriated, so the erection of monuments extolling the sacrifices of the war dead served as focal points of grief and mourning in local communities. Secondly, Remembrancetide (the time of year in which British rituals of commemoration are enacted) was initially a period in which support for disabled ex-servicemen, and those left widowed or orphaned by the First World War, was to be generated. Through the sale of poppies or direct donations, the British public was able to provide financial support for those in need. Collective mourning, such as at the Cenotaph where the monarchy and politicians gathered, was a demonstration of unity and a national thanksgiving to the war dead.

Attitudes to commemoration are not static
Whilst commemorative practices have remained practically unchanged over the past 100 years (only the day on which they are observed has been altered, and for the duration of the Second World War national services were suspended), the same cannot be said for the historical context in which they have been enacted, nor for the thoughts and ideals of those who enact them.

Newspaper Analysis
Analysing the Daily Mail and Daily Mirror newspapers, I have been able to create a small “pseudo” historiography of British attitudes towards commemoration throughout the 20th Century. The text samples from the two newspapers that I have examined range from the 7th -14th November at ten-year intervals starting in 1928 and contain at least one mention of the terms “Armistice” or “Remembrance.” The choice to search within this temporal parameter and for these specific terms was a conscious decision made so as to ensure that texts relating to both Armistice Day and Remembrance Sunday were collected and available for analysis. The intervals between samples was a deliberate choice so that each text is taken from a year in which a tenth anniversary of the First World War took place and, in theory, when coverage of the war in the media would be at a heightened state.

1928
The first text sample is taken from 1928, the ten-year anniversary of the signing of the Armistice in 1918 and provides the largest number of texts from any year. This is in most part due to the fact that the First World War was a relatively recent event at this point in time. The main emphasis of these texts is on how the British public can aid those left disabled by their experience of the First World War, either through donations to the British Legion’s poppy appeal or by direct purchasing goods made by ex-servicemen. The issue of ‘lasting peace’ is also brought up several times, with many believing that ten years having passed without another World War proves that the cause so many British soldiers died fighting for was not in vain. At this point in time, when commemoration was in many ways an expression of a commitment to peace, the majority of the British public seemed convinced that it was fulfilling its purpose.

1938
However, by 1938 the mood had shifted considerably. With another conflict looming there is less conviction in proclamations of the First World War having achieved this lasting peace. There is an increase in articles discussing the possibility of another war in the near future and the failings of the last 20 years in maintaining peace. There is a palpable anxiety present in the coverage of both the Mail and Mirror as British society faces the stark realisation that the lasting peace so many died for between 1914-18 is on the verge of dissolution.

1948
By 1948 this anxiety had yet to subside, and despite another recent victory over Germany and her allies there is little celebration or indication that the Second World War had done a better job in achieving peace than the First had done as too little time had yet passed. This sample provides a much shorter number of texts concerned with commemoration, and I am drawn to Jay Winter’s assertion that societies following the Second World War struggled to make sense of the carnage they had experienced as an explanation as to why this was the case:

The limits of language had been reached; perhaps there was no way adequately to express the hideousness and scale of the cruelties of the 1939-1945 war. (Winter, 1995, p.9)

In the wake of the First World War, commemorative practices were conceived so as to soothe the suffering of the bereaved and to attach value and meaning to the sacrifice of the war dead. The aftermath of the Second World War resulted in a disillusionment with this previous tradition as commemoration hinged on the maintenance of peace. Now it was clear that the ‘peace’ so many died to attain was a fiction, and perhaps the lack of coverage in this text sample is demonstrative of a contextual detachment felt in British society towards the commemoration of war. The overarching theme displayed by this text sample is that of a society disillusioned with the concept of war commemoration, yet perceived slights to tradition, such as “gigglers” at Whitehall, are still harshly condemned. Despite there being no overt celebration of the war dead, or victory in the two World Wars present in either paper, it is clear that the bare minimum of traditional commemorative practices were to still be respected and observed.

1958
The texts from 1958 greatly resemble those of 1928, where it was believed that a sufficient period of time had passed since the ending of the First World War and thus it was acceptable to again assert that lasting peace had been achieved. There are a few texts that discuss this idea of lasting peace, specifically one in the Daily Mail titled What a Difference 27 Years Make, which argues that the contrast between the present and 1931, both being 13 years removed from a World War, proves that society is on the right track to avoiding another global conflict.

Another important focus of texts from this period is the issue of the “200,000,” the last remaining veterans of the First World War, and what is perceived to be a lack of financial support from the government as they enter the later stages of their lives. After 1948, where overt reference to ex-servicemen in the texts was absent, this year’s sample brings them back to the fore, reminiscent once more of 1928’s sample. The difference here, however, is that the ex-servicemen mentioned in the texts collected prior to the Second World War focused on those who had been left disabled by their experiences of the First World War. In 1958, media coverage encompasses all ex-servicemen from the First World War due to their age – now that 40 years have passed since the Armistice, the advanced age of veterans now means they are all regarded as vulnerable and in need of assistance from the public, be they disabled as a result of the war or not.

1968 and 1978
Both 1968 and 1978 samples offer an insight to changing attitudes to the First World War in British society. The British mythology of the conflict that is firmly planted in modern popular imagination has its roots in the 1960s and 70s where a number of influential pieces of media were produced that transformed attitudes to the First World War.

Evident in both text samples is the widening divide between older and younger generations and their attitudes towards the commemoration of war, and wider ideas regarding the relevance of traditional commemorative rituals considering how much time had passed since the Armistice. Both newspapers wrestle with the idea that commemorative practices have become outdated and appeal only to a small minority of the population with personal connections to the First World War, with it being described as “too sentimental” to some. Despite these growing objections, large crowds are still in attendance at remembrance services, many of whom, as the Daily Mirror points out, are young people. These decades depict the future of commemorative tradition as being somewhat in doubt; with the Second World War receding into history, and the First even more so, there is a real feeling in the texts that the commemorative traditions conceived in the wake of the Armistice had started to become outdated.

1988 & 1998
By the late 1980s British interest in commemoration seems to have been reinvigorated, perhaps in no small part due to the Falklands Conflict of 1982, with both the 1988 and 1998 texts bearing a more nationalistic tone than previous samples. With memory of the First World War having all but passed from living memory, emphasis in the texts shifts from the personal stories of those who were directly affected by the conflict towards a more abstract concept of commemoration as an almost celebration of Britishness. Both newspapers in 1988 contain adverts from the British Legion that describe the observance of traditional commemorative practices as a “National Debt,” and especially in the Daily Mail there is a vast increase in articles containing inflammatory and accusatory language directed at those who are not 100% committed to participation. Whilst in 1998, the question of whether today’s youth are willing to die for their nation is repeated numerous times throughout Remembrancetide in the Daily Mail. 

21st Century
Leading into the 21st Century there is a sense that the initial meaning behind commemoration, which sought to provide support for those mourning the deaths of loved ones, has become outdated now that lived experience of the First World War has passed from the British population. There is a real danger that the language and symbols that vindicated the sacrifice of the war-dead in the wake of the conflict are more likely to inspire militaristic notions in the present day.

Poppies in a field

Summary
While brief, I hope this piece has demonstrated to some degree the fluid nature of British attitudes to commemoration in the 20th Century, and how these attitudes are somewhat representative of wider historical and social change. As my research moves forward it will be most interesting to see the relationship between ‘micro’ discourses and those disseminated by the British media.

Resources such as the UK Web Archive will prove invaluable in exploring these ‘bottom up’ approaches to commemoration, asking how language and symbols popularised in the wake of the First World War, such as the Remembrance Poppy, are reproduced within amateur online remembrance projects and how this usage potentially relates to issues such as nationalism and militarism. Often, mainstream representations of Remembrance focus on the unifying nature of commemoration, and it will be interesting to see whether analysis of materials produced by the average British citizen challenges or confirms this narrative.

UKWA First World War centenary collection - 900+ archived websites (or pages).

04 November 2020

Curating culturally themed collections online: The Russia in the UK Collection, UK Web Archive

By Hannah Connell, Collaborative PhD Student, King’s College London; British Library

Title slide from Hannah's presentation with a London Underground map in Russian

 

I spoke about my position as a curator for the Russia in the UK curated collection as part of the recent Engaging with Web Archives conference (EWA), which was held online from the 21st-22nd of September 2020. This conference reflected the breadth of the web archiving community, bringing together speakers from researchers to librarians, as well as curators and web archiving teams from many different countries.

As always, it was inspiring to participate in such a welcoming event. Even online, the conference retained the collaborative atmosphere which has marked my experience of research in web archiving, allowing new researchers to interact with more experienced practitioners and encouraging questions and conversations between researchers, users and archivists.

The researcher-curated collection, Russia in the UK, is part of the UK Web Archive (UKWA). I was particularly pleased to have had the opportunity to present this curated collection, a resource on the Russian-speaking community in the UK, which was first started in November 2017. Such collections play an important role in making the wide range of material preserved in the UKWA more visible to researchers.

Curators are important to the preservation work of the UKWA. Curated collections are collected manually by curators and researchers with specialist knowledge in their field. The role of a curator in creating a UKWA collection involves identifying relevant websites to be included in a collection, and recording the metadata for these websites, including the translation and transliteration of titles and descriptions in other languages.

This collection is valuable both as a resource for further research, and as a means of questioning research practices. It is not possible to capture everything on the web, and collection curators ensure that a representative sample of websites for each thematic collection are selected. The practice of creating and maintaining a collection such as the Russia in the UK  ultimately influences the shape of the collection and the online representation of the diasporic community it will come to reflect. As such, it is important for researchers and users to understand the decisions taken by curators in selecting and capturing websites.

My paper for EWA focused on the creation of a curation guide for curators of new curated collections. This  draws on the ongoing process of curating the Russia in the UK collection, documenting both the provenance of this special collection and reflecting on this process as a model for future collections.  

In documenting the creation of this collection, I hope to enable future researchers to explore and contribute to this record of the online activity of the Russian diaspora in the UK, and to question and develop the curatorial and research practices behind the curation of collections.

You can watch Hannah Connell’s presentation on the EWA YouTube channel.

 

02 November 2020

Digital archaeology in the web of links: reconstructing a late-90s web sphere

By Dr. Peter Webster, Independent Scholar, Historian and Consultant

Fiber cables for the internet

 

The historian of the late 1990s has a problem. The vast bulk of content from the period is no longer on the live web; there are few, if any, indications of what has been lost – no inventory of the 1990s web against which to check. Of the content that was captured by the Internet Archive (more or less the only archive of the Anglophone web of the period), only a superficial layer is exposed to full-text search, and the bulk may only be retrieved by a search for the URL. We do not know what was never archived, and in the archive it is difficult to find what we might want, since there is no means of knowing the URL of a lost resource. Sometimes we need, then, to understand the archived web using only the technical data about itself that it can be made to disclose.

Niels Brügger has defined a web sphere as ‘web material … related to a topic, a theme, an event or a geographic area’.  My paper at the EWA conference presents a method of reconstructing a web sphere, much of which is lost from the live web and exists only in the Internet Archive: the web estate of the many conservative Christian campaign groups in the UK in the 1990s and early 2000s.

This method of web sphere reconstruction is based not on page content but on the relationships between sites, i.e., the web of hyperlinks. The method is iterative, and tracks back and forth between big data and small. Individual archived pages and directories, printed sources, the scholarly record itself, and even traces of previous unsuccessful attempts at web archiving come into play, as does a large dataset held by the British Library. From the more than 2 billion lines in the UK Host Link Graph dataset it is possible to extract the outlines of this particular web sphere.

You can watch Peter Webster’s presentation on his website peterwebster.me

 

Previous studies using a similar method are: 

Webster, Peter. 2019. Lessons from cross-border religion in the Northern Irish web sphere: understanding the limitations of the ccTLD as a proxy for the national web. In The Historical Web and Digital Humanities: the Case of National Web domains, eds Niels Brügger & Ditte Laursen, 110-23. London: Routledge.  http://dx.doi.org/10.17613/yms5-9v95     

Webster, Peter. 2017. Religious discourse in the archived web: Rowan Williams, archbishop of Canterbury, and the sharia law controversy of 2008. In: The Web as History, eds Niels Brügger & Ralph Schroeder, 190-203. London: UCL Press. (Available Open Access at:  https://www.uclpress.co.uk/products/84010)

 

19 October 2020

Exploring media events with Shine

By Caio Mello, Doctoral Researcher at the School of Advanced Study, University of London

Computer screen with some HTML code on the screen

This blogpost is a summary of the presentation I delivered with my colleague Daniela Major in the conference Engaging with Web Archives: ‘Opportunities, Challenges and Potentialities’ in September 2020. This presentation is entitled ‘Tracking and analysing media events through web archives’.

My research explores the media coverage of the Olympic Games in a cross-cultural, cross-lingual and temporal perspective. I am especially interested in comparing how the concept of 'Olympic legacy' has been approached by the Brazilian and British media considering different locations, languages and social-political contexts. I have written a bit about this before on the UK Web Archive blog in December 2019 and March 2020.

Because of its controversial nature, the term Olympic legacy is used in a variety of contexts and it has multiple meanings. Considering its narrative importance to legitimize the billionaire investment of cities to host these events, this study has as the main objective to explore and define the concept of Olympic Legacy and how it changes over time.

Here however, I will be focusing on my experience doing a secondment at the British Library with the UK Web Archive team. I have explored the potential of using the platform Shine to track news articles on Olympic legacy.

Why Shine?

Shine is a tool to explore .uk websites archived by the Internet Archive between 1996 and April 2013. While a big part of the content of the UK Web Archive can only be accessed from inside the British Library, Shine is open access and provides us with search results and URL data that can be easier to manage.

We have developed a pipeline based on 5 steps: searching, extraction, cleaning, filtering and visualisation. To extract information, we have conducted web scraping of the data using Python notebooks looking at specific newspapers (like The Guardian) and broadcast websites (like BBC) using the keyword “Olympic legacy”. Having searched for URL’s in Shine and extracted the results, the main challenge is cleaning. After extracting just the body text of the articles, we saw that many of them did not mention Olympic legacy. Usually, Shine provides results where the words searched appear in peripheral locations of the webpage. Cleaning consists of removing all the information around the main text, such as images, adverts, menus and links. With the documents we needed in hand, we had to verify if their content is relevant or not to our analysis. Sometimes, the term Olympic legacy appears but it is not necessarily related to Rio and London Olympics or it is not the main topic of the article. The process of filtering demanded a huge effort of close reading to identify contexts. At the end, we have produced some charts to visualise word-trends and topics that pop up around legacy. Although the Shine search results are limited in terms of time - it searched up until 2013 - it has been very useful as an exploratory tool to conduct preliminary analysis in a small-scale, and to build web archive and web scraping methods before applying my methods to huge amounts of texts elsewhere. 

You can watch Caio de Castro Mello Santos & Daniela Cotta de Azevedo Major’s presentation on the EWA YouTube Channel.

*This project has received funding from the European Union’s Horizon 2020 research and innovation programme. For more information: cleopatra-project.eu.

 

14 October 2020

Engaging with Web Archives - Conference Report

By Jason Webber, Web Archive Engagement Manager, The British Library

 

Engaging with Web Archives conference banner

 

Is it possible to have a successful conference when you can no longer meet in person? Going exclusively online doesn’t seem to have stopped the ‘Engaging with Web Archives’ (EWA) Conference from being a superb experience. Co-Chairs of the event are Sharon Healy and Michael Kurzmeier, PhD students at Maynooth University.

Originally planned as a more traditional, in person, conference in April 2020 the EWA team re-planned for a completely online event on 21and 22 September 2020. It is notable that this was the first web archiving conference in Ireland. Most talks were pre-recorded which meant that questions could be posed in the chat box and were often answered live by the presenter during the talk. This can be a significant advantage of pre-recorded talks.

The programme was packed with high quality presentations from many areas of web archiving but here I’ll highlight a few that were UK Web Archive (UKWA) projects or used UKWA data. 

 

Highlights

 

A Keynote talk was delivered by Professor Jane Winters, School of Advanced Study, University of London. Web archives as sites of collaboration. Jane has worked with the UK Web Archive extensively over many years and is one of only a few Professors in the UK training and promoting web archives to students. Jane's talk (link to YouTube).

 

Sara Day Thomson (University of Edinburgh) Developing a Web Archiving Strategy for the Covid-19 Collecting Initiative at the University of Edinburgh. Sara formerly worked for the Digital Preservation Coalition (DPC) led a ‘Web Archiving Task Force’ and more recently has been building important collections on Covid-19 with the University of Edinburgh in partnership with UKWA. Sara's talk (link to YouTube).

 

Dr. Brendan Power (The Library of Trinity College Dublin): Leveraging the UK Web Archive in an Irish context: Challenges and Opportunities. With Trinity College Dublin being a UK Legal Deposit Library we try and work together as much as possible and this talk highlights what is possible with specific mention of the Easter Rising collection. Brendan's talk (link to YouTube).

 

Robert McNicol (Kenneth Ritchie Wimbledon Library): The UK Web Archive and Wimbledon: A Winning Combination. We try to represent as many aspects of UK life as possible including sport. This also highlights our cooperation with other libraries and archives. See the Tennis collection. Robert's talk (link to YouTube).

 

Dr. Peter Webster (Independent Scholar, Historian and Consultant): Digital archaeology in the web of links: reconstructing a late-90s web sphere. Peter has conducted several pieces of research utilising the UKWA secondary datasets. These are free and available for download. Peter's talk (link to YouTube).

 

Helena Byrne (Curator of web Archiving, British Library): From the sidelines to the archived web: What are the most annoying football phrases in the UK? Helena is a curator in the UK Web Archive but also has a keen interest in sport and women’s football in particular. Here, Helena shows how the Trends feature (graphs) in our SHINE service can help guide research in an easy and accessible way. Helena's talk (link to YouTube).

 

Caio de Castro Mello Santos & Daniela Cotta de Azevedo Major (School of Advanced Study, University of London): Tracking and Analysing Media Events through Web Archives. Caio was a Phd student placement with UKWA as part of the Cleopatra project. Read about some of his work on this blog on Olympic legacy. Caio and Daniella's talk (link to YouTube).

 

Hannah Connell (King’s College London; British Library): Curating culturally themed collections online: The Russia in the UK Special Collection, UK Web Archive. Hannah has worked extensively collecting one of the several diaspora community collections. In addition to Russia in the UK, there is London French and Latin America UK. Hannah's talk (link to YouTube).

 

Dr. Jessica Ogden (University of Southampton) & Emily Maemura (University of Toronto): A tale of two web archives: Challenges of engaging web archival infrastructures for research. Jessica has also worked previously with UKWA as a Phd placement on the challenges of researchers using web archives. This vital work helps guide our planning for the future. Jessica and Emily's talk (link to YouTube).

 

Dr. Olga Holownia (International Internet Preservation Consortium): IIPC: training, research, and outreach activities. Olga works full time for the IIPC but has been based within the UK Web Archive team at the British Library. We have been delighted to have worked with and been supported by the IIPC since it began (The British Library is a founding member).

 

Rosita Murchan (Public Record Office of Northern Ireland): PRONI Web Archive: A Collaborative Approach. PRONI maintains their own web archive but also collaborates with the UK Web Archive in collecting material specific to Northern Ireland. This is important as there currently is no Legal Deposit partner in Northern Ireland. Rosita’s talk (link to YouTube).

 

Summary

Whilst it is a shame not to meet people in person this conference has shown me how online conferences can be a viable way forward. I’m very much looking forward to the next one.

 

See all of the pre-recorded talks on the EWA conference Youtube Channel. You can find the Engaging with Web Archives on Twitter and catch up on the conference discussion with the hashtag #EWAvirtual

 

Look out for more in-depth blog posts from EWA conference speakers over the coming weeks on the UK Web Archive blog.

 

30 September 2020

National Sporting Heritage Day 2020

By Helena Byrne, Curator of Web Archives at the British Library

women playing soccer with a linesman in the foreground
Women playing soccer

 

The 30th September is National Sporting Heritage Day in the UK and to celebrate we will give you a quick overview of the UK Web Archive (UKWA) sporting activities in 2020. UKWA is made up of the six UK Legal Deposit Libraries, these are the British Library, National Library of Scotland, National Library of Wales, Bodleian Libraries, Cambridge University Library and Trinity College Dublin Library.  

Sport is a subject that shapes and reflects society. As more publications about sport move to online only, preserving this cultural record through web archiving becomes paramount. To mark the occasion back in 2018 we published a blog post outlining the UKWA sports collection policies. 

We have three collections that focus on sport that are actively curated throughout the year:

  1. Sports Collection
  2. Sport: Football 
  3. Sports: International Events

 

International Internet Preservation Consortium (IIPC)

As individual institutions the British Library and the National Library of Scotland are members of the International Internet Preservation Consortium (IIPC) and worked on building collaborative collections covering international events such as the Summer and Winter Olympic/Paralympic Games. 2020 marks ten years of building IIPC Olympic/Paralympic web archive collections.  Since the formation of the IIPC Content Development Group (CDG) in 2015, there has been a consolidated effort to build collections both on and off the playing field. All of the IIPC collections are open access. The CDG planned to build a collection on the Tokyo 2020 Games. However, due to the coronavirus pandemic the Games were rescheduled for 2021 and so was CDG dedicated collection. However, some content around the 2020 event was included in the Novel Coronavirus (COVID-19) collection and there will be updates made to the National Olympic and Paralympic Committees collection this year.  

 

Documenting the Olympics and Paralympics

Even though Tokyo 2020 was postponed until 2021, the symposium Documenting the Olympics & Paralympics, which was supposed to be a full day face-to-face event, went online. This was a collaboration between the web archive team based at the British Library, the International Centre for Sports History and Culture (ICSHC) at De Montfort University, and the British Society of Sports History (BSSH).

A broad mix of physical, digitised and born digital resources were covered in the presentations. You can listen back to an audio recording of this symposium on the Sport in History Podcast. The full abstracts and some of the PowerPoint slides are available on the British Library Research Repository.

 

Engaging with Web Archives Conference

The Engaging with Web Archives conference brought together practitioners and web archive researchers from around the world. There were three presentations on the programme that focused on UK Web Archive sports collections. 

  1. Robert McNicol (Librarian, Kenneth Ritchie Wimbledon Library) discussed the collaboration on developing the Tennis section of the UK Web Archive Sports Collection. 
  2. Helena Byrne (Curator of Web Archives, British Library) looked at tracing the popularity of annoying football phrases on the archived .uk web space from 1996-2013. 
  3. Caio de Castro Mello Santos & Daniela Cotta de Azevedo Major (PhD students, School of Advanced Study, University of London) used the London 2012 and Rio 2016 Olympic Games as a case study to analyse media events through the UK Web Archive. 

A series of blog posts about the Engaging with Web Archives conference will be coming out in the next few weeks on the UK Web Archive blog.

 

Accessing the UK Web Archive

Under the Non-Print Legal Deposit Regulations 2013, we can archive UK published websites but are only able to make the archived version available to people outside the Legal Deposit Libraries Reading Rooms, if the website owner has given permission. 

 

Some of the websites  in UKWA that have already had permission granted, include Heritage Quay, Pride Sports UK and WheelPower. Some examples of websites that are onsite-only access include the Fans Supporting Food Banks, Barnsley Yorkshire: Tour de France and The Women's Open.

 

As the content of UKWA has mixed access, the message ‘Viewable only on Library premises’ will appear under the title of the website if you need to visit a Legal Deposit Library to view the content. If there is no message underneath then the archived version of the website should be available on your personal device.

 

Get involved with preserving sports online with the UK Web Archive

We can’t curate the whole of the UK web on our own, we need your help to ensure that information, discussion and creative output related to sport are preserved for future generations. Anyone can suggest UK published websites to be included in the UK Web Archive by filling in our nominations form: https://www.webarchive.org.uk/en/ukwa/nominate 

 

17 September 2020

Arnhem75 - a special collection of websites added to the UK Web Archive

 

By Marja Kingma, Curator of Germanic Collections, the British Library.

 

Arnhem75 blog image
Book cover of 75 Years Battle of Arnhem by Laurens van Aggelen

 

Introduction

The idea to create a collection of websites about the commemoration of Arnhem75 came to RAF Museum historian Harry Raffal and myself whilst attending the seminar ‘The Arnhem Spirit - 75 years of Brits in Arnhem’, on 15 May 2019, organised by the Dutch Embassy in London. The event was part of a programme in which the Netherlands, Britain and other former Allied countries commemorated Operation Market Garden, the code name for the battle for the bridge across the Rhine at Arnhem that took place in September 1944. Allied forces consisted of British, American and Polish troops, with help from Dutch resistance.

The Battle of Arnhem 1944 is of great significance to the UK and interest in it remains strong on both sides of the North Sea.

We wanted to create a lasting memory of these events and a special collection in the UK Web Archive on the subject seemed like a good idea.

 

What is included?

We kept the scope of the project quite narrow; only websites with a focus on the commemorations that took place in Britain and the Netherlands in 2019 are included, with the exception of some websites that deal with the historic facts regarding the Battle to give it some context.

So far over 150 individual websites within the UK web domain have been identified, of which 64 were selected to go into the collection. These sites are limited to the UK web domain, so have .uk in their domain name, or if they don’t must be hosted in the UK, or owned by UK organisations or individuals with a postal address in the UK.

Some of the websites selected for this collection include the 23 Parachute Field Ambulance, Airborne at the Bridge and Arnhem Oosterbeel War Cemetary.

 

How can you access these archived websites?

Under the Non-Print Legal Deposit Regulations 2013, we can archive UK websites but we are only able to make them available to people outside the UK Legal Deposit Libraries reading rooms, if the website owner has given permission. The UK Legal Deposit Libraries are the British Library, National Library of Scotland, National Library of Wales, Bodleian Libraries, Cambridge University Library and Trinity College Dublin Library.

For this collection you can view what has been selected through the UK Web Archive website but will need to visit a UK Legal Deposit Library reading room to view the archived content. The reading rooms across the Legal Deposit Libraries are starting to reopen now, with some restrictions, as you can read in this blog: https://blogs.bl.uk/webarchive/2020/09/ukwa-available-in-reading-rooms-again.html

 

How Can I Get Involved?

You can help expand this collection by sending us a URL you think may be eligible for inclusion in the collection Arnhem75. Please go to https://www.webarchive.org.uk/en/ukwa/info/nominate to nominate a website and we’ll take it from there.

Occasionally websites from non UK domains can be included, if they have a strong link to the UK and the website owners have given their permission to be included in the collection. Dutch organisations that were involved in the Arnhem75 commemorations are encouraged to get in touch.

We look forward to your suggestions!

 

UK Web Archive blog recent posts

Archives

Tags

Other British Library blogs