UK Web Archive blog

68 posts categorized "Collections"

26 July 2022

Web archiving the UEFA Women’s Euros in Sheffield

By Dr Justine Reilly, Strategic Director, Sporting Heritage

Four different photos of handmade football flags. There are six flags in total. The image is from a partnership event Sporting Heritage hosted with Sheffield Museums. The event was held on Monday 25 July 2022 at the Museum. There were four different sessions where children came together to make football flags.

The Heritage Fund awarded £500,000 to a programme which is recording the hidden history of women’s football and has launched a celebration of the game, its players, and communities in partnership with the UEFA Women’s EUROs.

Alongside this programme, the UK Web Archive is archiving UK-published websites about the tournament. In this guest blog post, we hear from Dr Justine Reilly from Sporting Heritage who supported host city Sheffield, with their contribution to the collection.

Sporting Heritage
Sporting Heritage is a UK wide organisation who work to support the preservation, collection, access, celebration of the sporting past. Whether that be objects and archives, photographs and videos, oral histories or song and chants, our role is to support all those who have a sporting story. We deliver a range of activities and events for example training events, National Sporting Heritage Day, and the Sporting Heritage Awards.

What did you collect for your museum/archive while working on this project?
We supported the host city of Sheffield by developing a number of different programmes including:

How did you collect your archive material?
We reached out to local sports clubs and organisations with links to football across Sheffield to inform both our exhibition and our wider activity. This included a social media campaign to draw in voices which have previously been ignored and hidden in the story of women’s football in Sheffield.

We continue to capture new stories via our web pages, and worked closely with partner organisations such as Football Unites, Racism Divides (FURD) and academic researchers Dr Fiona Skillen and Dr Gary James to inform our programming. Our aim was to draw on online content, cross reference historical facts, and hear from lived experience voices which may not have been part of the historical record previously. 

What websites are important for telling the story of the UEFA Women’s Euro England 2022 tournament in your area?
The overarching web pages linking to our heritage content around the Women’s Euro in Sheffield:

The linked pages hosted by the City of Sheffield:

And FURD pages outlining their work on the physical exhibition plinths and supporting activity:

The archived versions of these web pages can be found in the Cultural Programme subsection of the UEFA Women’s Euro England 2022 collection on the UK Web Archive website.

Get Involved
Browse through the UEFA Women’s Euro England 2022 collection and let us know if there is any UK published content that should be added? Anyone can suggest UK published websites to be included in the UK Web Archive by filling in our nomination form:

Birmingham 2022 Commonwealth Games

By Helena Byrne, Curator of Web Archives, The British Library

a screenshot of the Commonwealth Games logo used in an article by Sport England on their website. The article was archived by the UK Web Archive on 4/20/2022, 4:44:51 AM. You can view the article here:]

The Birmingham 2022 Commonwealth Games are taking place from July 28th to August 8th. There is also an extensive cultural programme running alongside the event till the end of September 2022.

The first Commonwealth Games was held in 1930 and the 2022 event is the twenty second edition of the competition. This is the sixth time that Britain has hosted the Commonwealth Games, Scotland have hosted it three times and including Birmingham 2022, England has hosted it three times. However, this is the second time that Britain has hosted this event since the formation of the UK Web Archive in 2005. 

Sport collecting in the web archive
In late 2017, the UK Web Archive started to formally curate sports websites by establishing three main collections on sport. They are the
Sports Collection, Sports: Football and Sports: International Events. The final collection in this series is Sports: International Events, documents major sporting events mostly hosted in the UK. It is in this collection that the Commonwealth Games Glasgow 2014 and the Commonwealth Games Birmingham 2022 collections sit.

You can view the Glasgow 2014 collection here: 

You can view the Birmingham 2022 collection here:

The Birmingham 2022 collection overview
We’ve broken this collection down into six areas:

  • Competitors: Athletes' websites and social media collected during the Games
  • Cultural Programme: Any websites and social media accounts related to the cultural programme during the Games
  • Organisational Bodies/Venues: UK national Commonwealth Games bodies' sites, local government sites etc.
  • Press Media and Comment: News and comment, including the Commonwealth games, interest groups and others
  • Sponsors: UK Websites and news articles relating to some of the official sponsors of the Games
  • Sports: The Sports subsection has twenty subsections, all governing body websites and club websites related to these sports and the Commonwealth Games will be tagged under their relevant sport

Get involved 
The UK Web Archive works across the six UK legal Deposit Libraries and with other external partners to try and bridge gaps in our subject expertise. But we can’t curate the whole of the UK web on our own, we need your help to ensure that information, discussions and creative output related to the Commonwealth Games Birmingham 2022 are preserved for future generations.

Anyone can suggest UK published websites to be included in the UK Web Archive by filling in our nomination form.

14 July 2022

Web Archiving the UEFA Women’s Euro England 2022 tournament in Northern Ireland

By Rosita Murchan, Web Archivist, Public Record Office of Northern Ireland (PRONI)

Black and white photo of Female footballer in a black and white striped shirt in motion of keeping up the ball
Thanks to the Deputy Keeper of the Records, Public Record Office of Northern Ireland and the Northern Ireland Women’s Football Association for the photo

The Public Record Office of Northern Ireland (PRONI) is the official archive of Northern Ireland and is situated in the historic Titanic Quarter in Belfast. PRONI was established by the Public Records Act (Northern Ireland) in 1923 which means in June next year we look forward to celebrating our centenary. PRONI has been collecting websites for over ten years, focusing on Government departments, local councils and websites deemed historically or culturally important to Northern Ireland. Over the years our collection has grown in both size and scope and we now capture one terabyte of data per year. PRONI does not have legal deposit status, so working with the UK Web Archive enables us to widen the scope of our collections, and ensure that other relevant content is captured.

PRONI has a rich history of celebrating women in sport having previously curated ‘A Level Playing field – Women in sport’ an exhibition from the archives held by PRONI. With images from the late nineteenth century onwards, this exhibition reminds us that women actually have a long history of participation in a wide range of sporting activities. PRONI also holds the papers of the Northern Ireland Women’s Football Association which includes official minutes and documents, as well as scrapbooks, programmes, newspaper clippings and other ephemera (PRONI Reference: D4633).

We are delighted to be working in partnership once again with the British Library and adding a Northern Irish perspective to their UEFA Women’s Euro England 2022 collection.

The Northern Ireland team has defied the odds to book their place in this summer’s tournament, and PRONI’s collaboration with the British Library will enable us to capture web content documenting the progress of the players who are set to make history for Northern Ireland this summer.

We plan to select as much of the news and media coverage as we can, capturing the local views, hype and excitement of Northern Ireland’s historic qualification to the Euros as well as content from Northern Ireland women’s official home page within the IFA (NI Women's Football) detailing all fixtures, news, team profiles and updates throughout the tournament. We will also include social media content about the tournament, twitter feeds of organisations and team members, and general social media coverage of the competition.

In recent years, PRONI has developed a number of creative and digital engagement projects that put the public at the heart of archives, making archives more welcoming and inclusive. We plan to use our social media channels to put out a call for nominations for sites from PRONI followers but anyone can suggest UK published websites to be included in the UK Web Archive by filling in our nominations form:

PRONI Logo white background

05 July 2022

What to expect on the UK Web Archive blog during UEFA Women’s Euro England 2022

By Helena Byrne, Curator of Web Archives, British Library

The UEFA Women's Euro 2022 competition is taking place across England from July 6 to July 31, 2022. We are collecting websites about the UEFA Women’s Euro 2022 from around the UK

You can view the UEFA Women’s Euro England 2022 collection here:

a blue banner image with the British Library, Inspired by England 2022, the National Football Museum and the UK Web Archive. A female football player kicking a ball and the text, Can you help us preserve football history? We are collecting websites about the UEFA Women’s EURO 2022. Nominate a website for us to archive QR code and link to the nomination form:

Over the next few weeks there will be a number of guest blog posts from the UK Web Archive and collaborators from around the UK. 

First up, we will have a blog post from the National Library of Scotland and the National Library of Wales. Neither Scotland nor Wales qualified for this edition of the tournament, but as part of the UK Web Archive, both national libraries will be contributing to the collection and ensuring that any fan events taking place are preserved. 

From the 18th July there will be a number of blog posts published each week in July.  There will be a guest blog post from the Public Records Office of Northern Ireland (PRONI) who will be contributing a range of content from Northern Ireland. The team from Northern Ireland made history by qualifying for their first UEFA Women’s Euro tournament. 

There will be a series of blog posts from the tournament’s Arts and Heritage partners in the host cities. There were three specially commissioned projects to celebrate the rich history of women’s football and its players and to encourage more people to be inspired by the tournament. These blog posts will also include updates from across the UEFA Women’s Euro England 2022 host cities. These blog posts will give a summary of their local cultural programme activities, as well as an overview of what websites they nominated to the collection that are important for telling the story of the UEFA Women’s Euro England 2022 tournament in their area.

The final blog post in the series will be published in late September, this will be a reflection on the collection activities and give an overview of some personal favourites from the curator of the web archive collection, Helena Byrne. 

Get involved 
Anyone can suggest UK published websites to be included in the UK Web Archive by filling in our nomination form: 

29 June 2022

What content should I nominate on the UEFA Women’s Euro to the UK Web Archive?

By Helena Byrne, Curator of Web Archives, British Library

a blue banner image with the UK Web Archive, British Library, Inspired by England 2022 and the National Football Museum. A female football player kicking a ball and the text, Can you help us preserve football history? We are collecting websites about the UEFA Women’s EURO 2022. Nominate a website for us to archive:

The UEFA Women's Euro 2022 competition is taking place across England from July 6 to July 31, 2022. We are collecting websites about the 2022 UEFA Women’s EURO from around the UK. You can view the collection here: 

This blog post runs through some examples of the type of content you might like to nominate to the collection. 

We archive websites: 1. That are on a .uk or other UK geographic top-level domain such as .scot or .cymru. 2. That are published in the UK.  We do not archive: 1.Online Sound or Video platforms, in which audio-visual material is the predominant content. 2. Private Intranets and Emails. 3. Personal data in social networking sites or websites only available to restricted groups.

We archive as much openly available online content that we can identify as being published in the UK. Archiving is carried out through a mix of automated processes such as an annual domain crawl or through manual selection by the UK Web Archive teams, as well as the public nomination form.

UEFA Women’s Euro England 2022
For the UEFA Women’s Euro England 2022 we want content that specifically refers to the tournament. Some websites might only have a subsection or even just one page dedicated to the tournament so you can nominate that specific URL. 

We add the following type of web content to the collection:

  1. Full website
  2. Subsection of a website
  3. Individual page from a website
  4. Event page
  5. Twitter accounts

Unfortunately due to technical challenges, the only social media content we can successfully archive is Twitter. If you know of any high-profile Twitter accounts -  that aren’t personal accounts of ordinary people - then please nominate them. 

Examples of some website content we have added so far include:

Full website
Have you seen any new websites set up just for the UEFA Women’s Euro 2022 tournament? Most websites will, at most, just have a dedicated subsection or page for the tournament. Some websites such as the official sponsor, Visa, highlight the tournament on their home page in the run-up to and during the tournament. This is why we have added the whole website to the collection, as it is easy for the user to navigate from the home page of the archived website during the tournament to the dedicated section for the tournament. 

Subsection of a website
The FA website has a subsection dedicated to UEFA Women’s Euro 2022. The earliest captures of this subsection are from July 2020 which you can view here: 

a screenshot of the UEFA Women’s Euro 2022 subsection of the FA website from July 26 2020. The text reads Women’s Euro set for 2022. The UEFA Women’s Euro 2021 in England is postponed until the summer of 2022]

Link to archived website: 

Individual page from a website
In some cases there is just one page on a website relevant to the collection subject. When thinking about women’s football, the Royal Philharmonic Orchestra (RPO) doesn’t always come top of the list of potential websites. However, they have partnered with the FA to ‘engage fans in a range of musical opportunities and public events celebrating the history, ethos and future of women’s football’. What other websites have you seen that have posted an article about the UEFA Women’s Euro 2022 tournament? 

You can listen back to the archived versions of the anthems on the RPO website here: 

Event pages:
There are lots of events going on around the UEFA Women’s Euro 2022, these range from official events, fan-led events or venues organising their own events such as talks, book launches or watch parties for the matches. Eventbrite is one of the most popular platforms for ticketing these events, but have you seen any other platforms or websites?

A search on Eventbrite for Euro 2022 in the United Kingdom on the day of writing comes back with 500 pages

Twitter accounts:
Archived copies of Twitter accounts are only accessible through a reading room, but you can view what we have selected here:

We have already added the Twitter accounts of the players for England, Northern Ireland and other players based in the UK. However, we may have missed some, so please let us know through the nomination form.

Get involved 
Anyone can suggest UK published websites to be included in the UK Web Archive by filling in our nomination form.

30 May 2022

What UKWA did at the IIPC Web Archive Conference 2022

By Jason Webber, Engagement Manager, The British Library

Between the 18 and 25 May 2022, we had the biggest annual event in the world of web archiving - The IIPC General Assembly and Web Archive Conference. Some of the sessions were for members only but many were free and open for anyone to attend.

IIPC conference banner

Here are the UKWA staff and research partners who gave presentations at the conference with links to their pre-recorded talks that have been uploaded to our YouTube channel.



23 May 2022

Building Event Collections from Web Archives

By Sara Abdollahi, PhD student, L3S Research Center

The world is frequently experiencing events such as terrorist attacks, Brexit, and the migrant crisis, that has resulted in a vast amount of event-centric information on the web. Researchers, particularly digital humanities researchers and social scientists who analyse the significant events that influence and shape our societies, can benefit from web archives that reflect the perception of events as they happened at the time.

The Research challenge
Web archiving services provide a preserved state of the web that facilitates its study in the future. The ever-growing structure of web archives is one of the main challenges in accessing information for specific research. It is often difficult or even impossible for researchers to find their required documents. Typically, web archives offer interfaces for the users to access the information they need through keyword search. Researchers can then type the name of the event they are interested in and retrieve a list of web documents containing the text's keyword. The returned results are often overwhelming due to their quantity, potential redundancy, and irrelevance, needing an additional intensive cleaning phase to get more related web documents.

The UK Web Archive (UKWA) as well as some other web archives, offer manually collected event-centric collections to solve this issue, which can be considerably time-consuming to create. More importantly, these collections might not cover all necessary information related to a specific event.

A Potential Solution
To address the mentioned challenge, I propose automatically building event collections from web archives using knowledge graphs. Knowledge graphs such as
Wikidata and DBpedia are collections of interlinked real-world entities and concepts. 

In this research, I utilise the EventKG knowledge graph which provides structured information about events, their characteristics, and relationships (e.g., sub-events) and can thus be used as a resource for extending and diversifying the search space when building event collections.

Take the Arab Spring as an example; Tunisian Revolution, Bahraini protests of 2011, and 2011 Yemeni revolution are three sub-events of it. The figure below demonstrates an example of using EventKG to create event collections for Arab Spring. 

Building Event collections diagram

By utilising sub-events to expand the initial user query, a more diverse initial set of documents can be retrieved. This process leads to increased precision and coverage of the final event collection. Traditional methods might miss related documents to sub-events if there is no mention of the main event in those documents. To advance such methods, I demonstrate the impact of event-centric features and relations from a knowledge graph on building event collections.

Sara is giving a presentation of this project at IIPC Web Archive Conference 2022 (session 15) - Register for free.

12 November 2021

Welsh language websites within the UK Web Archive

By Aled Betts, Acquisitions Librarian and Web Archivist, National Library of Wales

The National Library of Wales have been collecting Welsh language websites to archive for the UK Web Archive since the 2004. In 2018, we decided to collate these websites and include them in a dedicated Collection in order to make it more accessible to researchers.

Significantly, 2018 was an important milestone for the Welsh language as it was 25 years since the passing of the Welsh Language Act in 1993 which gives effect to the principle that in the conduct of public business in Wales, the English and Welsh languages should be treated ‘on the basis of equality’. It was also 10 years since the passing of Welsh Language (Wales) Measure 2011 giving the Welsh language official status in Wales. In terms of Government and Public Bodies, the following principle that the Welsh language will not be treated less favourably than English was observed. As a result, the Welsh language is clearly visible and widespread on the web as many websites by law are now bilingual.

However, the aim of the Welsh Language Collection was not simply to list websites that were published through the medium of Welsh. The focus was more on those websites and organisations whose aim was to promote and facilitate the use of the Welsh language in all walks of life. The Collection also covers websites relating to Welsh language communities, online and physical, where Welsh is the medium of communication. It also looks at bodies that promote Welsh umbrella organisations as well as groups that campaign and lobby for the language. Furthermore, we have been collecting Welsh language websites since 2004, therefore we were able to showcase many of these websites and show how much they had changed over the last 17 years!

Here is just a small sample of the type of websites covered in the Welsh Language Collection.

Advocacy, campaigning and lobbying
Much of the work promoting the Welsh language across Wales is done by Mentrau Iaith (English: Language Initiatives). These are community-based organisations that operate to raise the profile of the Welsh language in a specific area. The percentage of Welsh speakers vary considerably. For instance, the highest percentages of Welsh speakers can be found in Gwynedd (64%) and the lowest is Blaenau Gwent (8%) therefore the challenges in each area differ. In order to capture this important work, we also archived their twitter feeds. These feeds are showing us how these initiatives are promoting the Welsh language in their respective areas. Furthermore, the Menter Iaith (English: Language initiative) umbrella body website is one the earliest sites we captured, a site we first archived in 2006.


Mentrau Iaith (English: Language initative) website in 2021

Mentrau Iaith website

Mentrau Iaith (English: Language initative) website in 2006 

Over the last 2 decades, we have seen bodies and organisations evolve, grow and some disappear. A statutory body set up under the Welsh Language Act 1993 was Bwrdd yr Iaith Gymraeg (English: Welsh Language Board). The board was responsible for administering the Welsh Language Act and for seeing that public bodies in Wales kept to its terms. The Welsh Language Board was abolished in 2012 and following the passing of the 2011 Welsh Language (Wales) Measure, powers were transferred to the Welsh Government and the Welsh Language Commissioner, a new body promoting and facilitating the use of the Welsh language. Fortunately, we have captured this transfer of power as we have been archiving the Welsh Language Board website since 2008 and the Welsh Language Commissioner since 2012, in both cases, open access has been granted.


Bwrdd yr iaith Gymraeg/ (English: Welsh Language Board) website in 2008


Comisiynydd y Gymraeg (English: Welsh Language Commissioner) website in 2021

Arts and Culture
The Welsh language has a lively and vibrant arts, music and literature scene. This is no more exemplified by the Eisteddfod Genedlaethol (English: National Eisteddfod) and Urdd Gobaith Cymru, the Welsh language national voluntary youth organisation, who run the Urdd Eisteddfod, arguably Europe's largest youth festival. Both sites are archived since early 2000’s. The National Eisteddfod is held in different locations each year alternating between north and south Wales therefore naturally the content changes every year. The first National Eisteddfod we archived was Eisteddfod Genedlaethol Cymru Casnewydd a’r Cylch (English: National Eisteddfod of Wales Newport and surrounding area) in 2004 and our first Urdd National Eisteddfod was Eisteddfod yr Urdd Sir Ddinbych (English: Urdd Eisteddfod Denbighshire) in 2006! Again, open access granted, therefore available to view anywhere.


The Eisteddfod Genedlaethol Cymru Casnewydd a’r Cylch (English: National Eisteddfod of Wales Newport and surrounding area) 2004


Urdd Eisteddfod Denbighshire 2006

Alongside the all-important bodies, we archive a plethora of arts and culture websites, from record labels to folk groups, theatrical bodies, local eisteddfodau and Welsh language festivals. Same goes for the buoyant Welsh literature and publishing scene, close to a hundred websites listed within our ‘literature and publishing sub-section.

Education and Learning
An all-important sub-section is Education and learning. Here two types of websites dominate. One is education and learning through the medium of Welsh. Here, Welsh-medium education, including Mudiad Meithrin (English: Nursery Movement), formed in 1971, to nurture early-years Welsh speakers to Coleg Cymraeg Cenedlaethol (English: Welsh National College), formed in 2011, to develop Welsh-language courses and resources for Higher Education students are archived.


Coleg Cymraeg Cenedlaethol (English: Welsh National College) website in 2011

Secondly, the web has seen an explosion of language learning websites globally. This is also apparent in the Welsh language allowing those wishing to learn a second language to do so through the internet.


SaySomethinginWelsh website in 2011

As of 2021, the collection has between 500 and 600 websites and is a growing collection. However, a significant collection, as many websites were collected since the early days of web archiving in 2004. The principle of equality had been an underlying theme in Welsh language discourse and legislation was passed to meet this demand. The Collection explores how promoting and supporting the Welsh language has changed over the past 20 years but also shows how legislation has helped shape this change.