UK Web Archive blog

Information from the team at the UK Web Archive, the Library's premier resource of archived UK websites

The UK Web Archive, the Library's premier resource of archived UK websites

5 posts from September 2020

30 September 2020

National Sporting Heritage Day 2020

By Helena Byrne, Curator of Web Archives at the British Library

women playing soccer with a linesman in the foreground
Women playing soccer

 

The 30th September is National Sporting Heritage Day in the UK and to celebrate we will give you a quick overview of the UK Web Archive (UKWA) sporting activities in 2020. UKWA is made up of the six UK Legal Deposit Libraries, these are the British Library, National Library of Scotland, National Library of Wales, Bodleian Libraries, Cambridge University Library and Trinity College Dublin Library.  

Sport is a subject that shapes and reflects society. As more publications about sport move to online only, preserving this cultural record through web archiving becomes paramount. To mark the occasion back in 2018 we published a blog post outlining the UKWA sports collection policies. 

We have three collections that focus on sport that are actively curated throughout the year:

  1. Sports Collection
  2. Sport: Football 
  3. Sports: International Events

 

International Internet Preservation Consortium (IIPC)

As individual institutions the British Library and the National Library of Scotland are members of the International Internet Preservation Consortium (IIPC) and worked on building collaborative collections covering international events such as the Summer and Winter Olympic/Paralympic Games. 2020 marks ten years of building IIPC Olympic/Paralympic web archive collections.  Since the formation of the IIPC Content Development Group (CDG) in 2015, there has been a consolidated effort to build collections both on and off the playing field. All of the IIPC collections are open access. The CDG planned to build a collection on the Tokyo 2020 Games. However, due to the coronavirus pandemic the Games were rescheduled for 2021 and so was CDG dedicated collection. However, some content around the 2020 event was included in the Novel Coronavirus (COVID-19) collection and there will be updates made to the National Olympic and Paralympic Committees collection this year.  

 

Documenting the Olympics and Paralympics

Even though Tokyo 2020 was postponed until 2021, the symposium Documenting the Olympics & Paralympics, which was supposed to be a full day face-to-face event, went online. This was a collaboration between the web archive team based at the British Library, the International Centre for Sports History and Culture (ICSHC) at De Montfort University, and the British Society of Sports History (BSSH).

A broad mix of physical, digitised and born digital resources were covered in the presentations. You can listen back to an audio recording of this symposium on the Sport in History Podcast. The full abstracts and some of the PowerPoint slides are available on the British Library Research Repository.

 

Engaging with Web Archives Conference

The Engaging with Web Archives conference brought together practitioners and web archive researchers from around the world. There were three presentations on the programme that focused on UK Web Archive sports collections. 

  1. Robert McNicol (Librarian, Kenneth Ritchie Wimbledon Library) discussed the collaboration on developing the Tennis section of the UK Web Archive Sports Collection. 
  2. Helena Byrne (Curator of Web Archives, British Library) looked at tracing the popularity of annoying football phrases on the archived .uk web space from 1996-2013. 
  3. Caio de Castro Mello Santos & Daniela Cotta de Azevedo Major (PhD students, School of Advanced Study, University of London) used the London 2012 and Rio 2016 Olympic Games as a case study to analyse media events through the UK Web Archive. 

A series of blog posts about the Engaging with Web Archives conference will be coming out in the next few weeks on the UK Web Archive blog.

 

Accessing the UK Web Archive

Under the Non-Print Legal Deposit Regulations 2013, we can archive UK published websites but are only able to make the archived version available to people outside the Legal Deposit Libraries Reading Rooms, if the website owner has given permission. 

 

Some of the websites  in UKWA that have already had permission granted, include Heritage Quay, Pride Sports UK and WheelPower. Some examples of websites that are onsite-only access include the Fans Supporting Food Banks, Barnsley Yorkshire: Tour de France and The Women's Open.

 

As the content of UKWA has mixed access, the message ‘Viewable only on Library premises’ will appear under the title of the website if you need to visit a Legal Deposit Library to view the content. If there is no message underneath then the archived version of the website should be available on your personal device.

 

Get involved with preserving sports online with the UK Web Archive

We can’t curate the whole of the UK web on our own, we need your help to ensure that information, discussion and creative output related to sport are preserved for future generations. Anyone can suggest UK published websites to be included in the UK Web Archive by filling in our nominations form: https://www.webarchive.org.uk/en/ukwa/nominate 

 

25 September 2020

The World of Food and the UK Web Archive

 

By Helena Byrne, Curator of Web Archives at the British Library

 

Assorted sliced fruits in white ceramic bowl surrounded by more sliced fruits and some small muffins
A variety of food

 

Food is a subject that transcends culture, politics and leisure practices. Thus, food has always been a key part of the UK Web Archive (UKWA) since it was established in 2005. 

 

Recipes, restaurant menus, food blogs, online reviews are just the start of food related online material that UKWA collects. Even protest and campaigning can be food related, for instance, this summer, footballer Marcus Rashford highlighted the issue of child poverty and the lack of access to food, especially during the school holidays. 

 

For the last three years the British Library has been running a series of events around food. Due to the coronavirus pandemic, this year's Food Season moved online with a series of talks over the autumn period. 

 

The Food Season celebrates the British Library’s extensive food-related collections and explores the politics, pleasures and history of food. UKWA, which is a partnership of the six UK Legal Deposit Libraries, including the British Library, also has an extensive collection of food related websites. 

 

Food collections

In 2017, the Food Archive collection was established. This collection covers the following topics:

There are currently 333 websites or web pages in this collection. Some of the websites selected include Eat Like a Girl, the Good Grub Club and the Veggies Catering Campaign. Why not have a browse through the collection and nominate your favourite UK published food sites or restaurant websites to be included in the collection? Anyone can nominate a website by following this link: https://www.webarchive.org.uk/en/ukwa/info/nominate 

 

Even though there is a dedicated collection about food, it also features as a subsection in a number of other collections. ‘Food and Drink’ is a subsection in both the Festivals and Online Enthusiast Communities in the UK collections. In addition, individual food websites appear in several other collections. Websites related to food activism appear in both the Political Action and Communication collection as well as the (soccer) fan subsection of the Sport: Football Collection, as numerous supporters clubs have organised to support their local food banks. 

 

Social media is a very popular way to share food and micro-reviews of eateries, however, this is often challenging for us to archive. At present, Twitter is the only social media platform that we archive on a regular basis but these captures are by no means comprehensive. We have experimented with other methods of archiving social media but this is on a selective basis.

 

How can you access these archived websites?

Under the Non-Print Legal Deposit Regulations 2013, we can archive UK published websites but are only able to make the archived version available to people outside the Legal Deposit Libraries Reading Rooms, if the website owner has given permission. The UK Legal Deposit Libraries are the British Library, National Library of Scotland, National Library of Wales, Bodleian Libraries, Cambridge University Library and Trinity College Dublin Library.  

 

Some of the websites  in UKWA that have already had permission granted, these include the Cake Fest Edinburgh, the Lancashire Pork Pie Appreciation Society and the Food Research Collaboration. Some examples of websites that are onsite-only access include the Biscuit Appreciation Society, the UK Menu Archive and Fans Supporting Food Banks.

 

As the content of UKWA has mixed access, the message ‘Viewable only on Library premises’ will appear under the title of the website if you need to visit a Legal Deposit Library to view the content. If there is no message underneath then the archived version of the website should be available on your personal device.

Due to the coronavirus pandemic, the reading rooms were closed for a number of weeks but are starting to reopen. This blog post gives an overview of opening hours and how to book a visit at the six UK Legal Deposit Libraries:

https://blogs.bl.uk/webarchive/2020/09/ukwa-available-in-reading-rooms-again.html 

 

We would especially like to see more food and drink nominations that reflect the multicultural nature of the UK and the many diaspora communities based here. Browse through what we have so far and please nominate more content here:

https://www.webarchive.org.uk/en/ukwa/info/nominate 

 

17 September 2020

Arnhem75 - a special collection of websites added to the UK Web Archive

 

By Marja Kingma, Curator of Germanic Collections, the British Library.

 

Arnhem75 blog image
Book cover of 75 Years Battle of Arnhem by Laurens van Aggelen

 

Introduction

The idea to create a collection of websites about the commemoration of Arnhem75 came to RAF Museum historian Harry Raffal and myself whilst attending the seminar ‘The Arnhem Spirit - 75 years of Brits in Arnhem’, on 15 May 2019, organised by the Dutch Embassy in London. The event was part of a programme in which the Netherlands, Britain and other former Allied countries commemorated Operation Market Garden, the code name for the battle for the bridge across the Rhine at Arnhem that took place in September 1944. Allied forces consisted of British, American and Polish troops, with help from Dutch resistance.

The Battle of Arnhem 1944 is of great significance to the UK and interest in it remains strong on both sides of the North Sea.

We wanted to create a lasting memory of these events and a special collection in the UK Web Archive on the subject seemed like a good idea.

 

What is included?

We kept the scope of the project quite narrow; only websites with a focus on the commemorations that took place in Britain and the Netherlands in 2019 are included, with the exception of some websites that deal with the historic facts regarding the Battle to give it some context.

So far over 150 individual websites within the UK web domain have been identified, of which 64 were selected to go into the collection. These sites are limited to the UK web domain, so have .uk in their domain name, or if they don’t must be hosted in the UK, or owned by UK organisations or individuals with a postal address in the UK.

Some of the websites selected for this collection include the 23 Parachute Field Ambulance, Airborne at the Bridge and Arnhem Oosterbeel War Cemetary.

 

How can you access these archived websites?

Under the Non-Print Legal Deposit Regulations 2013, we can archive UK websites but we are only able to make them available to people outside the UK Legal Deposit Libraries reading rooms, if the website owner has given permission. The UK Legal Deposit Libraries are the British Library, National Library of Scotland, National Library of Wales, Bodleian Libraries, Cambridge University Library and Trinity College Dublin Library.

For this collection you can view what has been selected through the UK Web Archive website but will need to visit a UK Legal Deposit Library reading room to view the archived content. The reading rooms across the Legal Deposit Libraries are starting to reopen now, with some restrictions, as you can read in this blog: https://blogs.bl.uk/webarchive/2020/09/ukwa-available-in-reading-rooms-again.html

 

How Can I Get Involved?

You can help expand this collection by sending us a URL you think may be eligible for inclusion in the collection Arnhem75. Please go to https://www.webarchive.org.uk/en/ukwa/info/nominate to nominate a website and we’ll take it from there.

Occasionally websites from non UK domains can be included, if they have a strong link to the UK and the website owners have given their permission to be included in the collection. Dutch organisations that were involved in the Arnhem75 commemorations are encouraged to get in touch.

We look forward to your suggestions!

 

10 September 2020

Launching the UK Web Archive 2020 Annual Domain Crawl

By Helena Byrne, Curator of Web Archives at the British Library

Today (10th September 2020) the UK Web Archive team will be pushing the big red button to kickstart the annual Domain Crawl of the UK webspace. The current coronavirus pandemic will no doubt feature strongly in this year’s crawl. This will complement the curated collection that the web archive teams across the UK Legal Deposit Libraries are contributing. The British Library along with the National Library of Scotland are also selecting websites for the International Internet Preservation Consortium (IIPC) Content Development Group (CDG) Novel Coronavirus (COVID-19) collection. 

What we collect

The UK Web Archive has been archiving UK published websites on a selective basis since 2005 and in 2020 is celebrating #15YearsOfUKWA. Domain Crawl 2020 is the seventh that has taken place. It wasn’t till after the implementation of the Non-Print Legal Deposit Regulations (NPLD) in April 2013, that we were able to run a broad crawl over the UK webspace. This includes anything with a .uk or other UK geographic Top Level Domain (TLD) such as .scot, .cymru or .london etc. It also includes websites on other TLDs that have been registered in the UK or that have been manually selected. 

NPLD came into effect on the 6th April 2013 and the British Library hosted a special event to launch the first Domain Crawl. This was widely covered in the national press and you can still watch back a short video from the event on The Guardian website

How much data is collected in the Domain Crawl?

The Domain Crawl usually runs for three months of the year and each year starts at a different time of year to avoid seasonal biases. Roughly 5-10 million hosts (websites) are archived every year. However, the amount of data collected each year varies. Also the way the data is collected and stored over time changes. We compress the data we store and as technology develops the amount of data that can be compressed into one terabyte changes. Last year 63.7 TB of compressed data was collected bringing the total collected during Domain Crawls from 2013 to 2019 to 477.62 TB. 

UKWA Domain Crawl 2013-2019 (1)

When can I view this content?

Due to the enormous amounts of data that is collected each year from the annual Domain Crawl and our Frequent Crawls, there is a significant lag from when the content is archived and made available through the UK Web Archive website. The Frequent Crawl data collected from 2013-2019 was 250.34 TB bringing the combined total to 727.96 TB of compressed data. To make searching content easier the website allows you search across all the Selectively Crawled content from 2005 to 2013 as well as the Frequent Crawl content from 2013 to 2017 and the Domain Crawl content 2013 to 2015. 

Under the Non-Print Legal Deposit (NPLD) Regulations 2013, we can archive all UK published websites but we are only able to make them available to people outside the Legal Deposit Libraries Reading Rooms, if the website owner has given permission.

Due to the NPLD Regulations, access to the archived content is a mix of open and onsite access. The ‘Viewable only on Library premises’ message on individual records indicates that you have to visit one of the six UK Legal Deposit Libraries.  The UK Legal Deposit Libraries are the British Library, National Library of Scotland, National Library of Wales, Bodleian Libraries, Cambridge University Library and Trinity College Dublin Library.

Follow the UK Web Archive on Twitter for the latest updates on the domain crawl and other web archiving activities! 



08 September 2020

UKWA available in reading rooms again

By Jason Webber, Web Archive Engagement Manager, The British Library

Much of the UK Web Archive content is only available in the reading rooms of UK Legal Deposit Libraries as current legislation regulates access. All libraries were closed for many months during the COVID-19 lockdown, however, a phased reopening has begun. 

Below is some basic information of what current access is available at Legal Deposit Libraries with links to more detail. Note opening times were correct at the time of publishing this article, library websites should be checked for current opening times.

British Library reading room

British Library

www.bl.uk/visit/opening-hours

London, St Pancras

Tuesday – Saturday 11.00 – 15.00

Boston Spa

Tuesday – Friday 11.00 – 15.00

You’ll need to pre-book online for whatever you would like to see at the Library. At the moment you can pre-book:

National Library of Scotland

www.nls.uk/using-the-library/opening-hours

Edinburgh reading rooms

Our Edinburgh reading rooms have reopened to existing and new library card holders, on a pre-booked basis only, with revised opening hours. Readers must book and preorder items 24 hours in advance.

General Reading Room and Special Collections Reading Room:

Tuesday-Saturday, 10.00-16.00

Kelvin Hall

We anticipate that the Library at Kelvin Hall in Glasgow, will reopen around mid-September.

National Library of Wales

www.library.wales/visit/before-your-visit/opening-times

The Reading Room is open to the public with a restricted service. You will have to book your place online before your visit. For more details on this and to read strict guidelines regarding the nature of the restricted service and what is expected of you go to Guidelines on re-opening.

(Reading Rooms only)

Monday - Friday: 10:00-12.30 and 13.30-16.00

Saturday: Closed

Bodleian Library

www.bodleian.ox.ac.uk/using/reading-rooms

The Bodleian Libraries have begun a phased reopening to staff, students and Bodleian Reader Card holders.

To help us keep you safe, and make sure we follow government and University guidelines, you'll need to book your visit in advance.

Weston Library (and several others)

Monday - Friday 1000-1600

Cambridge University Libraries

https://www.lib.cam.ac.uk/full-opening-hours

Cambridge University Library is now open for limited services from Monday-Friday. Book a visit to view non-borrowable material in the Main reading room or a Special Collections reading room. Please read more about our phased reopening of the UL and Faculty and Departmental Libraries.

Monday-Friday 10:15 -15:45 for limited services.

Trinity College, Dublin

www.tcd.ie/library/opening-hours/

Library reading rooms are now open for current staff and students. Face coverings are required. "Click and Collect" items will now be delivered to Library buildings. Goldsmith Hall is no longer used for collections or returns.

Monday-Friday 0930-1700