THE BRITISH LIBRARY

UK Web Archive blog

90 posts categorized "Web/Tech"

25 November 2020

LGBTQ+ Lives Online Web Archive Collection

Add comment

By Steven Dryden, British Library LGBTQ+ Staff Network & Ash Green CILIP LGBTQ+ Network

As you’ll have read on this blog, the collaboration with UK Web Archive (UKWA), British Library and CILIP LGBTQ+ Network to develop LGBTQ+ content within the UK Web Archive was launched during summer 2020.

Rainbow tapestry

LGBTQ+ content was already part of the UK Web Archive before the collaboration began, with many sites in other collections overlapping LGBTQ+ themes. For example, Black and Asian Britain (blackgayblog.com), Gender Equality (Beyond the Binary), Sport (Graces Cricket Club). And some sites cut across many collections, highlighting the intersectional nature of the UK Web Archive. For example, Gal-Dem features in the News Sites; Zines and Fanzines; Black and Asian Britain; Gender Equality; Women's Issues; Unfinished Business: The Fight for Women’s Rights collections, as well as LGBTQ+ Lives Online. LGBTQ+ Lives Online, much like the lived experience of the LGBTQ+ does not sit in isolation, disconnected from other aspects of UK offline and online life. LGBTQ+ people play a part in all aspects of the UK community, and are not solely defined by their gender or sexual orientation.

This UK Web Archive collection doesn’t stand in isolation either, it enriches the scope of work already begun at The British Library.LGBTQ Histories aims to explore the experiences and stories encountered in the collections, posing questions about the lived experience of LGBTQ+ people throughout history.The LGBTQ+ Lives Online collection of the UK Web Archive plays a part in CILIP LGBTQ+ Network’s ambition to raise the profile of LGBTQ+ people, support the development of LGBTQ+ information resources and the work of LGBTQ+ Library, information and knowledge workers.

LGBTQ+ Lives Online Collection

UKWA 'ACT' tool

The collection currently contains over 400 sites and web pages in the main collection, with more of these being added to sub-collections every week. Many of the sites were already in the UKWA before the collaboration began, but were not linked to sub-collections. We are still at the stage where we are developing the structure of sub-collections but our initial indexes cover:

Since the launch of this collaborative project, we have been focused on a number of areas to both develop the project and to preserve sites within the collection. This includes:

  • Identifying sites already in the UK Web Archive to be added to the LGBTQ+ Lives Online sub-collections.
  • Identifying new sites not already in the UKWA to be included in the collection.
  • Spreading the word about the project as widely as possible via blog posts and articles such as this; social media; emails targeting specific LGBTQ+, library, and broader diversity organisations and networks.

You can browse through the collection here, and nominate a UK published site or webpage with a focus on LGBTQ+ lives to be included in the collection via: https://www.webarchive.org.uk/en/ukwa/info/nominate. We would especially like to see more nominations that reflect the multicultural nature of UK LGBTQ+ communities and the many diaspora communities based here, including UK sites written in languages other than English.

Though it can often be challenging for us to archive social media accounts, we are able to collect LGBTQ+ Twitter accounts. We have experimented with other methods of archiving social media but this is on a selective basis, but we would welcome nominations and projects that might address these challenges and how they might impact on archiving LGBTQ+ experience in the UK,

How can you access these archived websites?

UKWA search results page

Under the Non-Print Legal Deposit Regulations 2013, the UKWA  can archive UK published websites, but are only able to make the archived version available to people outside the Legal Deposit Libraries Reading Rooms, if the website owner has given permission. The UK Legal Deposit Libraries are the British Library, National Library of Scotland, National Library of Wales, Bodleian Libraries, Cambridge University Library and Trinity College Dublin Library.  

Some of the websites in UKWA have already had permission granted, these include Out Stories Bristol, Trans Ageing and Care, Bi Cymru/Wales and Queer Zine Library. As the content of UKWA has mixed access, the message ‘Viewable only on Library premises’ will appear under the title of the website if you need to visit a Legal Deposit Library to view content. If there is no message underneath then the archived version of the website should be available on your personal device.

Due to the coronavirus pandemic, the reading rooms were closed for a number of weeks but are starting to reopen. This blog post gives an overview of opening hours and how to book a visit at the six UK Legal Deposit Libraries:

https://blogs.bl.uk/webarchive/2020/09/ukwa-available-in-reading-rooms-again.html 

Previous blog posts about the project can be viewed via the following links.

LGBTQ+ Lives Online project introduction

LGBTQ+ Lives Online: Introducing the Lead Curators

 

24 November 2020

Web Archive Team wins 2020 Digital Preservation Award

Add comment

By Sophia Chrisafis, Internal Communications Officer, The British Library

On Thursday 5 November the UK Web Archive Team won The National Archives (UK) Award for Safeguarding the Digital Legacy at the Digital Preservation Awards 2020.

The Award was made to the UK Web Archive for ‘15 years of the UK Web Archive’, marking the anniversary of the launch of a public UK Web Archive service.

In all, there are six awards, which are presented every two years. 

Digital Preservation Award 2020

2020 Awards
This year the awards took place online, via Zoom. John Sheridan, Digital Director at The National Archives, introduced the award: 

The National Archives (UK) Award for Safeguarding the Digital Legacy celebrates the practical application of preservation skills to protect at-risk digital objects, drawing attention to the concrete efforts to ensure important elements of our generation’s digital memory can remain available for future generations. It is also for demonstrating a deep understanding of the risks that digital objects face and (the winner) should be an exemplar of digital preservation best practice and why preservation matters.

The winners were announced by judge April Miller, from the World Bank Group, who invited Ian Cooke, the British Library’s Head of Contemporary British Published Collections, to give an acceptance speech.

On behalf of Web Archiving Team Ian said:

‘We’re really amazingly pleased to have won.

‘It’s a huge honour for us to be recognised in this way, and to have been among such excellent finalists, such amazing projects, really inspiring ones.

‘We always say it’s not possible to understand the 21st century without the archived web, and we’ve been posting to our blog all week about the diversity and variety of our collections.

‘I’m personally always amazed and incredibly proud of the work Andy Jackson and Nicola Bingham lead for the Web Archive, and also for our whole team, both at the British Library and across the UK legal deposit libraries, and the friends we work with – the International Internet Preservation Consortium, an incredible community – and everyone we’ve worked with around the world for the past 15 years for digital preservation access and development.

Thank you so much.’

The UK Web Archive

The UK Web Archive (UKWA) was formed in 2003 as a response to growing awareness of an urgent digital preservation need, to collect and preserve communication using the web.

UKWA is a partnership of the six UK Legal Deposit Libraries: National Library of Scotland, National Library of Wales, Bodleian Libraries, University of Oxford, Cambridge University Libraries, Trinity College, Dublin and the British Library. In 2020, UKWA celebrated 15 years since making its first collections available publicly online.

Read more about the last 15 years of UKWA in these blog posts:

 

18 November 2020

2020 Domain Crawl Update

Add comment

By Andy Jackson, Web Archiving Technical Lead at the British Library

 

On the 10th of September the 2020 Domain Crawl got underway. The annual Domain Crawl usually takes about three months to complete, it visits UK published websites on a UK Top Level Domain (TLD) like .uk, .cymru, .scot, .london etc., any web content hosted on a server registered in the UK as well as all the records manually created by the UK Web Archive teams across the UK Legal Deposit Libraries

 

Update on crawl management

Due to the billions of URLs involved, the Domain Crawl is the most technically difficult crawl we run. As the crawl frontier grows and grows, the strain starts to show, particularly on the disk space required to store all of the status information about the URLs that have been crawled or are awaiting crawling. Worst of all, we found some mysterious problems with how Heritrix3 manages this information, meant that we could not safely stop and restart long crawls. We could usually restart once, but if we restarted again strange errors would appear, and sometimes these would be serious enough to cause the whole crawl to fail. Fortunately, in the last year, we finally tracked this down and updated the Heritrix3 crawler so that it can be safely stopped and restarted multiple times. 

This has made managing the crawler much easier, as we can stop and restart the crawl with confidence if we need to change the software or hardware setup. This makes managing things like disk space much less stressful.

 

Update on the crawl performance 

In the initial phase of the crawl, we threw in the roughly 11 million web hostnames that we have seen in past crawls, which then got whittled down to about 7 million active hosts. After this bumpy start and some system tuning, the crawl settled down and has been pretty consistently processing 250-300 URLs per second.  This is acceptable, but isn’t quite as fast as we would like, so we are analysing the crawl while it runs to try and work out where the bottlenecks are.

 

What we have collected so far

The figure below shows the URLs collected over time.

 

Graph illustrating the number of URLs downloaded in the 2020 Domain Crawl
Graph illustrating the number of URLs downloaded in the 2020 Domain Crawl

 

The rather jagged start shows where we were able to stop and start the crawl in order to tune the initial hardware setup, and the flatter ‘pauses’ later on are from other maintenance activities like growing the available disk space. The advantage of being able to re-tune the crawler as we go is shown by the way the line gets steeper over time, corresponding to the increased crawl rate.

 

In terms of bytes downloaded, we see a similar result:

Graph illustrating the number of TBs downloaded in the 2020 Domain Crawl
Graph illustrating the number of TBs downloaded in the 2020 Domain Crawl

 

As you can see, we are rapidly approaching 90TB of downloaded data, which corresponds to roughly 50TB of compressed WARC.gz data.

Despite starting the crawl relatively late in the year (due to issues around the COVID-19 outbreak), we are making good and stable progress and are on track to download over two billion URLs by the end of the year.

 

Follow the UK Web Archive on Twitter for the latest updates on the Domain Crawl and other web archiving activities! 

 

09 November 2020

A tale of two web archives: challenges of engaging web archival infrastructures for research

Add comment

By Jessica Ogden, University of Bristol and Emily Maemura, University of Toronto

Web archives are quickly becoming a key source for studying the historical Web, with many recent projects and publications demonstrating the scholarly opportunities presented by national web archives, in particular. At the same time, research in and on national web archives presents a number of challenges for scholars - where a ‘sociotechnical gap’ (Ackerman 2000) can be observed between the needs of researchers and the affordances of web archives themselves.

Diagram illustrating a web archive conceptual framework

In an effort to better understand the barriers to web archival use in research, our recent paper at the Engaging with Web Archives conference shares the results of a collaborative project which compares and contrasts our experiences of using two national web archives: the UK Web Archive and the Netarchive in Denmark. In 2018, Jessica undertook a three-month research placement with the British Library looking at the challenges and opportunities of using the UKWA for social science research. Around the same time, Emily also spent three months at the Danish National Web Archive, Netarchive, in collaboration with the Royal Library and the University of Aarhus in Denmark. 

Based on our own interactions with these web archives, and interviews with staff and curators, alongside observations of web archiving activities, this paper proposes a conceptual framework that outlines the earliest stages of research alignment and engagement with national web archives. The concepts developed in the paper (orientating, auditing and constructing) provide an avenue for discussing the entanglement of researchers, curators and collections in the research process. In discussion, we make a number of observations regarding the challenges of this form of digital research - including how researchers must unpick the complex constraints of different web archives - and suggest possible ways that existing curatorial infrastructure (tools, people and curatorial knowledge and expertise) could be leveraged to better facilitate researcher engagement in future.  

To learn more about our findings, check out the recording of our EWA 2020 presentation.

Acknowledgments

This work was supported by the Social Sciences and Humanities Research Council (SSHRC) Canada Graduate Scholarship 767-2015-2217 and Michael Smith Foreign Study Supplement. Additional funding was provided by a UKRI/Economic and Social Research Council, National Centre for Research Methods placement fellowship and research funds by the University of Southampton. The authors also gratefully acknowledge the generosity and support provided by staff and researchers at the UKWA, the British Library, the Royal Library and the NetLab at Aarhus University.

05 November 2020

On World Digital Preservation Day, the UK Web Archive and the ‘Children of Lockdown’ capture the moment for future generations

Add comment

By Charlotte McMillan, Founder, Storychest

Introduction from Nicola Bingham, Lead Curator, Web Archiving, British Library

Today is World Digital Preservation Day (WDPD), a celebration of all things digital preservation which is organised by the Digital Preservation Coalition and held annually on the first Thursday of November.

To mark today’s WDPD, the UK Web Archive is highlighting the ‘Children of Lockdown’ project, a recent addition to the Archive which exemplifies the diversity of voices captured in the Archive and highlights the importance of preserving our digital legacy.

Charlotte McMillan, founder of Storychest and curator of the ‘Children of Lockdown’ project writes:

“Lockdown has been like a question mark in the middle of a sentence. Unexpected, confusing and stressful”, observes Yilu, 14, from London, in one of 200 thoughtful and powerful responses to lockdown and Covid19 from children aged 3 to 17, spanning the whole of United Kingdom and from as far afield as Australia.

In July this year, children were asked to contribute to a ‘digital time capsule’ to be included by the British Library in the UK Web Archive’s Covid19 collection. The resulting group of stories, poems and artwork, is insightful and poignant, and whilst it reflects anxiety and, in some cases, profound sadness, it is also imbued with humour, imagination and enduring hope. Disruption to education, distancing from friends and family and uncertainty about the future, are themes which have been captured through the lenses and with the clarity of the voices of the children themselves, a generation whose lives have been affected by the pandemic in so many ways.

The ‘Children of Lockdown’ collection was initiated by Charlotte McMillan, mother of 3 boys and founder of Storychest, the private digital memory box app. Witnessing the impact of the upheaval of lockdown restrictions, she encouraged her own children to record their impressions of this unprecedented period of time.

Charlotte studied history at university and remembers the amazement she felt at exploring on microfiche newspaper articles stored by the British Library from a century ago to help her to understand how events were perceived contemporaneously. She wanted to enable today’s children to record their thoughts and impressions in a lasting way, to help future generations to understand what they were going through. So, Charlotte, together with a group of 5 British children’s authors put the word out to schools and other groups for children’s submissions.

The children have captured enduring and iconic snapshots like the ‘clap for carers’, PE with Joe Wicks, the run on loo paper, empty streets and the emergence of nature, including goats taking over Llandudno.

Maddy mask

Maddy, 15, from London drew herself in monochrome, with the now all too familiar accessory of a mask, eyes peering knowingly at the viewer.

Tiggy

Tiggy, 10, from Kent felt trapped and isolated away from her friends, so drew herself behind prison bars in her own home, but added hope to her work by overlaying a rainbow in pastels.

Sholto

Charlotte’s son Sholto, 14, pictured himself caught inside his phone, referencing the use of devices in lockdown as both a window to the outside world and a trap.

There are stories of imagination and escapism: Flora, 10, from Wallingford likened the virus to a wandering wolf ‘pacing the fence line’ outside her home; Joseph, 8, from Nottingham, missing playing football, invented games using Lego, “Harry Potter, racing through on his broomstick to smash it in the goal. Godzilla in goal, nobody could beat him”.

Incredibly brave Emma, 11, from Derbyshire, was being treated for a brain tumour during lockdown. She reflected on happy moments sat on a best friend’s drive for a chat and also describes sadness that her mum was not able to be with her, due to the restrictions, when she rang the hospital bell to mark the end of her treatment.

Siomha, 11, from London grieves for the death of a beloved great uncle, who had been known as the ‘baby whisperer’ for his calming effect on her as a baby: “Where is my baby whisperer? This time he cannot stop the tears, because this time I weep for him.”

Maddy mural

And yet, through it all, there is hope. Maddy, 15, from Exeter, creates a lockdown mural in her bedroom to show hope for her family and friends. Saanvi, 11, from Leamington Spa, remembering Dumbledore’s words “Happiness can be found even in the darkest of times, if one only remembers to turn on the light”, reflects that “mankind will get through any crisis and discover positivity even where it seems impossible to find”.

Kenzi, 12, from Chesterfield who is autistic and found lockdown to be a particularly anxious time, sums it up beautifully, in his poem ‘Life in Lockdown’:

When normality returns

2020 will be remembered

In our broken battered hearts

As the time the world finally united

By staying apart

The Children of Lockdown collection can be viewed in full at https://childrenoflockdown.storychest.com/

The UK Web Archive Coronavirus Collection can be viewed here: https://www.webarchive.org.uk/en/ukwa/collection/2975

04 November 2020

Curating culturally themed collections online: The Russia in the UK Collection, UK Web Archive

Add comment

By Hannah Connell, Collaborative PhD Student, King’s College London; British Library

Title slide from Hannah's presentation with a London Underground map in Russian

 

I spoke about my position as a curator for the Russia in the UK curated collection as part of the recent Engaging with Web Archives conference (EWA), which was held online from the 21st-22nd of September 2020. This conference reflected the breadth of the web archiving community, bringing together speakers from researchers to librarians, as well as curators and web archiving teams from many different countries.

As always, it was inspiring to participate in such a welcoming event. Even online, the conference retained the collaborative atmosphere which has marked my experience of research in web archiving, allowing new researchers to interact with more experienced practitioners and encouraging questions and conversations between researchers, users and archivists.

The researcher-curated collection, Russia in the UK, is part of the UK Web Archive (UKWA). I was particularly pleased to have had the opportunity to present this curated collection, a resource on the Russian-speaking community in the UK, which was first started in November 2017. Such collections play an important role in making the wide range of material preserved in the UKWA more visible to researchers.

Curators are important to the preservation work of the UKWA. Curated collections are collected manually by curators and researchers with specialist knowledge in their field. The role of a curator in creating a UKWA collection involves identifying relevant websites to be included in a collection, and recording the metadata for these websites, including the translation and transliteration of titles and descriptions in other languages.

This collection is valuable both as a resource for further research, and as a means of questioning research practices. It is not possible to capture everything on the web, and collection curators ensure that a representative sample of websites for each thematic collection are selected. The practice of creating and maintaining a collection such as the Russia in the UK  ultimately influences the shape of the collection and the online representation of the diasporic community it will come to reflect. As such, it is important for researchers and users to understand the decisions taken by curators in selecting and capturing websites.

My paper for EWA focused on the creation of a curation guide for curators of new curated collections. This  draws on the ongoing process of curating the Russia in the UK collection, documenting both the provenance of this special collection and reflecting on this process as a model for future collections.  

In documenting the creation of this collection, I hope to enable future researchers to explore and contribute to this record of the online activity of the Russian diaspora in the UK, and to question and develop the curatorial and research practices behind the curation of collections.

You can watch Hannah Connell’s presentation on the EWA YouTube channel.

 

03 November 2020

LGBTQ+ Lives Online: Introducing the Lead Curators

Add comment

By Steven Dryden, British Library LGBTQ+ Staff Network & Ash Green CILIP LGBTQ+ Network

In July 2020 the British Library, the UK Web Archive and CILIP LGBTQ+ Network relaunched the LGBTQ+ Lives Online web archive collection. We have received many nominations for new sites to be collected by the UK Web Archive and work has begun to re-tag many of the websites that have been collected since the UK Web Archive began collecting the UK web in 2005.

To mark two months since the project began, LGBTQ+ Lives Online leads Steven Dryden, of the British Library, and Ash Green, of CILIP LGBTQ+ Network write about the relevance of the World Wide Web to them as members of the LGBTQ+ community, and some of their collection highlights:

 

Steven he/him/his

StevenDryden
Steven Dryden

I first encountered the internet in Las Vegas. It was the summer of 1998, I was 17 and my family had migrated from Newcastle Upon Tyne to the western world’s party play pit in the Nevada desert. My friend, Lilian, was talking to someone in New York City about the band Depeche Mode through America Online (AOL).

Chat rooms were online spaces that allowed groups of people to join anonymously and had the options to talk and interact within a group or in private. Chatrooms quickly became a pivotal part of my small cohort of friends and I, the odd balls who didn’t quite fit, as we were forming our identities in those formative late teen years, and trying to find our place in the world.

Later the same year on October 12, 1998 Matthew Shepard would die. A gay student at the University of Wyoming, Shepherd was beaten, tortured, and left to die near Laramie on the night of October 6, 1998. AOL chatrooms formed the major part of how I found out about Shepherd, worked through my feelings about his murder, and was the first news story that I followed online.

The protections and general understanding of who the lesbian, gay, bisexual and transgender community are has undergone radical change in the 22 years since I first encountered the internet. I’m interested to see what survives online of the change in language relating to the community, and what evidence remains in the UK Web Archive of the online discussion. Some websites that interest me in these first months of the project include:

  • The Campaign for Homosexual Equality: an organisation which led the way to legal reform in the UK, following the passing of the Sexual Offences Act 1967, which partial decriminalised homosexuality in England and Wales.

https://www.webarchive.org.uk/wayback/en/archive/20130505124828/http://www.c-h-e.org.uk/

  • Around the Toilet: a community engaged art project exploring the accessibility and culture of toilets for the LGBTQ+ community

https://www.webarchive.org.uk/wayback/en/archive/20180606164959/https://aroundthetoilet.wordpress.com/

  • Asexual Visibility and Education Network: founded in 2001 with two distinct goals: creating public acceptance and discussion of asexuality and facilitating the growth of an asexual community

https://www.webarchive.org.uk/wayback/en/archive/20150226230020/http://www.asexuality.org/home/

 

Ash (they/them)

Ash Green
Ash Green

When I was studying for my BA Information and Library Management degree in the early 1990s, the internet and World Wide Web weren’t as high profile as they are now. I loved tech back then, and was into programming and creating databases as part of the degree. But I didn’t really understand what the lecturers were talking about when they mentioned the internet. At the time I had no idea how important it would be to my coming out just over 20 years later, and what a positive impact it would have.

Thinking about the lead up to my coming out in 2017, without access to sites and forums related to trans/gender non-conforming lives in particular, I doubt I would have come out at all. But when I decided to look for guidance online, I found a huge amount of information that was overwhelming at first, but eventually this helped me understood where I fitted into the world. They included medical sites; statements from WHO and other health organisations highlighting that being trans wasn’t a mental health issue; personal blogs and forums, talking about experiences and a variety of perspectives on what it means to be trans; finding out about non-binary, genderfluid, and genderqueer people experiences (I had no idea what these words meant); LGBTQ+ events; makeup and style tips; sites for face-to-face support groups and meetups, and sites for exhibitions such as the Museum of Transology and the Transworkers photography exhibition, which helped me understand that being trans is much broader than mainstream media would have the world believe.

Many sites were useful, but at the same time I came across quite a few that were more "Yes, this miracle herbal treatment really does change your hormones", and "You're only valid if you fit into trans box X or Y" that put my critical, digital literacy and research experience into practice. I also found supportive friends and allies, and I was able to share useful sites and sources of information I’d discovered to give them a better understanding of my experience. It’s important that these sites should be a part of the UK Web Archive LGBTQ+ Lives Online collection. Not only because they have a relevance to the UK Web Archive in general, but from a personal perspective I feel that if they had such an impact on helping me find where I fit into the world, how many other people have they also had a similar positive impact upon?

The sites I’ve chosen below from the UK Web Archive have all had a personal impact upon myself.

  • Museum of Transology: The UK’s most significant collection of objects representing trans, non-binary and intersex people’s lives. 

https://www.webarchive.org.uk/wayback/en/archive/20201003091027/https://www.museumoftransology.com/

  • OutStories Bristol: Collecting and preserving the social history and recollections of LGBT+ people living in or associated with Bristol, England.

https://www.webarchive.org.uk/wayback/en/archive/10000101000000/https://outstoriesbristol.org.uk/

  • Outline Surrey: Outline provides support to people with their sexuality and gender identity, including but not limited to the lesbian, gay, bi-sexual and trans community of Surrey, primarily through a helpline, website and support groups.

https://www.webarchive.org.uk/wayback/en/archive/20160107134238/http://www.outlinesurrey.org/

 

Get involved with preserving UK LGBTQ+ Lives Online with the UK Web Archive

We can’t curate the whole of the UK web on our own, we need your help to ensure that information, discussions, personal experiences and creative outputs related to the LGBTQ+ community are preserved for future generations. Anyone can suggest UK published websites to be included in the UK Web Archive by filling in our nominations form:

https://www.webarchive.org.uk/en/ukwa/nominate

 

02 November 2020

Digital archaeology in the web of links: reconstructing a late-90s web sphere

Add comment

By Dr. Peter Webster, Independent Scholar, Historian and Consultant

Fiber cables for the internet

 

The historian of the late 1990s has a problem. The vast bulk of content from the period is no longer on the live web; there are few, if any, indications of what has been lost – no inventory of the 1990s web against which to check. Of the content that was captured by the Internet Archive (more or less the only archive of the Anglophone web of the period), only a superficial layer is exposed to full-text search, and the bulk may only be retrieved by a search for the URL. We do not know what was never archived, and in the archive it is difficult to find what we might want, since there is no means of knowing the URL of a lost resource. Sometimes we need, then, to understand the archived web using only the technical data about itself that it can be made to disclose.

Niels Brügger has defined a web sphere as ‘web material … related to a topic, a theme, an event or a geographic area’.  My paper at the EWA conference presents a method of reconstructing a web sphere, much of which is lost from the live web and exists only in the Internet Archive: the web estate of the many conservative Christian campaign groups in the UK in the 1990s and early 2000s.

This method of web sphere reconstruction is based not on page content but on the relationships between sites, i.e., the web of hyperlinks. The method is iterative, and tracks back and forth between big data and small. Individual archived pages and directories, printed sources, the scholarly record itself, and even traces of previous unsuccessful attempts at web archiving come into play, as does a large dataset held by the British Library. From the more than 2 billion lines in the UK Host Link Graph dataset it is possible to extract the outlines of this particular web sphere.

You can watch Peter Webster’s presentation on his website peterwebster.me

 

Previous studies using a similar method are: 

Webster, Peter. 2019. Lessons from cross-border religion in the Northern Irish web sphere: understanding the limitations of the ccTLD as a proxy for the national web. In The Historical Web and Digital Humanities: the Case of National Web domains, eds Niels Brügger & Ditte Laursen, 110-23. London: Routledge.  http://dx.doi.org/10.17613/yms5-9v95     

Webster, Peter. 2017. Religious discourse in the archived web: Rowan Williams, archbishop of Canterbury, and the sharia law controversy of 2008. In: The Web as History, eds Niels Brügger & Ralph Schroeder, 190-203. London: UCL Press. (Available Open Access at:  https://www.uclpress.co.uk/products/84010)