UK Web Archive blog

19 June 2023

Reflections on the IIPC Web Archiving Conference 2023

By Andrew Jackson, Web Archive Technical Lead

Tessa Walsh (Webrecorder) Anders Klindt (Royal Danish Library) Ilya Kreymer (Webrecorder) & Andy Jackson (British Library ) demonstrating the new Browsertrix features in the workshop 'Browser-Based Crawling for All: Getting Started with Browsertrix Cloud'
Demonstrating the new Browsertrix features in the workshop 'Browser-Based Crawling for All: Getting Started with Browsertrix Cloud'

My main goal for the conference was to support the adoption and development of shared open source tools. I've been involved in the IIPC project Browser-based crawling for all, and at the conference I helped run a workshop where attendees could start exploring Browsertrix Cloud and give feedback to the project and to Webrecorder. There were some initial problems with the capacity of the demo system, but these were quickly resolved and the workshop was a success and provided useful feedback for future work.

I also ended up chairing the SolrWayback session, which showed many great examples of how that search interface and the underlying indexing tools (developed by UKWA) have been used by different web archives to help explore and analyse their collection. It's heartening to see more and more web archives doing this kind of thing.

There were a lot of good presentations and discussions around tools, but I'd particularly like to recommend that you all check out Warchaeology by the National Library of Norway Web Archive, and Scoop by the Harvard Library Innovation Laboratory.

Both the Scoop presentation and the Bellingcat keynote provided important insights into what it takes for web archives to be legally-admissible evidence (see also e.g. this post about Scoop and this post from Bellingcat). There are interesting questions here about our tools and workflows, like whether the WARC or WACZ formats are sufficient in their current form, and whether there are opportunities for deeper collaboration across the domains of cultural heritage, law, and open source investigation.

Finally, across a number of presentations, the conference also raised questions about the current and future role of cultural heritage institutions. Are our approaches to information literacy fit for an age of fake news and ChatGPT pollution? Is there something libraries and archives can learn from how Bellingcat and fact checkers like Full Fact are helping people find reliable information and avoid conspiracy theories? Can web archives do more to fight disinformation? I look forward to seeing more about this at future conferences!