Digital scholarship blog

Enabling innovative research with British Library digital collections

Introduction

Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

06 November 2024

Digital Humanities Congress 2024

Research Software Engineer James Misson writes...

On the 4th and 5th of September the Digital Humanities Congress was held in Sheffield, where the University of Sheffield continues to affirm its reputation as a hub for all things DH. The conference was a testament to the wide scope of DH methods, as well as researchers' abilities to adopt cutting edge technology to further our knowledge of human culture.

A common theme that emerged between papers was the application of machine learning to historical linguistics. Kate Wild, from the Oxford English Dictionary, shared the initial stages of the Oxford Corpus of Historical English, which will unite a vast amount of linguistic data spanning from the fifteenth century to the present day. The equally impressive Ansund project was presented by Mark Faulkner and Elisabetta Magnanti — a comprehensive corpus of Old English texts enriched from their manuscript sources by computer vision.

Keynote lectures were given by Melissa Terras and Simon Mahony, whose extensive experience gave them ideal vantage points from which to survey the Digital Humanities and the twists and turns it has taken since the beginnings of their careers. Likewise, Paola Marchionni and Peter Findlay (formerly of the British Library) presented the history of Jisc, elucidating its critical role within research institutes.

Conversations beyond the lecture hall were instructive for the Digital Scholarship team, especially for the BL’s recovery following the cyberattack last year. It was clear that the English Short Title Catalogue is a crucial resource for many scholars in attendance, not only as a finding aid but also as a dataset — encouraging to know, as the library works towards getting the ESTC back online. This is especially true of Fred Schurink’s research on the importation of early continental books to early modern England, which is an innovative contribution to the burgeoning field of Bibliographic Data Science. We look forward to learning more about this field at Dr Schurink’s upcoming workshop at the John Ryland’s Library in Manchester.

Recovered Pages: Crowdsourcing at the British Library

Digital Curator Mia Ridge writes...

While the British Library works to recover from the October 2023 cyber-attack, we're putting some information from our currently inaccessible website into an easily readable and shareable format. This blog post is based on a page captured by the Wayback Machine in September 2023.

Crowdsourcing at the British Library

Screenshot of the Zooniverse interface for annotating a historical newspaper article
Example of a crowdsourcing task

For the British Library, crowdsourcing is an engaging form of online volunteering supported by digital tools that manage tasks such as transcription, classification and geolocation that make our collections more discoverable.

The British Library has run several popular crowdsourcing projects in the past, including the Georeferencer, for geolocating historical maps, and In the Spotlight, for transcribing important information about historical playbills. We also integrated crowdsourcing activities into our flagship AI / data science project, Living with Machines.

  • Agents of Enslavement uses 18th/19th century newspapers to research slavery in Barbados and create a database of enslaved people.
  • Living with Machines, which is mostly based on research questions around nineteenth century newspapers

Crowdsourcing Projects at the British Library

  • Living with Machines (2019-2023) created innovative crowdsourced tasks, including tasks that asked the public to closely read historical newspaper articles to determine how specific words were used.
  • Agents of Enslavement (2021-2022) used 18th/19th century newspapers to research slavery in Barbados and create a database of enslaved people.
  • In the Spotlight (2017-2021) was a crowdsourcing project from the British Library that aimed to make digitised historical playbills more discoverable, while also encouraging people to closely engage with this otherwise less accessible collection of ephemera.
  • Canadian wildlife: notes from the field (2021), a project where volunteers transcribed handwritten field notes that accompany recordings of a wildlife collection within the sound archive.
  • Convert a Card (2015) was a series of crowdsourcing projects aimed to convert scanned catalogue cards in Asian and African languages into electronic records. The project template can be found and used on GitHub.
  • Georeferencer (2012 - present) enabled volunteers to create geospatial data from digitised versions of print maps by adding control points to the old and modern maps.
  • Pin-a-Tale (2012) asked people to map literary texts to British places.

 

Research Projects

The Living with Machines project included a large component of crowdsourcing research through practice, led by Digital Curator Mia Ridge.

Mia was also the Principle Investigator on the AHRC-funded Collective Wisdom project, which worked with a large group of co-authors to produce a book, The Collective Wisdom Handbook: perspectives on crowdsourcing in cultural heritage, through two 'book sprints' in 2021:

This book is written for crowdsourcing practitioners who work in cultural institutions, as well as those who wish to gain experience with crowdsourcing. It provides both practical tips, grounded in lessons often learned the hard way, and inspiration from research across a range of disciplines. Case studies and perspectives based on our experience are woven throughout the book, complemented by information drawn from research literature and practice within the field.

More Information

Our crowdsourcing projects were designed to produce data that can be used in discovery systems (such as online catalogues and our item viewer) through enjoyable tasks that give volunteers an opportunity to explore digitised collections.

Each project involves teams across the Library to supply digitised images for crowdsourcing and ensure that the results are processed and ingested into various systems. Enhancing metadata through crowdsourcing is considered in the British Library's Collection Metadata Strategy.

We previously posted on twitter @LibCrowds and currently post occasionally on Mastodon https://glammr.us/@libcrowds and via our newsletter.

Past editions of our newsletter are available online.

31 October 2024

Welcome to the British Library’s new Digital Curator OCR/HTR!

Blog pictureHello everyone! I am Dr Valentina Vavassori, the new Digital Curator for Optical Character Recognition/Handwritten Text Recognition at the British Library.

I am part of the Heritage Made Digital Team, which is responsible for developing and overseeing the digitisation workflow at the Library. I am also an unofficial member of the Digital Research Team, where I promote the reuse and access to the Library’s collections.

My role has both an operational component (integrating and developing OCR and HTR in the digitisation workflow) and a research and engagement component (supporting OCR/HTR projects in the Library). I really enjoy these two sides of my role, as I have a background as a researcher and as a cultural heritage professional.

I joined the British Library from The National Archives, London, where I worked as a Digital Scholarship Researcher in the Digital Research Team. I worked on projects involving data visualisation, OCR/HTR, data modelling, and user experience.

Before that, I completed a PhD in Digital Humanities at King’s College London, focusing on chatbots and augmented reality in museums and their impact on users and museum narratives. Part of my thesis explored how to use these narratives using spatial humanities methods such as GIS. During my PhD, I also collaborated on various digital research projects with institutions like The National Gallery, London, and the Museum of London.

However, I originally trained as an art historian. I studied art history in Italy and worked for a few years in museums. During my job, I realised the potential of developing digital experiences for visitors and the significant impact digitisation can have on research and enjoyment in cultural heritage. I was so interested in the opportunities, that I co-founded a start-up which developed a heritage geolocation app for tourists.

Joining the Library has been an amazing opportunity. I am really looking forward to learning from my colleagues and exploring all the potential collaborations within and outside the Library.