Digital scholarship blog

Enabling innovative research with British Library digital collections

215 posts categorized "Projects"

11 September 2023

Join the British Library's Universal Viewer Product Team

The British Library has been a leading contributor to IIIF, the International Image Interoperability Framework, and the Universal Viewer for many years. We're about to take the next step in this work - and you can join us! We are recruiting for a Product Owner, a Research Software Engineer and a Senior Test Engineer (deadline 03 January 2024). 

In this post, Dr Mia Ridge, product owner for the Universal Viewer (UV) 2015-18, and Dr Rossitza Atanassova, UV business owner 2019-2023, share some background information on how new posts advertised for a UV product team will help shape the future of the Viewer at the Library and contribute to international work on the UV, IIIF standards and activities.

A lavishly decorated page from a fourteenth century manusript 'The Sherborne Missal' showing an illuminated capital with the Virgin Mary holding baby Jesus and surrounded by the three Kings.With other illuminations in the margins and the text.
Detail from Add MS 74236 'The Sherborne Missal' displayed in the Universal Viewer

 The creation of a Universal Viewer product team is part of wider infrastructure changes at the British Library, and marks a shift from contributing via specific UV development projects to thinking of the Viewer as a product. We'll continue to work with the Open Collective while focusing on Library-specific issues to support other activities across the organisation. 

Staff across the Library have contributed to the development of the Universal Viewer, including curators, digitisation teams and technology staff. Staff engage through bespoke training delivered by the IIIF Consortium, participation at IIIF workshops and conferences and experimentation with new tools, such as the digital storytelling tool Exhibit, to engage wide audiences. Other Library work with IIIF includes a collaboration with Zooniverse to enable items to be imported to Zooniverse via IIIF manifests, making crowdsourcing more accessible to organisations with IIIF items. Most recently with funding from the Andrew W Mellon Foundation we updated the UV to play audio from the British Library sound collections

Over half a million items from the British Library's collections are already available via the Universal Viewer, and that number grows all the time. Work on the UV has already let us retire around 35 other image viewers, a significant reduction in maintenance overheads and creating a more consistent experience for our readers.

However, there's a lot more to do! User expectations change as people use other document and media viewers, whether that's other IIIF tools like Mirador or the latest commercial streaming video platforms. We also need to work on some technical debt, ensure accessibility standards are met, improve infrastructure, and consolidate services for the benefits to users. Future challenges include enhancing UV capabilities to display annotations, formats such as newspapers, and complex objects such as 3D.

A view of the Library's image viewer, showing an early nineteenth century Javanese palm-leaf manuscript inside its decorated wooden covers. To the left of the image there is a list with the thumbnails of the manuscript leaves and to the right the panel displays bibliographic information about the item.
British Library Universal Viewer displaying Add MS 12278

 If you'd like to work in collaboration with an international open source community on a viewer that will reach millions of users around the world, one of these jobs may be for you!

Product Owner (job reference R00000196)

Ensure the strategic vision, development, and success of the project. Your primary goal will be to understand user needs, prioritise features and enhancements, and collaborate with the development team and community to deliver a high-quality open source product. 

Research Software Engineer (job reference R00000197)

Help identify requirements, and design and implement online interfaces to showcase our collections, help answer research questions, and support application of novel methods across team activities.

Senior Test Engineer (job reference R00000198)

Help devise requirements, develop high quality test cases, and support application of novel methods across team activities

To apply please visit the British Library recruitment siteApplications close on 3 January 2024. Interview dates are listed in the job ads.

Please ensure you answer all application questions (CVs cannot be submitted). At the BL we can only shortlist with information that applicants provide in response to questions on the application.  Any questions about the roles or the process? Drop us a line at [email protected].

02 May 2023

Detecting Catalogue Entries in Printed Catalogue Data

This is a guest blog post by Isaac Dunford, MEng Computer Science student at the University of Southampton. Isaac reports on his Digital Humanities internship project supervised by Dr James Baker.

Introduction

The purpose of this project has been to investigate and implement different methods for detecting catalogue entries within printed catalogues. For whilst printed catalogues are easy enough to digitise and convert into machine readable data, dividing that data by catalogue entry requires visual signifiers of divisions between entries - gaps in the printed page, large or upper-case headers, catalogue references - into machine-readable information. The first part of this project involved experimenting with XML-formatted data derived from the 13-volume Catalogue of books printed in the 15th century now at the British Museum (described by Rossitza Atanassova in a post announcing her AHRC-RLUK Professional Practice Fellowship project) and trying to find the best ways to detect individual entries and reassemble them as data (given that the text for a single catalogue entry may be spread across multiple pages of a printed catalogue). Then the next part of this project involved building a complete system based on this approach to take the large volume of XML files for a volume and output all of the catalogue entries in a series of desired formats. This post describes our initial experiments with that data, the approach we settled on, and key features of our approach that you should be able to reapply to your catalogue data. All data and code can be found on the project GitHub repo.

Experimentation

The catalogue data was exported from Transkribus in two different formats: an ALTO XML schema and a PAGE XML schema. The ALTO layout encodes positional information about each element of the text (that is, where each word occurs relative to the top left corner of the page) that makes spatial analysis - such as looking for gaps between lines - helpful. However, it also creates data files that are heavily encoded, meaning that it can be difficult to extract the text elements from the data files. Whereas the PAGE schema makes it easier to access the text element from the files.

 

An image of a digitised page from volume 8 of the Incunabula Catalogue and the corresponding Optical Character Recognition file encoded in the PAGE XML Schema
Raw PAGE XML for a page from volume 8 of the Incunabula Catalogue

 

An image of a digitised page from volume 8 of the Incunabula Catalogue and the corresponding Optical Character Recognition file encoded in the ALTO XML Schema
Raw ALTO XML for a page from volume 8 of the Incunabula Catalogue

 

Spacing and positioning

One of the first approaches tried in this project was to use size and spacing to find entries. The intuition behind this is that there is generally a larger amount of white space around the headings in the text than there is between regular lines. And in the ALTO schema, there is information about the size of the text within each line as well as about the coordinates of the line within the page.

However, we found that using the size of the text line and/or the positioning of the lines was not effective for three reasons. First, blank space between catalogue entries inconsistently contributed to the size of some lines. Second, whenever there were tables within the text, there would be large gaps in spacing compared to the normal text, that in turn caused those tables to be read as divisions between catalogue entries. And third, even though entry headings were visually further to the left on the page than regular text, and therefore should have had the smallest x coordinates, the materiality of the printed page was inconsistently represented as digital data, and so presented regular lines with small x coordinates that could be read - using this approach - as headings.

Final Approach

Entry Detection

Our chosen approach uses the data in the page XML schema, and is bespoke to the data for the Catalogue of books printed in the 15th century now at the British Museum as produced by Transkribus (and indeed, the version of Transkribus: having built our code around some initial exports, running it over  the later volumes - which had been digitised last -  threw an error due to some slight changes to the exported XML schema).

The code takes the XML input and finds entry using a content-based approach that looks for features at the start and end of each catalogue entry. Indeed after experimenting with different approaches, the most consistent way to detect the catalogue entries was to:

  1. Find the “reference number” (e.g. IB. 39624) which is always present at the end of an entry.
  2. Find a date that is always present after an entry heading.

This gave us an ability to contextually infer the presence of a split between two catalogue entries, the main limitation of which is quality of the Optical Character Recognition (OCR) at the point at which the references and dates occur in the printed volumes.

 

An image of a digitised page with a catalogue entry and the corresponding text output in XML format
XML of a detected entry

 

Language Detection

The reason for dividing catalogue entries in this way was to facilitate analysis of the catalogue data, specifically analysis that sought to define the linguistic character of descriptions in the Catalogue of books printed in the 15th century now at the British Museum and how those descriptions changed and evolved across the thirteen volumes. As segments of each catalogue entry contains text transcribed from the incunabula that were not written by a cataloguer (and therefore not part of their cataloguing ‘voice’), and as those transcribed sections are in French, Dutch, Old English, and other languages that a machine could detect as not being modern English, to further facilitate research use of the final data, one of the extensions we implemented was to label sections of each catalogue entry by the language. This was achieved using a python library for language detection and then - for a particular output type - replacing non-English language sections of text with a placeholder (e.g. NON-ENGLISH SECTION). And whilst the language detection model does not detect the Old-English, and varies between assigning those sections labels for different languages as a result, the language detection was still able to break blocks of text in each catalogue entry into the English and non-English sections.

 

Text files for catalogue entry number IB39624 showing the full text and the detected English-only sections.
Text outputs of the full and English-only sections of the catalogue entry

 

Poorly Scanned Pages

Another extension for this system was to use the input data to try and determine whether a page had been poorly scanned: for example, that the lines in the XML input read from one column straight into another as a single line (rather than the XML reading order following the visual signifiers of column breaks). This system detects poorly scanned pages by looking at the lengths of all lines in the page XML schema, establishing which lines deviate substantially from the mean line length, and if sufficient outliers are found then marking the page as poorly scanned.

Key Features

The key parts of this system which can be taken and applied to a different problem is the method for detecting entries. We expect that the fundamental method of looking for marks in the page content to identify the start and end of catalogue entries in the XML files would be applicable to other data derived from printed catalogues. The only parts of the algorithm which would need changing for a new system would be the regular expressions used to find the start and end of the catalogue entry headings. And as long as the XML input comes in the same schema, the code should be able to consistently divide up the volumes into the individual catalogue entries.

31 March 2023

Mapping Caribbean Diasporic Networks through the Correspondence of Andrew Salkey

This is a guest post by Natalie Lucy, a PhD student at University College London, who recently undertook a British Library placement to work on a project Mapping Caribbean Diasporic Networks through the correspondence of Andrew Salkey.

Project Objectives

The project, supervised by curators Eleanor Casson and Stella Wisdom, focussed on the extensive correspondence contained within Andrew Salkey’s archive. One of the initial objectives was to digitally depict the movement of key Caribbean writers and artists, as it is evidenced within the correspondence, many of whom travelled between Britain and the Caribbean as well as the United States, Central and South America and Africa. Although Salkey corresponded with a diverse range of people, we therefore focused on the letters in his archive which were from Caribbean writers and academics and which illustrated  patterns of movement of the Caribbean diaspora. Much of the correspondence stems from 1960s and 1970s, a time when Andrew Salkey was particularly active both in the Caribbean Artists Movement and, as a writer and broadcaster, at the BBC.

Photograph of Andrew Salkey's head and shoulders in profile
Photograph of Andrew Salkey

Andrew Salkey was unusual not only for the panoply of writers, artists and politicians with whom he was connected, but that he sustained those relationships, carefully preserving the correspondence which resulted from those networks. My personal interest in this project stemmed from the fact that my PhD seeks to consider the ways that the Caribbean trickster character, Anancy, has historically been reinvented to say something about heritage and identity. Significant to that question was the way that the Caribbean Artists Movement, a dynamic group of artists and writers formed in London in the mid-1960s, and of which Andrew Salkey was a founder, appropriated Anancy, reasserting him and the folktales to convey something of a literary ‘voice’ for the Caribbean. For this reason, I was also interested in the writing networks which were evidenced within the correspondence, together with their impact.

What is Gephi?

Prior to starting the project, Eleanor, who had catalogued the Andrew Salkey archive and Digital Curator, Stella, had identified Gephi as a possible software application through which to visualise this data. Gephi has been used in a variety of projects, including several at Harvard University, examples of the breadth and diversity of those initiatives can be found here. Several of these projects have social networks or historical trading routes as their focus, with obvious parallels to this project. Others notably use correspondence as their main data.

Gathering the Data

Andrew Salkey was known as something of a chronicler. He was interested in letters and travel and was also a serious collector of stamps. As such, he had not only retained the majority of the letters he received but categorised them. Eleanor had originally identified potential correspondents who might be useful to the project, selecting writers who travelled widely, whose correspondence had been separately stored by Salkey, partly because of its volume, and who might be of wider interest to the public. These included the acclaimed Caribbean writers, Samuel Selvon, George Lamming, Jan Carew and Edward Kamau Brathwaite and publishers and political activists, Jessica and Eric Huntley.

Our initial intention was to limit the data to simple facts which could easily be gleaned from the letters. Gephi required that we did so on a spreadsheet ,which had to conform to a particular format. In the first stages of the project, the data was confined to the dates and location of the correspondence, information which could suggest the patterns of movement within the diaspora. However, the letters were so rich in detail, that we ultimately recorded other information. This included any additional travel taken by any of the correspondents,  and which was clearly evidenced in the letters, together with any passages from the correspondence which demonstrated either something of the nature and quality of the friendships or, alternatively, the mutual benefit of those relationships to the careers of so many of the writers.

Creating a visual network

Dr Duncan Hay was invited to collaborate with me on this project, as he has considerable expertise in this field, his research interests include web mapping for culture and heritage and data visualisation for literary criticism.  After the initial data was collated, we discussed with Duncan what visualisations could be created. It became apparent early on that creating a visualisation of the social networks, as opposed to the patterns of movement, might be relatively straightforward via Gephi, an application which was particularly useful for this type of graph. I had prepared a spreadsheet but, Gephi requires the data to be presented in a strictly consistent way which meant that any anomalies had to be eradicated and the data effectively ‘cleaned up’ using Open Refine. Gephi also requires that information is presented by way of a system of ‘nodes’; ‘edges’  and ‘attributes’ with corresponding spreadsheet columns. In our project, the ‘nodes’ referred to Andrew Salkey and each of the correspondents and other individuals of interest who were specifically referred to within the correspondence. The edges referred to the way that those people were connected which, in this case, was through correspondence. However, what added to the potential of the project was that these nodes and edges could be further described by reference to ‘attributes.’ The possibility of assigning a range of ‘attributes’ to each of the correspondents allowed a wealth of additional information to be provided about the networks. As a consequence, and in order to make any visualisation as informative as possible, I also added brief biographical information for each of the writers and artists to be inputted as ‘attributes’ together with some explanation of the nature of the networks that were being illustrated.

The visual illustration below shows not only the quantity of letters from the sample of correspondents to Andrew Salkey (the pink lines),  but also shows which other correspondents formed part of those networks and were referenced as friends or contacts within specific items of correspondence. For example, George Lamming references academic, Rex Nettleford and writer and activist, Claudia Jones, the founder of the Notting Hill Carnival, in his correspondence, connections which are depicted in grey. 

Data visualisation of nodes and lines representing Andrew Salkey's Correspondence Network
Gephi: Andrew Salkey correspondence network

The aim was, however, for the visualisation to also be interactive. This required considerable further manipulation of the format and tools. In this illustration you can see the information that is revealed about the prominent Barbadian writer, George Lamming which, in an interactive format, can be accessed via the ‘i’ symbols beside many of the nodes coloured in green.  

Whilst Gephi was a useful tool with which to illustrate the networks, it was less helpful as a way to demonstrate the patterns of movement, one of the primary objectives of the project. A challenge was, therefore, to create a map which could be both interactive and illustrative of the specific locations of the correspondents as well as their movement over time. With Duncan’s input and expertise, we opted for a hybrid approach, utilising two principal ways to illustrate the data: we used Gephi to create a visualisation of the ‘networks’ (above) and another software tool, Kepler.gl, to show the diasporic movement.

A static version of what ultimately will be a ‘moving’ map (illustrating correspondence with reference to person, date and location) is shown below. As well as demonstrating patterns of movement, it should also be possible to access information about specific letters as well as their shelf numbers through this map, hopefully making the archive more accessible.

Data visualisation showing lines connecting countries on a map showing part of the Americas, Europe and Africa
Patterns of diasporic movement from Andrew Salkey's correspondence, illustrated in Kepler.gl

Whilst we are still exploring the potential of this project and how it might intersect with other areas of research and archives, it has already revealed something of the benefits of this type of data visualisation. For example, a project of this type could be used as an educational tool, providing something of a simple, but dynamic, introduction to the Caribbean Artists Movement. Being able to visualise the project has also allowed us to input information which confirms where specific letters of interest might be found within the archive. Ultimately, it is hoped that the project will offer ways to make a rich, yet arguably undervalued, archive more accessible to a wider audience with the potential to replicate something of an introductory model, or ‘pilot’ for further archives in the future. 

28 February 2023

Legacies of Catalogue Descriptions Project Events at Yale

In January James Baker and I visited the Lewis Walpole Library at Yale, who are the US partner of the Legacies of catalogue descriptions collaboration. The visit had to be postponed several times due to the pandemic, so we were delighted to finally meet in person with Cindy Roman, our counterpart at Yale. The main reason for the trip was to disseminate the findings of our project by running workshops on tools for computational analysis of catalogue data and delivering talks about Researching the Histories of Cataloguing to (Try to) Make Better Metadata. Two of these events were kindly hosted by Kayla Shipp, Programme Manager of the fabulous Franke Family Digital Humanities Lab (DH Lab).

A photo of Cindy Roman, Rossitza Atanassova, James Baker and Kayla Shipp standing in a line in the middle of the Yale Digital Humanities Lab
(left to right) Cindy Roman, Rossitza Atanassova, James Baker and Kayla Shipp in the Yale Digital Humanities Lab

This was my first visit to Yale University campus, so I took the opportunity to explore its iconic library spaces, including the majestic Sterling Memorial Library building, a masterpiece of Gothic Revival architecture, and the world renowned Beinecke Rare Book and Manuscripts Library, whose glass tower inspired the Kings’ Library Tower at the British Library. As well as being amazing hubs for learning and research, the Library buildings and exhibition spaces are also open to public visitors. At the time of my visit I explored the early printed treasures on display at the Beinecke Library, the exhibit about Martin Luther King Jr’s connection with Yale and the splendid display of highlights from Yale’s Slavic collections, including Vladimir Nabokov’s CV for a job application to Yale and a family photo album that belonged to the Romanovs.

A selfie of Rossitza Atanassova with the building of the Stirling Memorial Library in the the background
Outside Yale's Stirling Memorial Library

A real highlight of my visit was the day I spent at the Lewis Walpole Library (LWP), located in Farmington, about 40 miles from the Yale campus. The LWP is a research centre of eighteenth-century studies and an essential resource for the study of Horace Walpole. The collections including important holdings of British prints and drawings were donated to Yale by Wilmarth and Annie Lewis in 1970s, together with several eighteenth-century historic buildings and land.

Prior to my arrival James had conducted archival research with the catalogues of the LWP satirical prints collections, a case study for our project. As well as visiting the modern reading room to take a look at the printed card catalogues many in hand of Mrs Lewis, we were given a tour of Mr and Mrs Lewis’ house which is now used for classes, workshops and meetings. I enjoyed meeting the LWP staff and learned much about the history of the place, the collectors' lives and LWP current initiatives.

One of the historic buildings on the Lewis Walpole Library site - The Roots House, a white Georgian-style building with a terrace, used to house visiting fellows and guests
The Root House which houses residential fellows

 

One of the historic buildings on the Lewis Walpole Library site - a red-coloured building surrounded by trees
Thomas Curricomp House

 

The main house, a white Georgian-style house, seen from the side, with the entrance to the Library on the left
The Cowles House, where Mr and Mrs Lewis lived

 

The two project events I was involved with took place at the Yale DH Lab. During the interactive workshop, Yale Library, faculty and students worked through the training materials on using AntConc for computational analysis and performed a number of tasks with the LWP satirical prints descriptions. There were discussions about the different ways of querying the data and the suitability of this tool for use with non-European languages and scripts. It was great to hear that this approach could prove useful for querying and promoting Yale’s own open access metadata.

 

James talking to a group of people seated at a table, with a screen behind him showing some text data
James presenting at the workshop about AntConc
Rossitza standing next to a screen with a slide about her talk facing the audience
Rossitza presenting her research with incunabula catalogue descriptions

 

The talks addressed the questions around cataloguing labour and curatorial voices, the extent to which computational analysis enables new research questions and can assist practitioners with remedial work involving collections metadata. I spoke about my current RLUK fellowship project with the British Library incunabula descriptions and in particular the history of cataloguing, the process to output text data and some hypotheses to be tested through computational analysis. The following discussion raised questions about the effort that goes into this type of work and the need to balance a greater user access to library and archival collections with the very important considerations about the quality and provenance of metadata.

During my visit I had many interesting conversations with Yale Library staff, Nicole Bouché, Daniel Lovins, Daniel Dollar, and caught up with folks I had met at the 2022 IIIF Conference, Tripp Kirkpatrick, Jon Manton and Emmanuelle Delmas-Glass. I was curious to learn about recent organisational changes aimed to unify the Yale special collections and enhance digital access via IIIF metadata; the new roles of Director of Computational Data and Methods in charge of the DH Lab and Cultural Heritage Data Engineer to transform Yale data into LOUD.

This has been a truly informative and enjoyable visit and my special thanks go to Cindy Roman and Kayla Shipp who hosted my visit and project events at the start of a busy term and to James for the opportunity to work with him on this project.

This blogpost is by Dr Rossitza Atanassova, Digital Curator for Digitisation, British Library. She is on Twitter @RossiAtanassova  and Mastodon @[email protected]

06 February 2023

A Year In Three Wikithons: The Lord Chamberlain's Plays

The second year of the Wikimedia residency has allowed us to pay specific attention to the work being done on the Lord Chamberlain’s Plays, specifically the excellent research project work by Professor Kate Dossett (University of Leeds). Kate teaches American History at the University of Leeds, and is currently working on ‘Black Cultural Archives & the Making of Black Histories: Archives of Surveillance and Black Transnational Theatre’, a project supported by an Independent Social Research Foundation Fellowship and a Fellowship from the Eccles Centre. Her work focuses on the understudied area of Black theatre history in the first half of the twentieth century, and when we had the chance to collaborate, we leapt on it!

One of the things we wanted to do was run a series of three Wikithons, each celebrating a different aspect of the collection: in this case, the role of women; the ways in which censorship impacted creativity for Black theatre makers and the political surveillance of Black creatives. Alongside these Wikithons, we are developing a Wikibase structure to enable users to search the Lord Chamberlain’s Plays index cards from anywhere in the world. A blog on this work is forthcoming.

What transpired from our Wikithon dream was a series of three excellent events, interactions and collaborative work with a number of exceptional researchers and historians, all mixed in with a year of administrative tumult as we felt the impact of numerous strikes (academic and transport), the Royal funeral and the ongoing implications of the pandemic. 

This was an important learning opportunity for us to examine the role and impact of Wikithons, and consider different methods of delivery and engagement, tying into bigger conversations happening around Wikipedia on an international scale. It was a year in three Wikithons!

Event One (March 2022)

Our first event took place in March 2022. Having only just gotten over the dreaded Covid myself, the long-term impact of the pandemic was sorely felt: we were just out of some winter restrictions, and we felt it was best to hold this event as an online session, due to the uncertainty of the months ahead. Further to this, we had to look at dates that would not interrupt or clash with the ongoing University and College Union strikes. Once we had this in hand, we were ready to open the (virtual) doors to Black Theatre and the Archive: Making Women Visible, 1900-1950

We were lucky to have speakers from the Library, Alexander Lock and Laura Walker, to talk about and contextualise the materials, while Kate herself offered a thematic and political overview of the importance of the work we were to embark upon. Despite the strikes, the pandemic and the demands of early 2022, 9 editors added over 1600 words, 21 references and 84 total edits. Changes made on this day have now been viewed over 25000 times. For a small batch of changes, that is a significant impact! Articles edited included Elisabeth Welch, Anna Lucasta and Edric Connor. I was grateful to Stuart Prior and Dr Francesca Allfrey for the training support at this event, and to Heather Pascall from the News Reference Team who offered her expertise on the day. The British Newspaper Archive also gave us access to their online resource for this event, which was both generous and very helpful.

Image of Pauline Henriques, BBC UK Government, Public domain, via Wikimedia Commons
Image of Pauline Henriques, BBC UK Government, Public domain, via Wikimedia Commons

Event Two (November 2022)

After a summer of political upheaval, a royal funeral and further transport strikes, we finally made it to Leeds Playhouse on the 7th of November 2022. As luck would have it, there was a train strike running that day, but as most of our participants were local to Leeds, there was thankfully very little impact on our numbers. Leeds Playhouse was the perfect home for this Wikithon: Furnace Producer Rio Matchett was a fantastic ambassador for the venue, and made sure we were fed and watered in style. Hope Miyoba was there to support me in training both sessions and I am so grateful to her for her support, particularly as my laptop wasn’t working!

We took over the Playhouse for the full day, running Wikithon sessions in the morning and afternoon, with a lunchtime talk by Joe Williams of Heritage Corner Leeds which was attended by morning and afternoon attendees, as well as some members of the public. Joe’s talk on Sankofa Yorkshire was a brilliant overview of Black creativity in the Leeds area throughout history, and informed a lot of our conversation around the politics and practicalities of Wiki editing in an equitable way. Articles edited included Una Marson, a central figure in Kate’s research and the Lord Chamberlain’s Plays.

It was fantastic to be in person again, and to meet the excellent community of creatives at Leeds Playhouse. Joe’s talk was inspirational and the questions it provoked regarding the way in which the Wikimedia guidelines for notability can negatively impact the prevalence of Black creatives on Wiki were a much needed point of discussion.

Image of Leeds Playhouse illuminated at night
Anthony Robling, CC BY-SA 4.0, via Wikimedia Commons

Event Three (January 2023)

Our arrival at the iconic National Archives building at Kew was long awaited and months in the planning. Drs Jo Pugh and Kevin Searle were exceptionally helpful and supportive as we planned our way to the ‘Black Theater Making and Surveillance’ event in January 2023. We were delighted to be in the building, and even happier to welcome Perry Blankson of the Young Historians Project to present his work on The Secret War on Black Power in Britain and the Caribbean. Gathering in a central space in the Archives, Dr Searle curated an amazing selection of archival materials for participants to view and utilise, including documents from the Information Research Department.

Some of the documents on display at Kew, image by the author
Some of the documents on display at Kew, image by the author

Our conversations on this day turned towards the idea of Wiki notability and the use of primary sources in establishing authority on Wikipedia in particular. I was grateful once again to Stuart Prior and Dr Francesca Allfrey for their support and training assistance, and moreover for the thoughtful and important conversations we fostered around the ways in which the politics of the present day can cloud and impact what happens on Wiki and how events and politics can be reported. A truly breathtaking moment was when Dr Searle and his colleagues allowed us to look at the Windrush manifest, a material reminder of a significant and hugely important moment in modern Britain. It was wonderful, also, to welcome Dr Cara Rodway, Head of Research Development and Philip Abraham, Deputy Head of the Eccles Centre, to join us in seeing this final event in the Wikithon series.

Image of the National Archives building in Kew on a sunny day
The National Archives, Kew by Christopher Hilton, CC BY-SA 2.0, via Wikimedia Commons

Conclusion

Despite a year of unforeseeable events, disruption and obstacles, I am immensely proud of what this series of Wikithons achieved, bringing aspects of modern society into direct conversation with our literary archives, asking questions about race, equality and diversity in Britain. We were lucky to work with creative practitioners and speakers like Joe Williams and Perry Blankson, and to be afforded the chance to really think about what it is to edit Wiki, and to try to improve the world in this way. It has allowed me to think more deeply about the wider Wiki conversations around how best to engage with and train new Wiki editors, and how to look at collections in new and impactful ways. I am very grateful to the American Trust for the British Library and the Eccles Centre for American studies for their support in achieving this work.

This blogpost is by Dr Lucy Hinnie, Wikimedian in Residence, British Library. She is on Twitter @BL_Wikimedian.

28 October 2022

Learn more about Living with Machines at events this winter

Digital Curator, and Living with Machines Co-Investigator Dr Mia Ridge writes…

The Living with Machines research project is a collaboration between the British Library, The Alan Turing Institute and various partner universities. Our free exhibition at Leeds City Museum, Living with Machines: Human stories from the industrial age, opened at the end of July. Read on for information about adult events around the exhibition…

Museum Late: Living with Machines, Thursday 24 November, 2022

6 - 10pm Leeds City Museum • £5, booking essential https://my.leedstickethub.co.uk/19101

The first ever Museum Late at Leeds City Museum! Come along to experience the museum after hours with music, pub quiz, weaving, informal workshops, chats with curators, and a quiz. Local food and drinks in the main hall.

Full programme: https://museumsandgalleries.leeds.gov.uk/events/leeds-city-museum/museum-late-living-with-machines/

Tickets: https://my.leedstickethub.co.uk/19101

Study Day: Living with Machines, Friday December 2, 2022

10:00 am - 4:00 pm Online • Free but booking essential: https://my.leedstickethub.co.uk/18775

A unique opportunity to hear experts in the field illuminate key themes from the exhibition and learn how exhibition co-curators found stories and objects to represent research work in AI and digital history. This study day is online via Zoom so that you can attend from anywhere.

Full programme: https://museumsandgalleries.leeds.gov.uk/events/leeds-city-museum/living-with-machines-study-day/

Tickets: https://my.leedstickethub.co.uk/18775

Living with Machines Wikithon, Saturday January 7, 2023

1 – 4:30pm Leeds City Museum • Free but booking essential: https://my.leedstickethub.co.uk/19104

Ever wanted to try editing Wikipedia, but haven't known where to start? Join us for a session with our brilliant Wikipedian-in-residence to help improve Wikipedia’s coverage of local lives and topics at an editathon themed around our exhibition. 

Everyone is welcome. You won’t require any previous Wiki experience but please bring your own laptop for this event. Find out more, including how you can prepare, in my blog post on the Living with Machines site, Help fill gaps in Wikipedia: our Leeds editathon.

The exhibition closes the next day, so it really is your last chance to see it!

Full programme: https://museumsandgalleries.leeds.gov.uk/events/leeds-city-museum/living-with-machines-wikithon-exploring-the-margins/

Tickets: https://my.leedstickethub.co.uk/19104

If you just want to try out something more hands on with textiles inspired by the exhibition, there's also a Peg Loom Weaving Workshop, and not one but two Christmas Wreath Workshops.

You can find out more about our exhibition on the Living with Machines website.

Lwm800x400

20 September 2022

Learn more about what AI means for us at Living with Machines events this autumn

Digital Curator, and Living with Machines Co-Investigator Dr Mia Ridge writes…

The Living with Machines research project is a collaboration between the British Library, The Alan Turing Institute and various partner universities. Our free exhibition at Leeds City Museum, Living with Machines: Human stories from the industrial age, opened at the end of July. Read on for information about adult events around the exhibition…

AI evening panels and workshop, September 2022

We’ve put together some great panels with expert speakers guaranteed to get you thinking about the impact of AI with their thought-provoking examples and questions. You'll have a chance to ask your own questions in the Q&A, and to mingle with other attendees over drinks.

We’ve also collaborated with AI Tech North to offer an exclusive workshop looking at the practical aspects of ethics in AI. If you’re using or considering AI-based services or tools, this might be for you. Our events are also part of the jam-packed programme of the Leeds Digital Festival #LeedsDigi22, where we’re in great company.

The role of AI in Creative and Cultural Industries

Thu, Sep 22, 17:30 – 19:45 BST

Leeds City Museum • Free but booking required

https://www.eventbrite.com/e/the-role-of-ai-in-creative-and-cultural-industries-tickets-395003043737

How will AI change what we wear, the TV and films we watch, what we read? 

Join our fabulous Chair Zillah Watson (independent consultant, ex-BBC) and panellists Rebecca O’Higgins (Founder KI-AH-NA), Laura Ellis (Head of Technology Forecasting, BBC) and Maja Maricevic, (Head of Higher Education and Science, British Library) for an evening that'll help you understand the future of these industries for audiences and professionals alike. 

Maja's written a blog post on The role of AI in creative and cultural industries with more background on this event.

 

Workshop: Developing ethical and fair AI for society and business

Thu, Sep 29, 13:30 - 17:00 BST

Leeds City Museum • Free but booking required

https://www.eventbrite.com/e/workshop-developing-ethical-and-fair-ai-for-society-and-business-tickets-400345623537

 

Panel: Developing ethical and fair AI for society and business

Thu, Sep 29, 17:30 – 19:45 BST

Leeds City Museum • Free but booking required

https://www.eventbrite.com/e/panel-developing-ethical-and-fair-ai-for-society-and-business-tickets-395020706567

AI is coming, so how do we live and work with it? What can we all do to develop ethical approaches to AI to help ensure a more equal and just society? 

Our expert Chair, Timandra Harkness, and panellists Sherin Mathew (Founder & CEO of AI Tech UK), Robbie Stamp (author and CEO at Bioss International), Keely Crockett (Professor in Computational Intelligence, Manchester Metropolitan University) and Andrew Dyson (Global Co-Chair of DLA Piper’s Data Protection, Privacy and Security Group) will present a range of perspectives on this important topic.

If you missed our autumn events, we also have a study day and Wikipedia editathon this winter. You can find out more about our exhibition on the Living with Machines website.

Lwm800x400

20 April 2022

Importing images into Zooniverse with a IIIF manifest: introducing an experimental feature

Digital Curator Dr Mia Ridge shares news from a collaboration between the British Library and Zooniverse that means you can more easily create crowdsourcing projects with cultural heritage collections. There's a related blog post on Zooniverse, Fun with IIIF.

IIIF manifests - text files that tell software how to display images, sound or video files alongside metadata and other information about them - might not sound exciting, but by linking to them, you can view and annotate collections from around the world. The IIIF (International Image Interoperability Framework) standard makes images (or audio, video or 3D files) more re-usable - they can be displayed on another site alongside the original metadata and information provided by the source institution. If an institution updates a manifest - perhaps adding information from updated cataloguing or crowdsourcing - any sites that display that image automatically gets the updated metadata.

Playbill showing the title after other large text
Playbill showing the title after other large text

We've posted before about how we used IIIF manifests as the basis for our In the Spotlight crowdsourced tasks on LibCrowds.com. Playbills are great candidates for crowdsourcing because they are hard to transcribe automatically, and the layout and information present varies a lot. Using IIIF meant that we could access images of playbills directly from the British Library servers without needing server space and extra processing to make local copies. You didn't need technical knowledge to copy a manifest address and add a new volume of playbills to In the Spotlight. This worked well for a couple of years, but over time we'd found it difficult to maintain bespoke software for LibCrowds.

When we started looking for alternatives, the Zooniverse platform was an obvious option. Zooniverse hosts dozens of historical or cultural heritage projects, and hundreds of citizen science projects. It has millions of volunteers, and a 'project builder' that means anyone can create a crowdsourcing project - for free! We'd already started using Zooniverse for other Library crowdsourcing projects such as Living with Machines, which showed us how powerful the platform can be for reaching potential volunteers. 

But that experience also showed us how complicated the process of getting images and metadata onto Zooniverse could be. Using Zooniverse for volumes of playbills for In the Spotlight would require some specialist knowledge. We'd need to download images from our servers, resize them, generate a 'manifest' list of images and metadata, then upload it all to Zooniverse; and repeat that for each of the dozens of volumes of digitised playbills.

Fast forward to summer 2021, when we had the opportunity to put a small amount of funding into some development work by Zooniverse. I'd already collaborated with Sam Blickhan at Zooniverse on the Collective Wisdom project, so it was easy to drop her a line and ask if they had any plans or interest in supporting IIIF. It turns out they had, but hadn't had the resources or an interested organisation necessary before.

We came up with a brief outline of what the work needed to do, taking the ability to recreate some of the functionality of In the Spotlight on Zooniverse as a goal. Therefore, 'the ability to add subject sets via IIIF manifest links' was key. ('Subject set' is Zooniverse-speak for 'set of images or other media' that are the basis of crowdsourcing tasks.) And of course we wanted the ability to set up some crowdsourcing tasks with those items… The Zooniverse developer, Jim O'Donnell, shared his work in progress on GitHub, and I was very easily able to set up a test project and ask people to help create sample data for further testing. 

If you have a Zooniverse project and a IIIF address to hand, you can try out the import for yourself: add 'subject-sets/iiif?env=production' to your project builder URL. e.g. if your project is number #xxx then the URL to access the IIIF manifest import would be https://www.zooniverse.org/lab/xxx/subject-sets/iiif?env=production

Paste a manifest URL into the box. The platform parses the file to present a list of metadata fields, which you can flag as hidden or visible in the subject viewer (public task interface). When you're happy, you can click a button to upload the manifest as a new subject set (like a folder of items), and your images are imported. (Don't worry if it says '0 subjects).

 

Screenshot of manifest import screen
Screenshot of manifest import screen

You can try out our live task and help create real data for testing ingest processes at ​​https://frontend.preview.zooniverse.org/projects/bldigital/in-the-spotlight/classify

This is a very brief introduction, with more to come on managing data exports and IIIF annotations once you've set up, tested and launched a crowdsourced workflow (task). We'd love to hear from you - how might this be useful? What issues do you foresee? How might you want to expand or build on this functionality? Email [email protected] or tweet @mia_out @LibCrowds. You can also comment on GitHub https://github.com/zooniverse/Panoptes-Front-End/pull/6095 or https://github.com/zooniverse/iiif-annotations

Digital work in libraries is always collaborative, so I'd like to thank British Library colleagues in Finance, Procurement, Technology, Collection Metadata Services and various Collections departments; the Zooniverse volunteers who helped test our first task and of course the Zooniverse team, especially Sam, Jim and Chris for their work on this.

 

Digital scholarship blog recent posts

Archives

Tags

Other British Library blogs