Digital scholarship blog

Enabling innovative research with British Library digital collections

205 posts categorized "Experiments"

07 May 2024

Recovered Pages: Computing for Cultural Heritage Student Projects

The British Library is continuing to recover from last year’s cyber-attack. While our teams work to restore our services safely and securely, one of our goals in the Digital Research Team is to get some of the information from our currently inaccessible web pages into an easily readable and shareable format. We’ll be sharing these pages via blog posts here, with information recovered from the Wayback Machine, a fantastic initiative of the Internet Archive.  

The next page in this series is all about the student projects that came out of our Computing for Cultural Heritage project with the National Archives and Birkbeck University. This student project page was captured by the Wayback Machine on 7 June 2023.  

 

Computing for Cultural Heritage Student Projects

computing for cultural heritage logo - an image of a laptop with bookshelves as the screen saver

This page provides abstracts for a selection of student projects undertaken as part of a one-year part-time Postgraduate Certificate (PGCert), Computing for Cultural Heritage, co-developed by British Library, National Archives and Birkbeck University and funded by the Institute of Coding as part of a £4.8 million University skills drive.

“I have gone from not being able to print 'hello' in Python to writing some relatively complex programs and having a much greater understanding of data science and how it is applicable to my work."

- Jessica Green  

Key points

  • Aim of the trial was to provide professionals working in the cultural heritage sector with an understanding of basic programming and computational analytic tools to support them in their daily work 
  • During the Autumn & Spring terms (October 2019-April 2020), 12 staff members from British Library and 8 staff staff members from The National Archives completed two new trial modules at Birkbeck University: Demystifying computing for heritage professionals and Work-based Project 
  • Birkbeck University have now launched the Applied Data Science (Postgraduate Certificate) based on the outcomes of the trial

Student Projects

 

Transforming Physical Labels into Digital References 

Sotirios Alpanis, British Library
This project aims to use computing to convert data collected during the preparation of archive material for digitisation into a tool that can verify and validate image captures, and subsequently label them. This will take as its input physical information about each document being digitised, perform and facilitate a series of validations throughout image capture and quality assurance and result in an xml file containing a map of physical labels to digital files. The project will take place within the British Library/Qatar Foundation Partnership (BL/QFP), which is digitising archive material for display on the QDL.qa.  

Enhancing national thesis metadata with persistent identifiers

Jenny Basford, British Library 
Working with data from ISNI (International Standard Name Identifier) Agency and EThOS (Electronic Theses Online Service), both based at the British Library, I intend to enhance the metadata of both databases by identifying doctoral supervisors in thesis metadata and matching these data with ISNI holdings. This work will also feed into the European-funded FREYA project, which is concerned with the use of a wide variety of persistent identifiers across the research landscape to improve openness in research culture and infrastructure through Linked Data applications.

A software tool to support the social media activities of the Unlocking Our Sound Heritage Project

Lucia Cavorsi, British Library
Video
I would like to design a software tool able to flag forthcoming anniversaries, by comparing all the dates present in SAMI (sound and moving image catalogue – Sound Archive) with the current date. The aim of this tool is to suggest potential content for the Sound Archive’s social media posts. Useful dates in SAMI which could be matched with the current date and provide material for tweets are: birth and death dates of performers or authors, radio programme broadcast dates, recording dates).  I would like this tool to also match the subjects currently present in SAMI with the subjects featured in the list of anniversaries 2020 which the social media team uses. For example anniversaries like ‘International HIV day’, ‘International day of Lesbian visibility’ etc.  A windows pop up message will be designed for anniversaries notifications on the day.  If time permits, it would be convenient to also analyse what hashtags have been used over last year by the people who are followed by or follow the Sound Archive Twitter account. By extracting a list of these hashtags further, and more sound related, anniversaries could be added to the list of anniversaries currently used by the UOSH’s social media team.

Computing Cholera: Topic modelling the catalogue entries of the General Board of Health

Christopher Day, The National Archives
BlogOther
The correspondence of the General Board of Health (1848–1871) documents the work of a body set up to deal with cholera epidemics in a period where some English homes were so filthy as to be described as “mere pigholes not fit for human beings”. Individual descriptions for each of these over 89,000 letters are available on Discovery, The National Archives (UK)’s catalogue. Now, some 170 years later, access to the letters themselves has been disrupted by another epidemic, COVID-19. This paper examines how data science can be used to repurpose archival catalogue descriptions, initially created to enhance the ‘human findability’ of records (and favoured by many UK archives due to high digitisation costs), for large-scale computational analysis. The records of the General Board will be used as a case study: their catalogue descriptions topic modelled using a latent Dirichlet allocation model, visualised, and analysed – giving an insight into how new sanitary regulations were negotiated with a divided public during an epidemic. The paper then explores the validity of using the descriptions of historical sources as a source in their own right; and asks how, during a time of restricted archival access, metadata can be used to continue research.

An Automated Text Extraction Tool for Use on Digitised Maps

Nicholas Dykes, British Library
Blog / Video
Researchers of history often have difficulty geo-locating historical place names in Africa. I would like to apply automated transcription techniques to a digitised archive of historical maps of Africa to create a resource that will allow users to search for text, and discover where, and on which maps that text can be found. This will enable identification and analysis both of historical place names and of other text, such as topographical descriptions. I propose to develop a software tool in Python that will send images stored locally to the Google Vision API, and retrieve and process a response for each image, consisting of a JSON file containing the text found, pixel coordinate bounding boxes for each instance of text, and a confidence score. The tool will also create a copy of each image with the text instances highlighted. I will experiment with the parameters of the API in order to achieve the most accurate results.  I will incorporate a routine that will store each related JSON file and highlighted image together in a separate folder for each map image, and create an Excel spreadsheet containing text results, confidence scores, links to relevant image folders, and hyperlinks to high-res images hosted on the BL website. The spreadsheet and subfolders will then be packaged together into a single downloadable resource.  The finished software tool will have the capability to create a similar resource of interlinked spreadsheet and subfolders from any batch of images.

Reconstituting a Deconstructed Dataset using Python and SQLite

Alex Green, The National Archives
Video
For this project I will rebuild a database and establish the referential integrity of the data from CSV files using Python and SQLite. To do this I will need to study the data, read the documentation, draw an entity relationship diagram and learn more about relational databases. I want to enable users to query the data as they would have been able to in the past. I will then make the code reusable so it can be used to rebuild other databases, testing it with a further two datasets in CSV form. As an additional challenge, I plan to rearrange the data to meet the principles of ‘tidy data’ to aid data analysis.

PIMMS: Developing a Model Pre-Ingest Metadata Management System at the British Library

Jessica Green, British Library
GitHub / Video
I am proposing a solution to analysing and preparing for ingest a vast amount of ‘legacy’ BL digitised content into the future Digital Asset Management System (DAMPS). This involves building a prototype for a SQL database to aggregate metadata about digitised content and preparing for SIP creation. In addition, I will write basic queries to aid in our ongoing analysis about these TIFF files, including planning for storage, copyright, digital preservation and duplicate analysis. I will use Python to import sample metadata from BL sources like SharePoint, Excel and BL catalogues – currently used for analysis of ‘live’ and ‘legacy’ digitised BL collections. There is at least 1 PB of digitised content on the BL networks alone, as well as on external media such as hard-drives and CDs. We plan to only ingest one copy of each digitised TIFF file set and need to ensure that the metadata is accurate and up-to-date at the point of ingest. This database, the Pre-Ingest Metadata Management System (PIMMS), could serve as a central metadata repository for legacy digitised BL collections until then. I look forward to using Python and SQL, as well as drawing on the coding skills from others, to make these processes more efficient and effective going forward.

Exploring, cleaning and visualising catalogue metadata

Alex Hailey, British Library
Blog / Video
Working with catalogue metadata for the India Office Records (IOR) I will undertake three tasks: 1) converting c430,000 IOR/E index entries to descriptions within the relevant volume entries; 2) producing an SQL database for 46,500 IOR/P descriptions, allowing enhanced search when compared with the BL catalogue; and 3) creating Python scripts for searching, analysis and visualisation, to be demonstrated on dataset(s) and delivered through Jupyter Notebooks.

Automatic generation of unique reference numbers for structured archival data.

Graham Jevon, British Library
Blog / Video / GitHub
The British Library’s Endangered Archives Programme (EAP) funds the digital preservation of endangered archival material around the world. Third party researchers digitise material and send the content to the British Library. This is accompanied by an Excel spreadsheet containing metadata that describes the digitised content. EAP’s main task is to clean, validate, and enhance the metadata prior to ingesting it into the Library’s cataloguing system (IAMS). One of these tasks is the creation of unique catalogue reference numbers for each record (each row of data on the spreadsheet). This is a predominantly manual process that is potentially time consuming and subject to human inputting errors. This project seeks to solve this problem. The intention is to create a Windows executable program that will enable users to upload a csv file, enter a prefix, and then click generate. The instant result will be an export of a new csv file, which contains the data from the original csv file plus automatically generated catalogue reference numbers. These reference numbers are not random. They are structured in accordance with an ordered archival hierarchy. The program will include additional flexibility to account for several variables, including language encoding, computational efficiency, data validation, and wider re-use beyond EAP and the British Library.

Automating Metadata Extraction in Born Digital Processing

Callum McKean, British Library
Video
To automate the metadata extraction section of the Library’s current work-flow for born-digital processing using Python, then interrogate and collate information in new ways using the SQLite module.

Analysis of peak customer interactions with Reference staff at the British Library: a software solution

Jaimee McRoberts, British Library
Video
The British Library, facing on-going budget constraints, has a need to efficiently deploy Reference Services staff during peak periods of demand. The service would benefit from analysis of existing statistical data recording the timestamp of each customer interaction at a Reference Desk. In order to do this, a software solution is required to extract, analyse, and output the necessary data. This project report demonstrates a solution utilising Python alongside the pandas library which has successfully achieved the required data analysis.

Enhancing the data in the Manorial Documents Register (MDR) and making it more accessible

Elisabeth Novitski, The National Archives
Video
To develop computer scripts that will take the data from the existing separate and inconsistently formatted files and merge them into a consistent and organised dataset. This data will be loaded into the Manorial Documents Register (MDR) and National Register of Archives (NRA) to provide the user with improved search ability and access to the manorial document information.

Automating data analysis for collection care research at The National Archives: spectral and textual data

Lucia Pereira Pardo, The National Archives
The day-to-day work of a conservation scientist working for the care of an archival collection involves acquiring experimental data from the varied range of materials present in the physical records (inks, pigments, dyes, binding media, paper, parchment, photographs, textiles, degradation and restoration products, among others). To this end, we use multiple and complementary analytical and testing techniques, such as X-ray fluorescence (XRF), Fourier Transform Infrared (FTIR) and Fibre Optic Reflectance spectroscopies (FORS), multispectral imaging (MSI), colour and gloss measurements, microfading (MFT) and other accelerated ageing tests.  The outcome of these analyses is a heterogeneous and often large dataset, which can be challenging and time-consuming to process and analyse. Therefore, the objective of this project is to automate these tasks when possible, or at least to apply computing techniques to optimise the time and efforts invested in routine operations, so that resources are freed for actual research and more specialised and creative tasks dealing with the interpretation of the results.

Improving efficiencies in content development through batch processing and the automation of workloads

Harriet Roden, British Library
Video
With the purpose to support and enrich the curriculum, the British Library’s Digital Learning team produces large-scale content packages for online learners through individual projects. Due to their reliance on other internal teams within the workflow for content delivery, a substantial amount of resource is spent on routine tasks to duplicate collection metadata across various databases. In order to reduce inefficiencies, increase productivity and improve reliability, my project aimed to alleviate pressures across the workflow through workload automation, through four separate phases.

The Botish Library: building a poetry printing machine with Python

Giulia Carla Rossi, British Library
Blog / Video
This project aims to build a poetry printing machine, as a creative output that unites traditional content, new media and Python. The poems will be sourced from the British Library Digitised Books dataset collection, available under Public Domain Mark; I will sort through the datasets and identify which titles can be categorised as poetry using Python. I will then create a new dataset comprising these poetry books and relative metadata, which will then be connected to the printer with a Python script. The poetry printing machine will print randomized poems from this new dataset, together with some metadata (e.g. poem title, book title, author and shelfmark ID) that will allow users to easily identify the book.

Automating data entry in the UOSH Tracking Database

Chris Weaver, British Library
The proposed software solution is the creation of a Python script (to feature as a module in a larger script) to extract data from a web-based tool (either via obtaining data in JSON format via the sites' API or accessing the database powering the site directly). The data obtained is then formatted and inserted into corresponding fields in a Microsoft SQL Server database.

Final Module

Following the completion of the trial, participants had the opportunity to complete their PGCert in Applied Data Science by attending the final module, Analytic Tools for Information Professionals, which was part of the official course launched last autumn. We followed up with some of the participants to hear more about their experience of the full course:

“The third and final module of the computing for cultural heritage course was not only fascinating and enjoyable, it was also really pertinent to my job and I was immediately able to put the skills I learned into practice.  

The majority of the third module focussed on machine learning. We studied a number of different methods and one of these proved invaluable to the Agents of Enslavement research project I am currently leading. This project included a crowdsourcing task which asked the public to draw rectangles around four different types of newspaper advertisement. The purpose of the task was to use the coordinates of these rectangles to crop the images and create a dataset of adverts that can then be analysed for research purposes. To help ensure that no adverts were missed and to account for individual errors, each image was classified by five different people.  

One of my biggest technical challenges was to find a way of aggregating the rectangles drawn by five different people on a single page in order to calculate the rectangles of best fit. If each person only drew one rectangle, it was relatively easy for me to aggregate the results using the coding skills I had developed in the first two modules. I could simply find the average (or mean) of the five different classification attempts. But what if people identified several adverts and therefore drew multiple rectangles on a single page? For example, what if person one drew a rectangle around only one advert in the top left corner of the page; people two and three drew two rectangles on the same page, one in the top left and one in the top right; and people four and five drew rectangles around four adverts on the same page (one in each corner). How would I be able to create a piece of code that knew how to aggregate the coordinates of all the rectangles drawn in the top left and to separately aggregate the coordinates of all the rectangles drawn in the bottom right, and so on?  

One solution to this problem was to use an unsupervised machine learning method to cluster the coordinates before running the aggregation method. Much to my amazement, this worked perfectly and enabled me to successfully process the total of 92,218 rectangles that were drawn and create an aggregated dataset of more than 25,000 unique newspaper adverts.” 

-Graham Jevon, EAP Cataloguer; BL Endangered Archives Programme 

“The final module of the course was in some ways the most challenging — requiring a lot of us to dust off the statistics and algebra parts of our brain. However, I think, it was also the most powerful; revealing how machine learning approaches can help us to uncover hidden knowledge and patterns in a huge variety of different areas.  

Completing the course during COVID meant that collection access was limited, so I ended up completing a case study examining how generic tropes have evolved in science fiction across time using a dataset extracted from GoodReads. This work proved to be exceptionally useful in helping me to think about how computers understand language differently; and how we can leverage their ability to make statistical inferences in order to support our own, qualitative analyses. 

In my own collection area, working with born digital archives in Contemporary Archives and Manuscripts, we treat draft material — of novels, poems or anything else — as very important to understanding the creative process. I am excited to apply some of these techniques — particularly Unsupervised Machine Learning — to examine the hidden relationships between draft material in some of our creative archives. 

The course has provided many, many avenues of potential enquiry like this and I’m excited to see the projects that its graduates undertake across the Library.” 

- Callum McKean, Lead Curator, Digital; Contemporary British Collection

“I really enjoyed the Analytics Tools for Data Science module. As a data science novice, I came to the course with limited theoretical knowledge of how data science tools could be applied to answer research questions. The choice of using real-life data to solve queries specific to professionals in the cultural heritage sector was really appreciated as it made everyday applications of the tools and code more tangible. I can see now how curators’ expertise and specialised knowledge could be combined with tools for data analysis to further understanding of and meaningful research in their own collection area."

- Giulia Carla Rossi, Curator, Digital Publications; Contemporary British Collection

Please note this page was originally published in Feb 2021 and some of the resources, job titles and locations may now be out of date.

15 March 2024

Call for proposals open for DigiCAM25: Born-Digital Collections, Archives and Memory conference

Digital research in the arts and humanities has traditionally tended to focus on digitised physical objects and archives. However, born-digital cultural materials that originate and circulate across a range of digital formats and platforms are rapidly expanding and increasing in complexity, which raises opportunities and issues for research and archiving communities. Collecting, preserving, accessing and sharing born-digital objects and data presents a range of technical, legal and ethical challenges that, if unaddressed, threaten the archival and research futures of these vital cultural materials and records of the 21st century. Moreover, the environments, contexts and formats through which born-digital records are mediated necessitate reconceptualising the materials and practices we associate with cultural heritage and memory. Research and practitioner communities working with born-digital materials are growing and their interests are varied, from digital cultures and intangible cultural heritage to web archives, electronic literature and social media.

To explore and discuss issues relating to born-digital cultural heritage, the Digital Humanities Research Hub at the School of Advanced Study, University of London, in collaboration with British Library curators, colleagues from Aarhus University and the Endangered Material Knowledge Programme at the British Museum, are currently inviting submissions for the inaugural Born-Digital Collections, Archives and Memory conference, which will be hosted at the University of London and online from 2-4 April 2025. The full call for proposals and submission portal is available at https://easychair.org/cfp/borndigital2025.

Text on image says Born-Digital Collections, Archives and Memory, 2 - 4 April 2025, School of Advanced Study, University of London

This international conference seeks to further an interdisciplinary and cross-sectoral discussion on how the born-digital transforms what and how we research in the humanities. We welcome contributions from researchers and practitioners involved in any way in accessing or developing born-digital collections and archives, and interested in exploring the novel and transformative effects of born-digital cultural heritage. Areas of particular (but not exclusive) interest include:

  1. A broad range of born-digital objects and formats:
    • Web-based and networked heritage, including but not limited to websites, emails, social media platforms/content and other forms of personal communication
    • Software-based heritage, such as video games, mobile applications, computer-based artworks and installations, including approaches to archiving, preserving and understanding their source code
    • Born-digital narrative and artistic forms, such as electronic literature and born-digital art collections
    • Emerging formats and multimodal born-digital cultural heritage
    • Community-led and personal born-digital archives
    • Physical, intangible and digitised cultural heritage that has been remediated in a transformative way in born-digital formats and platforms
  2. Theoretical, methodological and creative approaches to engaging with born-digital collections and archives:
    • Approaches to researching the born-digital mediation of cultural memory
    • Histories and historiographies of born-digital technologies
    • Creative research uses and creative technologist approaches to born-digital materials
    • Experimental research approaches to engaging with born-digital objects, data and collections
    • Methodological reflections on using digital, quantitative and/or qualitative methods with born-digital objects, data and collections
    • Novel approaches to conceptualising born-digital and/or hybrid cultural heritage and archives
  3. Critical approaches to born-digital archiving, curation and preservation:
    • Critical archival studies and librarianship approaches to born-digital collections
    • Preserving and understanding obsolete media formats, including but not limited to CD-ROMs, floppy disks and other forms of optical and magnetic media
    • Preservation challenges associated with the platformisation of digital cultural production
    • Semantic technology, ontologies, metadata standards, markup languages and born-digital curation
    • Ethical approaches to collecting and accessing ‘difficult’ born-digital heritage, such as traumatic or offensive online materials
    • Risks and opportunities of generative AI in the context of born-digital archiving
  4. Access, training and frameworks for born-digital archiving and collecting:
    • Institutional, national and transnational approaches to born-digital archiving and collecting
    • Legal, trustworthy, ethical and environmentally sustainable frameworks for born-digital archiving and collecting, including attention to cybersecurity and safety concerns
    • Access, skills and training for born-digital research and archives
    • Inequalities of access to born-digital collecting and archiving infrastructures, including linguistic, geographic, economic, legal, cultural, technological and institutional barriers

Options for Submissions

A number of different submission types are welcomed and there will be an option for some presentations to be delivered online.

  • Conference papers (150-300 words)
    • Presentations lasting 20 minutes. Papers will be grouped with others on similar subjects or themes to form a complete session. There will be time for questions at the end of each session.
  • Panel sessions (100 word summary plus 150-200 words per paper)
    • Proposals should consist of three or four 20-minute papers. There will be time for questions at the end of each session.
  • Roundtables (200-300 word summary and 75-100 word bio for each speaker)
    • Proposals should include between three to five speakers, inclusive of a moderator, and each session will be no more than 90 minutes.
  • Posters, demos & showcases (100-200 words)
    • These can be traditional printed posters, digital-only posters, digital tool showcases, or software demonstrations. Please indicate the form your presentation will take in your submission.
    • If you propose a technical demonstration of some kind, please include details of technical equipment to be used and the nature of assistance (if any) required. Organisers will be able to provide a limited number of external monitors for digital posters and demonstrations, but participants will be expected to provide any specialist equipment required for their demonstration. Where appropriate, posters and demos may be made available online for virtual attendees to access.
  • Lightning talks (100-200 words)
    • Talks will be no more than 5 minutes and can be used to jump-start a conversation, pitch a new project, find potential collaborations, or try out a new idea. Reports on completed projects would be more appropriately given as 20-minute papers.
  • Workshops (150-300 words)
    • Please include details about the format, length, proposed topic, and intended audience.

Proposals will be reviewed by members of the programme committee. The peer review process will be double-blind, so no names or affiliations should appear on the submissions. The one exception is proposals for roundtable sessions, which should include the names of proposed participants. All authors and reviewers are required to adhere to the conference Code of Conduct.

The submission deadline for proposals is 15 May 2024, has been extended to 7 June 2024, and notification of acceptance is now scheduled for early August 2024. Organisers plan to make a number of bursaries available to presenters to cover the cost of attendance and details about these will be shared when notifications are sent. 

Key Information:

  • Dates: 2 - 4 April 2025
  • Venue: University of London, London, UK & online
  • Call for papers deadline: 7 June 2024
  • Notification of acceptance: early August 2024
  • Submission link: https://easychair.org/cfp/borndigital2025

Further details can be found on the conference website and the call for proposals submission portal at https://easychair.org/cfp/borndigital2025. If you have any questions about the conference, please contact the organising committee at [email protected].

09 October 2023

Strike a Pose Steampunk style! For our Late event with Clockwork Watch on Friday 13th October

This Friday (13th October) the British Library invites you to join the world of Clockwork Watch by Yomi Ayeni, a participatory storytelling project, set in a fantastical retro-futurist vision of Victorian England, with floating cities and sky pirates, which is one of the showcased narratives in our Digital Storytelling exhibition.

Flyer with text saying Late at the Library, Digital Steampunk at the British Library, London. Friday 13 October, 19:30 – 22:30

We are delighted that Dark Box Images will be bringing their portable darkroom to the Late at the Library: Digital Steampunk event and taking portrait photographs. If this appeals to you, then please arrive early to have your picture taken. Photographer Gregg McNeill is an expert in the wet plate collodion process invented by Frederick Scott Archer in 1851. Gregg’s skill in using an authentic Victorian camera creates genuinely remarkable results that appear right in front of your eyes.

Black and white photograph of a woman wearing an elaborate outfit and a mask with her arms outstretched wide with fabric like wings
Wet plate collodion photograph of Jennifer Garside of Wyte Phantom corsetry, taken by Gregg McNeill of Dark Box Images

If you want to pose for the camera at our steampunk Late, or have a portrait drawn by artist Doctor Geof, please don’t be shy, this is an event where guests are encouraged to dress to impress! The aesthetic of steampunk fashion is inspired by Victoriana and 19th Century literature, including Jules Verne’s novels and the Sherlock Holmes stories by Sir Arthur Conan Doyle. Steampunk looks can include hats and googles, tweed tailoring, waistcoats, corsets, fob watches and fans. Whatever your personal style, we encourage you to unleash your creativity when putting together an outfit for this event.

Furthermore, whether you are seeking a new look or some finishing touches, there will be an opportunity to browse a Night Market at this Late event, where you can purchase and admire a range of exquisite hand crafted items created by:

  • Jema Hewitt, a professional costumer and academic, will be bringing some of her unique, handmade jewellery and accessories to the Library Late event. She was one of the originators of the early artistic steampunk scene in the UK, subsequently exhibiting her costume work internationally, and having three how-to-make books published as her alter ego “Emilly Ladybird”. Jema currently specialises as a pattern cutter for film, theatre and TV, as well as lecturing and teaching workshops.
Photograph of jewellery, hats and clothing
Jewellery, hats and clothing created by Jema Hewitt/Emilly Ladybird
  • Doctor Geof, an artist, scientist, comics creator and maker of whimsical objects. His work is often satirical, usually with an historical twist, and features tea, goblins, krakens, steampunk, smut, nuns, bees, cats and more tea. Since 2004 you may have encountered him selling his comics, prints, cards, mugs, pins, and for some reason a lot of embroidered badges (including an Evil Librarian patch!) at various events. As one of the foremost Steampunk artists in the UK, Doctor Geof has worked with and exhibited at the Cutty Sark, Royal Museums Greenwich, and Discovery Museum Newcastle. He is a talented portrait artist, so please seek him out if you would like him to capture your likeness in ink and watercolour.
A round embroidered patch with a cartoon figure wearing goggles and carrying books. Text says "Evil Librarian"
Evil Librarian embroidered patch by Dr Geof

  • Jennifer Garside, a seamstress specialising in modern corsetry, which takes inspiration from historical styles. Her business, Wyte Phantom, opened in 2010, and she has made costumes for opera singers, performers and artists across the world.

  • Tracy Wells, a couture milliner based in the Lake District. She creates all kinds of hats and headpieces, often collaborating with other artists to explore new styles, concepts and genres.
Photograph of a woman wearing a steampunk hat with feathers
Millinery by Tracy Wells
  • Herr Döktor, a renowned inventor, gadgeteer, and contraptionist, who has been working in his Laboratory in the Surrey Hills for the last two decades, building a better future via the prism of history. He will be bringing a small selection of his inventions and scale models of his larger ideas. (His alter ego, Ian Crichton, is a professional model maker with thirty years experience as a toy prototype maker, museum and exhibition designer, and, most recently, building props and models for the film industry, he also lives in the Surrey Hills). 
Photograph of a man wearing a top hat and carrying a model submarine
Herr Döktor, inventor, gadgeteer, and contraptionist. Photograph by Adam Stait
  • Linette Withers established Anachronalia in 2012 to be a full-time bookbinder, producing historically-inspired books, miniature books, and quirky stationery. Her work has been shortlisted for display at the Bodleian Library at the University of Oxford as part of their ‘Redesigning the Medieval Book’ competition and exhibition in 2018 and one of her books is held in the permanent collection of The Lit & Phil in Newcastle after being part of an exhibition of bookbinding in 2021. She also teaches bookbinding in her studio in Leeds.

  • Heather Hayden of Diamante Queen Designs creates handmade vintage inspired, kitsch, macabre, noir accessories for everybody to wear and enjoy. Heather studied fashion and surface pattern design in the 80's near Leeds during the emergence of Gothic culture and has remained interested in the darker side of life ever since. She became fascinated with Steampunk after seeing Datamancer's Steampunk computer, loving the juxtaposition of new and old technology. This inspired her to make steampunk clothing and accessories using old and found items and upcycling as much as possible.
Photograph of a mannequin head wearing a headpiece with tassels, feathers, flowers and beads
Headpiece by Diamante Queen Designs
  • Matthew Chapman of Raphael's Workshop specialises in creating strange and sublime chainmail items, bringing ideas to life in metal that few would ever consider. From collars to corsets, serpents to squids, arms to armour and medals to masterpieces, you should visit his stall and see what creations spark the imagination.
Photograph of a table displaying a range of wearable items of chainmail jewellery and accessories
Chainmail jewellery and accessories created by Raphael's Workshop

We hope that this post has whetted your appetite for the delights available at the Late at the Library: Digital Steampunk event on Friday 13th October at the British Library. Tickets can be booked here.

21 September 2023

Convert-a-Card: Helping Cataloguers Derive Records with OCLC APIs and Python

This blog post is by Harry Lloyd, Research Software Engineer in the Digital Research team, British Library. You can sometimes find him at the Rose and Crown in Kentish Town.

Last week Dr Adi Keinan-Schoonbaert delved into the invaluable work that she and others have done on the Convert-a-Card project since 2015. In this post, I’m going to pick up where she left off, and describe how we’ve been automating parts of the workflow. When I joined the British Library in February, Victoria Morris and former colleague Giorgia Tolfo had prototyped programmatically extracting entities from transcribed catalogue cards and searching by title and author in the OCLC WorldCat database for any close matches. I have been building on this work, and addressing the last yellow rectangle below: “Curator disambiguation and resolution”. Namely how curators choose between OCLC results and develop a MARC record fit for ingest into British Library systems.

A flow chart of the Convert-a-card workflow. Digital catalogue cards to Transkribus to bespoke language model to OCR output (shelfmark, title, author, other text) to OCLC search and retrieval and shelfmark correction to spreadsheet with results to curator disambiguation and resolution to collection metadata ingest
The Convert-a-Card workflow at the start of 2023

 

Entity Extraction

We’re currently working with the digitised images from two drawers of cards, one Urdu and one Chinese. Adi and Giorgia used a layout model on Transkribus to successfully tag different entities on the Urdu cards. The transcribed XML output then had ‘title’, ‘shelfmark’ and ‘author’ tags for the relevant text, making them easy to extract.

On the left an image of an Urdu catalogue card, on the right XML describing the transcribed text, including a "title" tag for the title line
Card with layout model and resulting XML for an Urdu card, showing the `structure {type:title;}` parameter on line one

The same method didn’t work for the Chinese cards, possibly because the cards are less consistently structured. There is, however, consistency in the vertical order of entities on the card: shelfmark comes above title comes above author. This meant I could reuse some code we developed for Rossitza Atanassova’s Incunabula project, which reliably retrieved title and author (and occasionally an ISBN).

Two Chinese cards side-by-side, with different layouts.
Chinese cards. Although the layouts are variable, shelfmark is reliably the first line, with title and author following.

 

Querying OCLC WorldCat

With the title and author for each card, we were set-up to query WorldCat, but how to do this when there are over two thousand cards in these two drawers alone? Victoria and Giorgia made impressive progress combining Python wrappers for the Z39.50 protocol (PyZ3950) and MARC format (Pymarc). With their prototype, a lot of googling of ASN.1, BER and Z39.50, and a couple of quiet weeks drifting through the web of references between the two packages, I built something that could turn a table of titles and authors for the Chinese cards into a list of MARC records. I had also brushed up on enough UTF-8 to fix why none of the Chinese characters were encoded correctly.

For all that I enjoyed trawling through it, Z39.50 is, in the words of a 1999 tutorial, “rather hard to penetrate” and nearly 35 years old. PyZ39.50, the Python wrapper, hasn’t been maintained for two years, and making any changes to the code is a painstaking process. While Z39.50 remains widely used for transferring information between libraries, that doesn’t mean there aren’t better ways of doing things, and in the name of modernity OCLC offer a suite of APIs for their services. Crucially there are endpoints on their Metadata API that allow search and retrieval of records in MARCXML format. As the British Library maintains a cataloguing subscription to OCLC, we have access to the APIs, so all that’s needed is a call to the OCLC OAuth Server, a search on the Metadata API using title and author, then retrieval of the MARCXML for any results. This is very straightforward in Python, and with the Requests package and about ten lines of code we can have our MARCXML matches.

Selecting Matches

At all stages of the project we’ve needed someone to select the best match for a card from WorldCat search results. This responsibility currently lies with curators and cataloguers from the relevant collection area. With that audience in mind, I needed a way to present MARC data from WorldCat so curators could compare the MARC fields for different matches. The solution needed to let a cataloguer choose a card, show the card and a table with the MARC fields for each WorldCat result, and ideally provide filters so curators could use domain knowledge to filter out bad results. I put out a call on the cross-government data science network, and a colleague in the 10DS data science team suggested Streamlit.

Streamlit is a Python package that allows fast development of web apps without needing to be a web app developer (which is handy as I’m not one). Adding Streamlit commands to the script that processes WorldCat MARC records into a dataframe quickly turned it into a functioning web app. The app reads in a dataframe of the cards in one drawer and their potential worldcat matches, and presents it as a table of cards to choose from. You then see the image of the card you’re working on and a MARC field table for the relevant WorldCat matches. This side-by-side view makes it easy to scan across a particular MARC field, and exclude matches that have, for example, the wrong physical dimensions. There’s a filter for cataloguing language, sort options for things like number of subject access fields and total number of fields, and the ability to remove bad matches from view. Once the cataloguer has chosen a match they can save a match to the original dataframe, or note that there were no good matches, or only a partial match.

Screenshot from the Streamlit web app, with an image of a Chinese catalogue card above a table containing MARC data for different WorldCat matches relating to the card.
Screenshot from the Streamlit Convert-a-Card web app, showing the card and the MARC table curators use to choose between matches. As the cataloguers are familiar with MARC, providing the raw fields is the easiest way to choose between matches.

After some very positive initial feedback, we sat down with the Chinese curators and had them test the app out. That led to a fun, interactive, user experience focussed feedback session, and a whole host of GitHub issues on the repository for bugs and design suggestions. Behind the scenes discussion on where to host the app and data are ongoing and not straightforward, but this has been a deeply easy product to prototype, and I’m optimistic it will provide a light weight, gentle learning curve complement to full deriving software like Aleph (the Library’s main cataloguing system).

Next Steps

The project currently uses a range of technologies in  Transkribus, the OCLC APIs, and Streamlit, and tying these together has in itself been a success. Going forwards, we have the possibility of extracting non-English text from the cards to look forward to, and the richer list of entities this would make available. Working with the OCLC APIs has been a learning curve, and they’re not working perfectly yet, but they represent a relatively accessible option compared to Z39.50. And my hope for the Streamlit app is that it will be a useful tool beyond the project for wherever someone wants to use Worldcat to help derive records from minimal information. We still have challenges in terms of design, data storage, and hosting to overcome, but these discussions should have their own benefits in making future development easier. The goal for automation part of the project is a smooth flow of data from Transkribus, through OCLC and on to the curators, and while it’s not perfect, we’re definitely getting there.

14 July 2023

Share Family: British National Bibliography (Beta) service is live

Contents

Introduction

Share Family and National Bibliographies

       What is a National bibliography?

       BNB in the Share Family

Benefits

Future developments

Beta service

Further information

 

Introduction

The British National Bibliography (BNB), first published in January 1950, is a weekly listing of new books and journals published or distributed in the United Kingdom and the Republic of Ireland.  Over the last seventy-three years, the BNB has adapted to changing customer needs by embracing new technologies, from cards in the 1950s to mark-up languages for data exchange in the 1970s and CD-ROM in the 1980s. The BNB now provides online access to details of over 5 million publications and forthcoming titles, ranging in scope from computer science to history, from novels to textbooks.

 

Two examples of bibliographies including information like title, author, place of publication, year, description, prices etc.
1. Examples of British National Bibliography records, April 19th 2023. Please click the image to see it in full size & detail.

In 2011, the Library launched the Linked Open Data BNB.  At that time, linked data was an emerging technology using Web protocols to link data sets, as envisaged in Sir Tim Berners-Lee’s concept of a Semantic Web[1].  Our initial foray into linked data was successful from a technical perspective. We were able to convert BNB data held in Machine Readable Cataloging (MARC) format into linked data structures and make it available in a variety of schemas under an open licence.  Nevertheless, we lacked the capacity to re-model our data in order to realise the potential of linked data.  As the technology matured, we began to look around for partners with whom we could collaborate to take BNB forward.

As described in my September 2020 blogpost, British Library Joins Share-VDE Linked Data Community, the British Library joined the Share Community (now the Share Family) to develop our linked data service. The Share Linked Data Environment is “a global family built on collaboration that brings libraries, archives and museums together with a common goal and joins their knowledge in an ever-widening network of inter-connected bibliographic data.” (Share Family, 2022).

 

Share Family and National Bibliographies

“The Share Family is a suite of innovative tools and services, developed and driven by libraries, for libraries, in an international collaborative, consortial effort. Share-VDE enables the discovery of knowledge to increase user engagement with library and cultural heritage collections.”[2]

Screenshot: Share family components showing layers like Advanced API, Advanced Entity Model, Authority Service, Deliverables etc.
2. Share family components[3]. Please click the image to see it in full size & detail.

The Share Family has supported us through the transition from our traditional MARC data to linked open data.  We provided a full copy of the British National Bibliography to the Share team for identification and clustering of entities, e.g. works, publications, persons. Working with colleagues from other institutions on Share-VDE working groups we contribute to the development of the underlying data structures and the presentation of data.  This collaborative approach has enabled delivery of the British National Bibliography as the first institutional tenant of the Share Family National Bibliographies Portal

What is a National bibliography?

“National bibliographies are a permanent record of the cultural and intellectual output of a nation or country, which is witnessed by its publishing output. They gather the bibliographic information of current publications to preserve and provide ongoing access to this record.”

IFLA Bibliography Section

The IFLA (International Federation of Library Associations and Institutions) Register of national bibliographies contains 52 entries, ranging from Andorra to Vietnam.  National bibliographies vary in scope, but each provides insights into the intellectual and cultural history of society, literature and publishing.  The Share Family National Bibliographies Portal offers the potential for clustering and searching multiple national bibliographies on a single platform.

BNB in the Share Family

Screenshot of the BNB home screen stating 'Search for people, original works and publications
3. Screenshot BNB home screen. Please click the image to see it in full size & detail.

The British Library is proud that the British National Bibliography is the first tenant selected for the Share Family National Bibliographies Portal.

BNB is now available to explore in Beta: https://bl.natbib-lod.org. You can search for publications, original works and people, as illustrated by these examples:

You can use the national bibliography to search for a specific publication, such as a large print edition of the novel Small island by Andrea Levy.

Screenshot: Bibliographic description of large print edition of Small Island by Andrea Levy.
4. Screenshot: Bibliographic description of large print edition of Small Island by Andrea Levy. Please click the image to see it in full size & detail.

 

You can also find original works inspired by earlier works:

Screenshot: Results set for publication of the work, Small island by Helen Edmundson
5. Screenshot: Results set for publication of the work, Small island by Helen Edmundso. Please click the image to see it in full size & detail.

 

Alternatively, you can search for works by a specific author… 

Screenshot showing original works by Douglas Adams
6. Screenshot: Original works by Douglas Adams. Please click the image to see it in full size & detail.

 

…or about a specific person

Screenshot showing original works about Douglas Adams
7. Screenshot: Original works about Douglas Adams. Please click the image to see it in full size & detail.

 

…or by organization

Screenshot showing results set for BBC
8. Screenshot: Results set for BBC. Please click the image to see it in full size & detail.

 

Benefits

What benefit do we expect to gain from this collaboration?

  • We profit from practical experience our collaborators have gained through other linked data initiatives
  • We gain access to a state of the art, extensible infrastructure designed for library data
  • We gain a new channel for dissemination of the BNB, in aggregation with other national bibliographies

We are able to re-tool our metadata for the 21st Century:

  • Our data will be remodelled and clustered making it more compatible with current data models, including the IFLA Library Reference Model, RDA: Resource Description and Access, and Bibframe
  • Our data will be enriched with URIs that will make it more effective in linked data environments
  • The entity-centred view of the British National Bibliography offers new perspectives for researchers

 

Future developments

Conversion of the BNB and publication in the National Bibliographies Portal is only the beginning. 

  • The BNB data from the Cluster Knowledge base will also be published in the triple store
  • Original records will be available to the British Library as Bibframe 2.0, for dissemination or reuse as linked data
  • Users will be provided with access to the data via data dumps and a SPARQL endpoint
  • Our MARC records will be enriched with original Share URIs and URIs from external sources
  • Other national bibliographies will join BNB in the national bibliographies portal

The British National Bibliography represents only a fraction of the Library’s data.   You can explore the British Library’s collection through our catalogue, which we plan to contribute to Share-VDE in future.

 

Beta service

The British National Bibliography in the Share Family is being made available in Beta. The service is still being tested. The interface and the functionality are subject to change and may not work for everyone.  You can tell us what you think about the service or report problems by contacting [email protected].

 

Further information:

British National Bibliography https://bnb.bl.uk  

Share VDE http://www.share-family.org/

Share Family wiki https://wiki.share-vde.org/wiki/Main_Page

Share VDE Virtual Discovery Environment in linked open data https://svde.org/

National Bibliographies in Linked Open Data https://natbib-lod.org

British National Bibliography Linked Open Data Portal https://bl.natbib-lod.org

 

Footnotes

[1]  Berners-Lee, Tim; James Hendler; Ora Lassila (May 17, 2001). "The Semantic Web". Appeared in: Scientific American. (284(5):34-43 (May 2001). 

[2] Share-VDE: supporting the creation, management and discovery of linked open data for libraries: executive summary. Share-VDE Executive Committee. December 7th, 2022. Share-VDE Website (viewed 19th June 2023)

[3] Share Family – Linked data ecosystem. How does it work?  http://www.share-family.org/  (viewed on 23rd June 2023)

31 March 2023

Mapping Caribbean Diasporic Networks through the Correspondence of Andrew Salkey

This is a guest post by Natalie Lucy, a PhD student at University College London, who recently undertook a British Library placement to work on a project Mapping Caribbean Diasporic Networks through the correspondence of Andrew Salkey.

Project Objectives

The project, supervised by curators Eleanor Casson and Stella Wisdom, focussed on the extensive correspondence contained within Andrew Salkey’s archive. One of the initial objectives was to digitally depict the movement of key Caribbean writers and artists, as it is evidenced within the correspondence, many of whom travelled between Britain and the Caribbean as well as the United States, Central and South America and Africa. Although Salkey corresponded with a diverse range of people, we therefore focused on the letters in his archive which were from Caribbean writers and academics and which illustrated  patterns of movement of the Caribbean diaspora. Much of the correspondence stems from 1960s and 1970s, a time when Andrew Salkey was particularly active both in the Caribbean Artists Movement and, as a writer and broadcaster, at the BBC.

Photograph of Andrew Salkey's head and shoulders in profile
Photograph of Andrew Salkey

Andrew Salkey was unusual not only for the panoply of writers, artists and politicians with whom he was connected, but that he sustained those relationships, carefully preserving the correspondence which resulted from those networks. My personal interest in this project stemmed from the fact that my PhD seeks to consider the ways that the Caribbean trickster character, Anancy, has historically been reinvented to say something about heritage and identity. Significant to that question was the way that the Caribbean Artists Movement, a dynamic group of artists and writers formed in London in the mid-1960s, and of which Andrew Salkey was a founder, appropriated Anancy, reasserting him and the folktales to convey something of a literary ‘voice’ for the Caribbean. For this reason, I was also interested in the writing networks which were evidenced within the correspondence, together with their impact.

What is Gephi?

Prior to starting the project, Eleanor, who had catalogued the Andrew Salkey archive and Digital Curator, Stella, had identified Gephi as a possible software application through which to visualise this data. Gephi has been used in a variety of projects, including several at Harvard University, examples of the breadth and diversity of those initiatives can be found here. Several of these projects have social networks or historical trading routes as their focus, with obvious parallels to this project. Others notably use correspondence as their main data.

Gathering the Data

Andrew Salkey was known as something of a chronicler. He was interested in letters and travel and was also a serious collector of stamps. As such, he had not only retained the majority of the letters he received but categorised them. Eleanor had originally identified potential correspondents who might be useful to the project, selecting writers who travelled widely, whose correspondence had been separately stored by Salkey, partly because of its volume, and who might be of wider interest to the public. These included the acclaimed Caribbean writers, Samuel Selvon, George Lamming, Jan Carew and Edward Kamau Brathwaite and publishers and political activists, Jessica and Eric Huntley.

Our initial intention was to limit the data to simple facts which could easily be gleaned from the letters. Gephi required that we did so on a spreadsheet ,which had to conform to a particular format. In the first stages of the project, the data was confined to the dates and location of the correspondence, information which could suggest the patterns of movement within the diaspora. However, the letters were so rich in detail, that we ultimately recorded other information. This included any additional travel taken by any of the correspondents,  and which was clearly evidenced in the letters, together with any passages from the correspondence which demonstrated either something of the nature and quality of the friendships or, alternatively, the mutual benefit of those relationships to the careers of so many of the writers.

Creating a visual network

Dr Duncan Hay was invited to collaborate with me on this project, as he has considerable expertise in this field, his research interests include web mapping for culture and heritage and data visualisation for literary criticism.  After the initial data was collated, we discussed with Duncan what visualisations could be created. It became apparent early on that creating a visualisation of the social networks, as opposed to the patterns of movement, might be relatively straightforward via Gephi, an application which was particularly useful for this type of graph. I had prepared a spreadsheet but, Gephi requires the data to be presented in a strictly consistent way which meant that any anomalies had to be eradicated and the data effectively ‘cleaned up’ using Open Refine. Gephi also requires that information is presented by way of a system of ‘nodes’; ‘edges’  and ‘attributes’ with corresponding spreadsheet columns. In our project, the ‘nodes’ referred to Andrew Salkey and each of the correspondents and other individuals of interest who were specifically referred to within the correspondence. The edges referred to the way that those people were connected which, in this case, was through correspondence. However, what added to the potential of the project was that these nodes and edges could be further described by reference to ‘attributes.’ The possibility of assigning a range of ‘attributes’ to each of the correspondents allowed a wealth of additional information to be provided about the networks. As a consequence, and in order to make any visualisation as informative as possible, I also added brief biographical information for each of the writers and artists to be inputted as ‘attributes’ together with some explanation of the nature of the networks that were being illustrated.

The visual illustration below shows not only the quantity of letters from the sample of correspondents to Andrew Salkey (the pink lines),  but also shows which other correspondents formed part of those networks and were referenced as friends or contacts within specific items of correspondence. For example, George Lamming references academic, Rex Nettleford and writer and activist, Claudia Jones, the founder of the Notting Hill Carnival, in his correspondence, connections which are depicted in grey. 

Data visualisation of nodes and lines representing Andrew Salkey's Correspondence Network
Gephi: Andrew Salkey correspondence network

The aim was, however, for the visualisation to also be interactive. This required considerable further manipulation of the format and tools. In this illustration you can see the information that is revealed about the prominent Barbadian writer, George Lamming which, in an interactive format, can be accessed via the ‘i’ symbols beside many of the nodes coloured in green.  

Whilst Gephi was a useful tool with which to illustrate the networks, it was less helpful as a way to demonstrate the patterns of movement, one of the primary objectives of the project. A challenge was, therefore, to create a map which could be both interactive and illustrative of the specific locations of the correspondents as well as their movement over time. With Duncan’s input and expertise, we opted for a hybrid approach, utilising two principal ways to illustrate the data: we used Gephi to create a visualisation of the ‘networks’ (above) and another software tool, Kepler.gl, to show the diasporic movement.

A static version of what ultimately will be a ‘moving’ map (illustrating correspondence with reference to person, date and location) is shown below. As well as demonstrating patterns of movement, it should also be possible to access information about specific letters as well as their shelf numbers through this map, hopefully making the archive more accessible.

Data visualisation showing lines connecting countries on a map showing part of the Americas, Europe and Africa
Patterns of diasporic movement from Andrew Salkey's correspondence, illustrated in Kepler.gl

Whilst we are still exploring the potential of this project and how it might intersect with other areas of research and archives, it has already revealed something of the benefits of this type of data visualisation. For example, a project of this type could be used as an educational tool, providing something of a simple, but dynamic, introduction to the Caribbean Artists Movement. Being able to visualise the project has also allowed us to input information which confirms where specific letters of interest might be found within the archive. Ultimately, it is hoped that the project will offer ways to make a rich, yet arguably undervalued, archive more accessible to a wider audience with the potential to replicate something of an introductory model, or ‘pilot’ for further archives in the future. 

20 March 2023

Digital Storytelling at the 2023 BL Labs Symposium

One half of the 2023 British Library Labs Symposium will be dedicated to digital storytelling. This has been a significant part of BL Labs work over the years; we have collaborated with experimental artists from David Normal’s creative reuse of British Library Flickr images for his giant lightbox collage Crossroads of Curiosity installation at the 2014 Burning Man festival, to working with first runner up in the BL Labs 2016 competition Michael Takeo Magruder on his 2019 exhibition Imaginary Cities.

People looking at lightbox collage artworks
Crossroads of Curiosity by David Normal

In the last few years, due to the COVID-19 pandemic disruption, digital stories and engagement have become mainstream across the Galleries, Libraries, Archives and Museums (GLAM) sector. New types of digital storytelling mixing social media, online exhibitions embedding narratives and digital objects, and interactive online events reaching entirely new audiences, delighted us all. However, we also discovered that there can be a saturation point with online engagement, and that many digital developments have some way to go to reach their full potential.

As we are hopefully entering healthier times, new opportunities to mix virtual and physical worlds are starting to open up. With this in mind, we felt that this is the right moment to explore a new age of digital storytelling at the 2023 BL Labs Symposium.

The idea is to explore what is changing in the world of technological possibilities and how they are continuing to develop. We have envisaged a journey that will take us from the big picture of the arising digital possibilities to more specific examples from the British Library’s work. In true BL Labs spirit we will also celebrate initiatives that creatively reuse the Library’s digital collections.

To help us look into the big trends, we are delighted to be joined by Zillah Watson, whose extraordinary breath of experience working with BBC, Meta, BFI and Royal Shakespeare Company amongst many others, will help us to get a deeper sense of the opportunities of virtual reality (VR). Zillah will look into what it means, not just to be dazzled with technological possibilities, but also to enter the magic of storytelling.

Talking of magic, we are lucky to welcome award winning Director, Anrick Bregman, and award winning Producer, Grace Baird. Anrick and Grace will take us deeper into the potential of using VR to uncover hidden stories. Anrick’s film A Convict Story is an interactive VR project built on British Library data that brings to life a story discovered by the linking of data from centuries ago, using data research powered by machine learning.

Even closer to home, our own Stella Wisdom and Ian Cooke, will talk about their current work on curating the British Library’s forthcoming Digital Storytelling exhibition (2 June – 15 October 2023), which will explore the ways technology provides opportunities to transform and enhance the way writers write and readers engage. Drawing on the Library’s collection of contemporary digital publications and emerging formats to highlight the work of innovative and experimental writers. It will feature interactive works that invite and respond to user input, reading experiences influenced by data feeds, and immersive story worlds created using multiple platforms and audience participation. This is an exciting development, as we can see how earlier British Library creative digital experiments, collaborations and research projects are building into an exhibition in its own right.

We hope you can join us for discussion at the BL Labs Symposium on Thursday 30 March 2023. For the full programme, and further information on all our speakers, please read our earlier blog post.

 You can book your place here

02 March 2023

BL Labs Symposium 2023: Programme and Speakers announced

Book illustration of a shelf of books with "Informed" spelled across their spines
British Library digitised image from BL Flickr Collection - When Life is Young: a collection of verse for boys and girls by Mary Elizabeth Dodge

The BL Labs Symposium 2023 is taking place on Thursday 30th March as an online webinar.

This year we will be exploring two themes – digital storytelling and innovative uses of data and AI. As always, we are aiming to hear from some guest speakers, as well as showcase the recent work using the British Library digital collections. The programme also include an update of BL Labs, including our new website and services.

We hope this will spark many further ideas and collaborations.

The full programme for the BL Labs Symposium is as follows:

14.00 – Welcome

Part 1: Digital Storytelling

14.05 – How to bring the magic of VR to audiences – Zillah Watson

14.15 – There Exists – A VR experience about hidden narratives – Anrick Bregman and Grace Baird

14.25 – Curating a Digital Storytelling exhibition – Stella Wisdom and Ian Cooke

14.35 – Panel Q&A

15.00 – In Memoriam Maurice Nicholson

15.05 – Break

15.15 – BL Labs Update – Silvija Aurylaite

Part 2 – Data and AI

15.35 - Ithaca: Restoring and attributing ancient texts using deep neural networks - Yannis Assael

15.45 – Living with Machines: Using digitised newspaper collections from the British Library in a data science project – Kalle Westerling

15.55 – Locating a National Collections through audience research. How cultural heritage organisations can engage the public using geospatial data – Gethin Rees

16.05 – Panel Q&A

16.30 – END

You can register for the BL Labs Symposium here.

We are currently planning an evening networking session at the British Library, starting at 18.30 for those who can join us in London. We are aware of the train strike planned for this day, so will confirm details nearer the time.

Below are a few details about our speakers:

Head and shoulders photograph of Zillah Watson
Zillah Watson

Zillah Watson

Zillah Watson led the BBC's award winning VR studio, winning a host of awards at festivals around the world, including an Emmy nomination. She led pioneering work taking VR to audiences in libraries around the UK. She now consults on the metaverse, and content and audience growth strategies for organisations including Meta, London & Partners, the BFI, International News Media Association, Arts Council England, and the Royal Shakespeare Company. She's had a long and varied media career, including 20 years at the BBC, where she was a TV and radio current affairs journalist, head of editorial standards for BBC Radio and led R&D research on future content. She is a lecturer at UCL and the new London Interdisciplinary School. She recently co-founded Phase Space, a tech for good start-up to use VR to support mental health for students and young people.

Head and shoulders photograph of Anrick Bregman
Anrick Bregman

Anrick Bregman

Anrick is director and founder of an R&D studio that explores the future of spatial immersive storytelling by creating experiences built with virtual and augmented reality, computer vision and machine learning. His mission is to find new and interesting ways to merge technology with meaningful narratives which explore the human experience.

Head and shoulders photograph of Grace Baird
Grace Baird

Grace Baird

Grace is a Producer with twelve years' experience working on audience-centred projects in the Arts, TV, and Immersive industries. She is experienced in immersive and digital production and distribution, particularly entertainment content. Grace has produced a variety of innovative projects including site-specific installations, an interactive feature-film, and social-VR experiences.

Head and shoulders photograph of Stella Wisdom
Stella Wisdom

Stella Wisdom

Stella is Digital Curator for Contemporary British Collections at the British Library. Promoting creative and innovative reuse of digital collections, encouraging game making and digital storytelling in libraries, including collaborating widely with The National Videogame Museum, AdventureX, International Games Month in Libraries, the New Media Writing Prize and on research projects with University College London’s Institute of Education and Lancaster University. Stella research interests also explore the archiving of complex born digital material, examining methods for the collection, preservation and curation of narrative apps, digital comics and interactive fiction.

Head and shoulders photograph of Ian Cooke
Ian Cooke

Ian Cooke

Ian is Head of Contemporary British Publications at the British Library. He has worked in academic and research libraries with a focus on 20th and 21st-century history and social sciences. His interests are in the role of publishing in contemporary communications, and the everyday experience and expression of politics. 

Head and shoulders photograph of Silvija Aurylaite
Silvija Aurylaite

Silvija Aurylaite

Silvija Aurylaite is BL Labs Manager. She previously worked on the British Library Heritage Made Digital Programme. Her interests and domain of expertise include copyright, curation of digital collections of museums, archives and libraries, data science, design, creativity and social entrepreneurship. Previously, she was an initiator of a new publishing project Public Domain City that aimed to bring a new life into curious & obscure historical books on science, technology and nature. She also organized a retrospective dance film festival Dance in Film, Choreography, Body and Image, and media dance educational activities at the National Gallery of Art in Vilnius.

Head and shoulders photograph of Yannis Assael
Yannis Assael

Yannis Assael

Dr. Yannis Assael is a Staff Research Scientist at Google DeepMind working on Artificial Intelligence, and he is featured in Forbes' "30 Under 30" distinguished scientists of Europe. In 2013, he graduated from the Department of Applied Informatics, University of Macedonia, and with full scholarships, he did an MSc at the University of Oxford, finishing first in his year, and an MRes at Imperial College London. In 2016, he returned to Oxford for a DPhil degree with a Google DeepMind scholarship, and after a series of research breakthroughs and entrepreneurial activities, he started as a researcher at Google DeepMind. His contributions range from audio-visual speech recognition to multi-agent communication and AI for culture and the study of damaged ancient texts. Throughout this time, his research has attracted the media's attention several times, has been featured on the cover of the scientific journal Nature, and focuses on contributing to and expanding the greater good.

Head and shoulders photograph of Kalle Westerling
Kalle Westerling

Kalle Westerling

Dr Kalle Westerling is a Digital Humanities Research Software Engineer with Living with Machines, a collaboration between the British Library, the Alan Turing Institute, and researchers from a range of UK universities. Kalle holds a Ph.D. in Theatre and Performance Studies from The Graduate Center, City University of New York (CUNY), where he visualised and analysed networks of itinerant nightlife performers around New York City in the 1930s. Prior to joining the British Library, Kalle managed the Scholars program at HASTAC and the Digital Humanities Research Institute at CUNY, both efforts across higher education institutions in the United States, aiming to build nation-wide infrastructures and communities for digital humanities skill-building.

Head and shoulders photograph of Gethin Rees
Gethin Rees

Gethin Rees

Gethin’s role at the British Library includes helping to manage the non-print legal deposit of digital maps and coordinating the Georeferencer crowd-sourcing project. He is interested in helping research projects to get the most out of geospatial data and tools and was principal investigator of the AHRC-funded Locating a National Collection project. Before taking up his current position in 2018 he worked on two collaborative history projects funded by the ERC and as a software developer. His PhD in archaeology from University of Cambridge made use of Geographical Information Systems for spatial analysis and data management.

Digital scholarship blog recent posts

Archives

Tags

Other British Library blogs