Digital scholarship blog

Enabling innovative research with British Library digital collections

270 posts categorized "Collaborations"

07 May 2024

Recovered Pages: Computing for Cultural Heritage Student Projects

The British Library is continuing to recover from last year’s cyber-attack. While our teams work to restore our services safely and securely, one of our goals in the Digital Research Team is to get some of the information from our currently inaccessible web pages into an easily readable and shareable format. We’ll be sharing these pages via blog posts here, with information recovered from the Wayback Machine, a fantastic initiative of the Internet Archive.  

The next page in this series is all about the student projects that came out of our Computing for Cultural Heritage project with the National Archives and Birkbeck University. This student project page was captured by the Wayback Machine on 7 June 2023.  

 

Computing for Cultural Heritage Student Projects

computing for cultural heritage logo - an image of a laptop with bookshelves as the screen saver

This page provides abstracts for a selection of student projects undertaken as part of a one-year part-time Postgraduate Certificate (PGCert), Computing for Cultural Heritage, co-developed by British Library, National Archives and Birkbeck University and funded by the Institute of Coding as part of a £4.8 million University skills drive.

“I have gone from not being able to print 'hello' in Python to writing some relatively complex programs and having a much greater understanding of data science and how it is applicable to my work."

- Jessica Green  

Key points

  • Aim of the trial was to provide professionals working in the cultural heritage sector with an understanding of basic programming and computational analytic tools to support them in their daily work 
  • During the Autumn & Spring terms (October 2019-April 2020), 12 staff members from British Library and 8 staff staff members from The National Archives completed two new trial modules at Birkbeck University: Demystifying computing for heritage professionals and Work-based Project 
  • Birkbeck University have now launched the Applied Data Science (Postgraduate Certificate) based on the outcomes of the trial

Student Projects

 

Transforming Physical Labels into Digital References 

Sotirios Alpanis, British Library
This project aims to use computing to convert data collected during the preparation of archive material for digitisation into a tool that can verify and validate image captures, and subsequently label them. This will take as its input physical information about each document being digitised, perform and facilitate a series of validations throughout image capture and quality assurance and result in an xml file containing a map of physical labels to digital files. The project will take place within the British Library/Qatar Foundation Partnership (BL/QFP), which is digitising archive material for display on the QDL.qa.  

Enhancing national thesis metadata with persistent identifiers

Jenny Basford, British Library 
Working with data from ISNI (International Standard Name Identifier) Agency and EThOS (Electronic Theses Online Service), both based at the British Library, I intend to enhance the metadata of both databases by identifying doctoral supervisors in thesis metadata and matching these data with ISNI holdings. This work will also feed into the European-funded FREYA project, which is concerned with the use of a wide variety of persistent identifiers across the research landscape to improve openness in research culture and infrastructure through Linked Data applications.

A software tool to support the social media activities of the Unlocking Our Sound Heritage Project

Lucia Cavorsi, British Library
Video
I would like to design a software tool able to flag forthcoming anniversaries, by comparing all the dates present in SAMI (sound and moving image catalogue – Sound Archive) with the current date. The aim of this tool is to suggest potential content for the Sound Archive’s social media posts. Useful dates in SAMI which could be matched with the current date and provide material for tweets are: birth and death dates of performers or authors, radio programme broadcast dates, recording dates).  I would like this tool to also match the subjects currently present in SAMI with the subjects featured in the list of anniversaries 2020 which the social media team uses. For example anniversaries like ‘International HIV day’, ‘International day of Lesbian visibility’ etc.  A windows pop up message will be designed for anniversaries notifications on the day.  If time permits, it would be convenient to also analyse what hashtags have been used over last year by the people who are followed by or follow the Sound Archive Twitter account. By extracting a list of these hashtags further, and more sound related, anniversaries could be added to the list of anniversaries currently used by the UOSH’s social media team.

Computing Cholera: Topic modelling the catalogue entries of the General Board of Health

Christopher Day, The National Archives
BlogOther
The correspondence of the General Board of Health (1848–1871) documents the work of a body set up to deal with cholera epidemics in a period where some English homes were so filthy as to be described as “mere pigholes not fit for human beings”. Individual descriptions for each of these over 89,000 letters are available on Discovery, The National Archives (UK)’s catalogue. Now, some 170 years later, access to the letters themselves has been disrupted by another epidemic, COVID-19. This paper examines how data science can be used to repurpose archival catalogue descriptions, initially created to enhance the ‘human findability’ of records (and favoured by many UK archives due to high digitisation costs), for large-scale computational analysis. The records of the General Board will be used as a case study: their catalogue descriptions topic modelled using a latent Dirichlet allocation model, visualised, and analysed – giving an insight into how new sanitary regulations were negotiated with a divided public during an epidemic. The paper then explores the validity of using the descriptions of historical sources as a source in their own right; and asks how, during a time of restricted archival access, metadata can be used to continue research.

An Automated Text Extraction Tool for Use on Digitised Maps

Nicholas Dykes, British Library
Blog / Video
Researchers of history often have difficulty geo-locating historical place names in Africa. I would like to apply automated transcription techniques to a digitised archive of historical maps of Africa to create a resource that will allow users to search for text, and discover where, and on which maps that text can be found. This will enable identification and analysis both of historical place names and of other text, such as topographical descriptions. I propose to develop a software tool in Python that will send images stored locally to the Google Vision API, and retrieve and process a response for each image, consisting of a JSON file containing the text found, pixel coordinate bounding boxes for each instance of text, and a confidence score. The tool will also create a copy of each image with the text instances highlighted. I will experiment with the parameters of the API in order to achieve the most accurate results.  I will incorporate a routine that will store each related JSON file and highlighted image together in a separate folder for each map image, and create an Excel spreadsheet containing text results, confidence scores, links to relevant image folders, and hyperlinks to high-res images hosted on the BL website. The spreadsheet and subfolders will then be packaged together into a single downloadable resource.  The finished software tool will have the capability to create a similar resource of interlinked spreadsheet and subfolders from any batch of images.

Reconstituting a Deconstructed Dataset using Python and SQLite

Alex Green, The National Archives
Video
For this project I will rebuild a database and establish the referential integrity of the data from CSV files using Python and SQLite. To do this I will need to study the data, read the documentation, draw an entity relationship diagram and learn more about relational databases. I want to enable users to query the data as they would have been able to in the past. I will then make the code reusable so it can be used to rebuild other databases, testing it with a further two datasets in CSV form. As an additional challenge, I plan to rearrange the data to meet the principles of ‘tidy data’ to aid data analysis.

PIMMS: Developing a Model Pre-Ingest Metadata Management System at the British Library

Jessica Green, British Library
GitHub / Video
I am proposing a solution to analysing and preparing for ingest a vast amount of ‘legacy’ BL digitised content into the future Digital Asset Management System (DAMPS). This involves building a prototype for a SQL database to aggregate metadata about digitised content and preparing for SIP creation. In addition, I will write basic queries to aid in our ongoing analysis about these TIFF files, including planning for storage, copyright, digital preservation and duplicate analysis. I will use Python to import sample metadata from BL sources like SharePoint, Excel and BL catalogues – currently used for analysis of ‘live’ and ‘legacy’ digitised BL collections. There is at least 1 PB of digitised content on the BL networks alone, as well as on external media such as hard-drives and CDs. We plan to only ingest one copy of each digitised TIFF file set and need to ensure that the metadata is accurate and up-to-date at the point of ingest. This database, the Pre-Ingest Metadata Management System (PIMMS), could serve as a central metadata repository for legacy digitised BL collections until then. I look forward to using Python and SQL, as well as drawing on the coding skills from others, to make these processes more efficient and effective going forward.

Exploring, cleaning and visualising catalogue metadata

Alex Hailey, British Library
Blog / Video
Working with catalogue metadata for the India Office Records (IOR) I will undertake three tasks: 1) converting c430,000 IOR/E index entries to descriptions within the relevant volume entries; 2) producing an SQL database for 46,500 IOR/P descriptions, allowing enhanced search when compared with the BL catalogue; and 3) creating Python scripts for searching, analysis and visualisation, to be demonstrated on dataset(s) and delivered through Jupyter Notebooks.

Automatic generation of unique reference numbers for structured archival data.

Graham Jevon, British Library
Blog / Video / GitHub
The British Library’s Endangered Archives Programme (EAP) funds the digital preservation of endangered archival material around the world. Third party researchers digitise material and send the content to the British Library. This is accompanied by an Excel spreadsheet containing metadata that describes the digitised content. EAP’s main task is to clean, validate, and enhance the metadata prior to ingesting it into the Library’s cataloguing system (IAMS). One of these tasks is the creation of unique catalogue reference numbers for each record (each row of data on the spreadsheet). This is a predominantly manual process that is potentially time consuming and subject to human inputting errors. This project seeks to solve this problem. The intention is to create a Windows executable program that will enable users to upload a csv file, enter a prefix, and then click generate. The instant result will be an export of a new csv file, which contains the data from the original csv file plus automatically generated catalogue reference numbers. These reference numbers are not random. They are structured in accordance with an ordered archival hierarchy. The program will include additional flexibility to account for several variables, including language encoding, computational efficiency, data validation, and wider re-use beyond EAP and the British Library.

Automating Metadata Extraction in Born Digital Processing

Callum McKean, British Library
Video
To automate the metadata extraction section of the Library’s current work-flow for born-digital processing using Python, then interrogate and collate information in new ways using the SQLite module.

Analysis of peak customer interactions with Reference staff at the British Library: a software solution

Jaimee McRoberts, British Library
Video
The British Library, facing on-going budget constraints, has a need to efficiently deploy Reference Services staff during peak periods of demand. The service would benefit from analysis of existing statistical data recording the timestamp of each customer interaction at a Reference Desk. In order to do this, a software solution is required to extract, analyse, and output the necessary data. This project report demonstrates a solution utilising Python alongside the pandas library which has successfully achieved the required data analysis.

Enhancing the data in the Manorial Documents Register (MDR) and making it more accessible

Elisabeth Novitski, The National Archives
Video
To develop computer scripts that will take the data from the existing separate and inconsistently formatted files and merge them into a consistent and organised dataset. This data will be loaded into the Manorial Documents Register (MDR) and National Register of Archives (NRA) to provide the user with improved search ability and access to the manorial document information.

Automating data analysis for collection care research at The National Archives: spectral and textual data

Lucia Pereira Pardo, The National Archives
The day-to-day work of a conservation scientist working for the care of an archival collection involves acquiring experimental data from the varied range of materials present in the physical records (inks, pigments, dyes, binding media, paper, parchment, photographs, textiles, degradation and restoration products, among others). To this end, we use multiple and complementary analytical and testing techniques, such as X-ray fluorescence (XRF), Fourier Transform Infrared (FTIR) and Fibre Optic Reflectance spectroscopies (FORS), multispectral imaging (MSI), colour and gloss measurements, microfading (MFT) and other accelerated ageing tests.  The outcome of these analyses is a heterogeneous and often large dataset, which can be challenging and time-consuming to process and analyse. Therefore, the objective of this project is to automate these tasks when possible, or at least to apply computing techniques to optimise the time and efforts invested in routine operations, so that resources are freed for actual research and more specialised and creative tasks dealing with the interpretation of the results.

Improving efficiencies in content development through batch processing and the automation of workloads

Harriet Roden, British Library
Video
With the purpose to support and enrich the curriculum, the British Library’s Digital Learning team produces large-scale content packages for online learners through individual projects. Due to their reliance on other internal teams within the workflow for content delivery, a substantial amount of resource is spent on routine tasks to duplicate collection metadata across various databases. In order to reduce inefficiencies, increase productivity and improve reliability, my project aimed to alleviate pressures across the workflow through workload automation, through four separate phases.

The Botish Library: building a poetry printing machine with Python

Giulia Carla Rossi, British Library
Blog / Video
This project aims to build a poetry printing machine, as a creative output that unites traditional content, new media and Python. The poems will be sourced from the British Library Digitised Books dataset collection, available under Public Domain Mark; I will sort through the datasets and identify which titles can be categorised as poetry using Python. I will then create a new dataset comprising these poetry books and relative metadata, which will then be connected to the printer with a Python script. The poetry printing machine will print randomized poems from this new dataset, together with some metadata (e.g. poem title, book title, author and shelfmark ID) that will allow users to easily identify the book.

Automating data entry in the UOSH Tracking Database

Chris Weaver, British Library
The proposed software solution is the creation of a Python script (to feature as a module in a larger script) to extract data from a web-based tool (either via obtaining data in JSON format via the sites' API or accessing the database powering the site directly). The data obtained is then formatted and inserted into corresponding fields in a Microsoft SQL Server database.

Final Module

Following the completion of the trial, participants had the opportunity to complete their PGCert in Applied Data Science by attending the final module, Analytic Tools for Information Professionals, which was part of the official course launched last autumn. We followed up with some of the participants to hear more about their experience of the full course:

“The third and final module of the computing for cultural heritage course was not only fascinating and enjoyable, it was also really pertinent to my job and I was immediately able to put the skills I learned into practice.  

The majority of the third module focussed on machine learning. We studied a number of different methods and one of these proved invaluable to the Agents of Enslavement research project I am currently leading. This project included a crowdsourcing task which asked the public to draw rectangles around four different types of newspaper advertisement. The purpose of the task was to use the coordinates of these rectangles to crop the images and create a dataset of adverts that can then be analysed for research purposes. To help ensure that no adverts were missed and to account for individual errors, each image was classified by five different people.  

One of my biggest technical challenges was to find a way of aggregating the rectangles drawn by five different people on a single page in order to calculate the rectangles of best fit. If each person only drew one rectangle, it was relatively easy for me to aggregate the results using the coding skills I had developed in the first two modules. I could simply find the average (or mean) of the five different classification attempts. But what if people identified several adverts and therefore drew multiple rectangles on a single page? For example, what if person one drew a rectangle around only one advert in the top left corner of the page; people two and three drew two rectangles on the same page, one in the top left and one in the top right; and people four and five drew rectangles around four adverts on the same page (one in each corner). How would I be able to create a piece of code that knew how to aggregate the coordinates of all the rectangles drawn in the top left and to separately aggregate the coordinates of all the rectangles drawn in the bottom right, and so on?  

One solution to this problem was to use an unsupervised machine learning method to cluster the coordinates before running the aggregation method. Much to my amazement, this worked perfectly and enabled me to successfully process the total of 92,218 rectangles that were drawn and create an aggregated dataset of more than 25,000 unique newspaper adverts.” 

-Graham Jevon, EAP Cataloguer; BL Endangered Archives Programme 

“The final module of the course was in some ways the most challenging — requiring a lot of us to dust off the statistics and algebra parts of our brain. However, I think, it was also the most powerful; revealing how machine learning approaches can help us to uncover hidden knowledge and patterns in a huge variety of different areas.  

Completing the course during COVID meant that collection access was limited, so I ended up completing a case study examining how generic tropes have evolved in science fiction across time using a dataset extracted from GoodReads. This work proved to be exceptionally useful in helping me to think about how computers understand language differently; and how we can leverage their ability to make statistical inferences in order to support our own, qualitative analyses. 

In my own collection area, working with born digital archives in Contemporary Archives and Manuscripts, we treat draft material — of novels, poems or anything else — as very important to understanding the creative process. I am excited to apply some of these techniques — particularly Unsupervised Machine Learning — to examine the hidden relationships between draft material in some of our creative archives. 

The course has provided many, many avenues of potential enquiry like this and I’m excited to see the projects that its graduates undertake across the Library.” 

- Callum McKean, Lead Curator, Digital; Contemporary British Collection

“I really enjoyed the Analytics Tools for Data Science module. As a data science novice, I came to the course with limited theoretical knowledge of how data science tools could be applied to answer research questions. The choice of using real-life data to solve queries specific to professionals in the cultural heritage sector was really appreciated as it made everyday applications of the tools and code more tangible. I can see now how curators’ expertise and specialised knowledge could be combined with tools for data analysis to further understanding of and meaningful research in their own collection area."

- Giulia Carla Rossi, Curator, Digital Publications; Contemporary British Collection

Please note this page was originally published in Feb 2021 and some of the resources, job titles and locations may now be out of date.

02 May 2024

Recovered Pages: A Digital Transformation Story

The British Library is continuing to recover from last year’s cyber-attack. While our teams work to restore our services safely and securely, one of our goals in the Digital Research Team is to get some of the information from our currently inaccessible web pages into an easily readable and shareable format. We’ll be sharing these pages via blog posts here, with information recovered from the Wayback Machine, a fantastic initiative of the Internet Archive.  

The second page in this series is a case study on the impact of our Digital Scholarship Training Programme, captured by the Wayback Machine on 3 October 2023. 

 

Graham Jevon: A Digital Transformation Story

'The Digital Scholarship Training Programme has introduced me to new software, opened my eyes to digital opportunities, provided inspiration for me to improve, and helped me attain new skills'

Gj

Key points

  • Graham Jevon has been an active participant in the Digital Scholarship Training Programme
  • Through gaining digital skills he has been able to build software to automate tricky processes
  • Graham went on to become a Coleridge Fellowship scholar, putting these digital skills to good use!

Find out more on what Graham has been up to on his Staff Profile

Did you know? The Digital Scholarship Training Programme has been running since 2012, and creates opportunities for staff to develop necessary skills and knowledge to support emerging areas of modern scholarship.

The Digital Scholarship Training Programme

Since joining the library in 2018, the Digital Scholarship Training Programme has been integral to the trajectory of both my personal development and the working practices within my team.

The very first training course I attended at the library was the introduction to OpenRefine. The key thing that I took away from this course was not necessarily the skills to use the software, but simply understanding OpenRefine’s functionality and the possibilities the software offered for my team. This inspired me to spend time after the session devising a workflow that enhanced our cataloguing efficiency and accuracy, enabling me to create more detailed and accurate metadata in less time. With OpenRefine I created a semi-automated workflow that required the kind of logical thinking associated with computer programming, but without the need to understand a computer programming language.

 

Computing for Cultural Heritage

The use of this kind of logical thinking and the introduction to writing computational expressions within OpenRefine sparked an interest in me to learn a computing language such as Python. I started a free online Python introduction, but without much context to the course my attention quickly waned. When the Digital Scholarship Computing for Cultural Heritage course was announced I therefore jumped at the chance to apply. 

I went into the Computing for Cultural Heritage course hoping to learn skills that would enable me to solve cataloguing and administrative problems, skills that would help me process data in spreadsheets more efficiently and accurately. I had one particular problem in mind and I was able to address this problem in the project module of the course. For the project we had to design a software program. I created a program (known as ReG), which automatically generates structured catalogue references for archival collections. I was extremely pleased with the outcome of this project and this piece of software is something that my team now use in our day-to-day activities. An error-prone task that could take hours or days to complete manually in Excel now takes just a few seconds and is always 100% accurate.

This in itself was a great outcome of the course that met my hopes at the outset. But this course did so much more. I came away from the course with a completely new set of data science skills that I could build on and apply in other areas. For example, I recently created another piece of software that helps my team survey any digitisation data that we receive, to help us spot any errors or problems that need fixing.

 

 

The British Library Coleridge Research Fellowship

The data science skills were particularly instrumental in enabling me to apply successfully for the British Library’s Coleridge research fellowship. This research fellowship is partly a personal development scheme and it enabled me the opportunity to put my new data science skills into practice in a research environment (rather than simply using them in a cataloguing context). My previous academic research experience was based on traditional analogue methods. But for the Coleridge project I used crowdsourcing to extract data for analysis from two collections of newspapers.

A screenshot of a Guardian article that covered the work Graham has done, titled 'Secrets of rebel slaves in Barbados will finally be revealed'

The third and final Computing for Cultural Heritage module focussed on machine learning and I was able to apply these skills directly to the crowdsourcing project Agents of Enslavement. The first crowdsourcing task, for example, asked the public to draw rectangles around four specific types of newspaper advertisement. To help ensure that no adverts were missed and to account for individual errors, each image was classified by five different people. I therefore had to aggregate the results. Thanks to the new data science skills I had learned, I was able to write a Python script that used machine learning algorithms to aggregate 92,000 total rectangles drawn by the public into an aggregated dataset of 25,000 unique newspaper advertisements.

The OpenRefine and Computing for Cultural Heritage course are just two of the many digital scholarship training sessions that I have attended. But they perfectly illustrate the value of the Digital Scholarship Training Programme, which has introduced me to new software, opened my eyes to digital opportunities, provided inspiration for me to improve, and helped me attain new skills that I have been able to put into practice both for the benefit of myself and my team.

29 April 2024

Recovered Pages: Digital Scholarship Training Programme

The British Library is continuing to recover from last year’s cyber-attack. While our teams work to restore our services safely and securely, one of our goals in the Digital Research Team is to get some of the information from our currently inaccessible web pages into an easily readable and shareable format. We’ll be sharing these pages via blog posts here, with information recovered from the Wayback Machine, a fantastic initiative of the Internet Archive.  

The first page in this series is about our Digital Scholarship Training Programme, captured by the Wayback Machine on 27 September 2023.  

 

The Digital Scholarship Training Programme 

A laptop with one of the online tutorials covered in a Hack & Yack

The Digital Scholarship Training Programme has been running since 2012, and creates opportunities for staff to develop necessary skills and knowledge to support emerging areas of modern scholarship. 

 

About 

This internal and bespoke staff training programme is one of the cornerstones of the Digital Curator Team’s work at the British Library. Running since 2012, it provides colleagues with the space and opportunity to delve into and explore all that digital content and new technologies have to offer in the research domain today. The Digital Curator team oversees the design and delivery of roughly 50-60 training events a year. Since its inception, well over a thousand individual staff members have come through the programme, on average attending three or more courses each and the Library has seen a steep change in its capacity to support innovative digital research.  

 

Objectives 

  1. Staff are familiar and conversant with the foundational concepts, methods and tools of digital scholarship. 
  2. Staff are empowered to innovate. 
  3. Collaborative digital initiatives flourish across subject areas within the Library as well as externally.
  4. Our internal capacity for training and skill-sharing in digital scholarship are a shared responsibility across the Library. 

 

The Programme 

What's it all about? 

To celebrate our ten year anniversary, we created a series of video testimonials from the people behind the Training Programme - coordinators, instructors, and attendees. Click 'Watch on YouTube' to view the whole series of videos.

 

Nora McGregor, Digital Curator, gives a presentation all about the Digital Scholarship Training Programme - where it started, where it's going and what it hopes to accomplish. 

 

Courses 

As digital research methods have changed overtime, so too have course topics and content. Today's full course catalogue reflects this through a diversity of topics from cleaning up data, digital storytelling, to command line programming and geo-referencing. 

Courses range from half-days to full-day workshops for no more than 15 attendees at a time and are taught mainly by staff members but also external trainers where necessary. Example courses include: 

105 Crowdsourcing in Libraries, Museums and Cultural Heritage Institutions 

107 Data Visualisation for Cultural Heritage Collections 

109 Information Integration: Mash-ups, API’s and Linked Data 

118 Cleaning up Data 

 

Hack & Yacks 

We host a monthly “Hack & Yack” to run alongside the more formal training programme. During these two-hour self-paced casual meet-ups, open to all staff, the group works through a variety of online tutorials on a particular digital topic. Example sessions include: 

Transcribing Handwritten Text 

Transforming XML with XSLT 

Interactive writing platforms 

 

Digital Scholarship Reading Group 

The Digital Scholarship Reading Group holds informal discussions on the first Tuesday of each month. Each month we discuss an article, conference, podcast or video related to digital scholarship. It's a great way to keep up with new ideas or reality check trends in digital scholarship (including the digital humanities). We welcome people from any department in the Library, and take suggestions for topics that are particularly relevant to diverse teams or disciplines. 

Curious about what we cover? Check out this previous blog post that cover the last five years of our Reading Group.

 

21st Century Curatorship Talk Series 

The Digital Scholarship team hosts the 21st Century Curatorship Programme (C21st), a series of professional development talks and seminars, open to all staff, providing a forum for keeping up with new developments and emerging technologies in scholarship, libraries and cultural heritage. 
 

What’s new? 

In 2019, the British Library and partners Birkbeck University and The National Archives were awarded £222,420 in funding by the Institute of Coding (IoC) to co-develop a one-year part-time postgraduate Certificate (PGCert), Computing for Cultural Heritage, as part of a £4.8 million University skills drive. The new course aims to provide working professionals, particularly across the GLAM sector (Galleries, Libraries, Archives and Museums), with an understanding of basic programming, analytic tools and computing environments to support them in their daily work.  

 

Further information 

For more information on the Training Programme's most recent year, including our performance numbers and topics covered by the training, please see our full screen, interactive inforgraphic 

Please also see our two conference papers from Digital Humanities 2013 and Digital Humanities 2016 for more details on how the Training Programme was established. Any queries about this project can be directed to [email protected]. 

15 March 2024

Call for proposals open for DigiCAM25: Born-Digital Collections, Archives and Memory conference

Digital research in the arts and humanities has traditionally tended to focus on digitised physical objects and archives. However, born-digital cultural materials that originate and circulate across a range of digital formats and platforms are rapidly expanding and increasing in complexity, which raises opportunities and issues for research and archiving communities. Collecting, preserving, accessing and sharing born-digital objects and data presents a range of technical, legal and ethical challenges that, if unaddressed, threaten the archival and research futures of these vital cultural materials and records of the 21st century. Moreover, the environments, contexts and formats through which born-digital records are mediated necessitate reconceptualising the materials and practices we associate with cultural heritage and memory. Research and practitioner communities working with born-digital materials are growing and their interests are varied, from digital cultures and intangible cultural heritage to web archives, electronic literature and social media.

To explore and discuss issues relating to born-digital cultural heritage, the Digital Humanities Research Hub at the School of Advanced Study, University of London, in collaboration with British Library curators, colleagues from Aarhus University and the Endangered Material Knowledge Programme at the British Museum, are currently inviting submissions for the inaugural Born-Digital Collections, Archives and Memory conference, which will be hosted at the University of London and online from 2-4 April 2025. The full call for proposals and submission portal is available at https://easychair.org/cfp/borndigital2025.

Text on image says Born-Digital Collections, Archives and Memory, 2 - 4 April 2025, School of Advanced Study, University of London

This international conference seeks to further an interdisciplinary and cross-sectoral discussion on how the born-digital transforms what and how we research in the humanities. We welcome contributions from researchers and practitioners involved in any way in accessing or developing born-digital collections and archives, and interested in exploring the novel and transformative effects of born-digital cultural heritage. Areas of particular (but not exclusive) interest include:

  1. A broad range of born-digital objects and formats:
    • Web-based and networked heritage, including but not limited to websites, emails, social media platforms/content and other forms of personal communication
    • Software-based heritage, such as video games, mobile applications, computer-based artworks and installations, including approaches to archiving, preserving and understanding their source code
    • Born-digital narrative and artistic forms, such as electronic literature and born-digital art collections
    • Emerging formats and multimodal born-digital cultural heritage
    • Community-led and personal born-digital archives
    • Physical, intangible and digitised cultural heritage that has been remediated in a transformative way in born-digital formats and platforms
  2. Theoretical, methodological and creative approaches to engaging with born-digital collections and archives:
    • Approaches to researching the born-digital mediation of cultural memory
    • Histories and historiographies of born-digital technologies
    • Creative research uses and creative technologist approaches to born-digital materials
    • Experimental research approaches to engaging with born-digital objects, data and collections
    • Methodological reflections on using digital, quantitative and/or qualitative methods with born-digital objects, data and collections
    • Novel approaches to conceptualising born-digital and/or hybrid cultural heritage and archives
  3. Critical approaches to born-digital archiving, curation and preservation:
    • Critical archival studies and librarianship approaches to born-digital collections
    • Preserving and understanding obsolete media formats, including but not limited to CD-ROMs, floppy disks and other forms of optical and magnetic media
    • Preservation challenges associated with the platformisation of digital cultural production
    • Semantic technology, ontologies, metadata standards, markup languages and born-digital curation
    • Ethical approaches to collecting and accessing ‘difficult’ born-digital heritage, such as traumatic or offensive online materials
    • Risks and opportunities of generative AI in the context of born-digital archiving
  4. Access, training and frameworks for born-digital archiving and collecting:
    • Institutional, national and transnational approaches to born-digital archiving and collecting
    • Legal, trustworthy, ethical and environmentally sustainable frameworks for born-digital archiving and collecting, including attention to cybersecurity and safety concerns
    • Access, skills and training for born-digital research and archives
    • Inequalities of access to born-digital collecting and archiving infrastructures, including linguistic, geographic, economic, legal, cultural, technological and institutional barriers

Options for Submissions

A number of different submission types are welcomed and there will be an option for some presentations to be delivered online.

  • Conference papers (150-300 words)
    • Presentations lasting 20 minutes. Papers will be grouped with others on similar subjects or themes to form a complete session. There will be time for questions at the end of each session.
  • Panel sessions (100 word summary plus 150-200 words per paper)
    • Proposals should consist of three or four 20-minute papers. There will be time for questions at the end of each session.
  • Roundtables (200-300 word summary and 75-100 word bio for each speaker)
    • Proposals should include between three to five speakers, inclusive of a moderator, and each session will be no more than 90 minutes.
  • Posters, demos & showcases (100-200 words)
    • These can be traditional printed posters, digital-only posters, digital tool showcases, or software demonstrations. Please indicate the form your presentation will take in your submission.
    • If you propose a technical demonstration of some kind, please include details of technical equipment to be used and the nature of assistance (if any) required. Organisers will be able to provide a limited number of external monitors for digital posters and demonstrations, but participants will be expected to provide any specialist equipment required for their demonstration. Where appropriate, posters and demos may be made available online for virtual attendees to access.
  • Lightning talks (100-200 words)
    • Talks will be no more than 5 minutes and can be used to jump-start a conversation, pitch a new project, find potential collaborations, or try out a new idea. Reports on completed projects would be more appropriately given as 20-minute papers.
  • Workshops (150-300 words)
    • Please include details about the format, length, proposed topic, and intended audience.

Proposals will be reviewed by members of the programme committee. The peer review process will be double-blind, so no names or affiliations should appear on the submissions. The one exception is proposals for roundtable sessions, which should include the names of proposed participants. All authors and reviewers are required to adhere to the conference Code of Conduct.

The submission deadline for proposals is 15 May 2024, has been extended to 7 June 2024, and notification of acceptance is now scheduled for early August 2024. Organisers plan to make a number of bursaries available to presenters to cover the cost of attendance and details about these will be shared when notifications are sent. 

Key Information:

  • Dates: 2 - 4 April 2025
  • Venue: University of London, London, UK & online
  • Call for papers deadline: 7 June 2024
  • Notification of acceptance: early August 2024
  • Submission link: https://easychair.org/cfp/borndigital2025

Further details can be found on the conference website and the call for proposals submission portal at https://easychair.org/cfp/borndigital2025. If you have any questions about the conference, please contact the organising committee at [email protected].

13 March 2024

Rethinking Web Maps to present Hans Sloane’s Collections

A post by Dr Gethin Rees, Lead Curator, Digital Mapping...

I have recently started a community fellowship working with geographical data from the Sloane Lab project. The project is titled A Generous Approach to Web Mapping Sloane’s Collections and deals with the collection of Hans Sloane, amassed in the eighteenth century and a foundation collection for the British Museum and subsequently the Natural History Museum and the British Library. The aim of the fellowship is to create interactive maps that enable users to view the global breadth of Sloane’s collections, to discover collection items and to click through to their web pages. The Sloane Lab project, funded by the UK’s Arts and Humanities Research Council as part of the Towards a National collection programme, has created the Sloane Lab knowledge base (SLKB), a rich and interconnected knowledge graph of this vast collection. My fellowship seeks to link and visualise digital representations of British Museum and British Library objects in the SLKB and I will be guided by project researchers, Andreas Vlachidis and Daniele Metilli from University College, London.

Photo of a bust sculpture of a men in a curled wig on a red brick wall
Figure 1. Bust of Hans Sloane in the British Library.

The first stage of the fellowship is to use data science methods to extract place names from the records of Sloane’s collections that exist in the catalogues today. These records will then be aligned with a gazetteer, a list of places and associated data, such as World Historical Gazetteer (https://whgazetteer.org/). Such alignment results in obtaining coordinates in the form of latitude and longitude. These coordinates mean the places can be displayed on a map, and the fellowship will draw on Peripleo web map software to do this (https://github.com/britishlibrary/peripleo).

Image of a rectangular map with circles overlaid on locations
Figure 2 Web map using Web Mercator projection, from the Georeferencer.

https://britishlibrary.oldmapsonline.org/api/v1/density

The fellowship also aims to critically evaluate the use of mapping technologies (eg Google Maps Embed API, MapBoxGL, Leaflet) to present cultural heritage collections on the web. One area that I will examine is the use of the Web Mercator projection as a standard option for presenting humanities data using web maps. A map projection is a method of representing part of the surface of the earth on a plane (flat) surface. The transformation from a sphere or similar to a flat representation always introduces distortion. There are innumerable projections or ways to make this transformation and each is suited to different purposes, with strengths and weaknesses. Web maps are predominantly used for navigation and the Web Mercator projection is well suited to this purpose as it preserves angles.

Image of a rectangular map with circles illustrating that countries nearer the equator are shown as relatively smaller
Figure 3 Map of the world based on Mercator projection including indicatrices to visualise local distortions to area. By Justin Kunimune. Source https://commons.wikimedia.org/wiki/File:Mercator_with_Tissot%27s_Indicatrices_of_Distortion.svg Used under CC-BY-SA-4.0 license. 

However, this does not necessarily mean it is the right projection for presenting humanities data. Indeed, it is unsuitable for the aims and scope of Sloane Lab, first, due to well-documented visual compromises —such as the inflation of landmasses like Europe at the expense of, for example, Africa and the Caribbean— that not only hamper visual analysis but also recreate and reinforce global inequities and injustices. Second, the Mercator projection has a history, entangled with processes like colonialism, empire and slavery that also shaped Hans Sloane’s collections. The fellowship therefore examines the use of other projections, such as those that preserve distance and area, to represent contested collections and collecting practices in interactive maps like Leaflet or Open Layers. Geography is intimately connected with identity and thus digital maps offer powerful opportunities for presenting cultural heritage collections. The fellowship examines how reinvention of a commonly used visualisation form can foster thought-provoking engagement with Sloane’s collections and hopefully be applied to visualise the geography of heritage more widely.

Image of a curved map that represents the relative size of countries more accurately
Figure 4 Map of the world based on Albers equal-area projection including indicatrices to visualise local distortions to area. By Justin Kunimune. Source  https://commons.wikimedia.org/wiki/File:Albers_with_Tissot%27s_Indicatrices_of_Distortion.svg Used under CC-BY-SA-4.0 license. 

28 February 2024

Safeguarding Tomorrow: The Impact of AI on Media and Information Industries

The British Library has joined forces with the Guardian to hold a summit on the complex policy impacts of AI on media and information industries. The summit, chaired by broadcaster and author Timandra Harkness, brings together politicians, policy makers, industry leaders, artists and academics to shed light on key issues facing the media, newspapers, broadcasting, library and publishing industries in the age of AI. The summit was on Monday 11 March 2024 14:00 - 17:20; networking reception 17:30 - 19:00 GMT.

The video of the event is on YouTube at https://www.youtube.com/watch?v=4muZybkzMU4 and embedded below.

 

Lucy Crompton-Reid, Chief Executive of Wikimedia UK; Sara Lloyd, Group Communications Director & Global AI Lead at Pan Macmillan and Matt Rogerson from the Guardian will tackle the issue of copyright in the age of algorithms.

Novelist Tahmima Anam; Greg Clark MP, Chair Science & Technology Committee; Chris Moran from the Guardian and Roly Keating, Chief Executive of the British Library will discuss the issue of AI generated misinformation and bias.

 

Speakers on stage at the AI Summit
Speakers on stage at the AI Summit. Photo credit Mia Ridge

AI is rapidly changing the world as we know it, and the media and information industries are no exception. AI-powered technologies are already being used to automate tasks, create personalised content, and deliver targeted advertising. In the process AI is quickly becoming both a friend and a foe. People can use AI to flood the online environment with misinformation, creating significant worries, for example, around how deep fakes, and AI personalised and targeted content could influence democratic processes. At the same time, AI could become a key tool to combat misinformation by identifying fake news articles and social media posts.

Many creators of content - from the organisations creating and publishing content, to individual authors, artists and actors - are worried that their copyright has been infringed by AI and we have already seen a flurry of legal action, mostly in the United States. At the same time, many artists are embracing AI as a part of their creative process. The recent British Library exhibition on Digital Storytelling explored the ways technology provides new opportunities to transform and enhance the way writers write and readers engage, including interactive works that invite and respond to user input, and reading experiences influenced by data feeds.

And it is not only in the world of news that there is a danger of AI misinformation. In science, where AI is revolutionising many areas of research from helping us discover new drugs to aiding research on complexities of climate change, we are, at the same time, encountering the issue of fake, AI generated scientific articles. For libraries, AI holds the future promise of improving discovery and access to information, which would help library users to find relevant information quickly. Yet, AI is also introducing significant new challenges when it comes to understanding the provenance of information sources, especially in making the public aware if the information has been created or selected by algorithms rather than human beings.

How will we know - and will we care - if our future newspapers, television programmes and library enquiries are mediated and delivered by AI? Or if the content we are consuming is a machine rather than a human creation? We are used to making judgements about people and organisations that we trust on the basis of how we perceive their professional integrity, political leanings, their stance on the issues that we care about, or just likability and charisma of the individual in front of us. How will we make similar judgments about an algorithm and its inherent bias? And how will we govern and manage this new AI-powered environment?

Governmental regulation of AI is under development in the UK, the US, the EU and elsewhere. At the beginning of February 2024 the UK government released its response to the UK AI Regulation White Paper, signaling the continuation of ‘agile’ AI regulation in the UK, which attempts to balance innovation and economic benefits of AI while also giving greater responsibility related to AI to existing regulators. The government’s response also reserves an option for more binding regulation in the future. For some, such as tech companies investing in AI products, this creates uncertainty for their future business models. For others, especially many in the creative industries and artists affected by AI, there is a disappointment due to the absence of regulations in relation to AI being trained by using content under copyright.

Inevitably, as AI further develops and becomes more prevalent, the issues of its regulation and adoption in the society will continue to evolve. AI will continue to challenge the ways in which we understand creators’ rights, individual and corporate governance and management of information, and the ways in which we acquire knowledge, trust different information sources, and form our opinions on what to buy to who to vote for.

Join us to discuss the challenges and opportunities ahead. You can book your place on Eventbrite: https://www.eventbrite.co.uk/e/safeguarding-tomorrow-the-impact-of-ai-in-media-information-industries-tickets-814482728767?aff=oddtdtcreator.

09 October 2023

Strike a Pose Steampunk style! For our Late event with Clockwork Watch on Friday 13th October

This Friday (13th October) the British Library invites you to join the world of Clockwork Watch by Yomi Ayeni, a participatory storytelling project, set in a fantastical retro-futurist vision of Victorian England, with floating cities and sky pirates, which is one of the showcased narratives in our Digital Storytelling exhibition.

Flyer with text saying Late at the Library, Digital Steampunk at the British Library, London. Friday 13 October, 19:30 – 22:30

We are delighted that Dark Box Images will be bringing their portable darkroom to the Late at the Library: Digital Steampunk event and taking portrait photographs. If this appeals to you, then please arrive early to have your picture taken. Photographer Gregg McNeill is an expert in the wet plate collodion process invented by Frederick Scott Archer in 1851. Gregg’s skill in using an authentic Victorian camera creates genuinely remarkable results that appear right in front of your eyes.

Black and white photograph of a woman wearing an elaborate outfit and a mask with her arms outstretched wide with fabric like wings
Wet plate collodion photograph of Jennifer Garside of Wyte Phantom corsetry, taken by Gregg McNeill of Dark Box Images

If you want to pose for the camera at our steampunk Late, or have a portrait drawn by artist Doctor Geof, please don’t be shy, this is an event where guests are encouraged to dress to impress! The aesthetic of steampunk fashion is inspired by Victoriana and 19th Century literature, including Jules Verne’s novels and the Sherlock Holmes stories by Sir Arthur Conan Doyle. Steampunk looks can include hats and googles, tweed tailoring, waistcoats, corsets, fob watches and fans. Whatever your personal style, we encourage you to unleash your creativity when putting together an outfit for this event.

Furthermore, whether you are seeking a new look or some finishing touches, there will be an opportunity to browse a Night Market at this Late event, where you can purchase and admire a range of exquisite hand crafted items created by:

  • Jema Hewitt, a professional costumer and academic, will be bringing some of her unique, handmade jewellery and accessories to the Library Late event. She was one of the originators of the early artistic steampunk scene in the UK, subsequently exhibiting her costume work internationally, and having three how-to-make books published as her alter ego “Emilly Ladybird”. Jema currently specialises as a pattern cutter for film, theatre and TV, as well as lecturing and teaching workshops.
Photograph of jewellery, hats and clothing
Jewellery, hats and clothing created by Jema Hewitt/Emilly Ladybird
  • Doctor Geof, an artist, scientist, comics creator and maker of whimsical objects. His work is often satirical, usually with an historical twist, and features tea, goblins, krakens, steampunk, smut, nuns, bees, cats and more tea. Since 2004 you may have encountered him selling his comics, prints, cards, mugs, pins, and for some reason a lot of embroidered badges (including an Evil Librarian patch!) at various events. As one of the foremost Steampunk artists in the UK, Doctor Geof has worked with and exhibited at the Cutty Sark, Royal Museums Greenwich, and Discovery Museum Newcastle. He is a talented portrait artist, so please seek him out if you would like him to capture your likeness in ink and watercolour.
A round embroidered patch with a cartoon figure wearing goggles and carrying books. Text says "Evil Librarian"
Evil Librarian embroidered patch by Dr Geof

  • Jennifer Garside, a seamstress specialising in modern corsetry, which takes inspiration from historical styles. Her business, Wyte Phantom, opened in 2010, and she has made costumes for opera singers, performers and artists across the world.

  • Tracy Wells, a couture milliner based in the Lake District. She creates all kinds of hats and headpieces, often collaborating with other artists to explore new styles, concepts and genres.
Photograph of a woman wearing a steampunk hat with feathers
Millinery by Tracy Wells
  • Herr Döktor, a renowned inventor, gadgeteer, and contraptionist, who has been working in his Laboratory in the Surrey Hills for the last two decades, building a better future via the prism of history. He will be bringing a small selection of his inventions and scale models of his larger ideas. (His alter ego, Ian Crichton, is a professional model maker with thirty years experience as a toy prototype maker, museum and exhibition designer, and, most recently, building props and models for the film industry, he also lives in the Surrey Hills). 
Photograph of a man wearing a top hat and carrying a model submarine
Herr Döktor, inventor, gadgeteer, and contraptionist. Photograph by Adam Stait
  • Linette Withers established Anachronalia in 2012 to be a full-time bookbinder, producing historically-inspired books, miniature books, and quirky stationery. Her work has been shortlisted for display at the Bodleian Library at the University of Oxford as part of their ‘Redesigning the Medieval Book’ competition and exhibition in 2018 and one of her books is held in the permanent collection of The Lit & Phil in Newcastle after being part of an exhibition of bookbinding in 2021. She also teaches bookbinding in her studio in Leeds.

  • Heather Hayden of Diamante Queen Designs creates handmade vintage inspired, kitsch, macabre, noir accessories for everybody to wear and enjoy. Heather studied fashion and surface pattern design in the 80's near Leeds during the emergence of Gothic culture and has remained interested in the darker side of life ever since. She became fascinated with Steampunk after seeing Datamancer's Steampunk computer, loving the juxtaposition of new and old technology. This inspired her to make steampunk clothing and accessories using old and found items and upcycling as much as possible.
Photograph of a mannequin head wearing a headpiece with tassels, feathers, flowers and beads
Headpiece by Diamante Queen Designs
  • Matthew Chapman of Raphael's Workshop specialises in creating strange and sublime chainmail items, bringing ideas to life in metal that few would ever consider. From collars to corsets, serpents to squids, arms to armour and medals to masterpieces, you should visit his stall and see what creations spark the imagination.
Photograph of a table displaying a range of wearable items of chainmail jewellery and accessories
Chainmail jewellery and accessories created by Raphael's Workshop

We hope that this post has whetted your appetite for the delights available at the Late at the Library: Digital Steampunk event on Friday 13th October at the British Library. Tickets can be booked here.

02 October 2023

Last chance to see the Digital Storytelling exhibition

All good things must come to an end, no I’m not talking about the collapse of a favourite high street chain store beginning with W, but the final few weeks of our Digital Storytelling exhibition, which closes on the 15th October 2023. If you haven’t seen it yet, then this is your last chance to book!

Digital Storytelling showcases eleven different born digital works, including interactive narratives that respond to user input, reading experiences personalised by data feeds, and immersive multimedia story worlds developed through audience participation. From thought provoking autobiographical hypertexts to data journalism, uncanny ghost stories to weather poetry, steampunk literary adaptation to quirky Elizabethan medical comedy. 

Digital Storytelling exhibition image with art from Astrologaster, Seed, 80 Days, and Zombies, Run!

If you want to hear more about this exhibition, Digital Curator Stella Wisdom will be giving two talks later this week. The first of these will be in-person on Thursday evening, 5th October, in Richmond Lending Library for the Richmond Reads season of events, celebrating the joys and benefits of reading. The second will be held online on Friday morning, 6th October, for the DARIAH-EU autumn 2023 Friday Frontiers series.

We are also delighted to share that there is a chapter about interactive digital books written by Giulia Carla Rossi, Curator for Digital Publications, in The Book by Design, which was recently launched by our colleagues in British Library Publishing. Giulia’s chapter discusses innovative Editions at Play publications, including Seed by Joanna Walsh and Breathe by Kate Pullinger, which are both currently displayed in Digital Storytelling.

Before the Digital Storytelling exhibition closes, we'd love you to join us for a party on the evening of Friday 13th October. For one night only, transmedia storyteller Yomi Ayeni will transform the British Library into the Clockwork Watch story world for an immersive steampunk late event.

Genre-bending DJ Sacha Dieu will be spinning the best in Balkan Gypsy, Electro Swing, and Global Beats. Professor Elemental will perform live for us, and we really hope he’ll sing I Love Libraries! You'll also be able to view the Digital Storytelling exhibition, and there will be quieter areas to explore 19th Century London in Minecraft, play board games including Great Scott! The Game of Mad Invention with games librarian Marion Tessier, and to discover poetry with the Itinerant Poetry Librarian.

If you plan to party with us, book your ticket here.

Digital scholarship blog recent posts

Archives

Tags

Other British Library blogs