Digital scholarship blog

Enabling innovative research with British Library digital collections

Introduction

Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

11 November 2021

The British Library Adopts a New Persistent Identifier Policy

Since 29 September, to support and guide the management of its collection, the Library has adopted a new persistent identifier policy. A persistent identifier or PID is a long lasting digital reference to an entity whether it is physical or digital. PIDs are a core component in providing reliable, long-term access to collections and improve their discoverability. They also make it easier to track when and how collections are used. The Library has been using PIDs in various forms for almost a decade but following the creation of a case study as part of the AHRC’s Towards a National Collection funded project, PIDs as IRO Infrastructure, the Library recognised the need to document its rationale and approach to PIDs and lay down principles and requirements for their use.

An image of the world at night from space, showing the bright lights of cities and towns
Photo by NASA on Unsplash

The Library encourages the use of PIDs across its collections and collection metadata. It recognises the role PIDs have as a component in sustainable, open infrastructure and in enabling interoperability and the use of Library resources. PIDs also support the Library’s content strategy and its goal of connecting rather than collecting as they enable long term and reliable access to resources.  

Many different types of PIDs are used across the Library, some of which it creates for itself, e.g. ARKs, and others which it harvests from elsewhere, e.g. DOIs that are used to identify journal articles. While not all existing Library services may meet the requirements described in this policy, it provides a benchmark against which they can be measured and aspire to develop.

To make sure staff at the Library are supported in implementing the policy, a working group has been convened to run until the end of December 2022. This group will raise awareness of the policy and ensure that guidance is made available to any project or service which is under review to consider the use of PIDs.

A public version of the policy is available on this page and an extract with the key points are provided below. The group would like to acknowledge the Bibliothèque nationale de France’s policy which was influential in the creation of this policy.

Principles

In its use of identifiers, the British Library adheres to the following principles, which describe the qualities PIDs created, contributed or consumed by the Library must have.  

  • A PID must never be deleted but may be marked as deprecated if required
  • A PID must be usable in perpetuity to identify its associated entry
  • A PID must only describe one entity and must never be reused for different entities 
  • A PID must have established versioning processes and procedures in place; these may be defined locally by the Library as a creator or by the PID provider  
  • A PID must have established governance mechanisms, such as contracts, in place to ensure the standards of use of the PID are met and continue to be met  
  • A PID must resolve to metadata about the entity available in both a human and machine readable format 
  • A publicly accessible PID must be resolvable via a global resolver
  • A PID must have an operating model that is sustainable for long-term persistent use 

Established user community 

  • A PID must have an established user community, which has adopted it as a standard, either through an organisation such as the International Organization for Standardization (ISO) or as a de factostandard through widespread adoption; the Library will support and develop the use of new types of PIDs where there is a defined and recognised use case which they would address 

Interoperable 

  • A PID must be able to link with the other identifiers in use at the Library through open metadata standards and the capability to cross-reference resources 

New PID types or new use 

  • New types of PIDs should only be considered for use in the Library where there is a defined need which cannot reasonably be met by a combination of PIDs already in use 
  • Any new PID type used by the Library should meet the requirements described in this policy 
  • Where a PID type is emerging and does not have an established community, the Library can seek to influence its development in line with principles for open and sustainable infrastructures 

Requirements

These requirements outline the Library’s responsibilities in using PID services and creating PIDs. While the Library uses identifiers which do not meet all of these requirements, they are included for future work and developments.  

  • The Library aspires to assign PIDs to all resources within its collections, both physical and digital, and associated entities, in alignment with the guiding principles of the Library’s content strategy 2020-2023
  • The Library has varying levels of involvement in different PID schemes, but all PIDs created by the Library must meet the requirements described in this section and the Library prefers the use of PIDs which meet the principles
  • Identifiers created by the Library must have an opaque format, i.e. not contain any semantic information within them, to ensure their longevity 
  • A PID must resolve to information about the entity to which it refers 
  • The Library must have a process to specify the granularity at which PIDs are assigned and how relationships between PIDs for component and overarching entities are managed 
  • The Library must have a process to manage versioning including changes, merges and retirement of entities 
  • Standard descriptive information about an entity, e.g. creator, should have a PID 
  • All metadata associated with a PID should comply with Collection Metadata Licensing Guidelines 
  • Where a PID referring to a citable resource resolves to a webpage, that webpage should display a suggested citation including the hyperlink to the PID to encourage ongoing use of the PID outside the Library

If you would like to hear more about this policy and the Library’s approach to persistent identifiers, feel free to contact the Heritage PIDs project on Twitter or email openaccess@bl.uk.

This post is by Frances Madden (@maddenfc, orcid.org/0000-0002-5432-6116), Research Associate (PIDs as IRO Infrastructure) in the Research Infrastructure Services team.

10 November 2021

BL Labs Online Symposium 2021, Special Climate Change Edition: Book your place for webinar on Tuesday 7 December 2021

In response to the Climate Emergency and issues raised by the COP26, the 9th British Library Labs Symposium is devoted to looking at computational research and climate change.  Registration Now Open.

Futuristic, hologram looking version of the globe overlaid with images like wind turbines, water drops, trees and graphs.

The British Library Labs is the British Library programme dedicated to enabling people to experiment with our digital collections, including deploying computational research methods and using our collections as data. This inevitably means that we, and the communities we work with, are increasingly applying computational tools and methods that have environmental impact on our planet.

As our millions of pages of digitised content are becoming an exciting new research frontier, and we are increasingly using machine learning methods and tools on the large-scale projects, such as the Living with Machines project, it is also inevitable that this exciting new work comes with the increased use of computational resource and energy. With the view of the climate emergency, we are hoping to ensure that climate and sustainability considerations inform everything we do – meaning that we need much better understanding of digital environmental impacts and how this should inform our practice in all things related to computational research.

We know that this is not a simple issue - digitisation and digital preservation is often a lifeline for cultural heritage in the communities where museums, libraries and archives are already endangered due to the climate change - for example, the British Library’s Endangered Archives Programme is dedicated to digitising and saving archives in danger of destruction, including due to climate change. The new digital resources, such the UK Web Archive’s collections, the Climate Change collection in particular, as well as the International Internet Preservation Consortium’s Climate Change collection, are essential resources for climate researchers, especially as we are increasingly working with researchers who wish to text and data mine our collections for the insights that can broaden our understanding of changing climate and biodiversity, and the impact of these changes on different communities.

Equally, as in all other areas related to the impacts of climate change, we are aware that in relation to digital research, there is also a strong interdependency with the issues of equality and social justice. Digital advancements are enablers of new research, helping us to better understand different communities and to broaden access and opportunities, but we also need to consider how the complexities of computational research and access, as well as expensive set up and energy requirements of the state-of-art infrastructures, might disadvantage researchers and communities that do not have access to relevant technologies, or to prohibitively expensive and energy-demanding resources required to run them.

For this year’s BL Labs Symposium, we are bringing a group of speakers that will consider these issues from different angles - from large-scale digitisation, to digital humanities, climate and biodiversity research, as well as the impact of AI. We will look into how our digital strategies and projects can help us fight climate change and be more inclusive, but also how we can improve our sustainability and reduce our impact on the planet.

As well as the views from our panel, there will be an opportunity for an extended audience input, helping us to bring forward the views from the broader Labs community and learn together how our practice can be improved.

The 9th BL Labs Symposium takes place on Zoom on Tuesday 7th December from 16.30 until 18.00. Book your place now.

29 October 2021

Thought Bubble 2021 Wikithon Preparation

Comics fans, are you getting geared up for Thought Bubble? If you enjoy, or want to learn how to edit Wikipedia and Wikidata about comics, please do join us and our collaborators at Leeds Libraries for our first in-person Wikithon since this residency started, on Thursday 11th November, from 1.30pm to 4.30pm, in the Sanderson Room of Leeds Central Library.

Drawing of a person reading a comic and drinking a mug of tea

Joining us in person?

Remember the first step is to book your place here, via Eventbrite

If you’d like to get a head start, you can download and read our handy guide to setting up your Wikipedia account. There is advice on creating your account, Wikipedia's username policy and how to create your user page.

Once you have done that, or if you already have a Wikipedia account, please join our Thought Bubble Wikithon dashboard (the enrollment passcode is ltspmyfa) and go through the introductory exercises, which cover:

  • Wikipedia Essentials
  • Editing Basics
  • Evaluating Articles and Sources
  • Contributing Images and Media Files
  • Sandboxes and Mainspace
  • Sources and Citations
  • Plagiarism
  • Introduction to Wikidata (for those interested in this)

These are all short exercises that will help familiarise you with Wikipedia and its processes. Don’t have time to do them? We get it, and that’s totally fine - we’ll cover the basics on the day too!

You may want to verify your Wikipedia account - this function exists to make sure that people are contributing responsibly to Wikipedia. The easiest and swiftest way to verify your account is to do 10 small edits. You could do this by correcting typos or adding in missing dates. However, another way to do this is to find articles where citations are needed, and add them via Citation Hunt. For further information on adding citations, watching this video may be useful.

When it comes to Wikidata, we are very inspired by the excellent work of the Graphic Possibilities project at the Michigan University Department of English and we have been learning from them. For those interested in editing Wikidata we will be on hand to support this during our Thought Bubble Wikithon event.

Happier with a hybrid approach?

If you cannot join the physical event in person, but would like to contribute, please do check out and sign up to our dashboard. Although we cannot run the training as a hybrid presentation on this occasion, the online dashboard training exercises will be an excellent starting point. From there, all of your edits and contributions will be registered, and you can pat yourself firmly on the back for making the world of comics a better place from a distance.

However, if you can attend in person, please register for the Wikithon at Leeds Central Library here and check out the Thought Bubble festival programme here. Hope to see you there!

This post is by Wikimedian in Residence Lucy Hinnie (@BL_Wikimedian) and Digital Curator Stella Wisdom (@miss_wisdom).