Digital scholarship blog

Enabling innovative research with British Library digital collections

27 January 2021

Identify yourself!

On Friday, 22 January, the Digital Scholarship Team at the British Library held their first 21st Century Curatorship talk of 2021; Identify Yourself: (Almost) everything you ever wanted to know about persistent identifiers but were afraid to ask.

This series of professional development talks and seminars is part of Digital Scholarship Staff Training Programme. They are open to all British Library staff, providing a forum for them to keep up with new developments and emerging technologies in scholarship, libraries and cultural heritage. Usually 21st Century Curatorship talks are given by external guests, but this one involved six speakers from around the Library who work with persistent identifiers (PIDs) in various ways. This talk was also scheduled to coincide with PIDapalooza, the annual festival of persistent identifiers which is taking place over 24 hours this week.

There were many speakers for a one-hour talk but everyone gave a whistle-stop tour around their particular area. Frances Madden began with an introduction to PIDs generally and then gave an overview of a couple of PID-related projects; the Library is a partner in or leading including FREYA and PIDs as IRO Infrastructure. (Side note, PIDs as IRO Infrastructure will feature at PIDapalooza, on Thursday at 09:30 UTC). Frances also explained that you can have persistent identifiers for many types of entities, including articles, datasets, people and organisations. These can all be connected together through the persistent identifier metadata. PIDs are so important because they are reliably unique and persistent over time, important in a library!

Next up Erin Burnand and Emma Rogoz gave an overview of ISNI. The International Standard Name Identifier is an ISO standard used to identify the public identities of parties, persons and organisations associated with creative works. Each ISNI is a sixteen digit string and is accessible by a persistent URI[isni]. Erin gave an overview of the extensive quality assurance processes ISNI use to ensure very high quality metadata and the work they do with other organisations to provide training and support, as well as consultation with OCLC and ISNI committees and interest groups. ISNI’s use has expanded since its launch in 2010 and now serves various communities: Youtube and Spotify are both registration agencies for the music industry.

Emma described the ways in which the Library is working to embed ISNI into its cataloguing workflows by adding them into the LC/NACO file, which is a collaboration between the Library of Congress and the PCC Network. There is also ongoing work to embed them in legacy bibliographic data through matching algorithms and process. Through the UK Publishers Interest Group, they are working to match authors in publishers’ databases with ISNI and integrate them into their data, which publishers share with the Library. This work has been very successful with high match rates. The Library is also working on a portal so that end users can add information to their own records or request a record be created. Because of the high quality of metadata in the ISNI database, end users will not able to change or delete any information without liaising directly with the ISNI team.

A screenshot demonstrating the ISNI Portal that the BL is working on, as described above
Figure 1: A screenshot of the ISNI portal

Jez Cope described how digital object identifiers work and the role the Library has in assigning them. A DOI is a digital identifier for an object rather than an identifier for a digital object. DOIs are generally assigned to digital objects such as journal articles and datasets but they have been used to identify Roman coins and other physical items too. DOIs are designed primarily to identify objects for the purposes of citation. Jez went onto explain that DOIs are assigned by registration agencies which have members. Unlike ISNI, the metadata control is not centralised and is overseen by the members. The British Library leads a UK consortium of 100+ DataCite members. Jez also mentioned that the machine readability of a DOI and the metadata associated with it can be integrated into the PID Graph, developed in the FREYA project. This allows you to use PID metadata to answer complex queries and understand relationships which are at a two steps away from each other, e.g. which British Library authors have received funding from a particular funding agency. Of course all this information depends on the information being present in the metadata.

Example PID Graphs
Figure 2: Example PID Graphs

Finally we heard from two projects at different stages of completion which are using DOI metadata within the Library. Simon Moffatt described how the Library is using DOIs from journal articles to improve the links from records which have been acquired through different routes. This new service, known as BLDOI, improves the experience of end users using the catalogue but also has the potential to be rolled out to other libraries and users. The solution of a lookup table comparing ARKs (the Library’s internal identifier and DOIs) which is exposed via an API which feeds into the catalogue.

A screenshot of the new search results, displayed on Reading Room PCs, explaining how the new look-up service works.
Figure 3: A screenshot of the new search results, displayed on Reading Room PCs, explaining how the new look-up service works.

Sharon Johnson closed the session by describing a project in its early stages of using Crossref DOI metadata for journal articles to identify where the Library is missing articles which it should have collected via Legal Deposit legislation. This could apply where the Library is missing articles from issues of journals it already collects but also journals which it should collect but does not at this point.

Miraculously, this jam-packed session was completed within an hour and there was even some time for questions at the end. The aim of the session was to provide an overview of the services the Library has related to identifiers and to illustrate their breadth and diversity as well as the number of different teams involved in it. The fact that we had so many speakers and teams represented illustrates this. Hopefully we will be able to hold more detailed sessions on individual topics in the future.

This post is by Frances Madden (@maddenfc), Research Associate (PIDs as IRO Infrastructure) about a recent seminar for British Library staff.