11 November 2021
The British Library Adopts a New Persistent Identifier Policy
Since 29 September, to support and guide the management of its collection, the Library has adopted a new persistent identifier policy. A persistent identifier or PID is a long lasting digital reference to an entity whether it is physical or digital. PIDs are a core component in providing reliable, long-term access to collections and improve their discoverability. They also make it easier to track when and how collections are used. The Library has been using PIDs in various forms for almost a decade but following the creation of a case study as part of the AHRC’s Towards a National Collection funded project, PIDs as IRO Infrastructure, the Library recognised the need to document its rationale and approach to PIDs and lay down principles and requirements for their use.
The Library encourages the use of PIDs across its collections and collection metadata. It recognises the role PIDs have as a component in sustainable, open infrastructure and in enabling interoperability and the use of Library resources. PIDs also support the Library’s content strategy and its goal of connecting rather than collecting as they enable long term and reliable access to resources.
Many different types of PIDs are used across the Library, some of which it creates for itself, e.g. ARKs, and others which it harvests from elsewhere, e.g. DOIs that are used to identify journal articles. While not all existing Library services may meet the requirements described in this policy, it provides a benchmark against which they can be measured and aspire to develop.
To make sure staff at the Library are supported in implementing the policy, a working group has been convened to run until the end of December 2022. This group will raise awareness of the policy and ensure that guidance is made available to any project or service which is under review to consider the use of PIDs.
A public version of the policy is available on this page and an extract with the key points are provided below. The group would like to acknowledge the Bibliothèque nationale de France’s policy which was influential in the creation of this policy.
In its use of identifiers, the British Library adheres to the following principles, which describe the qualities PIDs created, contributed or consumed by the Library must have.
- A PID must never be deleted but may be marked as deprecated if required
- A PID must be usable in perpetuity to identify its associated entry
- A PID must only describe one entity and must never be reused for different entities
- A PID must have established versioning processes and procedures in place; these may be defined locally by the Library as a creator or by the PID provider
- A PID must have established governance mechanisms, such as contracts, in place to ensure the standards of use of the PID are met and continue to be met
- A PID must resolve to metadata about the entity available in both a human and machine readable format
- A publicly accessible PID must be resolvable via a global resolver
- A PID must have an operating model that is sustainable for long-term persistent use
Established user community
- A PID must have an established user community, which has adopted it as a standard, either through an organisation such as the International Organization for Standardization (ISO) or as a de factostandard through widespread adoption; the Library will support and develop the use of new types of PIDs where there is a defined and recognised use case which they would address
- A PID must be able to link with the other identifiers in use at the Library through open metadata standards and the capability to cross-reference resources
New PID types or new use
- New types of PIDs should only be considered for use in the Library where there is a defined need which cannot reasonably be met by a combination of PIDs already in use
- Any new PID type used by the Library should meet the requirements described in this policy
- Where a PID type is emerging and does not have an established community, the Library can seek to influence its development in line with principles for open and sustainable infrastructures
These requirements outline the Library’s responsibilities in using PID services and creating PIDs. While the Library uses identifiers which do not meet all of these requirements, they are included for future work and developments.
- The Library aspires to assign PIDs to all resources within its collections, both physical and digital, and associated entities, in alignment with the guiding principles of the Library’s content strategy 2020-2023.
- The Library has varying levels of involvement in different PID schemes, but all PIDs created by the Library must meet the requirements described in this section and the Library prefers the use of PIDs which meet the principles
- Identifiers created by the Library must have an opaque format, i.e. not contain any semantic information within them, to ensure their longevity
- A PID must resolve to information about the entity to which it refers
- The Library must have a process to specify the granularity at which PIDs are assigned and how relationships between PIDs for component and overarching entities are managed
- The Library must have a process to manage versioning including changes, merges and retirement of entities
- Standard descriptive information about an entity, e.g. creator, should have a PID
- All metadata associated with a PID should comply with Collection Metadata Licensing Guidelines
- Where a PID referring to a citable resource resolves to a webpage, that webpage should display a suggested citation including the hyperlink to the PID to encourage ongoing use of the PID outside the Library