Digital scholarship blog

200 posts categorized "Projects"

20 April 2022

Importing images into Zooniverse with a IIIF manifest: introducing an experimental feature

Digital Curator Dr Mia Ridge shares news from a collaboration between the British Library and Zooniverse that means you can more easily create crowdsourcing projects with cultural heritage collections. There's a related blog post on Zooniverse, Fun with IIIF.

IIIF manifests - text files that tell software how to display images, sound or video files alongside metadata and other information about them - might not sound exciting, but by linking to them, you can view and annotate collections from around the world. The IIIF (International Image Interoperability Framework) standard makes images (or audio, video or 3D files) more re-usable - they can be displayed on another site alongside the original metadata and information provided by the source institution. If an institution updates a manifest - perhaps adding information from updated cataloguing or crowdsourcing - any sites that display that image automatically gets the updated metadata.

Playbill showing the title after other large text
Playbill showing the title after other large text

We've posted before about how we used IIIF manifests as the basis for our In the Spotlight crowdsourced tasks on LibCrowds.com. Playbills are great candidates for crowdsourcing because they are hard to transcribe automatically, and the layout and information present varies a lot. Using IIIF meant that we could access images of playbills directly from the British Library servers without needing server space and extra processing to make local copies. You didn't need technical knowledge to copy a manifest address and add a new volume of playbills to In the Spotlight. This worked well for a couple of years, but over time we'd found it difficult to maintain bespoke software for LibCrowds.

When we started looking for alternatives, the Zooniverse platform was an obvious option. Zooniverse hosts dozens of historical or cultural heritage projects, and hundreds of citizen science projects. It has millions of volunteers, and a 'project builder' that means anyone can create a crowdsourcing project - for free! We'd already started using Zooniverse for other Library crowdsourcing projects such as Living with Machines, which showed us how powerful the platform can be for reaching potential volunteers. 

But that experience also showed us how complicated the process of getting images and metadata onto Zooniverse could be. Using Zooniverse for volumes of playbills for In the Spotlight would require some specialist knowledge. We'd need to download images from our servers, resize them, generate a 'manifest' list of images and metadata, then upload it all to Zooniverse; and repeat that for each of the dozens of volumes of digitised playbills.

Fast forward to summer 2021, when we had the opportunity to put a small amount of funding into some development work by Zooniverse. I'd already collaborated with Sam Blickhan at Zooniverse on the Collective Wisdom project, so it was easy to drop her a line and ask if they had any plans or interest in supporting IIIF. It turns out they had, but hadn't had the resources or an interested organisation necessary before.

We came up with a brief outline of what the work needed to do, taking the ability to recreate some of the functionality of In the Spotlight on Zooniverse as a goal. Therefore, 'the ability to add subject sets via IIIF manifest links' was key. ('Subject set' is Zooniverse-speak for 'set of images or other media' that are the basis of crowdsourcing tasks.) And of course we wanted the ability to set up some crowdsourcing tasks with those items… The Zooniverse developer, Jim O'Donnell, shared his work in progress on GitHub, and I was very easily able to set up a test project and ask people to help create sample data for further testing. 

If you have a Zooniverse project and a IIIF address to hand, you can try out the import for yourself: add 'subject-sets/iiif?env=production' to your project builder URL. e.g. if your project is number #xxx then the URL to access the IIIF manifest import would be https://www.zooniverse.org/lab/xxx/subject-sets/iiif?env=production

Paste a manifest URL into the box. The platform parses the file to present a list of metadata fields, which you can flag as hidden or visible in the subject viewer (public task interface). When you're happy, you can click a button to upload the manifest as a new subject set (like a folder of items), and your images are imported. (Don't worry if it says '0 subjects).

 

Screenshot of manifest import screen
Screenshot of manifest import screen

You can try out our live task and help create real data for testing ingest processes at ​​https://frontend.preview.zooniverse.org/projects/bldigital/in-the-spotlight/classify

This is a very brief introduction, with more to come on managing data exports and IIIF annotations once you've set up, tested and launched a crowdsourced workflow (task). We'd love to hear from you - how might this be useful? What issues do you foresee? How might you want to expand or build on this functionality? Email digitalresearch@bl.uk or tweet @mia_out @LibCrowds. You can also comment on GitHub https://github.com/zooniverse/Panoptes-Front-End/pull/6095 or https://github.com/zooniverse/iiif-annotations

Digital work in libraries is always collaborative, so I'd like to thank British Library colleagues in Finance, Procurement, Technology, Collection Metadata Services and various Collections departments; the Zooniverse volunteers who helped test our first task and of course the Zooniverse team, especially Sam, Jim and Chris for their work on this.

 

18 March 2022

Looking back at LibCrowds: surveying our participants

'In the Spotlight' is a crowdsourcing project from the British Library that aims to make digitised historical playbills more discoverable, while also encouraging people to closely engage with this otherwise less accessible collection. Digital Curator Dr Mia Ridge writes...

If you follow our @LibCrowds account on twitter, you might have noticed that we've been working on refreshed versions of our In the Spotlight tasks on Zooniverse. That's part of a small project to enable the use of IIIF manifests on Zooniverse - in everyday language, it means that many, many more digitised items can form the basis of crowdsourcing tasks in the Zooniverse Project Builder, and In the Spotlight is the first project to use this new feature. Along with colleagues in Printed Heritage and BL Labs, I've been looking at our original Pybossa-based LibCrowds site to plan a 'graceful ending' for first phase of the project on LibCrowds.com.

As part of our work documenting and archiving the original LibCrowds site, I'm delighted to share summary results from a 2018 survey of In the Spotlight participants, now published on the British library's Research Repository: https://doi.org/10.23636/w4ee-yc34. Our thanks go to Susan Knight, Customer Insight Coordinator, for her help with the survey.

The survey was designed to help us understand who In the Spotlight participants were, and to help us prioritise work on the project. The 22 question survey was based on earlier surveys run by the Galaxy Zoo and Art UK Tagger projects, to allow comparison with other crowdsourcing projects, and to contribute to our understanding of crowdsourcing in cultural heritage more broadly. It was open to anyone who had contributed to the British Library's In the Spotlight project for historical playbills. The survey was distributed to LibCrowds newsletter subscribers, on the LibCrowds community forum and on social media.

Some headline findings from our survey include:

  • Respondents were most likely to be a woman with a Masters degree, in full-time employment, in London or Southeast UK, who contributes in a break between other tasks or 'whenever they have spare time'.
  • 76% of respondents were motivated by contributing to historical or performance research

Responses to the question 'What was it about this project which caused you to spend more time than intended on it?':

  • Easy to do
  • It's so entertaining
  • Every time an entry is completed you are presented with another item which is interesting and
  • illuminating which provides a continuous temptation regarding what you might discover next
  • simplicity
  • A bit of competitiveness about the top ten contributors but also about contributing something useful
  • I just got carried away with the fun
  • It's so easy to complete
  • Easy to want to do just a few more
  • Addiction
  • Felt I could get through more tasks
  • Just getting engrossed
  • It can be a bit addictive!
  • It's so easy to do that it's very easy to get carried away.
  • interested in the [material]

The summary report contains more rich detail, so go check it out!

 

Crowdsourcing projects from the British Library. 2,969 Volunteers. 265,648 Contributions. 175 Projects
Detail of the front page of libcrowds.com; Crowdsourcing projects from the British Library. 2,969 Volunteers. 265,648 Contributions. 175 Projects

14 March 2022

The Lotus Sutra Manuscripts Digitisation Project: the collaborative work between the Heritage Made Digital team and the International Dunhuang Project team

Digitisation has become one of the key tasks for the curatorial roles within the British Library. This is supported by two main pillars: the accessibility of the collection items to everybody around the world and the preservation of unique and sometimes, very fragile, items. Digitisation involves many different teams and workflow stages including retrieval, conservation, curatorial management, copyright assessment, imaging, workflow management, quality control, and the final publication to online platforms.

The Heritage Made Digital (HMD) team works across the Library to assist with digitisation projects. An excellent example of the collaborative nature of the relationship between the HMD and International Dunhuang Project (IDP) teams is the quality control (QC) of the Lotus Sutra Project’s digital files. It is crucial that images meet the quality standards of the digital process. As a Digitisation Officer in HMD, I am in charge of QC for the Lotus Sutra Manuscripts Digitisation Project, which is currently conserving and digitising nearly 800 Chinese Lotus Sutra manuscripts to make them freely available on the IDP website. The manuscripts were acquired by Sir Aurel Stein after they were discovered  in a hidden cave in Dunhuang, China in 1900. They are thought to have been sealed there at the beginning of the 11th century. They are now part of the Stein Collection at the British Library and, together with the international partners of the IDP, we are working to make them available digitally.

The majority of the Lotus Sutra manuscripts are scrolls and, after they have been treated by our dedicated Digitisation Conservators, our expert Senior Imaging Technician Isabelle does an outstanding job of imaging the fragile manuscripts. My job is then to prepare the images for publication online. This includes checking that they have the correct technical metadata such as image resolution and colour profile, are an accurate visual representation of the physical object and that the text can be clearly read and interpreted by researchers. After nearly 1000 years in a cave, it would be a shame to make the manuscripts accessible to the public for the first time only to be obscured by a blurry image or a wayward piece of fluff!

With the scrolls measuring up to 13 metres long, most are too long to be imaged in one go. They are instead shot in individual panels, which our Senior Imaging Technicians digitally “stitch” together to form one big image. This gives online viewers a sense of the physical scroll as a whole, in a way that would not be possible in real life for those scrolls that are more than two panels in length unless you have a really big table and a lot of specially trained people to help you roll it out. 

Photo showing the three individual panels of Or.8210S/1530R with breaks in between
Or.8210/S.1530: individual panels
Photo showing the three panels of Or.8210S/1530R as one continuous image
Or.8210/S.1530: stitched image

 

This post-processing can create issues, however. Sometimes an error in the stitching process can cause a scroll to appear warped or wonky. In the stitched image for Or.8210/S.6711, the ruled lines across the top of the scroll appeared wavy and misaligned. But when I compared this with the images of the individual panels, I could see that the lines on the scroll itself were straight and unbroken. It is important that the digital images faithfully represent the physical object as far as possible; we don’t want anyone thinking these flaws are in the physical item and writing a research paper about ‘Wonky lines on Buddhist Lotus Sutra scrolls in the British Library’. Therefore, I asked the Senior Imaging Technician to restitch the images together: no more wonky lines. However, we accept that the stitched images cannot be completely accurate digital surrogates, as they are created by the Imaging Technician to represent the item as it would be seen if it were to be unrolled fully.

 

Or.8210/S.6711: distortion from stitching. The ruled line across the top of the scroll is bowed and misaligned
Or.8210/S.6711: distortion from stitching. The ruled line across the top of the scroll is bowed and misaligned

 

Similarly, our Senior Imaging Technician applies ‘digital black’ to make the image background a uniform colour. This is to hide any dust or uneven background and ensure the object is clear. If this is accidentally overused, it can make it appear that a chunk has been cut out of the scroll. Luckily this is easy to spot and correct, since we retain the unedited TIFFs and RAW files to work from.

 

Or.8210/S.3661, panel 8: overuse of digital black when filling in tear in scroll. It appears to have a large black line down the centre of the image.
Or.8210/S.3661, panel 8: overuse of digital black when filling in tear in scroll

 

Sometimes the scrolls are wonky, or dirty or incomplete. They are hundreds of years old, and this is where it can become tricky to work out whether there is an issue with the images or the scroll itself. The stains, tears and dirt shown in the images below are part of the scrolls and their material history. They give clues to how the manuscripts were made, stored, and used. This is all of interest to researchers and we want to make sure to preserve and display these features in the digital versions. The best part of my job is finding interesting things like this. The fourth image below shows a fossilised insect covering the text of the scroll!

 

Black stains: Or.8210/S.2814, panel 9
Black stains: Or.8210/S.2814, panel 9
Torn and fragmentary panel: Or.8210/S.1669, panel 1
Torn and fragmentary panel: Or.8210/S.1669, panel 1
Insect droppings obscuring the text: Or.8210/S.2043, panel 1
Insect droppings obscuring the text: Or.8210/S.2043, panel 1
Fossilised insect covering text: Or.8210/S.6457, panel 5
Fossilised insect covering text: Or.8210/S.6457, panel 5

 

We want to minimise the handling of the scrolls as much as possible, so we will only reshoot an image if it is absolutely necessary. For example, I would ask a Senior Imaging Technician to reshoot an image if debris is covering the text and makes it unreadable - but only after inspecting the scroll to ensure it can be safely removed and is not stuck to the surface. However, if some debris such as a small piece of fluff, paper or hair, appears on the scroll’s surface but is not obscuring any text, then I would not ask for a reshoot. If it does not affect the readability of the text, or any potential future OCR (Optical Character Recognition) or handwriting analysis, it is not worth the risk of damage that could be caused by extra handling. 

Reshoot: Or.8210/S.6501: debris over text  /  No reshoot: Or.8210/S.4599: debris not covering text.
Reshoot: Or.8210/S.6501: debris over text  /  No reshoot: Or.8210/S.4599: debris not covering text.

 

These are a few examples of the things to which the HMD Digitisation Officers pay close attention during QC. Only through this careful process, can we ensure that the digital images accurately reflect the physicality of the scrolls and represent their original features. By developing a QC process that applies the best techniques and procedures, working to defined standards and guidelines, we succeed in making these incredible items accessible to the world.

Read more about Lotus Sutra Project here: IDP Blog

IDP website: IDP.BL.UK

And IDP twitter: @IDP_UK

Dr Francisco Perez-Garcia

Digitisation Officer, Heritage Made Digital: Asian and African Collections

Follow us @BL_MadeDigital

10 March 2022

Scoping the connections between trusted arts and humanities data repositories

CONNECTED: Connecting trusted Arts and Humanities data repositories is a newly funded activity, supported by AHRC. It is led by the British Library, with the Archaeology Data Service and the Oxford Text Archive as co-investigators, and is supported by consultants from MoreBrains Cooperative.The CONNECTED team believes that improving discovery and curation of heritage and emergent content types in the arts and humanities will increase the impact of cultural resources, and enhance equity. Great work is already being done on discovery services for the sector, so we decided to look upstream, and focus on facilitating repository and archive deposit.

The UK boasts a dynamic institutional repository environment in the HE sector, as well as a range of subject- or field-specific repositories. With a distributed repository landscape now firmly established, challenges and inefficiencies still remain that reduce its impact. These include issues around discovery and access, but also questions around interoperability, the relationship of specialised vs general infrastructures, and potential duplication of effort from an author/depositor perspective. Greater coherence and interoperability will effectively unite different trusted repository services to form a resilient distributed data service, which can grow over time as new individual services are required and developed. Alongside the other projects funded as part of ‘Scoping future data services for the arts and humanities’, CONNECTED will help to deliver this unified network.

As practice in the creative arts becomes more digital and the digital humanities continue to thrive, the diversity of ways in which this research is expressed continues to grow. Researchers are increasingly able to combine artefacts, documents, and materials in new and innovative ways; practice-based research in the arts is creating a diverse range of (often complex) outputs, creating new curation and discovery needs; and heritage collections often contain artefacts with large amounts of annotation and commentary amassed over years or centuries, across multiple formats, and with rich contextual information. This expansion is already exposing the limitations of our current information systems, with the potential for vital context and provenance to become invisible. Without additional, careful, future-proofing, the risks of information loss and limits on access will only expand. In addition, metadata creation, deposit, preservation, and discovery strategies should, therefore, be tailored to meet the very different needs of the arts and humanities.

A number of initiatives are aimed at improving interoperability between metadata sources in ways that are more oriented towards the needs of the arts and humanities. Drawing these together with the insights to be gained from the abilities (and limitations) of bibliographic and data-centric metadata and discovery systems, will help to generate robust services in the complex, evolving landscape of arts and humanities research and creation. 

The CONNECTED project will assemble experts, practitioners, and researchers to map current gaps in the content curation and discovery ecosystem and weave together the strengths and potentials of a range of platforms, standards, and technologies in the service of the arts and humanities community. Our activities will run until the end of May, and will comprise three phases:

Phase 1 - Discovery

We will focus on repository or archive deposit as a foundation for the discovery and preservation of diverse outputs, and also as a way to help capture the connections between those objects and the commentary, annotation, and other associated artefacts. 

A data service for the arts and humanities must be developed with researcher needs as a priority, so the project team will engage in a series of semi-structured interviews with a variety of stakeholders including researchers, librarians, curators, and information technologists. The interviews will explore the following ideas:

  • What do researchers need when engaging in discovery of both heritage materials and new outputs?
  • Are there specific needs that relate to different types of content or use-cases? For example, research involving multimedia or structured information processing at scale?
  • What can the current infrastructure support, and where are the gaps between what we have and what we need?
  • What are the feasible technical approaches to transform information discovery?

Phase 2 - Data service programme scoping and planning

The findings from phase 1 will be synthesised using a commercial product strategy approach known as a canvas analysis. Based on the initial impressions from the semi-structured interviews, it is likely that an agile, product, or value proposition canvas will be used to synthesise the findings and structure thinking so that a coherent and robust strategy can be developed. Outputs from the strategy canvas exercise will then be applied to a fully costed and scoped product roadmap and budget for a national data deposit service for the arts and humanities.

Phase 3 - Scoping a unified archiving solution

Building on the partnerships and conversations from the previous phases, the feasibility of a unified ‘deposit switchboard’ will be explored. The purpose of such a switchboard is to enable researchers, curators, and creators to easily deposit items in the most appropriate repository or archive in their field for the object type they are uploading. Using insights gained from the landscaping interviews in phase 1, the team will identify potential pathways to developing a routing service for channelling content to the most appropriate home.

We will conclude with a virtual community workshop to explore the challenges and desirability of the switchboard approach, with a special focus on the benefits this could bring to the uploader of new content and resources.

This is an ambitious project, through which we hope to deliver:

  • A fully costed and scoped technical and organisational roadmap to build the required components and framework for the National Collection
  • Improved usage of resources in the wider GLAM and institutional network, including of course the Archaeology Data Service, The British Library's Shared Research Repository, and the Oxford Text Archive
  • Steps towards a truly community-governed data infrastructure for the arts and humanities as part of the National Collection

As a result of this work, access to UK cultural heritage and outputs will be accelerated and simplified, the impact of the arts and humanities will be enhanced, and we will help the community to consolidate the UK's position as a global leader in digital humanities and infrastructure.

This post is from Rachael Kotarski (@RachPK), Principal Investigator for CONNECTED, and Josh Brown from MoreBrains.

14 February 2022

PhD Placement on Mapping Caribbean Diasporic Networks through Correspondence

Every year the British Library host a range of PhD placement scheme projects. If you are interested in applying for one of these, the 2022 opportunities are advertised here. There are currently 15 projects available across Library departments, all starting from June 2022 onwards and ending before March 2023. If you would like to work with born digital collections, you may want to read last week’s Digital Scholarship blog post about two projects on enhanced curation, hybrid archives and emerging formats. However, if you are interested in Caribbean diasporic networks and want to experiment creating network analysis visualisations, then read on to find out more about the “Mapping Caribbean Diasporic Networks through correspondence (2022-ACQ-CDN)” project.

This is an exciting opportunity to be involved with the preliminary stages of a project to map the Caribbean Diasporic Network evident in the ‘Special Correspondence’ files of the Andrew Salkey Archive. This placement will be based in the Contemporary Literary and Creative Archives team at the British Library with support from Digital Scholarship colleagues. The successful candidate will be given access to a selection of correspondence files to create an item level dataset and explore the content of letters from the likes of Edward Kamau Brathwaite, C.L.R. James, and Samuel Selvon.

Photograph of Andrew Salkey
Photograph of Andrew Salkey, from the Andrew Salkey Archive, Deposit 10310. With kind permission of Jason Salkey.

The main outcome envisaged for this placement is to develop a dataset, using a sample of ten files, linking the data and mapping the correspondent’s names, location they were writing from, and dates of the correspondence in a spreadsheet. The placement student will also learn how to use the Gephi Open Graph Visualisation Platform to create a visual representation of this network, associating individuals with each other and mapping their movement across the world between the 1950s and 1990s.

Gephi is open-source software  for visualising and analysing networks, they provide a step-by-step guide to getting started, with the first step to upload a spreadsheet detailing your ‘nodes’ and ‘edges’. To show an example of how Gephi can be used, We've included an example below, which was created by previous British Library research placement student Sarah FitzGerald from the University of Sussex, using data from the Endangered Archives Programme (EAP) to create a Gephi visualisation of all EAP applications received between 2004 and 2017.

Gephi network visualisation diagram
Network visualisation of EAP Applications created by Sarah FitzGerald

In this visualisation the size of each country relates to the number of applications it features in, as country of archive, country of applicant, or both.  The colours show related groups. Each line shows the direction and frequency of application. The line always travels in a clockwise direction from country of applicant to country of archive, the thicker the line the more applications. Where the country of applicant and country of archive are the same the line becomes a loop. If you want to read more about the other visualisations that Sarah created during her project, please check out these two blog posts:

We hope this new PhD placement will offer the successful candidate the opportunity to develop their specialist knowledge through access to the extensive correspondence series in the Andrew Salkey archive, and to undertake practical research in a curatorial context by improving the accessibility of linked metadata for this collection material. This project is a vital building block in improving the Library’s engagement with this material and exploring the ways it can be accessed by a wider audience.

If you want to apply, details are available on the British Library website at https://www.bl.uk/research-collaboration/doctoral-research/british-library-phd-placement-scheme. Applications for all 2022/23 PhD Placements close on Friday 25 February 2022, 5pm GMT. The application form and guidelines are available online here. Please address any queries to research.development@bl.uk

This post is by Digital Curator Stella Wisdom (@miss_wisdom) and Eleanor Casson (@EleCasson), Curator in Contemporary Archives and Manuscripts.

23 December 2021

Three crowdsourcing opportunities with the British Library

Digital Curator Dr Mia Ridge writes, In case you need a break from whatever combination of weather, people and news is around you, here are some ways you can entertain yourself (or the kids!) while helping make collections of the British Library more findable, or help researchers understand our past. You might even learn something or make new discoveries along the way!

Your help needed: Living with Machines

Mia Ridge writes: Living with Machines is a collaboration between the British Library and the Alan Turing Institute with partner universities. Help us understand the 'machine age' through the eyes of ordinary people who lived through it. Our refreshed task builds on our previous work, and includes fresh newspaper titles, such as the Cotton Factory Times.

What did the Victorians think a 'machine' was - and did it matter where you lived, or if you were a worker or a factory owner? Help us find out: https://www.zooniverse.org/projects/bldigital/living-with-machines

Your contributions will not only help researchers - they'll also go on display in our exhibition

Image of a Cotton Factory Times masthead
You can read articles from Manchester's Cotton Factory Times in our crowdsourced task

 

Your help needed: Agents of Enslavement? Colonial newspapers in the Caribbean and hidden genealogies of the enslaved

Launched in July this year, Agents of Enslavement? is a research project which explores the ways in which colonial newspapers in the Caribbean facilitated and challenged the practice of slavery. One goal is to create a database of enslaved people identified within these newspapers. This benefits people researching their family history as well as those who simply want to understand more about the lives of enslaved people and their acts of resistance.

Project Investigator Graham Jevon has posted some insights into how he processes the results to the project forum, which is full of fascinating discussion. Join in as you take part: ​​https://www.zooniverse.org/projects/gjevon/agents-of-enslavement

Your help needed: Georeferencer

Dr. Gethin Rees writes: The community have now georeferenced 93% of 1277 maps that were added from our War Office Archive back in July (as mentioned in our previous newsletter).  

Some of the remaining maps are quite tricky to georeference and so if there is a perplexing map that you would like some guidance with do get in contact with myself and our curator for modern mapping  by emailing georeferencer@bl.uk and we will try to help. Please do look forward to some exciting news maps being released on the platform in 2022!

01 December 2021

Open and Engaged 2021: Review

Engagement with cultural heritage collections and the research impact beyond mainstream metrics in arts and humanities

Open and Engaged, the British Library’s annual event in Open Access Week, took place virtually on 25 October. The theme of the conference was Understanding the Impact of Open in the Arts and Humanities beyond the University as you may see in a previous blog post.

The slides and the video recordings together with their transcripts are now available through the British Library’s Research Repository. This blog post will give you a flavour of the talks and the sessions in a nutshell.

Two main sessions formed the programme of the conference; one was on increasing the engagement with cultural heritage collections and the other one was on measuring and evaluating impact of open resources beyond journal articles.

British Library in the background with the piazza full of people in the front
British Library and Piazza by Paul Grundy

 

Session One: Increasing Engagement with Cultural Heritage Collections

The first session was opened with a talk from Brigitte Vézina from Creative Commons (CC). It was about how CC supports GLAM (Galleries, Libraries, Archives and Museums) in embracing open access and unlocking universal access to knowledge and culture. Brigitte introduced CC’s Open GLAM programme which is a coordinated global effort to help GLAMs make the content they steward openly available and reusable for the public good.

The British Library’s Sam van Schaik presented Endangered Archives Programme (EAP) which provides funding for projects to digitise and preserve archival materials at risk of destruction. The resulting digital images and sound files are made available via the British Library’s website. Sam drew attention to the challenges around ethical issues with the CC licenses used for these digital materials and the practical considerations with working globally.

Merete Sanderhoff from National Gallery of Denmark (SMK) raised a concern about how the GLAM sector at the institutional level is lagging behind in embracing the full potential of open cultural heritage. Merete explained that GLAM users increasingly benefit from arts and knowledge beyond institutional walls by using data from GLAM collections and by spurring on developments in digital literacy, citizen science and democratic citizenship.

Towards a National Collection (TaNC), the research development programme funded by AHRC was the last talk of this session and presented by Rebecca Bailey, Programme Director at TaNC. The programme sponsors projects that are working to link collections and encourage cross-searching of multiple collection types, to enable research and enhance public engagement. Rebecca outlined the achievements and ambitions of the projects, as they start to look ahead to a national collections research infrastructure.

This session highlighted that the GLAM sector should embrace their full potential in making cultural heritage open for public good beyond their physical premises. The use of more open and public domain licences will make it easier to use digital heritage content and resources in the research and creative spheres. The challenge comes with the unethical use of digital collections in some cases, but licensing mechanisms are not the tools with which to police research ethics.

 

Session Two: Measuring and Evaluating Impact of Open Resources Beyond Journal Articles

The second half of the conference started with a metrics project, Cobaltmetrics, which works towards making altmetrics genuinely alternative by using URIs. Luc Boruta from Thunken talked about bringing algorithmic fairness to impact measurement, from web-scale attention tracking to computer-assisted data storytelling.

Gemma Derrick from University of Lancaster presented on the hidden REF experience and highlighted assessing the broader value of research culture. Gemma noted that the doubt in whether the impact can be measured doesn’t comes from lack of tools, but it is more about what is considered as impact that differs between individuals, institutions, and fields of disciplines. As she stated, “the nature of impact and the nature of evaluation is inherently better when humans are involved, mainly because mitigating factors and mitigating aspects of our research, and what makes our research culture really important, are less likely to be overlooked by an automated system.” This is what they addressed in the hidden REF, celebrating all research outputs and every role that makes research possible

Anne Boddington from Kingston University reflected on research impact in three parts; looking at its definition, partnering and collaboration between GLAMs and higher education institutions, and the reflections on future benefits. Anne talked about the challenges of impact, the kinds of evidence it demands and the opportunities it presents. She concluded her talk noting that impact is here to stay and there are significant areas for growth, opportunities for innovation and leadership in the context of impact.

Helen Adams from Oxford University Gardens, Libraries & Museums (GLAM) presented the Online Active Community Engagement (O-ACE) project where they combined arts and science to measure the benefits of online culture for mental health in young people. She highlighted how GLAM organizations can actively involve audiences in medical research and how cultural interventions may positively impact individual wellbeing, prior to diagnosis, treatment, or social prescribing pathways. The conference ended with this great case study on impact assessment.

In her closing remarks, Rachael Kotarski of the British Library underlined that opening up GLAM organizations is not only allowing us to break down the walls of our buildings to get content out there but also crosses those geographic boundaries to get content in front of communities who might not have had a chance to experience it before. It also allows us to work with communities who originated content to understand their concerns and not just the concerns of our organizations. Rachael echoed that licensing restrictions are not the solution to all our questions, or to the ethical issues. It is important that we can reflect on what we have learned to adjust and rethink our approach and identify what really allows us to balance access, engagement, and creativity.

In the context of research impact, we need to centre the human in our assessment and the processes. The other factor in impact assessments is the relatively short period of time to assess it. The examples like O-ACE project also showed us that the creation of impact can take much longer than we think and what impacts can be seen will vary through that time. So, assessing those interventions also needs a longer-term views.

Those who didn’t attend the conference or would like to re-visit the talks can find the recordings in the British Library’s Research Repository. The social media interactions can be followed with #OpenEngaged hashtag.

We are looking forward to hosting the Open and Engaged 2022 hopefully in person at the British Library.

This blog post was written by Ilkay Holt, Scholarly Communications Lead, part of the Research Infrastructure Services team.

11 November 2021

The British Library Adopts a New Persistent Identifier Policy

Since 29 September, to support and guide the management of its collection, the Library has adopted a new persistent identifier policy. A persistent identifier or PID is a long lasting digital reference to an entity whether it is physical or digital. PIDs are a core component in providing reliable, long-term access to collections and improve their discoverability. They also make it easier to track when and how collections are used. The Library has been using PIDs in various forms for almost a decade but following the creation of a case study as part of the AHRC’s Towards a National Collection funded project, PIDs as IRO Infrastructure, the Library recognised the need to document its rationale and approach to PIDs and lay down principles and requirements for their use.

An image of the world at night from space, showing the bright lights of cities and towns
Photo by NASA on Unsplash

The Library encourages the use of PIDs across its collections and collection metadata. It recognises the role PIDs have as a component in sustainable, open infrastructure and in enabling interoperability and the use of Library resources. PIDs also support the Library’s content strategy and its goal of connecting rather than collecting as they enable long term and reliable access to resources.  

Many different types of PIDs are used across the Library, some of which it creates for itself, e.g. ARKs, and others which it harvests from elsewhere, e.g. DOIs that are used to identify journal articles. While not all existing Library services may meet the requirements described in this policy, it provides a benchmark against which they can be measured and aspire to develop.

To make sure staff at the Library are supported in implementing the policy, a working group has been convened to run until the end of December 2022. This group will raise awareness of the policy and ensure that guidance is made available to any project or service which is under review to consider the use of PIDs.

A public version of the policy is available on this page and an extract with the key points are provided below. The group would like to acknowledge the Bibliothèque nationale de France’s policy which was influential in the creation of this policy.

Principles

In its use of identifiers, the British Library adheres to the following principles, which describe the qualities PIDs created, contributed or consumed by the Library must have.  

  • A PID must never be deleted but may be marked as deprecated if required
  • A PID must be usable in perpetuity to identify its associated entry
  • A PID must only describe one entity and must never be reused for different entities 
  • A PID must have established versioning processes and procedures in place; these may be defined locally by the Library as a creator or by the PID provider  
  • A PID must have established governance mechanisms, such as contracts, in place to ensure the standards of use of the PID are met and continue to be met  
  • A PID must resolve to metadata about the entity available in both a human and machine readable format 
  • A publicly accessible PID must be resolvable via a global resolver
  • A PID must have an operating model that is sustainable for long-term persistent use 

Established user community 

  • A PID must have an established user community, which has adopted it as a standard, either through an organisation such as the International Organization for Standardization (ISO) or as a de factostandard through widespread adoption; the Library will support and develop the use of new types of PIDs where there is a defined and recognised use case which they would address 

Interoperable 

  • A PID must be able to link with the other identifiers in use at the Library through open metadata standards and the capability to cross-reference resources 

New PID types or new use 

  • New types of PIDs should only be considered for use in the Library where there is a defined need which cannot reasonably be met by a combination of PIDs already in use 
  • Any new PID type used by the Library should meet the requirements described in this policy 
  • Where a PID type is emerging and does not have an established community, the Library can seek to influence its development in line with principles for open and sustainable infrastructures 

Requirements

These requirements outline the Library’s responsibilities in using PID services and creating PIDs. While the Library uses identifiers which do not meet all of these requirements, they are included for future work and developments.  

  • The Library aspires to assign PIDs to all resources within its collections, both physical and digital, and associated entities, in alignment with the guiding principles of the Library’s content strategy 2020-2023
  • The Library has varying levels of involvement in different PID schemes, but all PIDs created by the Library must meet the requirements described in this section and the Library prefers the use of PIDs which meet the principles
  • Identifiers created by the Library must have an opaque format, i.e. not contain any semantic information within them, to ensure their longevity 
  • A PID must resolve to information about the entity to which it refers 
  • The Library must have a process to specify the granularity at which PIDs are assigned and how relationships between PIDs for component and overarching entities are managed 
  • The Library must have a process to manage versioning including changes, merges and retirement of entities 
  • Standard descriptive information about an entity, e.g. creator, should have a PID 
  • All metadata associated with a PID should comply with Collection Metadata Licensing Guidelines 
  • Where a PID referring to a citable resource resolves to a webpage, that webpage should display a suggested citation including the hyperlink to the PID to encourage ongoing use of the PID outside the Library

If you would like to hear more about this policy and the Library’s approach to persistent identifiers, feel free to contact the Heritage PIDs project on Twitter or email openaccess@bl.uk.

This post is by Frances Madden (@maddenfc, orcid.org/0000-0002-5432-6116), Research Associate (PIDs as IRO Infrastructure) in the Research Infrastructure Services team.

Digital scholarship blog recent posts

Archives

Tags

Other British Library blogs