Digital scholarship blog

Enabling innovative research with British Library digital collections

150 posts categorized "Research collaboration"

04 May 2023

Webinar on Open Scholarship in GLAMs through Research Repositories

If you work in the galleries, libraries, archives, and museums (GLAM) sector and want to learn more about research repositories, then join us on 18th May, Thursday for an online repository training session for cultural heritage professionals.

Image of man looking at a poster that says 'Open Scholarship in GLAMs through Research Repositiories - Webinar on 18 May, Thursday - Register at bit.ly/BLrepowebinar

This event is part of the Library’s Repository Training Programme for Cultural Heritage Professionals. It is designed based on the input received from previous repository training events (this, this and this) to explore some areas of the open scholarship further. They include but are not limited to, research activities in GLAM, benefits of research repositories, scholarly publishing, research data management and digital preservation in scholarly communications.

 

Who is it for?

It is intended for those who are working in cultural heritage or a collection-holding organisation in roles where they are involved in managing digital collections, supporting the research lifecycle from funding to dissemination, providing research infrastructure and developing policies. However, anyone interested in the given topics is welcome to attend!

 

Programme

13.00                  Welcome and introductions

      Susan Miles, Scholarly Communications Specialist, British Library

Session 1          Open scholarship in GLAM research  

13.15                  Repositories to facilitate open scholarship

     Jenny Basford, Repository Services Lead, British Library

13.40                 Scholarly publishing dynamics in the GLAM environment

     Ilkay Holt, Scholarly Communications Lead, British Library

14.05                  Q&A

14.20                 Break time

Session 2          Building openness in GLAM research  

14.40                  Research data management

      Jez Cope, Data Services Lead, British Library

15.05                  Digital preservation and scholarly communications

      Neil Jefferies, Head of Innovation, Bodleian Libraries

15.30                  Q&A

15.45                  Closing

 

Register!

The event will take place from 13.00 to 15.45 on 18 May, Thursday. Please register at this link to receive your access link for the online session.

 

What is next?

The last training event of the Library’s Repository Training Programme will be held on 31 May in Cardiff, hosted by the National Museums Cardiff. It will be an update and re-run of the previous face-to-face events. More information about the programme and registration link can be found in this blog post.

Please contact [email protected] if you have any questions or comments about the events.

 

Previous Events

31 January, in-person, Edinburgh, hosted by the National Museums Scotland

8 March, online, hosted by the British Library

31 March, in-person, York, hosted by Archeology Data Service at the University of York

 

About British Library’s Repository Training Programme

The Library’s Repository Training Programme for cultural heritage professionals is funded as part of AHRC’s iDAH programme to support cultural heritage organisations in establishing or expanding open scholarship activities and sharing their outputs through research repositories. You can read more about the scoping report and the development of this training programme in this blog post.

02 May 2023

Detecting Catalogue Entries in Printed Catalogue Data

This is a guest blog post by Isaac Dunford, MEng Computer Science student at the University of Southampton. Isaac reports on his Digital Humanities internship project supervised by Dr James Baker.

Introduction

The purpose of this project has been to investigate and implement different methods for detecting catalogue entries within printed catalogues. For whilst printed catalogues are easy enough to digitise and convert into machine readable data, dividing that data by catalogue entry requires visual signifiers of divisions between entries - gaps in the printed page, large or upper-case headers, catalogue references - into machine-readable information. The first part of this project involved experimenting with XML-formatted data derived from the 13-volume Catalogue of books printed in the 15th century now at the British Museum (described by Rossitza Atanassova in a post announcing her AHRC-RLUK Professional Practice Fellowship project) and trying to find the best ways to detect individual entries and reassemble them as data (given that the text for a single catalogue entry may be spread across multiple pages of a printed catalogue). Then the next part of this project involved building a complete system based on this approach to take the large volume of XML files for a volume and output all of the catalogue entries in a series of desired formats. This post describes our initial experiments with that data, the approach we settled on, and key features of our approach that you should be able to reapply to your catalogue data. All data and code can be found on the project GitHub repo.

Experimentation

The catalogue data was exported from Transkribus in two different formats: an ALTO XML schema and a PAGE XML schema. The ALTO layout encodes positional information about each element of the text (that is, where each word occurs relative to the top left corner of the page) that makes spatial analysis - such as looking for gaps between lines - helpful. However, it also creates data files that are heavily encoded, meaning that it can be difficult to extract the text elements from the data files. Whereas the PAGE schema makes it easier to access the text element from the files.

 

An image of a digitised page from volume 8 of the Incunabula Catalogue and the corresponding Optical Character Recognition file encoded in the PAGE XML Schema
Raw PAGE XML for a page from volume 8 of the Incunabula Catalogue

 

An image of a digitised page from volume 8 of the Incunabula Catalogue and the corresponding Optical Character Recognition file encoded in the ALTO XML Schema
Raw ALTO XML for a page from volume 8 of the Incunabula Catalogue

 

Spacing and positioning

One of the first approaches tried in this project was to use size and spacing to find entries. The intuition behind this is that there is generally a larger amount of white space around the headings in the text than there is between regular lines. And in the ALTO schema, there is information about the size of the text within each line as well as about the coordinates of the line within the page.

However, we found that using the size of the text line and/or the positioning of the lines was not effective for three reasons. First, blank space between catalogue entries inconsistently contributed to the size of some lines. Second, whenever there were tables within the text, there would be large gaps in spacing compared to the normal text, that in turn caused those tables to be read as divisions between catalogue entries. And third, even though entry headings were visually further to the left on the page than regular text, and therefore should have had the smallest x coordinates, the materiality of the printed page was inconsistently represented as digital data, and so presented regular lines with small x coordinates that could be read - using this approach - as headings.

Final Approach

Entry Detection

Our chosen approach uses the data in the page XML schema, and is bespoke to the data for the Catalogue of books printed in the 15th century now at the British Museum as produced by Transkribus (and indeed, the version of Transkribus: having built our code around some initial exports, running it over  the later volumes - which had been digitised last -  threw an error due to some slight changes to the exported XML schema).

The code takes the XML input and finds entry using a content-based approach that looks for features at the start and end of each catalogue entry. Indeed after experimenting with different approaches, the most consistent way to detect the catalogue entries was to:

  1. Find the “reference number” (e.g. IB. 39624) which is always present at the end of an entry.
  2. Find a date that is always present after an entry heading.

This gave us an ability to contextually infer the presence of a split between two catalogue entries, the main limitation of which is quality of the Optical Character Recognition (OCR) at the point at which the references and dates occur in the printed volumes.

 

An image of a digitised page with a catalogue entry and the corresponding text output in XML format
XML of a detected entry

 

Language Detection

The reason for dividing catalogue entries in this way was to facilitate analysis of the catalogue data, specifically analysis that sought to define the linguistic character of descriptions in the Catalogue of books printed in the 15th century now at the British Museum and how those descriptions changed and evolved across the thirteen volumes. As segments of each catalogue entry contains text transcribed from the incunabula that were not written by a cataloguer (and therefore not part of their cataloguing ‘voice’), and as those transcribed sections are in French, Dutch, Old English, and other languages that a machine could detect as not being modern English, to further facilitate research use of the final data, one of the extensions we implemented was to label sections of each catalogue entry by the language. This was achieved using a python library for language detection and then - for a particular output type - replacing non-English language sections of text with a placeholder (e.g. NON-ENGLISH SECTION). And whilst the language detection model does not detect the Old-English, and varies between assigning those sections labels for different languages as a result, the language detection was still able to break blocks of text in each catalogue entry into the English and non-English sections.

 

Text files for catalogue entry number IB39624 showing the full text and the detected English-only sections.
Text outputs of the full and English-only sections of the catalogue entry

 

Poorly Scanned Pages

Another extension for this system was to use the input data to try and determine whether a page had been poorly scanned: for example, that the lines in the XML input read from one column straight into another as a single line (rather than the XML reading order following the visual signifiers of column breaks). This system detects poorly scanned pages by looking at the lengths of all lines in the page XML schema, establishing which lines deviate substantially from the mean line length, and if sufficient outliers are found then marking the page as poorly scanned.

Key Features

The key parts of this system which can be taken and applied to a different problem is the method for detecting entries. We expect that the fundamental method of looking for marks in the page content to identify the start and end of catalogue entries in the XML files would be applicable to other data derived from printed catalogues. The only parts of the algorithm which would need changing for a new system would be the regular expressions used to find the start and end of the catalogue entry headings. And as long as the XML input comes in the same schema, the code should be able to consistently divide up the volumes into the individual catalogue entries.

03 April 2023

Topics in contemporary Digital Scholarship via five years of our Reading Group

Since March 2016, the Digital Scholarship Reading Group at the British Library has discussed articles, videos, podcasts, blog posts and chapters that touch on digital scholarship in libraries. Digital Curator Mia Ridge previously shared our readings up to May 2018 and taken a thematic look at our readings at the intersection of digital scholarship and anti-racism in July 2020.

As the Living with Machines project draws to an end this (northern) summer, Mia provides an updated list of our readings since June 2018: I started including more pieces on deep learning, machine learning, AI ('artificial intelligence'), big data, data science, digital history, digitised newspapers, and user experience design for digital collections when we began discussing what became Living with Machines in early 2017. This was partly a way for me to catch up with relevant topics, and partly to lay the groundwork for LwM across the organisation. You can see that reflected in our topics up to May 2018 and onward.

Of course, the group continued to cover other topics, and sessions were suggested and/or led by colleagues including Adi Keinan-Schoonbaert, Annabel Gallop, Graham Jevon, Jez Cope, Lucy Hinnie, Mary Stewart, Nora McGregor, Sarah Miles, Sarah Stewart and Stella Wisdom. Especial thanks to Rossitza Atanassova and Deirdre Sullivan who’ve been helping me run the group in recent years. In 2021 we started using the January session to invite colleagues across the Library to look around and pick topics for discussion in the year ahead.

So what did we discuss from June 2018 to the end of 2022?

31 March 2023

Mapping Caribbean Diasporic Networks through the Correspondence of Andrew Salkey

This is a guest post by Natalie Lucy, a PhD student at University College London, who recently undertook a British Library placement to work on a project Mapping Caribbean Diasporic Networks through the correspondence of Andrew Salkey.

Project Objectives

The project, supervised by curators Eleanor Casson and Stella Wisdom, focussed on the extensive correspondence contained within Andrew Salkey’s archive. One of the initial objectives was to digitally depict the movement of key Caribbean writers and artists, as it is evidenced within the correspondence, many of whom travelled between Britain and the Caribbean as well as the United States, Central and South America and Africa. Although Salkey corresponded with a diverse range of people, we therefore focused on the letters in his archive which were from Caribbean writers and academics and which illustrated  patterns of movement of the Caribbean diaspora. Much of the correspondence stems from 1960s and 1970s, a time when Andrew Salkey was particularly active both in the Caribbean Artists Movement and, as a writer and broadcaster, at the BBC.

Photograph of Andrew Salkey's head and shoulders in profile
Photograph of Andrew Salkey

Andrew Salkey was unusual not only for the panoply of writers, artists and politicians with whom he was connected, but that he sustained those relationships, carefully preserving the correspondence which resulted from those networks. My personal interest in this project stemmed from the fact that my PhD seeks to consider the ways that the Caribbean trickster character, Anancy, has historically been reinvented to say something about heritage and identity. Significant to that question was the way that the Caribbean Artists Movement, a dynamic group of artists and writers formed in London in the mid-1960s, and of which Andrew Salkey was a founder, appropriated Anancy, reasserting him and the folktales to convey something of a literary ‘voice’ for the Caribbean. For this reason, I was also interested in the writing networks which were evidenced within the correspondence, together with their impact.

What is Gephi?

Prior to starting the project, Eleanor, who had catalogued the Andrew Salkey archive and Digital Curator, Stella, had identified Gephi as a possible software application through which to visualise this data. Gephi has been used in a variety of projects, including several at Harvard University, examples of the breadth and diversity of those initiatives can be found here. Several of these projects have social networks or historical trading routes as their focus, with obvious parallels to this project. Others notably use correspondence as their main data.

Gathering the Data

Andrew Salkey was known as something of a chronicler. He was interested in letters and travel and was also a serious collector of stamps. As such, he had not only retained the majority of the letters he received but categorised them. Eleanor had originally identified potential correspondents who might be useful to the project, selecting writers who travelled widely, whose correspondence had been separately stored by Salkey, partly because of its volume, and who might be of wider interest to the public. These included the acclaimed Caribbean writers, Samuel Selvon, George Lamming, Jan Carew and Edward Kamau Brathwaite and publishers and political activists, Jessica and Eric Huntley.

Our initial intention was to limit the data to simple facts which could easily be gleaned from the letters. Gephi required that we did so on a spreadsheet ,which had to conform to a particular format. In the first stages of the project, the data was confined to the dates and location of the correspondence, information which could suggest the patterns of movement within the diaspora. However, the letters were so rich in detail, that we ultimately recorded other information. This included any additional travel taken by any of the correspondents,  and which was clearly evidenced in the letters, together with any passages from the correspondence which demonstrated either something of the nature and quality of the friendships or, alternatively, the mutual benefit of those relationships to the careers of so many of the writers.

Creating a visual network

Dr Duncan Hay was invited to collaborate with me on this project, as he has considerable expertise in this field, his research interests include web mapping for culture and heritage and data visualisation for literary criticism.  After the initial data was collated, we discussed with Duncan what visualisations could be created. It became apparent early on that creating a visualisation of the social networks, as opposed to the patterns of movement, might be relatively straightforward via Gephi, an application which was particularly useful for this type of graph. I had prepared a spreadsheet but, Gephi requires the data to be presented in a strictly consistent way which meant that any anomalies had to be eradicated and the data effectively ‘cleaned up’ using Open Refine. Gephi also requires that information is presented by way of a system of ‘nodes’; ‘edges’  and ‘attributes’ with corresponding spreadsheet columns. In our project, the ‘nodes’ referred to Andrew Salkey and each of the correspondents and other individuals of interest who were specifically referred to within the correspondence. The edges referred to the way that those people were connected which, in this case, was through correspondence. However, what added to the potential of the project was that these nodes and edges could be further described by reference to ‘attributes.’ The possibility of assigning a range of ‘attributes’ to each of the correspondents allowed a wealth of additional information to be provided about the networks. As a consequence, and in order to make any visualisation as informative as possible, I also added brief biographical information for each of the writers and artists to be inputted as ‘attributes’ together with some explanation of the nature of the networks that were being illustrated.

The visual illustration below shows not only the quantity of letters from the sample of correspondents to Andrew Salkey (the pink lines),  but also shows which other correspondents formed part of those networks and were referenced as friends or contacts within specific items of correspondence. For example, George Lamming references academic, Rex Nettleford and writer and activist, Claudia Jones, the founder of the Notting Hill Carnival, in his correspondence, connections which are depicted in grey. 

Data visualisation of nodes and lines representing Andrew Salkey's Correspondence Network
Gephi: Andrew Salkey correspondence network

The aim was, however, for the visualisation to also be interactive. This required considerable further manipulation of the format and tools. In this illustration you can see the information that is revealed about the prominent Barbadian writer, George Lamming which, in an interactive format, can be accessed via the ‘i’ symbols beside many of the nodes coloured in green.  

Whilst Gephi was a useful tool with which to illustrate the networks, it was less helpful as a way to demonstrate the patterns of movement, one of the primary objectives of the project. A challenge was, therefore, to create a map which could be both interactive and illustrative of the specific locations of the correspondents as well as their movement over time. With Duncan’s input and expertise, we opted for a hybrid approach, utilising two principal ways to illustrate the data: we used Gephi to create a visualisation of the ‘networks’ (above) and another software tool, Kepler.gl, to show the diasporic movement.

A static version of what ultimately will be a ‘moving’ map (illustrating correspondence with reference to person, date and location) is shown below. As well as demonstrating patterns of movement, it should also be possible to access information about specific letters as well as their shelf numbers through this map, hopefully making the archive more accessible.

Data visualisation showing lines connecting countries on a map showing part of the Americas, Europe and Africa
Patterns of diasporic movement from Andrew Salkey's correspondence, illustrated in Kepler.gl

Whilst we are still exploring the potential of this project and how it might intersect with other areas of research and archives, it has already revealed something of the benefits of this type of data visualisation. For example, a project of this type could be used as an educational tool, providing something of a simple, but dynamic, introduction to the Caribbean Artists Movement. Being able to visualise the project has also allowed us to input information which confirms where specific letters of interest might be found within the archive. Ultimately, it is hoped that the project will offer ways to make a rich, yet arguably undervalued, archive more accessible to a wider audience with the potential to replicate something of an introductory model, or ‘pilot’ for further archives in the future. 

28 March 2023

BL Labs Symposium 30 March 2023: AI and GLAM data

A small bird flying with a trail of dashes behind it showing their flight pattern

Don’t forget to register for the 2023 BL Labs Symposium (https://us02web.zoom.us/webinar/register/WN_oAApT1laSFSCm28Kyfz4bA)

Following the latest advancements in AI is almost a job in itself. The constant excitement sometimes feels almost bewildering, and it leaves us a little room to really get stuck into peculiarities and joys of data and AI methods and tools emerging in Galleries, Libraries, Archives and Museums (GLAM). For the second part of the BL Labs Symposium this year, we will be looking to spend some time with the examples of real data, tools and methods emerging in the GLAM AI world.

We will start our Data and AI session with an exciting presentation by Yannis Assael from Deep Mind. Yannis will show us Ithaca, the first Deep Neural Network interactive interface built to restore and attribute ancient Greek inscriptions. We expect this to be a real game changer for the use of AI for the collections that include complex and incomplete fragments of text.

The words Living With Machines in front of circles and cog shapes

We will also explore some British Library examples of AI and machine learning, mainly using the examples of data derived from our newspaper and map collections. Kalle Westerling will reflect on the latest from the Living with Machines project, this is a ground-breaking research collaboration between The Alan Turing Institute, the British Library, and the Universities of Cambridge, East Anglia, Exeter, and London (QMUL, King’s College). Gethin Rees will tell us about his work that is engaging public with geospatial data and in the process improving our capabilities to locate national collections.

BL Labs are dedicated to opening up the British Library’s data, especially for all researchers who want to use it for different types of computational research. This remains a daunting task. But we have been working on it! Silvija Aurylaite, BL Labs Manager, will share the BL Labs direction of travel, including sharing our new BL Labs website in Beta. The site will be live for the first time, with the Symposium audience kicking off our testing and engagement phase.

We hope that this session will give us some time to share and reflect on the ongoing AI work in GLAM with all its excitement, challenges and opportunities. All going well, there may be even a chance to get your hands on some new datasets.

We hope you can join us at the BL Labs Symposium on Thursday 30 March 2023. For the full programme, and further information on all our speakers, please read our earlier blog post. We are also delighted to be going ahead with an informal drinks and networking drop in session at the Library between 6.30pm and 7.30pm and you are all most welcome to join us. Register for this and / or the Symposium here

08 March 2023

Next in York - Join us at the University of York for the Repository Training Programme for Cultural Heritage Professionals

If you work in the galleries, libraries, archives, and museums (GLAM) sector and want to learn more about research repositories, this event is for you.

The British Library’s Repository Training Programme for cultural heritage professionals is funded as part of AHRC’s iDAH programme to support GLAM organisations in establishing or expanding open scholarship activities and sharing their outputs through research repositories.

We had the first event in Edinburgh, in-person, hosted by the National Museums Scotland on 31 January. An online training event followed this on 8 March tailored on the basis of audience feedback in Edinburgh.

Our third training event will be in-person, the University of York will kindly host us in York on Thursday, 23 March 2023.

Photograph of rows of empty red chairs in an auditorium
Photograph by Jonas Kakaroto from Pexels

Who is this training for?

We invite everyone who are working in cultural heritage or a collection-holding organisation in roles where they are involved in managing digital collections, supporting research lifecycle from funding to dissemination, providing research infrastructure and developing policies. However, anyone interested in the given topics are welcome to attend.

What will you learn?

This one-day training session is designed as a starting point to a broader set of knowledge that will help you to:

  • Understand research landscape in cultural heritage organisations, benefits of openness for heritage research, basic concepts of open principles and influencing decision makers.
  • Lay foundation for repository services including stakeholder engagement, policy development, technical overview and project planning.
  • Adopt common principles and frameworks, technical standards and requirements in establishing repository services in a cultural heritage organisation.
  • Explore basics of the scholarly communications ecosystem in the context of cultural heritage practices.

Prerequisites

No previous knowledge of the topic is required. However, an understanding of open access will maximise the benefit of the taught content for attendees.

Programme

10: 30  Welcome and introductions

11:00   Session 1 Opening up heritage research

This session covers the topics of understanding the research landscape in GLAM organisations, benefits of openness for heritage research, basic concepts of open principles and frameworks.

11:45   Break time

12:00   Workshop

12:30   Lunch

13:30   Session 2 Getting started with heritage GLAM repositories

This session covers the topics on role of repository infrastructure in open access to heritage research and positioning research repositories in an organisation including policy and development.

14:15   Break time

14:30   Session 3: Realising and expanding the benefits

This module covers technical overview and requirements for running a cultural heritage repository including an overview of BL’s Shared Research Repository, platforms and software, content administration, technical features. 

15:15   Closing remarks

15:30   Closure

Book your place

In-person sessions are planned for a maximum of 35 people per event and registrants from cultural heritage institutions will be prioritised. Registration for the event is free. Please fill this form to book your place by 20th March. Confirmation and event details will be sent to the registered email address.

Members of the Research Infrastructure Services Team at the British Library will be delivering the training programme. The team has over ten years of broad experience and extensive knowledge in supporting open scholarship across the sector and with international partners. They also provide a Shared Research Repository Service for cultural heritage organisations.

Please contact [email protected] if you have any questions or comments about this training programme.

28 February 2023

Legacies of Catalogue Descriptions Project Events at Yale

In January James Baker and I visited the Lewis Walpole Library at Yale, who are the US partner of the Legacies of catalogue descriptions collaboration. The visit had to be postponed several times due to the pandemic, so we were delighted to finally meet in person with Cindy Roman, our counterpart at Yale. The main reason for the trip was to disseminate the findings of our project by running workshops on tools for computational analysis of catalogue data and delivering talks about Researching the Histories of Cataloguing to (Try to) Make Better Metadata. Two of these events were kindly hosted by Kayla Shipp, Programme Manager of the fabulous Franke Family Digital Humanities Lab (DH Lab).

A photo of Cindy Roman, Rossitza Atanassova, James Baker and Kayla Shipp standing in a line in the middle of the Yale Digital Humanities Lab
(left to right) Cindy Roman, Rossitza Atanassova, James Baker and Kayla Shipp in the Yale Digital Humanities Lab

This was my first visit to Yale University campus, so I took the opportunity to explore its iconic library spaces, including the majestic Sterling Memorial Library building, a masterpiece of Gothic Revival architecture, and the world renowned Beinecke Rare Book and Manuscripts Library, whose glass tower inspired the Kings’ Library Tower at the British Library. As well as being amazing hubs for learning and research, the Library buildings and exhibition spaces are also open to public visitors. At the time of my visit I explored the early printed treasures on display at the Beinecke Library, the exhibit about Martin Luther King Jr’s connection with Yale and the splendid display of highlights from Yale’s Slavic collections, including Vladimir Nabokov’s CV for a job application to Yale and a family photo album that belonged to the Romanovs.

A selfie of Rossitza Atanassova with the building of the Stirling Memorial Library in the the background
Outside Yale's Stirling Memorial Library

A real highlight of my visit was the day I spent at the Lewis Walpole Library (LWP), located in Farmington, about 40 miles from the Yale campus. The LWP is a research centre of eighteenth-century studies and an essential resource for the study of Horace Walpole. The collections including important holdings of British prints and drawings were donated to Yale by Wilmarth and Annie Lewis in 1970s, together with several eighteenth-century historic buildings and land.

Prior to my arrival James had conducted archival research with the catalogues of the LWP satirical prints collections, a case study for our project. As well as visiting the modern reading room to take a look at the printed card catalogues many in hand of Mrs Lewis, we were given a tour of Mr and Mrs Lewis’ house which is now used for classes, workshops and meetings. I enjoyed meeting the LWP staff and learned much about the history of the place, the collectors' lives and LWP current initiatives.

One of the historic buildings on the Lewis Walpole Library site - The Roots House, a white Georgian-style building with a terrace, used to house visiting fellows and guests
The Root House which houses residential fellows

 

One of the historic buildings on the Lewis Walpole Library site - a red-coloured building surrounded by trees
Thomas Curricomp House

 

The main house, a white Georgian-style house, seen from the side, with the entrance to the Library on the left
The Cowles House, where Mr and Mrs Lewis lived

 

The two project events I was involved with took place at the Yale DH Lab. During the interactive workshop, Yale Library, faculty and students worked through the training materials on using AntConc for computational analysis and performed a number of tasks with the LWP satirical prints descriptions. There were discussions about the different ways of querying the data and the suitability of this tool for use with non-European languages and scripts. It was great to hear that this approach could prove useful for querying and promoting Yale’s own open access metadata.

 

James talking to a group of people seated at a table, with a screen behind him showing some text data
James presenting at the workshop about AntConc
Rossitza standing next to a screen with a slide about her talk facing the audience
Rossitza presenting her research with incunabula catalogue descriptions

 

The talks addressed the questions around cataloguing labour and curatorial voices, the extent to which computational analysis enables new research questions and can assist practitioners with remedial work involving collections metadata. I spoke about my current RLUK fellowship project with the British Library incunabula descriptions and in particular the history of cataloguing, the process to output text data and some hypotheses to be tested through computational analysis. The following discussion raised questions about the effort that goes into this type of work and the need to balance a greater user access to library and archival collections with the very important considerations about the quality and provenance of metadata.

During my visit I had many interesting conversations with Yale Library staff, Nicole Bouché, Daniel Lovins, Daniel Dollar, and caught up with folks I had met at the 2022 IIIF Conference, Tripp Kirkpatrick, Jon Manton and Emmanuelle Delmas-Glass. I was curious to learn about recent organisational changes aimed to unify the Yale special collections and enhance digital access via IIIF metadata; the new roles of Director of Computational Data and Methods in charge of the DH Lab and Cultural Heritage Data Engineer to transform Yale data into LOUD.

This has been a truly informative and enjoyable visit and my special thanks go to Cindy Roman and Kayla Shipp who hosted my visit and project events at the start of a busy term and to James for the opportunity to work with him on this project.

This blogpost is by Dr Rossitza Atanassova, Digital Curator for Digitisation, British Library. She is on Twitter @RossiAtanassova  and Mastodon @[email protected]

22 February 2023

Repository Training for Cultural Heritage Professionals

If you work in the galleries, libraries, archives, and museums (GLAM) sector and want to learn more about research repositories, then save Wednesday 8 March in your diaries and register for an online repository training session for cultural heritage professionals.

This is the second event in the British Library’s Repository Training Programme for cultural heritage professionals, which is funded as part of AHRC’s iDAH programme to support GLAM organisations in establishing or expanding open scholarship activities and sharing their outputs through research repositories. You can read more about the development of this training programme in an earlier blog post.

Our first event was delivered in-person, in Edinburgh hosted by the National Museum of Scotland on 31 January 2023. This provided the audience with a starting point to a broader set of knowledge in the areas of open scholarship. Attendees had diverse backgrounds and roles such as library manager, collections manager, research manager, research coordinator, repository staff, curatorial staff and collaborative PhD students in cultural heritage organisations. The event created a forum to discuss different aspects of research activities, challenges and opportunities to build openness in GLAM research and sharing research outputs.  

Photograph of a group of people standing together at an event
Group photo from The British Library’s Repository Training Programme for Cultural Heritage Professionals, National Museum of Scotland, 31 January 2023.

Our upcoming online session on 8 March is designed as a follow up from the initial in-person event to explore some of the topics further, identified by the attendees in Edinburgh. Topics include but are not limited to research activities in GLAM, benefits of research repositories, persistent identifiers, research data management, copyright and rights management. It is intended for those who are working in cultural heritage or a collection-holding organisation in roles where they are involved in managing digital collections, supporting research lifecycle from funding to dissemination, providing research infrastructure and developing policies. However, anyone interested in the given topics are welcome to attend.

The event will take place from 14:00 to 17:00 on Wednesday 8 March. Please register at this link to receive your access details for this online session, the programme will be displayed on the booking form. 

Photograph of a group of people standing together at an event looking at post-it-notes on a wall

Many post-it-notes stuck on a wall near a drawing of a ship
Photos taken at an earlier workshop for The British Library’s Repository Training Programme for Cultural Heritage Professionals, National Museum of Scotland, 31 January 2023.


Our next in-person event will be held on Thursday 23 March in York, hosted by the University of York. It will be an update and re-run of the first face-to-face event in Edinburgh. More information about the programme will be provided in a couple of weeks but registration is open for those who would like to join us there.

Please contact [email protected] if you have any questions or comments about this training programme.

Digital scholarship blog recent posts

Archives

Tags

Other British Library blogs