Digital scholarship blog

Enabling innovative research with British Library digital collections

220 posts categorized "Projects"

28 February 2023

Legacies of Catalogue Descriptions Project Events at Yale

In January James Baker and I visited the Lewis Walpole Library at Yale, who are the US partner of the Legacies of catalogue descriptions collaboration. The visit had to be postponed several times due to the pandemic, so we were delighted to finally meet in person with Cindy Roman, our counterpart at Yale. The main reason for the trip was to disseminate the findings of our project by running workshops on tools for computational analysis of catalogue data and delivering talks about Researching the Histories of Cataloguing to (Try to) Make Better Metadata. Two of these events were kindly hosted by Kayla Shipp, Programme Manager of the fabulous Franke Family Digital Humanities Lab (DH Lab).

A photo of Cindy Roman, Rossitza Atanassova, James Baker and Kayla Shipp standing in a line in the middle of the Yale Digital Humanities Lab
(left to right) Cindy Roman, Rossitza Atanassova, James Baker and Kayla Shipp in the Yale Digital Humanities Lab

This was my first visit to Yale University campus, so I took the opportunity to explore its iconic library spaces, including the majestic Sterling Memorial Library building, a masterpiece of Gothic Revival architecture, and the world renowned Beinecke Rare Book and Manuscripts Library, whose glass tower inspired the Kings’ Library Tower at the British Library. As well as being amazing hubs for learning and research, the Library buildings and exhibition spaces are also open to public visitors. At the time of my visit I explored the early printed treasures on display at the Beinecke Library, the exhibit about Martin Luther King Jr’s connection with Yale and the splendid display of highlights from Yale’s Slavic collections, including Vladimir Nabokov’s CV for a job application to Yale and a family photo album that belonged to the Romanovs.

A selfie of Rossitza Atanassova with the building of the Stirling Memorial Library in the the background
Outside Yale's Stirling Memorial Library

A real highlight of my visit was the day I spent at the Lewis Walpole Library (LWP), located in Farmington, about 40 miles from the Yale campus. The LWP is a research centre of eighteenth-century studies and an essential resource for the study of Horace Walpole. The collections including important holdings of British prints and drawings were donated to Yale by Wilmarth and Annie Lewis in 1970s, together with several eighteenth-century historic buildings and land.

Prior to my arrival James had conducted archival research with the catalogues of the LWP satirical prints collections, a case study for our project. As well as visiting the modern reading room to take a look at the printed card catalogues many in hand of Mrs Lewis, we were given a tour of Mr and Mrs Lewis’ house which is now used for classes, workshops and meetings. I enjoyed meeting the LWP staff and learned much about the history of the place, the collectors' lives and LWP current initiatives.

One of the historic buildings on the Lewis Walpole Library site - The Roots House, a white Georgian-style building with a terrace, used to house visiting fellows and guests
The Root House which houses residential fellows

 

One of the historic buildings on the Lewis Walpole Library site - a red-coloured building surrounded by trees
Thomas Curricomp House

 

The main house, a white Georgian-style house, seen from the side, with the entrance to the Library on the left
The Cowles House, where Mr and Mrs Lewis lived

 

The two project events I was involved with took place at the Yale DH Lab. During the interactive workshop, Yale Library, faculty and students worked through the training materials on using AntConc for computational analysis and performed a number of tasks with the LWP satirical prints descriptions. There were discussions about the different ways of querying the data and the suitability of this tool for use with non-European languages and scripts. It was great to hear that this approach could prove useful for querying and promoting Yale’s own open access metadata.

 

James talking to a group of people seated at a table, with a screen behind him showing some text data
James presenting at the workshop about AntConc
Rossitza standing next to a screen with a slide about her talk facing the audience
Rossitza presenting her research with incunabula catalogue descriptions

 

The talks addressed the questions around cataloguing labour and curatorial voices, the extent to which computational analysis enables new research questions and can assist practitioners with remedial work involving collections metadata. I spoke about my current RLUK fellowship project with the British Library incunabula descriptions and in particular the history of cataloguing, the process to output text data and some hypotheses to be tested through computational analysis. The following discussion raised questions about the effort that goes into this type of work and the need to balance a greater user access to library and archival collections with the very important considerations about the quality and provenance of metadata.

During my visit I had many interesting conversations with Yale Library staff, Nicole Bouché, Daniel Lovins, Daniel Dollar, and caught up with folks I had met at the 2022 IIIF Conference, Tripp Kirkpatrick, Jon Manton and Emmanuelle Delmas-Glass. I was curious to learn about recent organisational changes aimed to unify the Yale special collections and enhance digital access via IIIF metadata; the new roles of Director of Computational Data and Methods in charge of the DH Lab and Cultural Heritage Data Engineer to transform Yale data into LOUD.

This has been a truly informative and enjoyable visit and my special thanks go to Cindy Roman and Kayla Shipp who hosted my visit and project events at the start of a busy term and to James for the opportunity to work with him on this project.

This blogpost is by Dr Rossitza Atanassova, Digital Curator for Digitisation, British Library. She is on Twitter @RossiAtanassova  and Mastodon @[email protected]

06 February 2023

A Year In Three Wikithons: The Lord Chamberlain's Plays

The second year of the Wikimedia residency has allowed us to pay specific attention to the work being done on the Lord Chamberlain’s Plays, specifically the excellent research project work by Professor Kate Dossett (University of Leeds). Kate teaches American History at the University of Leeds, and is currently working on ‘Black Cultural Archives & the Making of Black Histories: Archives of Surveillance and Black Transnational Theatre’, a project supported by an Independent Social Research Foundation Fellowship and a Fellowship from the Eccles Centre. Her work focuses on the understudied area of Black theatre history in the first half of the twentieth century, and when we had the chance to collaborate, we leapt on it!

One of the things we wanted to do was run a series of three Wikithons, each celebrating a different aspect of the collection: in this case, the role of women; the ways in which censorship impacted creativity for Black theatre makers and the political surveillance of Black creatives. Alongside these Wikithons, we are developing a Wikibase structure to enable users to search the Lord Chamberlain’s Plays index cards from anywhere in the world. A blog on this work is forthcoming.

What transpired from our Wikithon dream was a series of three excellent events, interactions and collaborative work with a number of exceptional researchers and historians, all mixed in with a year of administrative tumult as we felt the impact of numerous strikes (academic and transport), the Royal funeral and the ongoing implications of the pandemic. 

This was an important learning opportunity for us to examine the role and impact of Wikithons, and consider different methods of delivery and engagement, tying into bigger conversations happening around Wikipedia on an international scale. It was a year in three Wikithons!

Event One (March 2022)

Our first event took place in March 2022. Having only just gotten over the dreaded Covid myself, the long-term impact of the pandemic was sorely felt: we were just out of some winter restrictions, and we felt it was best to hold this event as an online session, due to the uncertainty of the months ahead. Further to this, we had to look at dates that would not interrupt or clash with the ongoing University and College Union strikes. Once we had this in hand, we were ready to open the (virtual) doors to Black Theatre and the Archive: Making Women Visible, 1900-1950

We were lucky to have speakers from the Library, Alexander Lock and Laura Walker, to talk about and contextualise the materials, while Kate herself offered a thematic and political overview of the importance of the work we were to embark upon. Despite the strikes, the pandemic and the demands of early 2022, 9 editors added over 1600 words, 21 references and 84 total edits. Changes made on this day have now been viewed over 25000 times. For a small batch of changes, that is a significant impact! Articles edited included Elisabeth Welch, Anna Lucasta and Edric Connor. I was grateful to Stuart Prior and Dr Francesca Allfrey for the training support at this event, and to Heather Pascall from the News Reference Team who offered her expertise on the day. The British Newspaper Archive also gave us access to their online resource for this event, which was both generous and very helpful.

Image of Pauline Henriques, BBC UK Government, Public domain, via Wikimedia Commons
Image of Pauline Henriques, BBC UK Government, Public domain, via Wikimedia Commons

Event Two (November 2022)

After a summer of political upheaval, a royal funeral and further transport strikes, we finally made it to Leeds Playhouse on the 7th of November 2022. As luck would have it, there was a train strike running that day, but as most of our participants were local to Leeds, there was thankfully very little impact on our numbers. Leeds Playhouse was the perfect home for this Wikithon: Furnace Producer Rio Matchett was a fantastic ambassador for the venue, and made sure we were fed and watered in style. Hope Miyoba was there to support me in training both sessions and I am so grateful to her for her support, particularly as my laptop wasn’t working!

We took over the Playhouse for the full day, running Wikithon sessions in the morning and afternoon, with a lunchtime talk by Joe Williams of Heritage Corner Leeds which was attended by morning and afternoon attendees, as well as some members of the public. Joe’s talk on Sankofa Yorkshire was a brilliant overview of Black creativity in the Leeds area throughout history, and informed a lot of our conversation around the politics and practicalities of Wiki editing in an equitable way. Articles edited included Una Marson, a central figure in Kate’s research and the Lord Chamberlain’s Plays.

It was fantastic to be in person again, and to meet the excellent community of creatives at Leeds Playhouse. Joe’s talk was inspirational and the questions it provoked regarding the way in which the Wikimedia guidelines for notability can negatively impact the prevalence of Black creatives on Wiki were a much needed point of discussion.

Image of Leeds Playhouse illuminated at night
Anthony Robling, CC BY-SA 4.0, via Wikimedia Commons

Event Three (January 2023)

Our arrival at the iconic National Archives building at Kew was long awaited and months in the planning. Drs Jo Pugh and Kevin Searle were exceptionally helpful and supportive as we planned our way to the ‘Black Theater Making and Surveillance’ event in January 2023. We were delighted to be in the building, and even happier to welcome Perry Blankson of the Young Historians Project to present his work on The Secret War on Black Power in Britain and the Caribbean. Gathering in a central space in the Archives, Dr Searle curated an amazing selection of archival materials for participants to view and utilise, including documents from the Information Research Department.

Some of the documents on display at Kew, image by the author
Some of the documents on display at Kew, image by the author

Our conversations on this day turned towards the idea of Wiki notability and the use of primary sources in establishing authority on Wikipedia in particular. I was grateful once again to Stuart Prior and Dr Francesca Allfrey for their support and training assistance, and moreover for the thoughtful and important conversations we fostered around the ways in which the politics of the present day can cloud and impact what happens on Wiki and how events and politics can be reported. A truly breathtaking moment was when Dr Searle and his colleagues allowed us to look at the Windrush manifest, a material reminder of a significant and hugely important moment in modern Britain. It was wonderful, also, to welcome Dr Cara Rodway, Head of Research Development and Philip Abraham, Deputy Head of the Eccles Centre, to join us in seeing this final event in the Wikithon series.

Image of the National Archives building in Kew on a sunny day
The National Archives, Kew by Christopher Hilton, CC BY-SA 2.0, via Wikimedia Commons

Conclusion

Despite a year of unforeseeable events, disruption and obstacles, I am immensely proud of what this series of Wikithons achieved, bringing aspects of modern society into direct conversation with our literary archives, asking questions about race, equality and diversity in Britain. We were lucky to work with creative practitioners and speakers like Joe Williams and Perry Blankson, and to be afforded the chance to really think about what it is to edit Wiki, and to try to improve the world in this way. It has allowed me to think more deeply about the wider Wiki conversations around how best to engage with and train new Wiki editors, and how to look at collections in new and impactful ways. I am very grateful to the American Trust for the British Library and the Eccles Centre for American studies for their support in achieving this work.

This blogpost is by Dr Lucy Hinnie, Wikimedian in Residence, British Library. She is on Twitter @BL_Wikimedian.

28 October 2022

Learn more about Living with Machines at events this winter

Digital Curator, and Living with Machines Co-Investigator Dr Mia Ridge writes…

The Living with Machines research project is a collaboration between the British Library, The Alan Turing Institute and various partner universities. Our free exhibition at Leeds City Museum, Living with Machines: Human stories from the industrial age, opened at the end of July. Read on for information about adult events around the exhibition…

Museum Late: Living with Machines, Thursday 24 November, 2022

6 - 10pm Leeds City Museum • £5, booking essential https://my.leedstickethub.co.uk/19101

The first ever Museum Late at Leeds City Museum! Come along to experience the museum after hours with music, pub quiz, weaving, informal workshops, chats with curators, and a quiz. Local food and drinks in the main hall.

Full programme: https://museumsandgalleries.leeds.gov.uk/events/leeds-city-museum/museum-late-living-with-machines/

Tickets: https://my.leedstickethub.co.uk/19101

Study Day: Living with Machines, Friday December 2, 2022

10:00 am - 4:00 pm Online • Free but booking essential: https://my.leedstickethub.co.uk/18775

A unique opportunity to hear experts in the field illuminate key themes from the exhibition and learn how exhibition co-curators found stories and objects to represent research work in AI and digital history. This study day is online via Zoom so that you can attend from anywhere.

Full programme: https://museumsandgalleries.leeds.gov.uk/events/leeds-city-museum/living-with-machines-study-day/

Tickets: https://my.leedstickethub.co.uk/18775

Living with Machines Wikithon, Saturday January 7, 2023

1 – 4:30pm Leeds City Museum • Free but booking essential: https://my.leedstickethub.co.uk/19104

Ever wanted to try editing Wikipedia, but haven't known where to start? Join us for a session with our brilliant Wikipedian-in-residence to help improve Wikipedia’s coverage of local lives and topics at an editathon themed around our exhibition. 

Everyone is welcome. You won’t require any previous Wiki experience but please bring your own laptop for this event. Find out more, including how you can prepare, in my blog post on the Living with Machines site, Help fill gaps in Wikipedia: our Leeds editathon.

The exhibition closes the next day, so it really is your last chance to see it!

Full programme: https://museumsandgalleries.leeds.gov.uk/events/leeds-city-museum/living-with-machines-wikithon-exploring-the-margins/

Tickets: https://my.leedstickethub.co.uk/19104

If you just want to try out something more hands on with textiles inspired by the exhibition, there's also a Peg Loom Weaving Workshop, and not one but two Christmas Wreath Workshops.

You can find out more about our exhibition on the Living with Machines website.

Lwm800x400

20 September 2022

Learn more about what AI means for us at Living with Machines events this autumn

Digital Curator, and Living with Machines Co-Investigator Dr Mia Ridge writes…

The Living with Machines research project is a collaboration between the British Library, The Alan Turing Institute and various partner universities. Our free exhibition at Leeds City Museum, Living with Machines: Human stories from the industrial age, opened at the end of July. Read on for information about adult events around the exhibition…

AI evening panels and workshop, September 2022

We’ve put together some great panels with expert speakers guaranteed to get you thinking about the impact of AI with their thought-provoking examples and questions. You'll have a chance to ask your own questions in the Q&A, and to mingle with other attendees over drinks.

We’ve also collaborated with AI Tech North to offer an exclusive workshop looking at the practical aspects of ethics in AI. If you’re using or considering AI-based services or tools, this might be for you. Our events are also part of the jam-packed programme of the Leeds Digital Festival #LeedsDigi22, where we’re in great company.

The role of AI in Creative and Cultural Industries

Thu, Sep 22, 17:30 – 19:45 BST

Leeds City Museum • Free but booking required

https://www.eventbrite.com/e/the-role-of-ai-in-creative-and-cultural-industries-tickets-395003043737

How will AI change what we wear, the TV and films we watch, what we read? 

Join our fabulous Chair Zillah Watson (independent consultant, ex-BBC) and panellists Rebecca O’Higgins (Founder KI-AH-NA), Laura Ellis (Head of Technology Forecasting, BBC) and Maja Maricevic, (Head of Higher Education and Science, British Library) for an evening that'll help you understand the future of these industries for audiences and professionals alike. 

Maja's written a blog post on The role of AI in creative and cultural industries with more background on this event.

 

Workshop: Developing ethical and fair AI for society and business

Thu, Sep 29, 13:30 - 17:00 BST

Leeds City Museum • Free but booking required

https://www.eventbrite.com/e/workshop-developing-ethical-and-fair-ai-for-society-and-business-tickets-400345623537

 

Panel: Developing ethical and fair AI for society and business

Thu, Sep 29, 17:30 – 19:45 BST

Leeds City Museum • Free but booking required

https://www.eventbrite.com/e/panel-developing-ethical-and-fair-ai-for-society-and-business-tickets-395020706567

AI is coming, so how do we live and work with it? What can we all do to develop ethical approaches to AI to help ensure a more equal and just society? 

Our expert Chair, Timandra Harkness, and panellists Sherin Mathew (Founder & CEO of AI Tech UK), Robbie Stamp (author and CEO at Bioss International), Keely Crockett (Professor in Computational Intelligence, Manchester Metropolitan University) and Andrew Dyson (Global Co-Chair of DLA Piper’s Data Protection, Privacy and Security Group) will present a range of perspectives on this important topic.

If you missed our autumn events, we also have a study day and Wikipedia editathon this winter. You can find out more about our exhibition on the Living with Machines website.

Lwm800x400

20 April 2022

Importing images into Zooniverse with a IIIF manifest: introducing an experimental feature

Digital Curator Dr Mia Ridge shares news from a collaboration between the British Library and Zooniverse that means you can more easily create crowdsourcing projects with cultural heritage collections. There's a related blog post on Zooniverse, Fun with IIIF.

IIIF manifests - text files that tell software how to display images, sound or video files alongside metadata and other information about them - might not sound exciting, but by linking to them, you can view and annotate collections from around the world. The IIIF (International Image Interoperability Framework) standard makes images (or audio, video or 3D files) more re-usable - they can be displayed on another site alongside the original metadata and information provided by the source institution. If an institution updates a manifest - perhaps adding information from updated cataloguing or crowdsourcing - any sites that display that image automatically gets the updated metadata.

Playbill showing the title after other large text
Playbill showing the title after other large text

We've posted before about how we used IIIF manifests as the basis for our In the Spotlight crowdsourced tasks on LibCrowds.com. Playbills are great candidates for crowdsourcing because they are hard to transcribe automatically, and the layout and information present varies a lot. Using IIIF meant that we could access images of playbills directly from the British Library servers without needing server space and extra processing to make local copies. You didn't need technical knowledge to copy a manifest address and add a new volume of playbills to In the Spotlight. This worked well for a couple of years, but over time we'd found it difficult to maintain bespoke software for LibCrowds.

When we started looking for alternatives, the Zooniverse platform was an obvious option. Zooniverse hosts dozens of historical or cultural heritage projects, and hundreds of citizen science projects. It has millions of volunteers, and a 'project builder' that means anyone can create a crowdsourcing project - for free! We'd already started using Zooniverse for other Library crowdsourcing projects such as Living with Machines, which showed us how powerful the platform can be for reaching potential volunteers. 

But that experience also showed us how complicated the process of getting images and metadata onto Zooniverse could be. Using Zooniverse for volumes of playbills for In the Spotlight would require some specialist knowledge. We'd need to download images from our servers, resize them, generate a 'manifest' list of images and metadata, then upload it all to Zooniverse; and repeat that for each of the dozens of volumes of digitised playbills.

Fast forward to summer 2021, when we had the opportunity to put a small amount of funding into some development work by Zooniverse. I'd already collaborated with Sam Blickhan at Zooniverse on the Collective Wisdom project, so it was easy to drop her a line and ask if they had any plans or interest in supporting IIIF. It turns out they had, but hadn't had the resources or an interested organisation necessary before.

We came up with a brief outline of what the work needed to do, taking the ability to recreate some of the functionality of In the Spotlight on Zooniverse as a goal. Therefore, 'the ability to add subject sets via IIIF manifest links' was key. ('Subject set' is Zooniverse-speak for 'set of images or other media' that are the basis of crowdsourcing tasks.) And of course we wanted the ability to set up some crowdsourcing tasks with those items… The Zooniverse developer, Jim O'Donnell, shared his work in progress on GitHub, and I was very easily able to set up a test project and ask people to help create sample data for further testing. 

If you have a Zooniverse project and a IIIF address to hand, you can try out the import for yourself: add 'subject-sets/iiif?env=production' to your project builder URL. e.g. if your project is number #xxx then the URL to access the IIIF manifest import would be https://www.zooniverse.org/lab/xxx/subject-sets/iiif?env=production

Paste a manifest URL into the box. The platform parses the file to present a list of metadata fields, which you can flag as hidden or visible in the subject viewer (public task interface). When you're happy, you can click a button to upload the manifest as a new subject set (like a folder of items), and your images are imported. (Don't worry if it says '0 subjects).

 

Screenshot of manifest import screen
Screenshot of manifest import screen

You can try out our live task and help create real data for testing ingest processes at ​​https://frontend.preview.zooniverse.org/projects/bldigital/in-the-spotlight/classify

This is a very brief introduction, with more to come on managing data exports and IIIF annotations once you've set up, tested and launched a crowdsourced workflow (task). We'd love to hear from you - how might this be useful? What issues do you foresee? How might you want to expand or build on this functionality? Email [email protected] or tweet @mia_out @LibCrowds. You can also comment on GitHub https://github.com/zooniverse/Panoptes-Front-End/pull/6095 or https://github.com/zooniverse/iiif-annotations

Digital work in libraries is always collaborative, so I'd like to thank British Library colleagues in Finance, Procurement, Technology, Collection Metadata Services and various Collections departments; the Zooniverse volunteers who helped test our first task and of course the Zooniverse team, especially Sam, Jim and Chris for their work on this.

 

18 March 2022

Looking back at LibCrowds: surveying our participants

'In the Spotlight' is a crowdsourcing project from the British Library that aims to make digitised historical playbills more discoverable, while also encouraging people to closely engage with this otherwise less accessible collection. Digital Curator Dr Mia Ridge writes...

If you follow our @LibCrowds account on twitter, you might have noticed that we've been working on refreshed versions of our In the Spotlight tasks on Zooniverse. That's part of a small project to enable the use of IIIF manifests on Zooniverse - in everyday language, it means that many, many more digitised items can form the basis of crowdsourcing tasks in the Zooniverse Project Builder, and In the Spotlight is the first project to use this new feature. Along with colleagues in Printed Heritage and BL Labs, I've been looking at our original Pybossa-based LibCrowds site to plan a 'graceful ending' for first phase of the project on LibCrowds.com.

As part of our work documenting and archiving the original LibCrowds site, I'm delighted to share summary results from a 2018 survey of In the Spotlight participants, now published on the British library's Research Repository: https://doi.org/10.23636/w4ee-yc34. Our thanks go to Susan Knight, Customer Insight Coordinator, for her help with the survey.

The survey was designed to help us understand who In the Spotlight participants were, and to help us prioritise work on the project. The 22 question survey was based on earlier surveys run by the Galaxy Zoo and Art UK Tagger projects, to allow comparison with other crowdsourcing projects, and to contribute to our understanding of crowdsourcing in cultural heritage more broadly. It was open to anyone who had contributed to the British Library's In the Spotlight project for historical playbills. The survey was distributed to LibCrowds newsletter subscribers, on the LibCrowds community forum and on social media.

Some headline findings from our survey include:

  • Respondents were most likely to be a woman with a Masters degree, in full-time employment, in London or Southeast UK, who contributes in a break between other tasks or 'whenever they have spare time'.
  • 76% of respondents were motivated by contributing to historical or performance research

Responses to the question 'What was it about this project which caused you to spend more time than intended on it?':

  • Easy to do
  • It's so entertaining
  • Every time an entry is completed you are presented with another item which is interesting and
  • illuminating which provides a continuous temptation regarding what you might discover next
  • simplicity
  • A bit of competitiveness about the top ten contributors but also about contributing something useful
  • I just got carried away with the fun
  • It's so easy to complete
  • Easy to want to do just a few more
  • Addiction
  • Felt I could get through more tasks
  • Just getting engrossed
  • It can be a bit addictive!
  • It's so easy to do that it's very easy to get carried away.
  • interested in the [material]

The summary report contains more rich detail, so go check it out!

 

Crowdsourcing projects from the British Library. 2,969 Volunteers. 265,648 Contributions. 175 Projects
Detail of the front page of libcrowds.com; Crowdsourcing projects from the British Library. 2,969 Volunteers. 265,648 Contributions. 175 Projects

14 March 2022

The Lotus Sutra Manuscripts Digitisation Project: the collaborative work between the Heritage Made Digital team and the International Dunhuang Project team

Digitisation has become one of the key tasks for the curatorial roles within the British Library. This is supported by two main pillars: the accessibility of the collection items to everybody around the world and the preservation of unique and sometimes, very fragile, items. Digitisation involves many different teams and workflow stages including retrieval, conservation, curatorial management, copyright assessment, imaging, workflow management, quality control, and the final publication to online platforms.

The Heritage Made Digital (HMD) team works across the Library to assist with digitisation projects. An excellent example of the collaborative nature of the relationship between the HMD and International Dunhuang Project (IDP) teams is the quality control (QC) of the Lotus Sutra Project’s digital files. It is crucial that images meet the quality standards of the digital process. As a Digitisation Officer in HMD, I am in charge of QC for the Lotus Sutra Manuscripts Digitisation Project, which is currently conserving and digitising nearly 800 Chinese Lotus Sutra manuscripts to make them freely available on the IDP website. The manuscripts were acquired by Sir Aurel Stein after they were discovered  in a hidden cave in Dunhuang, China in 1900. They are thought to have been sealed there at the beginning of the 11th century. They are now part of the Stein Collection at the British Library and, together with the international partners of the IDP, we are working to make them available digitally.

The majority of the Lotus Sutra manuscripts are scrolls and, after they have been treated by our dedicated Digitisation Conservators, our expert Senior Imaging Technician Isabelle does an outstanding job of imaging the fragile manuscripts. My job is then to prepare the images for publication online. This includes checking that they have the correct technical metadata such as image resolution and colour profile, are an accurate visual representation of the physical object and that the text can be clearly read and interpreted by researchers. After nearly 1000 years in a cave, it would be a shame to make the manuscripts accessible to the public for the first time only to be obscured by a blurry image or a wayward piece of fluff!

With the scrolls measuring up to 13 metres long, most are too long to be imaged in one go. They are instead shot in individual panels, which our Senior Imaging Technicians digitally “stitch” together to form one big image. This gives online viewers a sense of the physical scroll as a whole, in a way that would not be possible in real life for those scrolls that are more than two panels in length unless you have a really big table and a lot of specially trained people to help you roll it out. 

Photo showing the three individual panels of Or.8210S/1530R with breaks in between
Or.8210/S.1530: individual panels
Photo showing the three panels of Or.8210S/1530R as one continuous image
Or.8210/S.1530: stitched image

 

This post-processing can create issues, however. Sometimes an error in the stitching process can cause a scroll to appear warped or wonky. In the stitched image for Or.8210/S.6711, the ruled lines across the top of the scroll appeared wavy and misaligned. But when I compared this with the images of the individual panels, I could see that the lines on the scroll itself were straight and unbroken. It is important that the digital images faithfully represent the physical object as far as possible; we don’t want anyone thinking these flaws are in the physical item and writing a research paper about ‘Wonky lines on Buddhist Lotus Sutra scrolls in the British Library’. Therefore, I asked the Senior Imaging Technician to restitch the images together: no more wonky lines. However, we accept that the stitched images cannot be completely accurate digital surrogates, as they are created by the Imaging Technician to represent the item as it would be seen if it were to be unrolled fully.

 

Or.8210/S.6711: distortion from stitching. The ruled line across the top of the scroll is bowed and misaligned
Or.8210/S.6711: distortion from stitching. The ruled line across the top of the scroll is bowed and misaligned

 

Similarly, our Senior Imaging Technician applies ‘digital black’ to make the image background a uniform colour. This is to hide any dust or uneven background and ensure the object is clear. If this is accidentally overused, it can make it appear that a chunk has been cut out of the scroll. Luckily this is easy to spot and correct, since we retain the unedited TIFFs and RAW files to work from.

 

Or.8210/S.3661, panel 8: overuse of digital black when filling in tear in scroll. It appears to have a large black line down the centre of the image.
Or.8210/S.3661, panel 8: overuse of digital black when filling in tear in scroll

 

Sometimes the scrolls are wonky, or dirty or incomplete. They are hundreds of years old, and this is where it can become tricky to work out whether there is an issue with the images or the scroll itself. The stains, tears and dirt shown in the images below are part of the scrolls and their material history. They give clues to how the manuscripts were made, stored, and used. This is all of interest to researchers and we want to make sure to preserve and display these features in the digital versions. The best part of my job is finding interesting things like this. The fourth image below shows a fossilised insect covering the text of the scroll!

 

Black stains: Or.8210/S.2814, panel 9
Black stains: Or.8210/S.2814, panel 9
Torn and fragmentary panel: Or.8210/S.1669, panel 1
Torn and fragmentary panel: Or.8210/S.1669, panel 1
Insect droppings obscuring the text: Or.8210/S.2043, panel 1
Insect droppings obscuring the text: Or.8210/S.2043, panel 1
Fossilised insect covering text: Or.8210/S.6457, panel 5
Fossilised insect covering text: Or.8210/S.6457, panel 5

 

We want to minimise the handling of the scrolls as much as possible, so we will only reshoot an image if it is absolutely necessary. For example, I would ask a Senior Imaging Technician to reshoot an image if debris is covering the text and makes it unreadable - but only after inspecting the scroll to ensure it can be safely removed and is not stuck to the surface. However, if some debris such as a small piece of fluff, paper or hair, appears on the scroll’s surface but is not obscuring any text, then I would not ask for a reshoot. If it does not affect the readability of the text, or any potential future OCR (Optical Character Recognition) or handwriting analysis, it is not worth the risk of damage that could be caused by extra handling. 

Reshoot: Or.8210/S.6501: debris over text  /  No reshoot: Or.8210/S.4599: debris not covering text.
Reshoot: Or.8210/S.6501: debris over text  /  No reshoot: Or.8210/S.4599: debris not covering text.

 

These are a few examples of the things to which the HMD Digitisation Officers pay close attention during QC. Only through this careful process, can we ensure that the digital images accurately reflect the physicality of the scrolls and represent their original features. By developing a QC process that applies the best techniques and procedures, working to defined standards and guidelines, we succeed in making these incredible items accessible to the world.

Read more about Lotus Sutra Project here: IDP Blog

IDP website: IDP.BL.UK

And IDP twitter: @IDP_UK

Dr Francisco Perez-Garcia

Digitisation Officer, Heritage Made Digital: Asian and African Collections

Follow us @BL_MadeDigital

10 March 2022

Scoping the connections between trusted arts and humanities data repositories

CONNECTED: Connecting trusted Arts and Humanities data repositories is a newly funded activity, supported by AHRC. It is led by the British Library, with the Archaeology Data Service and the Oxford Text Archive as co-investigators, and is supported by consultants from MoreBrains Cooperative.The CONNECTED team believes that improving discovery and curation of heritage and emergent content types in the arts and humanities will increase the impact of cultural resources, and enhance equity. Great work is already being done on discovery services for the sector, so we decided to look upstream, and focus on facilitating repository and archive deposit.

The UK boasts a dynamic institutional repository environment in the HE sector, as well as a range of subject- or field-specific repositories. With a distributed repository landscape now firmly established, challenges and inefficiencies still remain that reduce its impact. These include issues around discovery and access, but also questions around interoperability, the relationship of specialised vs general infrastructures, and potential duplication of effort from an author/depositor perspective. Greater coherence and interoperability will effectively unite different trusted repository services to form a resilient distributed data service, which can grow over time as new individual services are required and developed. Alongside the other projects funded as part of ‘Scoping future data services for the arts and humanities’, CONNECTED will help to deliver this unified network.

As practice in the creative arts becomes more digital and the digital humanities continue to thrive, the diversity of ways in which this research is expressed continues to grow. Researchers are increasingly able to combine artefacts, documents, and materials in new and innovative ways; practice-based research in the arts is creating a diverse range of (often complex) outputs, creating new curation and discovery needs; and heritage collections often contain artefacts with large amounts of annotation and commentary amassed over years or centuries, across multiple formats, and with rich contextual information. This expansion is already exposing the limitations of our current information systems, with the potential for vital context and provenance to become invisible. Without additional, careful, future-proofing, the risks of information loss and limits on access will only expand. In addition, metadata creation, deposit, preservation, and discovery strategies should, therefore, be tailored to meet the very different needs of the arts and humanities.

A number of initiatives are aimed at improving interoperability between metadata sources in ways that are more oriented towards the needs of the arts and humanities. Drawing these together with the insights to be gained from the abilities (and limitations) of bibliographic and data-centric metadata and discovery systems, will help to generate robust services in the complex, evolving landscape of arts and humanities research and creation. 

The CONNECTED project will assemble experts, practitioners, and researchers to map current gaps in the content curation and discovery ecosystem and weave together the strengths and potentials of a range of platforms, standards, and technologies in the service of the arts and humanities community. Our activities will run until the end of May, and will comprise three phases:

Phase 1 - Discovery

We will focus on repository or archive deposit as a foundation for the discovery and preservation of diverse outputs, and also as a way to help capture the connections between those objects and the commentary, annotation, and other associated artefacts. 

A data service for the arts and humanities must be developed with researcher needs as a priority, so the project team will engage in a series of semi-structured interviews with a variety of stakeholders including researchers, librarians, curators, and information technologists. The interviews will explore the following ideas:

  • What do researchers need when engaging in discovery of both heritage materials and new outputs?
  • Are there specific needs that relate to different types of content or use-cases? For example, research involving multimedia or structured information processing at scale?
  • What can the current infrastructure support, and where are the gaps between what we have and what we need?
  • What are the feasible technical approaches to transform information discovery?

Phase 2 - Data service programme scoping and planning

The findings from phase 1 will be synthesised using a commercial product strategy approach known as a canvas analysis. Based on the initial impressions from the semi-structured interviews, it is likely that an agile, product, or value proposition canvas will be used to synthesise the findings and structure thinking so that a coherent and robust strategy can be developed. Outputs from the strategy canvas exercise will then be applied to a fully costed and scoped product roadmap and budget for a national data deposit service for the arts and humanities.

Phase 3 - Scoping a unified archiving solution

Building on the partnerships and conversations from the previous phases, the feasibility of a unified ‘deposit switchboard’ will be explored. The purpose of such a switchboard is to enable researchers, curators, and creators to easily deposit items in the most appropriate repository or archive in their field for the object type they are uploading. Using insights gained from the landscaping interviews in phase 1, the team will identify potential pathways to developing a routing service for channelling content to the most appropriate home.

We will conclude with a virtual community workshop to explore the challenges and desirability of the switchboard approach, with a special focus on the benefits this could bring to the uploader of new content and resources.

This is an ambitious project, through which we hope to deliver:

  • A fully costed and scoped technical and organisational roadmap to build the required components and framework for the National Collection
  • Improved usage of resources in the wider GLAM and institutional network, including of course the Archaeology Data Service, The British Library's Shared Research Repository, and the Oxford Text Archive
  • Steps towards a truly community-governed data infrastructure for the arts and humanities as part of the National Collection

As a result of this work, access to UK cultural heritage and outputs will be accelerated and simplified, the impact of the arts and humanities will be enhanced, and we will help the community to consolidate the UK's position as a global leader in digital humanities and infrastructure.

This post is from Rachael Kotarski (@RachPK), Principal Investigator for CONNECTED, and Josh Brown from MoreBrains.

Digital scholarship blog recent posts

Archives

Tags

Other British Library blogs