Digital Research and the year that was
With the conclusion of another successful British Library Labs Symposium, and what has been a rather unusual year, it is a good time to reflect on some of the things that the Digital Research Team at the British Library has been busy with – and some of our plans for the coming year too. Despite pandemic-related challenges, we managed to deliver various strands of work towards fulfilling our mission: to enable the use of the British Library’s digital collections for research, inspiration, creativity, and enjoyment.
We undertook innovative research, projects and collaborations, using digital methods on our collections to showcase their potential and improve access for our users. One such project is the Legacies of Catalogue Descriptions and Curatorial Voice: Opportunities for Digital Scholarship, with which Rossitza Atanassova from the team is involved. It is a 12-month collaboration between the Sussex Humanities Lab, the British Library, and Yale University Library, funded under the AHRC UK-US Collaboration for Digital Scholarship in Cultural Institutions: Partnership Development Grants scheme. The project covers computational, critical, and curatorial analysis of collection catalogues, and combines corpus linguistic methods and archival research to characterise curatorial “voice”. It also develops sectoral capability in digital scholarship through co-produced training materials and workshops.
Another example is our Living with Machines project, which has been thriving. For those of you who (still!) don’t know it, it is a collaborative project between the Alan Turing Institute, academics at UK universities and the British Library. Mia Ridge from the team is one of the project co-investigators. With her expertise in crowdsourcing, the project launched its first crowdsourcing tasks in late 2019, and watched them being finished by volunteers in the first weeks of lockdown. Contributors were asked to identify accidents reported in 19th century newspapers – this proved to be very popular! As a next step, the project is launching more ‘language’-related tasks – such as to read small sections of newspaper articles that talk about a 'machine‘, and then identify what kind of machine was meant in each example.
Mia’s other crowdsourcing initiative, the In the Spotlight project hosted on the Library’s LibCrowds platform, is almost ready to celebrate a whopping quarter of a million contributions to this platform. She plans to bring both those experiences to the process of collaboratively writing a book in a fortnight on crowdsourcing and digital participation in cultural heritage through a networking project called Collective Wisdom, funded by the same AHRC scheme mentioned above.
One of our main goals is to enable digital content to be as accessible as possible, and one of the most efficient ways to go about this is to create machine-readable text from digitised material through OCR/HTR - Optical Character Recognition and Handwritten Text Recognition. I (Adi) have been involved in work in this area as well as Tom Derrick, who works with the Two Centuries of Indian Print project. We made Arabic and Bengali OCR/HTR datasets available through the British Library repository, and delivered some OCR/HTR training to BL staff (one-day course and a Transkribus workshop).
Tom used Transkribus to OCR an entire series of books using the Bangla trained model (ca. 150 early Bengali books), and is now using Wikimedia Commons and Wikisource to present text of books alongside the scanned images. He plans on running a competition in partnership with the Bengali Wikisource community in spring 2021, encouraging volunteers to improve OCR of Bengali books in Wikisource. The plan is also to make these transcriptions available as an open dataset and keyword searchable through the Library viewer.
We welcomed two British Library Collaborative Doctoral Students researching different aspects relating to the creation, production, consumption, value and collecting of UK digital comics: Linda Berube, “Understanding UK digital comics information and publishing practices: From creation to consumption,” City, University of London, supervised by Ian Cooke; and Thomas Gebhart, “Collecting UK Digital Comics: Social, cultural and technological factors for cultural institutions”, University of the Arts, supervised by Stella Wisdom.
Digital comics in the UK are at the cutting-edge of how imaginative, immediate and emotionally engaging stories can be told in the 21st Century. New creative tools, new formats, and new methods of distribution have expanded the reach of ideas communicated through digital comics. On top of embracing technological change, digital comics have the potential to reflect, embrace and contribute to social and cultural change in the UK. We look forward to reading more about Thomas and Linda’s research!
The British Library and partners Birkbeck University and The National Archives have been awarded £222,420 in funding by the Institute of Coding (IoC) to co-develop a one-year part-time postgraduate Certificate (PGCert), Computing for Cultural Heritage, as part of a £4.8M University skills drive. Nora McGregor has co-ordinated this trial, aimed at information professionals working in the cultural heritage sector.
From deploying simple scripts for everyday tasks, to developing tools for analysing collections data, the British Library and The National Archives explored different ways to meet demand for such skills, arising particularly from colleagues in curatorial and collection-based roles. This trial explored a model whereby cultural heritage professionals could gain crucial computational skills, immediately relevant to their roles, while earning a formal qualification in computer science, with the express support of their institution.
A cohort of 20 staff from British Library (12) and The National Archives in the UK (8), undertook two newly designed modules at Birkbeck University as part of the trial: Demystifying Computing with Python and Work-based project: Digital project design and development. Examples of some of the exciting projects the cohort undertook can be found on the project page. A final module, Analytic Tools for Information Professionals, is currently under development and will be launched as part of the full Applied Data Science Postgraduate Certificate starting in January 2021, and a final report on the trial will be available in spring 2021.
Boosting Staff Skills
This year we kept running our usual Digital Scholarship Training Programme, transitioning fully to online delivery. This is an internal training programme aimed at enabling Library staff to support and/or undertake digital research. We run different types of events, including courses, hands-on workshops (Hack & Yacks), talks and a reading group. We’ve had Deirdre Sullivan helping us run many events as smoothly as possible, around topics such as Library Carpentry, OpenRefine, Machine Learning and AI, OCR/HTR, emerging formats, and Wikimedia training.
I’ve just finished co-ordinating and co-delivering a course on digital mapping last week with Gethin Rees and a few other colleagues. We took on board some of the learnings from last time that we ran the course, published in a Journal of Map & Geography Libraries article.
In total, this year, we hosted 50 events, with 1,167 overall attendees. These training opportunities reached 343 colleagues across 42 teams. Looking forward to seeing more of you next year!
One area of training worth focusing on is our collaboration with Wikimedia UK. During 2020, Stella Wisdom ran a programme of four online talks and five training sessions for BL staff, to raise awareness and understanding of the Wikimedia family of platforms, including Wikisource and Wikidata. The training sessions, entitled Wikimedia Wednesdays included the following:
- Getting Started with Wikipedia
- Contributing to Wikimedia Commons
- Transcribing and Translating with Wikisource
- Hack & Yack Wikipedia edit-a-thon session
- Working with Wikidata and an Introduction to SPARQL
Looking ahead to 2021, we plan to host a new British Library Wikimedian-in-Residence who will collaborate with Library colleagues to increase engagement with the Wikimedia community.
And here’s to a better, vaccine-fuelled, 2021!