Digital scholarship blog

Enabling innovative research with British Library digital collections

158 posts categorized "BL Labs"

28 March 2023

BL Labs Symposium 30 March 2023: AI and GLAM data

A small bird flying with a trail of dashes behind it showing their flight pattern

Don’t forget to register for the 2023 BL Labs Symposium (https://us02web.zoom.us/webinar/register/WN_oAApT1laSFSCm28Kyfz4bA)

Following the latest advancements in AI is almost a job in itself. The constant excitement sometimes feels almost bewildering, and it leaves us a little room to really get stuck into peculiarities and joys of data and AI methods and tools emerging in Galleries, Libraries, Archives and Museums (GLAM). For the second part of the BL Labs Symposium this year, we will be looking to spend some time with the examples of real data, tools and methods emerging in the GLAM AI world.

We will start our Data and AI session with an exciting presentation by Yannis Assael from Deep Mind. Yannis will show us Ithaca, the first Deep Neural Network interactive interface built to restore and attribute ancient Greek inscriptions. We expect this to be a real game changer for the use of AI for the collections that include complex and incomplete fragments of text.

The words Living With Machines in front of circles and cog shapes

We will also explore some British Library examples of AI and machine learning, mainly using the examples of data derived from our newspaper and map collections. Kalle Westerling will reflect on the latest from the Living with Machines project, this is a ground-breaking research collaboration between The Alan Turing Institute, the British Library, and the Universities of Cambridge, East Anglia, Exeter, and London (QMUL, King’s College). Gethin Rees will tell us about his work that is engaging public with geospatial data and in the process improving our capabilities to locate national collections.

BL Labs are dedicated to opening up the British Library’s data, especially for all researchers who want to use it for different types of computational research. This remains a daunting task. But we have been working on it! Silvija Aurylaite, BL Labs Manager, will share the BL Labs direction of travel, including sharing our new BL Labs website in Beta. The site will be live for the first time, with the Symposium audience kicking off our testing and engagement phase.

We hope that this session will give us some time to share and reflect on the ongoing AI work in GLAM with all its excitement, challenges and opportunities. All going well, there may be even a chance to get your hands on some new datasets.

We hope you can join us at the BL Labs Symposium on Thursday 30 March 2023. For the full programme, and further information on all our speakers, please read our earlier blog post. We are also delighted to be going ahead with an informal drinks and networking drop in session at the Library between 6.30pm and 7.30pm and you are all most welcome to join us. Register for this and / or the Symposium here

20 March 2023

Digital Storytelling at the 2023 BL Labs Symposium

One half of the 2023 British Library Labs Symposium will be dedicated to digital storytelling. This has been a significant part of BL Labs work over the years; we have collaborated with experimental artists from David Normal’s creative reuse of British Library Flickr images for his giant lightbox collage Crossroads of Curiosity installation at the 2014 Burning Man festival, to working with first runner up in the BL Labs 2016 competition Michael Takeo Magruder on his 2019 exhibition Imaginary Cities.

People looking at lightbox collage artworks
Crossroads of Curiosity by David Normal

In the last few years, due to the COVID-19 pandemic disruption, digital stories and engagement have become mainstream across the Galleries, Libraries, Archives and Museums (GLAM) sector. New types of digital storytelling mixing social media, online exhibitions embedding narratives and digital objects, and interactive online events reaching entirely new audiences, delighted us all. However, we also discovered that there can be a saturation point with online engagement, and that many digital developments have some way to go to reach their full potential.

As we are hopefully entering healthier times, new opportunities to mix virtual and physical worlds are starting to open up. With this in mind, we felt that this is the right moment to explore a new age of digital storytelling at the 2023 BL Labs Symposium.

The idea is to explore what is changing in the world of technological possibilities and how they are continuing to develop. We have envisaged a journey that will take us from the big picture of the arising digital possibilities to more specific examples from the British Library’s work. In true BL Labs spirit we will also celebrate initiatives that creatively reuse the Library’s digital collections.

To help us look into the big trends, we are delighted to be joined by Zillah Watson, whose extraordinary breath of experience working with BBC, Meta, BFI and Royal Shakespeare Company amongst many others, will help us to get a deeper sense of the opportunities of virtual reality (VR). Zillah will look into what it means, not just to be dazzled with technological possibilities, but also to enter the magic of storytelling.

Talking of magic, we are lucky to welcome award winning Director, Anrick Bregman, and award winning Producer, Grace Baird. Anrick and Grace will take us deeper into the potential of using VR to uncover hidden stories. Anrick’s film A Convict Story is an interactive VR project built on British Library data that brings to life a story discovered by the linking of data from centuries ago, using data research powered by machine learning.

Even closer to home, our own Stella Wisdom and Ian Cooke, will talk about their current work on curating the British Library’s forthcoming Digital Storytelling exhibition (2 June – 15 October 2023), which will explore the ways technology provides opportunities to transform and enhance the way writers write and readers engage. Drawing on the Library’s collection of contemporary digital publications and emerging formats to highlight the work of innovative and experimental writers. It will feature interactive works that invite and respond to user input, reading experiences influenced by data feeds, and immersive story worlds created using multiple platforms and audience participation. This is an exciting development, as we can see how earlier British Library creative digital experiments, collaborations and research projects are building into an exhibition in its own right.

We hope you can join us for discussion at the BL Labs Symposium on Thursday 30 March 2023. For the full programme, and further information on all our speakers, please read our earlier blog post.

 You can book your place here

02 March 2023

BL Labs Symposium 2023: Programme and Speakers announced

Book illustration of a shelf of books with "Informed" spelled across their spines
British Library digitised image from BL Flickr Collection - When Life is Young: a collection of verse for boys and girls by Mary Elizabeth Dodge

The BL Labs Symposium 2023 is taking place on Thursday 30th March as an online webinar.

This year we will be exploring two themes – digital storytelling and innovative uses of data and AI. As always, we are aiming to hear from some guest speakers, as well as showcase the recent work using the British Library digital collections. The programme also include an update of BL Labs, including our new website and services.

We hope this will spark many further ideas and collaborations.

The full programme for the BL Labs Symposium is as follows:

14.00 – Welcome

Part 1: Digital Storytelling

14.05 – How to bring the magic of VR to audiences – Zillah Watson

14.15 – There Exists – A VR experience about hidden narratives – Anrick Bregman and Grace Baird

14.25 – Curating a Digital Storytelling exhibition – Stella Wisdom and Ian Cooke

14.35 – Panel Q&A

15.00 – In Memoriam Maurice Nicholson

15.05 – Break

15.15 – BL Labs Update – Silvija Aurylaite

Part 2 – Data and AI

15.35 - Ithaca: Restoring and attributing ancient texts using deep neural networks - Yannis Assael

15.45 – Living with Machines: Using digitised newspaper collections from the British Library in a data science project – Kalle Westerling

15.55 – Locating a National Collections through audience research. How cultural heritage organisations can engage the public using geospatial data – Gethin Rees

16.05 – Panel Q&A

16.30 – END

You can register for the BL Labs Symposium here.

We are currently planning an evening networking session at the British Library, starting at 18.30 for those who can join us in London. We are aware of the train strike planned for this day, so will confirm details nearer the time.

Below are a few details about our speakers:

Head and shoulders photograph of Zillah Watson
Zillah Watson

Zillah Watson

Zillah Watson led the BBC's award winning VR studio, winning a host of awards at festivals around the world, including an Emmy nomination. She led pioneering work taking VR to audiences in libraries around the UK. She now consults on the metaverse, and content and audience growth strategies for organisations including Meta, London & Partners, the BFI, International News Media Association, Arts Council England, and the Royal Shakespeare Company. She's had a long and varied media career, including 20 years at the BBC, where she was a TV and radio current affairs journalist, head of editorial standards for BBC Radio and led R&D research on future content. She is a lecturer at UCL and the new London Interdisciplinary School. She recently co-founded Phase Space, a tech for good start-up to use VR to support mental health for students and young people.

Head and shoulders photograph of Anrick Bregman
Anrick Bregman

Anrick Bregman

Anrick is director and founder of an R&D studio that explores the future of spatial immersive storytelling by creating experiences built with virtual and augmented reality, computer vision and machine learning. His mission is to find new and interesting ways to merge technology with meaningful narratives which explore the human experience.

Head and shoulders photograph of Grace Baird
Grace Baird

Grace Baird

Grace is a Producer with twelve years' experience working on audience-centred projects in the Arts, TV, and Immersive industries. She is experienced in immersive and digital production and distribution, particularly entertainment content. Grace has produced a variety of innovative projects including site-specific installations, an interactive feature-film, and social-VR experiences.

Head and shoulders photograph of Stella Wisdom
Stella Wisdom

Stella Wisdom

Stella is Digital Curator for Contemporary British Collections at the British Library. Promoting creative and innovative reuse of digital collections, encouraging game making and digital storytelling in libraries, including collaborating widely with The National Videogame Museum, AdventureX, International Games Month in Libraries, the New Media Writing Prize and on research projects with University College London’s Institute of Education and Lancaster University. Stella research interests also explore the archiving of complex born digital material, examining methods for the collection, preservation and curation of narrative apps, digital comics and interactive fiction.

Head and shoulders photograph of Ian Cooke
Ian Cooke

Ian Cooke

Ian is Head of Contemporary British Publications at the British Library. He has worked in academic and research libraries with a focus on 20th and 21st-century history and social sciences. His interests are in the role of publishing in contemporary communications, and the everyday experience and expression of politics. 

Head and shoulders photograph of Silvija Aurylaite
Silvija Aurylaite

Silvija Aurylaite

Silvija Aurylaite is BL Labs Manager. She previously worked on the British Library Heritage Made Digital Programme. Her interests and domain of expertise include copyright, curation of digital collections of museums, archives and libraries, data science, design, creativity and social entrepreneurship. Previously, she was an initiator of a new publishing project Public Domain City that aimed to bring a new life into curious & obscure historical books on science, technology and nature. She also organized a retrospective dance film festival Dance in Film, Choreography, Body and Image, and media dance educational activities at the National Gallery of Art in Vilnius.

Head and shoulders photograph of Yannis Assael
Yannis Assael

Yannis Assael

Dr. Yannis Assael is a Staff Research Scientist at Google DeepMind working on Artificial Intelligence, and he is featured in Forbes' "30 Under 30" distinguished scientists of Europe. In 2013, he graduated from the Department of Applied Informatics, University of Macedonia, and with full scholarships, he did an MSc at the University of Oxford, finishing first in his year, and an MRes at Imperial College London. In 2016, he returned to Oxford for a DPhil degree with a Google DeepMind scholarship, and after a series of research breakthroughs and entrepreneurial activities, he started as a researcher at Google DeepMind. His contributions range from audio-visual speech recognition to multi-agent communication and AI for culture and the study of damaged ancient texts. Throughout this time, his research has attracted the media's attention several times, has been featured on the cover of the scientific journal Nature, and focuses on contributing to and expanding the greater good.

Head and shoulders photograph of Kalle Westerling
Kalle Westerling

Kalle Westerling

Dr Kalle Westerling is a Digital Humanities Research Software Engineer with Living with Machines, a collaboration between the British Library, the Alan Turing Institute, and researchers from a range of UK universities. Kalle holds a Ph.D. in Theatre and Performance Studies from The Graduate Center, City University of New York (CUNY), where he visualised and analysed networks of itinerant nightlife performers around New York City in the 1930s. Prior to joining the British Library, Kalle managed the Scholars program at HASTAC and the Digital Humanities Research Institute at CUNY, both efforts across higher education institutions in the United States, aiming to build nation-wide infrastructures and communities for digital humanities skill-building.

Head and shoulders photograph of Gethin Rees
Gethin Rees

Gethin Rees

Gethin’s role at the British Library includes helping to manage the non-print legal deposit of digital maps and coordinating the Georeferencer crowd-sourcing project. He is interested in helping research projects to get the most out of geospatial data and tools and was principal investigator of the AHRC-funded Locating a National Collection project. Before taking up his current position in 2018 he worked on two collaborative history projects funded by the ERC and as a software developer. His PhD in archaeology from University of Cambridge made use of Geographical Information Systems for spatial analysis and data management.

12 April 2022

Making British Library collections (even) more accessible

Daniel van Strien, Digital Curator, Living with Machines, writes:

The British Library’s digital scholarship department has made many digitised materials available to researchers. This includes a collection of digitised books created by the British Library in partnership with Microsoft. This is a collection of books that have been digitised and processed using Optical Character Recognition (OCR) software to make the text machine-readable. There is also a collection of books digitised in partnership with Google. 

Since being digitised, this collection of digitised books has been used for many different projects. This includes recent work to try and augment this dataset with genre metadata and a project using machine learning to tag images extracted from the books. The books have also served as training data for a historic language model.

This blog post will focus on two challenges of working with this dataset: size and documentation, and discuss how we’ve experimented with one potential approach to addressing these challenges. 

One of the challenges of working with this collection is its size. The OCR output is over 20GB. This poses some challenges for researchers and other interested users wanting to work with these collections. Projects like Living with Machines are one avenue in which the British Library seeks to develop new methods for working at scale. For an individual researcher, one of the possible barriers to working with a collection like this is the computational resources required to process it. 

Recently we have been experimenting with a Python library, datasets, to see if this can help make this collection easier to work with. The datasets library is part of the Hugging Face ecosystem. If you have been following developments in machine learning, you have probably heard of Hugging Face already. If not, Hugging Face is a delightfully named company focusing on developing open-source tools aimed at democratising machine learning. 

The datasets library is a tool aiming to make it easier for researchers to share and process large datasets for machine learning efficiently. Whilst this was the library’s original focus, there may also be other uses cases for which the datasets library may help make datasets held by the British Library more accessible. 

Some features of the datasets library:

  • Tools for efficiently processing large datasets 
  • Support for easily sharing datasets via a ‘dataset hub’ 
  • Support for documenting datasets hosted on the hub (more on this later). 

As a result of these and other features, we have recently worked on adding the British Library books dataset library to the Hugging Face hub. Making the dataset available via the datasets library has now made the dataset more accessible in a few different ways.

Firstly, it is now possible to download the dataset in two lines of Python code: 

Image of a line of code: "from datasets import load_dataset ds = load_dataset('blbooks', '1700_1799')"

We can also use the Hugging Face library to process large datasets. For example, we only want to include data with a high OCR confidence score (this partially helps filter out text with many OCR errors): 

Image of a line of code: "ds.filter(lambda example: example['mean_wc_ocr'] > 0.9)"

One of the particularly nice features here is that the library uses memory mapping to store the dataset under the hood. This means that you can process data that is larger than the RAM you have available on your machine. This can make the process of working with large datasets more accessible. We could also use this as a first step in processing data before getting back to more familiar tools like pandas. 

Image of a line of code: "dogs_data = ds['train'].filter(lamda example: "dog" in example['text'].lower()) df = dogs_data_to_pandas()

In a follow on blog post, we’ll dig into the technical details of datasets in some more detail. Whilst making the technical processing of datasets more accessible is one part of the puzzle, there are also non-technical challenges to making a dataset more usable. 

 

Documenting datasets 

One of the challenges of sharing large datasets is documenting the data effectively. Traditionally libraries have mainly focused on describing material at the ‘item level,’ i.e. documenting one dataset at a time. However, there is a difference between documenting one book and 100,000 books. There are no easy answers to this, but libraries could explore one possible avenue by using Datasheets. Timnit Gebru et al. proposed the idea of Datasheets in ‘Datasheets for Datasets’. A datasheet aims to provide a structured format for describing a dataset. This includes questions like how and why it was constructed, what the data consists of, and how it could potentially be used. Crucially, datasheets also encourage a discussion of the bias and limitations of a dataset. Whilst you can identify some of these limitations by working with the data, there is also a crucial amount of information known by curators of the data that might not be obvious to end-users of the data. Datasheets offer one possible way for libraries to begin more systematically commuting this information. 

The dataset hub adopts the practice of writing datasheets and encourages users of the hub to write a datasheet for their dataset. For the British library books, we have attempted to write one of these datacards. Whilst it is certainly not perfect, it hopefully begins to outline some of the challenges of this dataset and gives end-users a better sense of how they should approach a dataset. 

14 March 2022

The Lotus Sutra Manuscripts Digitisation Project: the collaborative work between the Heritage Made Digital team and the International Dunhuang Project team

Digitisation has become one of the key tasks for the curatorial roles within the British Library. This is supported by two main pillars: the accessibility of the collection items to everybody around the world and the preservation of unique and sometimes, very fragile, items. Digitisation involves many different teams and workflow stages including retrieval, conservation, curatorial management, copyright assessment, imaging, workflow management, quality control, and the final publication to online platforms.

The Heritage Made Digital (HMD) team works across the Library to assist with digitisation projects. An excellent example of the collaborative nature of the relationship between the HMD and International Dunhuang Project (IDP) teams is the quality control (QC) of the Lotus Sutra Project’s digital files. It is crucial that images meet the quality standards of the digital process. As a Digitisation Officer in HMD, I am in charge of QC for the Lotus Sutra Manuscripts Digitisation Project, which is currently conserving and digitising nearly 800 Chinese Lotus Sutra manuscripts to make them freely available on the IDP website. The manuscripts were acquired by Sir Aurel Stein after they were discovered  in a hidden cave in Dunhuang, China in 1900. They are thought to have been sealed there at the beginning of the 11th century. They are now part of the Stein Collection at the British Library and, together with the international partners of the IDP, we are working to make them available digitally.

The majority of the Lotus Sutra manuscripts are scrolls and, after they have been treated by our dedicated Digitisation Conservators, our expert Senior Imaging Technician Isabelle does an outstanding job of imaging the fragile manuscripts. My job is then to prepare the images for publication online. This includes checking that they have the correct technical metadata such as image resolution and colour profile, are an accurate visual representation of the physical object and that the text can be clearly read and interpreted by researchers. After nearly 1000 years in a cave, it would be a shame to make the manuscripts accessible to the public for the first time only to be obscured by a blurry image or a wayward piece of fluff!

With the scrolls measuring up to 13 metres long, most are too long to be imaged in one go. They are instead shot in individual panels, which our Senior Imaging Technicians digitally “stitch” together to form one big image. This gives online viewers a sense of the physical scroll as a whole, in a way that would not be possible in real life for those scrolls that are more than two panels in length unless you have a really big table and a lot of specially trained people to help you roll it out. 

Photo showing the three individual panels of Or.8210S/1530R with breaks in between
Or.8210/S.1530: individual panels
Photo showing the three panels of Or.8210S/1530R as one continuous image
Or.8210/S.1530: stitched image

 

This post-processing can create issues, however. Sometimes an error in the stitching process can cause a scroll to appear warped or wonky. In the stitched image for Or.8210/S.6711, the ruled lines across the top of the scroll appeared wavy and misaligned. But when I compared this with the images of the individual panels, I could see that the lines on the scroll itself were straight and unbroken. It is important that the digital images faithfully represent the physical object as far as possible; we don’t want anyone thinking these flaws are in the physical item and writing a research paper about ‘Wonky lines on Buddhist Lotus Sutra scrolls in the British Library’. Therefore, I asked the Senior Imaging Technician to restitch the images together: no more wonky lines. However, we accept that the stitched images cannot be completely accurate digital surrogates, as they are created by the Imaging Technician to represent the item as it would be seen if it were to be unrolled fully.

 

Or.8210/S.6711: distortion from stitching. The ruled line across the top of the scroll is bowed and misaligned
Or.8210/S.6711: distortion from stitching. The ruled line across the top of the scroll is bowed and misaligned

 

Similarly, our Senior Imaging Technician applies ‘digital black’ to make the image background a uniform colour. This is to hide any dust or uneven background and ensure the object is clear. If this is accidentally overused, it can make it appear that a chunk has been cut out of the scroll. Luckily this is easy to spot and correct, since we retain the unedited TIFFs and RAW files to work from.

 

Or.8210/S.3661, panel 8: overuse of digital black when filling in tear in scroll. It appears to have a large black line down the centre of the image.
Or.8210/S.3661, panel 8: overuse of digital black when filling in tear in scroll

 

Sometimes the scrolls are wonky, or dirty or incomplete. They are hundreds of years old, and this is where it can become tricky to work out whether there is an issue with the images or the scroll itself. The stains, tears and dirt shown in the images below are part of the scrolls and their material history. They give clues to how the manuscripts were made, stored, and used. This is all of interest to researchers and we want to make sure to preserve and display these features in the digital versions. The best part of my job is finding interesting things like this. The fourth image below shows a fossilised insect covering the text of the scroll!

 

Black stains: Or.8210/S.2814, panel 9
Black stains: Or.8210/S.2814, panel 9
Torn and fragmentary panel: Or.8210/S.1669, panel 1
Torn and fragmentary panel: Or.8210/S.1669, panel 1
Insect droppings obscuring the text: Or.8210/S.2043, panel 1
Insect droppings obscuring the text: Or.8210/S.2043, panel 1
Fossilised insect covering text: Or.8210/S.6457, panel 5
Fossilised insect covering text: Or.8210/S.6457, panel 5

 

We want to minimise the handling of the scrolls as much as possible, so we will only reshoot an image if it is absolutely necessary. For example, I would ask a Senior Imaging Technician to reshoot an image if debris is covering the text and makes it unreadable - but only after inspecting the scroll to ensure it can be safely removed and is not stuck to the surface. However, if some debris such as a small piece of fluff, paper or hair, appears on the scroll’s surface but is not obscuring any text, then I would not ask for a reshoot. If it does not affect the readability of the text, or any potential future OCR (Optical Character Recognition) or handwriting analysis, it is not worth the risk of damage that could be caused by extra handling. 

Reshoot: Or.8210/S.6501: debris over text  /  No reshoot: Or.8210/S.4599: debris not covering text.
Reshoot: Or.8210/S.6501: debris over text  /  No reshoot: Or.8210/S.4599: debris not covering text.

 

These are a few examples of the things to which the HMD Digitisation Officers pay close attention during QC. Only through this careful process, can we ensure that the digital images accurately reflect the physicality of the scrolls and represent their original features. By developing a QC process that applies the best techniques and procedures, working to defined standards and guidelines, we succeed in making these incredible items accessible to the world.

Read more about Lotus Sutra Project here: IDP Blog

IDP website: IDP.BL.UK

And IDP twitter: @IDP_UK

Dr Francisco Perez-Garcia

Digitisation Officer, Heritage Made Digital: Asian and African Collections

Follow us @BL_MadeDigital

10 February 2022

In conversation: Meet Silvija Aurylaitė, the new British Library Labs Manager

The newly appointed manager of the British Library Labs (BL Labs), Silvija Aurylaitė, is excited to start leading the BL Labs Labs transformation with a new focus on computational creative thinking. The BL Labs is a welcoming space for everyone curious about computational research and using the British Library’s digital collections. We welcome all researchers - data scientists, digital humanists, artists, creative practitioners, and everyone curious about digital research.

Image of BL Labs Manager Silvija Aurylaite
Introducing Silvija Aurylaitė, new manager of BL Labs

Find out more from Silvija, in conversation with Maja Maricevic, BL Head of Higher Education and Science.

 

Maja: The Labs have a proud history of experimenting and innovating with the British Library’s digital collections. Can you tell us more about your own background?

Silvija: Ever since I discovered the BL Labs in London 8 years ago, I have been immersed into the world of experimentation with digital collections. I started researching collections from open GLAMs (galleries, libraries, archives and museums) around the world and the implications of copyright and licensing for creative reuse. In a large ecosystem of open digital collections, my special interest has been identifying content for people to use to bring to life their creative ideas such as new design works.

Inspired by the Labs, I started developing my own curatorial web project, which won the Europeana Creative Design Challenge in 2015. The award gave me the chance to work with a team of international experts to learn new skills in areas such as IT, copyright and social entrepreneurship. This experience later evolved into the ‘Revivo Images’, a pilot website that gives guidance on open image collections around the world, which are carefully selected for quality, reliability of copyright and licence information, with explanations how to use the databases. It was a result of collaboration with a great interdisciplinary team including an IT lead, programmers, curators, designers and a copywriter.

All this gave me invaluable experience in overseeing a digital collections web project from vision to implementation. I learned about curating content from across collections, building an image database and mapping metadata using various standards. We also used AI and human input to create keywords and thematic catalogs and designed a simple minimalist user interface.

What I most enjoyed about this journey, actually, was meeting a great range of creative people in many creative fields, from professional animators to students looking for a theme for their BA final thesis - and learning what excited them most, and what barriers they faced in using open collections. I met many of them at various art festivals, universities, design schools and events where I delivered talks and creative workshops in my free time to spread the word about open digital collections for creativity. For two years I was also responsible for the ‘Bridgeman Education’ online database, one of the largest digital image collections with over 1.300.000 images from the GLAM sector, designed for the use of art images in higher education curricula. I had the opportunity to talk to many librarians, lecturers and students from around the world about what they find most useful in this new digital turn.

As a result of this, I am particularly excited about introducing the Labs to university students: from students in computer science departments with coding skills to researchers in social sciences and humanities, to creativity champions in fashion, graphic design or jewelry, who might be attracted to aesthetic qualities of our collections or those looking to pick up creative coding skills.

The landscape has changed a lot in the last 8 years since I learned about the Labs, and I gradually started my own journey of learning code and algorithmic thinking. Already in my previous role in the British Library, as the Rights Officer for the Heritage Made Digital project, we approached digital collections as data. Now we are all embracing computational data science methods to gain new insights into digital collections, and that is what the future British Library Labs is going to celebrate.

 

Maja: You have a strong connection to the BL Labs since you were the Labs volunteer 8 years ago. What most inspired you when you first heard of the Labs?

Silvija: Personally, the Labs were my first professional experience abroad after my MA studies in intellectual history at the American university in Budapest, and happened to be one of the main incentives to stay in London.

This city has attracted me for its serendipity - you can have a great range of urban experiences from attending the oldest special interest societies and visiting antiquarian bookshops to meeting founders of latest startups in their regular gatherings and getting up to speed with the mindset of perpetual innovation.

When I first heard about the Labs in one of its public events, this sentence struck me: “experiment with the BL digital collections to create something new”, with the “new” being undefined and open. I had this idea of a perpetuity - the possibility of endlessly combining the knowledge and aesthetics of the past, safeguarded by one of the biggest libraries of the world, with the creative visions, skills and technology of today and tomorrow.

Such endless new experiences of digital collections can be accelerated by creating a dedicated space for experimentation - a collider or a matchmaker - that contributes to the diverse serendipitous urban experience of London itself. This is how I see the Labs.

Looking from a user point of view, I am particularly excited about the ‘semiotic democracy’, or ‘the ability of users to produce and disseminate new creations and to take part in public cultural discourse’[1] (Stark, 2006). I believe this new playful approach to digitise out-of-copyright cultural materials will fundamentally change the way we see GLAMs. We’ll look at them less and less as spaces that are only there to learn about the past as it used to be, as a recipient, and more and more as a co-creator, able to enter into a meaningful dialogue and reshape meanings, narratives and experiences.

 

Maja: Prior to Labs appointment, you also have a significant rights management experience. What have you learned that will be useful for the Labs?

Silvija: It was a delight to work with Matthew Lambert, the Head of Copyright, Policy & Assurance, for the Heritage Made Digital project, led by Sandra Tuppen, in setting up the British Library’s copyright workflow for both current and historical digitisation projects. This project now allows users to explore the BL’s digital images in the Universal Viewer with attributed rights statements and usage terms.

These last 3.5 years was a great exercise in dealing with very large, often very messy, data to create complex systems, policies and procedures which allow oversight of all important aspects of the digital data including copyright and licencing, data protection and sensitivities. Of course, such work in the Library is of massive importance because it affects the level of freedom we later have to experiment, reuse and do further research based on this data.

Personally, the Heritage Made Digital project is also very precious to me because of its collaborative nature. They use MS SharePoint tool to facilitate data contributions from across many departments in the BL. And they are just fantastic at promoting and celebrating digitisation as a common effort to make content publicly accessible. I will definitely use this experience to suggest solutions on how to register and document both the BL’s datasets and related reuse projects as a similar collaborative project within the Library.

 

Maja: There is so much that is changing in digital research all the time. Are there particular current developments that you find exciting and why?

Silvija: Yes! First, I find the moment of change itself exciting - there is no book about the tools we use today that won’t be running out of date tomorrow. This is a good neuroplasticity exercise that trains the mind not to sleep and be constantly attentive to new developments and opportunities.

Second, I absolutely love to see how many people, from creators to researchers and library staff, are gradually and naturally embracing code languages. With this comes associated critical thinking, such as the ability to surpass often outdated old database interfaces to reveal exciting data insights simply by having a liberating package of new digital skills.

And, third, I am super excited about the possibility of upscaling and creating a bigger impact with existing breakthrough projects and brilliant ideas relating to the British Library’s data. I believe this could be done by finding consensus on how we want to register and document data science initiatives - finalised, ongoing and most wanted, both internally and externally - and then by promoting this knowledge further.

This would allow us to enter a new stage of the BL Labs. The new ecosystem of re-use would promote sustainability, reproducibility, adaptation and crowdsourced improvement of existing projects, giving us new super powers!

↩︎ Stark, Elisabeth (2006). Free culture and the internet: a new semiotic democracy. opendemocracy.net (June 20). URL: https://www.opendemocracy.net/en/semiotic_3662jsp

30 November 2021

BL Labs Online Symposium 2021, Special Climate Change Edition: Speakers Announced!

BL Labs 9th Symposium – Special Climate Change Edition is taking place on Tuesday 7 December 2021. This special event is devoted to looking at computational research and climate change.

A polar bear jumping off an iceberg with the rear of a ship showing. Image captioned: 'A Bear Plunging Into The Sea'
British Library digitised image from page 303 of "A Voyage of Discovery, made under the orders of the Admiralty, in his Majesty's ships Isabella and Alexander for the purpose of exploring Baffin's Bay, and enquiring into the possibility of a North-West Passage".

To help us explore a range of complex issues at the intersection of computational research and climate change we are delighted to announce our expert panel:

  • Schuyler Esprit – Founding Director of Create Caribbean Research Institute & Research Officer at the School of Graduate Studies and Research at the University of West Indies
  • Helen Hardy – Science Digital Programme Manager at the Natural History Museum, London, responsible for mass digitisation of the Museum’s collections of 80 million items
  • Joycelyn Longdon – Founder of ClimateInColour, a platform at the intersection of climate science and social justice, and PhD Student on the Artificial Intelligence for Environmental Risk programme at University of Cambridge
  • Gavin Shaddick – Chair of Data Science and Statistics, University of Exeter, Director of the UKRI funded Centre for Doctoral Training in Environmental Intelligence: Data Science and AI for Sustainable Futures, co-Director of the University of Exeter-Met Office Joint Centre for Excellence in Environmental Intelligence and an Alan Turing Fellow
  • Richard Sandford – Professor of Heritage Evidence, Foresight and Policy at the Institute of Sustainable Heritage at University College London
  • Joseph Walton – Research Fellow in Digital Humanities and Critical and Cultural Theory at the University of Sussex

Join us for this exciting discussion addressing issues such as how digitisation can improve research efficiency, discussing pros and cons of AI and machine learning in relation to climate change, and the links between new technologies, climate and social justice.

You can see more details about our panel and book your place here.

10 November 2021

BL Labs Online Symposium 2021, Special Climate Change Edition: Book your place for webinar on Tuesday 7 December 2021

In response to the Climate Emergency and issues raised by the COP26, the 9th British Library Labs Symposium is devoted to looking at computational research and climate change.  Registration Now Open.

Futuristic, hologram looking version of the globe overlaid with images like wind turbines, water drops, trees and graphs.

The British Library Labs is the British Library programme dedicated to enabling people to experiment with our digital collections, including deploying computational research methods and using our collections as data. This inevitably means that we, and the communities we work with, are increasingly applying computational tools and methods that have environmental impact on our planet.

As our millions of pages of digitised content are becoming an exciting new research frontier, and we are increasingly using machine learning methods and tools on the large-scale projects, such as the Living with Machines project, it is also inevitable that this exciting new work comes with the increased use of computational resource and energy. With the view of the climate emergency, we are hoping to ensure that climate and sustainability considerations inform everything we do – meaning that we need much better understanding of digital environmental impacts and how this should inform our practice in all things related to computational research.

We know that this is not a simple issue - digitisation and digital preservation is often a lifeline for cultural heritage in the communities where museums, libraries and archives are already endangered due to the climate change - for example, the British Library’s Endangered Archives Programme is dedicated to digitising and saving archives in danger of destruction, including due to climate change. The new digital resources, such the UK Web Archive’s collections, the Climate Change collection in particular, as well as the International Internet Preservation Consortium’s Climate Change collection, are essential resources for climate researchers, especially as we are increasingly working with researchers who wish to text and data mine our collections for the insights that can broaden our understanding of changing climate and biodiversity, and the impact of these changes on different communities.

Equally, as in all other areas related to the impacts of climate change, we are aware that in relation to digital research, there is also a strong interdependency with the issues of equality and social justice. Digital advancements are enablers of new research, helping us to better understand different communities and to broaden access and opportunities, but we also need to consider how the complexities of computational research and access, as well as expensive set up and energy requirements of the state-of-art infrastructures, might disadvantage researchers and communities that do not have access to relevant technologies, or to prohibitively expensive and energy-demanding resources required to run them.

For this year’s BL Labs Symposium, we are bringing a group of speakers that will consider these issues from different angles - from large-scale digitisation, to digital humanities, climate and biodiversity research, as well as the impact of AI. We will look into how our digital strategies and projects can help us fight climate change and be more inclusive, but also how we can improve our sustainability and reduce our impact on the planet.

As well as the views from our panel, there will be an opportunity for an extended audience input, helping us to bring forward the views from the broader Labs community and learn together how our practice can be improved.

The 9th BL Labs Symposium takes place on Zoom on Tuesday 7th December from 16.30 until 18.00. Book your place now.

Digital scholarship blog recent posts

Archives

Tags