Digital scholarship blog

236 posts categorized "Digital scholarship"

22 October 2021

Thought Bubble 2021 Wikithon

We are so excited to be working with Thought Bubble and our friends at Leeds Libraries to run our first in-person Wikithon since this residency started. Thought Bubble is an amazing comics festival spread across Yorkshire, culminating in a two day convention in Harrogate, where the British Library will be having a stall and curating a panel discussion, more details about these can be found here.

Thought Bubble Comic Convention Banner

The Thought Bubble website sums it up best when it says: ‘[w]e use our festival week to promote the power of comics! We believe they can inspire, educate and bring people together like no other medium [...]’. We at the library quite agree.

On Thursday November 11, from 1.30pm to 4.30pm, we’ll be taking up residence in the Sanderson Room of Leeds Central Library to demonstrate how to update, create and improve Wikipedia articles, and we'll even dabble in a bit of basic Wikidata editing for those who are interested. The Comics Wikithon event is free, but please book here.

Photograph of Leeds Central Library on a clear sunny day
Leeds Central Library by Lad 2011, CC BY-SA 4.0 via Wikimedia Commons

We’ll be focusing on underrepresented and marginalised voices in graphic novels and comics. We’re particularly interested in exploring the way Black, Asian and minority ethnic, disabled and LGBTQ+ creators and characters, and want to amplify representation at all levels!

As with all our Wikithons, no previous experience of editing Wikipedia is required. If you can write an email, you can edit Wikipedia! Whether it’s Widdershins, The Walking Dead or Wolverine that you like best, come along and learn some new skills and expand your comic horizons.

For those of you keen to get started, we’ll be following up next week with a blog post on how to get set up for the event. In the meantime you can freely register for the Comics Wikithon event here. 

This post is by Wikimedian in Residence Lucy Hinnie (@BL_Wikimedian).

29 September 2021

Sailing Away To A Distant Land - Mahendra Mahey, Manager of BL Labs - final post

Posted by Mahendra Mahey, former Manager of British Library Labs or "BL Labs" for short

[estimated reading time of around 15 minutes]

This is is my last day working as manager of BL Labs, and also my final posting on the Digital Scholarship blog. I thought I would take this chance to reflect on my journey of almost 9 years in helping to set up, maintain and enabling BL Labs to become a permanent fixture at the British Library (BL).

BL Labs was the first digital Lab in a national library, anywhere in the world, that gets people to experiment with its cultural heritage digital collections and data. There are now several Gallery, Library, Archive and Museum Labs or 'GLAM Labs' for short around the world, with an active community which I helped build, from 2018.

I am really proud I was there from the beginning to implement the original proposal which was written by several colleagues, but especially Adam Farquhar, former head of Digital Scholarship at the British Library (BL). The project was at first generously funded by the Andrew W. Mellon foundation through four rounds of funding as well as support from the BL. In April 2021, the project became a permanently funded fixture, helped very much by my new manager Maja Maricevic, Head of Higher Education and Science.

The great news is that BL Labs is going to stay after I have left. The position of leading the Lab will soon be advertised. Hopefully, someone will get a chance to work with my helpful and supportive colleague Technical Lead of Labs, Dr Filipe Bento, bright, talented and very hard working Maja and other great colleagues in Digital Research and wider at the BL.

The beginnings, the BL and me!

I met Adam Farquhar and Aly Conteh (Former Head of Digital Research at the BL) in December 2012. They must have liked something about me because I started working on the project in January 2013, though I officially started in March 2013 to launch BL Labs.

I must admit, I had always felt a bit intimidated by the BL. My first visit was in the early 1980s before the St Pancras site was opened (in 1997) as a Psychology student. I remember coming up from Wolverhampton on the train to get a research paper about "Serotonin Pathways in Rats when sleeping" by Lidov, feeling nervous and excited at the same time. It felt like a place for 'really intelligent educated people' and for those who were one for the intellectual elites in society. It also felt for me a bit like it represented the British empire and its troubled history of colonialism, especially some of the collections which made me feel uncomfortable as to why they were there in the first place.

I remember thinking that the BL probably wasn't a place for some like me, a child of Indian Punjabi immigrants from humble beginnings who came to England in the 1960s. Actually, I felt like an imposter and not worthy of being there.

Nearly 9 years later, I can say I learned to respect and even cherish what was inside it, especially the incredible collections, though I also became more confident about expressing stronger views about the decolonisation of some of these.  I became very fond of some of the people who work or use it, there are some really good kind-hearted souls at the BL. However, I never completely lost that 'imposter and being an outsider' feeling.

What I remember at that time, going for my interview, was having this thought, what will happen if I got the position and 'What would be the one thing I would try and change?'. It came easily to me, namely that I would try and get more new people through the doors literally or virtually by connecting them to the BL's collections (especially the digital). New people like me, who may have never set foot, or had been motivated to step into the building before. This has been one of the most important reasons for me to get up in the morning and go to work at BL Labs.

So what have been my highlights? Let's have a very quick pass through!

BL Labs Launch and Advisory Board

I launched BL Labs in March 2013, one week after I had started. It was at the launch event organised by my wonderfully supportive and innovative colleague, Digital Curator Stella Wisdom. I distinctly remember in the afternoon session (which I did alone), I had to present my 'ideas' of how I might launch the first BL Labs competition where we would be trying to get pioneering researchers to work with the BL's digital collections.

God it was a tough crowd! They asked pretty difficult questions, questions I myself was asking too which I still didn't know the answer too either.

I remember Professors Tim Hitchcock (now at Sussex University and who eventually sat (and is still sitting) on the BL Labs Advisory Board) and Laurel Brake (now Professor Emerita of Literature and Print Culture, Birkbeck, University of London) being in the audience together with staff from the Royal Library of Netherlands, who 6 months later launched their own brilliant KB Lab. Subsequently, I became good colleagues with Lotte Wilms who led their Lab for many years and is now Head of Research support at Tilburg University.

My first gut feeling overall after the event was, this is going to be hard work. This feeling and reality remained a constant throughout my time at BL Labs.

In early May 2013, we launched the competition, which was a really quick and stressful turnaround as I had only officially started in mid March (one and a half months). I remember worrying as to whether anyone would even enter!  All the final entries were pretty much submitted a few minutes before the deadline. I remember being alone that evening on deadline day near to midnight waiting by my laptop, thinking what happens if no one enters, it's going to be disaster and I will lose my job. Luckily that didn't happen, in the end, we received 26 entries.

I am a firm believer that we can help make our own luck, but sometimes luck can be quite random! Perhaps BL Labs had a bit of both!

After that, I never really looked back! BL Labs developed its own kind of pattern and momentum each year:

  • hunting around the BL for digital collections to make into datasets and make available
  • helping to make more digital collections openly licensed
  • having hundreds of conversations with people interested in connecting with the BL's digital collections in the BL and outside
  • working with some people more intensively to carry out experiments
  • developing ideas further into prototype projects
  • telling the world of successes and failures in person, meetings, events and social media
  • launching a competition and awards in April or May
  • roadshows before and after with invitations to speak at events around the world
  • the summer working with competition winners
  • late October/November the international symposium showcased things from the year
  • working on special projects
  • repeat!

The winners were announced in July 2013, and then we worked with them on their entries showcasing them at our annual BL Labs Symposium in November, around 4 months later.

'Nothing interesting happens in the office' - Roadshows, Presentations, Workshops and Symposia!

One of the highlights of BL Labs was to go out to universities and other places to explain what the BL is and what BL Labs does.  This ended up with me pretty much seeing the world (North America, Europe, Asia, Australia, and giving virtual talks in South America and Africa).

My greatest challenge in BL Labs was always to get people to truly and passionately 'connect' with the BL's digital collections and data in order to come up with cool ideas of what to actually do with them. What I learned from my very first trip was that telling people what you have is great, they definitely need to know what you have! However, once you do that, the hard work really begins as you often need to guide and inspire many of them, help and support them to use the collections creatively and meaningfully. It was also important to understand the back story of the digital collection and learn about the institutional culture of the BL if people also wanted to work with BL colleagues.  For me and the researchers involved, inspirational engagement with digital collections required a lot of intellectual effort and emotional intelligence. Often this means asking the uncomfortable questions about research such as 'Why are we doing this?', 'What is the benefit to society in doing this?', 'Who cares?', 'How can computation help?' and 'Why is it necessary to even use computation?'.

Making those connections between people and data does feel like magic when it really works. It's incredibly exciting, suddenly everyone has goose bumps and is energised. This feeling, I will take away with me, it's the essence of my work at BL Labs!

A full list of over 200 presentations, roadshows, events and 9 annual symposia can be found here.

Competitions, Awards and Projects

Another significant way BL Labs has tried to connect people with data has been through Competitions (tell us what you would like to do, and we will choose an idea and work collaboratively with you on it to make it a reality), Awards (show us what you have already done) and Projects (collaborative working).

At the last count, we have supported and / or highlighted over 450 projects in research, artistic, entrepreneurial, educational, community based, activist and public categories most through competitions, awards and project collaborations.

We also set up awards for British Library Staff which has been a wonderful way to highlight the fantastic work our staff do with digital collections and give them the recognition they deserve. I have noticed over the years that the number of staff who have been working on digital projects has increased significantly. Sometimes this was with the help of BL Labs but often because of the significant Digital Scholarship Training Programme, run by my Digital Curator colleagues in Digital Research for staff to understand that the BL isn't just about physical things but digital items too.

Browse through our project archive to get inspiration of the various projects BL Labs has been involved in or highlighted.

Putting the digital collections 'where the light is' - British Library platforms and others

When I started at BL Labs it was clear that we needed to make a fundamental decision about how we saw digital collections. Quite early on, we decided we should treat collections as data to harness the power of computational tools to work with each collection, especially for research purposes. Each collection should have a unique Digital Object Identifier (DOI) so researchers can cite them in publications.  Any new datasets generated from them will also have DOIs, allowing us to understand the ecosystem through DOIs of what happens to data when you get it out there for people to use.

In 2014, https://data.bl.uk was born and today, all our 153 datasets (as of 29/09/2021) are available through the British Library's research repository.

However, BL Labs has not stopped there! We always believed that it's important to put our digital collections where others are likely to discover them (we can't assume that researchers will want to come to BL platforms), 'where the light is' so to speak.  We were very open and able to put them on other platforms such as Flickr and Wikimedia Commons, not forgetting that we still needed to do the hard work to connect data to people after they have discovered them, if they needed that support.

Our greatest success by far was placing 1 million largely undescribed images that were digitally snipped from 65,000 digitised public domain books from the 19th Century on Flickr Commons in 2013. The number of images on the platform have grown since then by another 50 to 60 thousand from collections elsewhere in the BL. There has been significant interaction from the public to generate crowdsourced tags to help to make it easier to find the specific images. The number of views we have had have reached over a staggering 2 billion over this time. There have also been an incredible array of projects which have used the images, from artistic use to using machine learning and artificial intelligence to identify them. It's my favourite collection, probably because there are no restrictions in using it.

Read the most popular blog post the BL has ever published by my former BL Labs colleague, the brilliant and inspirational Ben O'Steen, a million first steps and the 'Mechanical Curator' which describes how we told the world why and how we had put 1 million images online for anyone to use freely.

It is wonderful to know that George Oates, the founder of Flickr Commons and still a BL Labs Advisory Board member, has been involved in the creation of the Flickr Foundation which was announced a few days ago! Long live Flickr Commons! We loved it because it also offered a computational way to access the collections, critical for powerful and efficient computational experiments, through its Application Programming Interface (API).

More recently, we have experimented with browser based programming / computational environments - Jupyter Notebooks. We are huge fans of Tim Sherrat who was a pioneer and brilliant advocate of OPEN GLAM in using them, especially through his GLAM Workbench. He is a one person Lab in his own right, and it was an honour to recognise his monumental efforts by giving him the BL Labs Research Award 2020 last year. You can also explore the fantastic work of Gustavo Candela and colleagues on Jupyter Notebooks and the ones my colleageue Filipe Bento created.

Art Exhibitions, Creativity and Education

I am extremely proud to have been involved in enabling two major art exhibitions to happen at the BL, namely:

Crossroads of Curiosity by David Normal

Imaginary Cities by Michael Takeo Magruder

I loved working with artists, its my passion! They are so creative and often not restricted by academic thinking, see the work of Mario Klingemann for example! You can browse through our archives for various artistic projects that used the BL's digital collections, it's inspiring.

I was also involved in the first British Library Fashion Student Competition won by Alanna Hilton, held at the BL which used the BL's Flickr Commons collection as inspiration for the students to design new fashion ranges. It was organised by my colleague Maja Maricevic, the British Fashion Colleges Council and Teatum Jones who were great fun to work with. I am really pleased to say that Maja has gone on from strength to strength working with the fashion industry and continues to run the competition to this day.

We also had some interesting projects working with younger people, such as Vittoria's world of stories and the fantastic work of Terhi Nurmikko-Fuller at the Australian National University. This is something I am very much interested in exploring further in the future, especially around ideas of computational thinking and have been trying out a few things.

GLAM Labs community and Booksprint

I am really proud of helping to create the international GLAM Labs community with over 250 members, established in 2018 and still active today. I affectionately call them the GLAM Labbers, and I often ask people to explore their inner 'Labber' when I give presentations. What is a Labber? It's the experimental and playful part of us we all had as children and unfortunately many have lost when becoming an adult. It's the ability to be fearless, having the audacity and perhaps even naivety to try crazy things even if they are likely to fail! Unfortunately society values success more than it does failure. In my opinion, we need to recognise, respect and revere those that have the courage to try but failed. That courage to experiment should be honoured and embraced and should become the bedrock of our educational systems from the very outset.

Two years ago, many of us Labbers 'ate our own dog food' or 'practised what we preached' when me and 15 other colleagues came together for 5 days to produce a book through a booksprint, probably the most rewarding professional experience of my life. The book is about how to set up, maintain, sustain and even close a GLAM Lab and is called 'Open a GLAM Lab'. It is available as public domain content and I encourage you to read it.

Online drop-in goodbye - today!

I organised a 30 minute ‘online farewell drop-in’ on Wednesday 29 September 2021, 1330 BST (London), 1430 (Paris, Amsterdam), 2200 (Adelaide), 0830 (New York) on my very last day at the British Library. It was heart-warming that the session was 'maxed out' at one point with participants from all over the world. I honestly didn't expect over 100 colleagues to show up. I guess when you leave an organisation you get to find out who you actually made an impact on, who shows up, and who tells you, otherwise you may never know.

Those that know me well know that I would have much rather had a farewell do ‘in person’, over a pint and praying for the ‘chip god’ to deliver a huge portion of chips with salt/vinegar and tomato sauce’ magically and mysteriously to the table. The pub would have been Mc'Glynns (http://www.mcglynnsfreehouse.com/) near the British Library in London. I wonder who the chip god was?  I never found out ;)

The answer to who the chip god was is in text following this sentence on white on white text...you will be very shocked to know who it was!- s

Spoiler alert it was me after all, my alter ego

Farwell-bl-labs-290921Mahendra's online farewell to BL Labs, Wednesday 29 September, 1330 BST, 2021.
Left: Flowers and wine from the GLAM Labbers arrived in Tallinn, 20 mins before the meeting!
Right: Some of the participants of the online farewell

Leave a message of good will to see me off on my voyage!

It would be wonderful if you would like to leave me your good wishes, comments, memories, thoughts, scans of handwritten messages, pictures, photographs etc. on the following Google doc:

http://tiny.cc/mahendramahey

I will leave it open for a week or so after I have left. Reading positive sincere heartfelt messages from colleagues and collaborators over the years have already lifted my spirits. For me it provides evidence that you perhaps did actually make a difference to somone's life.  I will definitely be re-reading them during the cold dark Baltic nights in Tallinn.

I would love to hear from you and find out what you are doing, or if you prefer, you can email me, the details are at the end of this post.

BL Labs Sailor and Captain Signing Off!

It's been a blast and lots of fun! Of course there is a tinge of sadness in leaving! For me, it's also been intellectually and emotionally challenging as well as exhausting, with many ‘highs’ and a few ‘lows’ or choppy waters, some professional and others personal.

I have learned so much about myself and there are so many things I am really really proud of. There are other things of course I wish I had done better. Most of all, I learned to embrace failure, my best teacher!

I think I did meet my original wish of wanting to help to open up the BL to as many new people who perhaps would have never engaged in the Library before. That was either by using digital collections and data for cool projects and/or simply walking through the doors of the BL in London or Boston Spa and having a look around and being inspired to do something because of it.

I wish the person who takes over my position lots of success! My only piece of advice is if you care, you will be fine!

Anyhow, what a time this has been for us all on this planet? I have definitely struggled at times. I, like many others, have lost loved ones and thought deeply about life and it's true meaning. I have also managed to find the courage to know what’s important and act accordingly, even if that has been a bit terrifying and difficult at times. Leaving the BL for example was not an easy decision for me, and I wish perhaps things had turned out differently, but I know I am doing the right thing for me, my future and my loved ones. 

Though there have been a few dark times for me both professionally and personally, I hope you will be happy to know that I have also found peace and happiness too. I am in a really good place.

I would like to thank former alumni of BL Labs, Ben O'Steen - Technical Lead for BL Labs from 2013 to 2018, Hana Lewis (2016 - 2018) and Eleanor Cooper (2018-2019) both BL Labs Project Officers and many other people I worked through BL Labs and wider in the Library and outside it in my journey.

Where I am off to and what am I doing?

My professional plans are 'evolving', but one thing is certain, I will be moving country!

To Estonia to be precise!

I plan to live, settle down with my family and work there. I was never a fan of Brexit, and this way I get to stay a European.

I would like to finish with this final sweet video created by writer and filmaker Ling Low and her team in 2016, entitled 'Hey there Young Sailor' which they all made as volunteers for the Malaysian band, the 'Impatient Sisters'. It won the BL Labs Artistic Award in 2016. I had the pleasure and honour of meeting Ling over a lovely lunch in Kuala Lumpa, Malaysia, where I had also given a talk at the National Library about my work and looked for remanants of my grandfather who had settled there many years ago.

I wish all of you well, and if you are interested in keeping in touch with me, working with me or just saying hello, you can contact me via my personal email address: mr.mahendra.mahey@gmail.com or follow my progress on my personal website.

Happy journeys through this short life to all of you!

Mahendra Mahey, former BL Labs Manager / Captain / Sailor signing off!

23 September 2021

Computing for Cultural Heritage: Trial Outcomes and Final Report

Six months ago, twenty members of staff from the British Library and The National Archives UK completed Computing for Cultural Heritage, a project that trialled Birkbeck University and Institute of Coding’s new PGCert, Applied Data Science. In this blog post we explore the necessity of this new course, the final report of this trial, and the lasting impact that this PGCert has made on some of the participants. 

 

 

Background 

Information professionals have been experiencing a massive shift to digital in the way collections are being donated, held and accessed. In the British Library’s digital collections there are e-books, maps, digitised newspapers, journal titles, sound recordings and over 500 terabytes of preserved data from the UK Web Archive. Yearly, the library sees 6 million catalogue searches by web users with almost 4 million items consulted online. This amounts to a vast amount of potential cultural heritage data available to researchers, and it requires complex digital workflows to curate, collect, manage, provide access, and help researchers computationally make sense of it all. 

Staff at collecting institutions like the British Library and the National Archives, UK are engaging in computationally driven projects like never before, but often without the benefit of data skills and computational thinking to support them. That is where a program like Computing for Cultural Heritage can help information professionals, allowing them to upskill and tackle issues – like building new digital systems and services, supporting collaborative, computational and data-driven research using digital collections and data, or deploying simple scripts to make everyday tasks easier – with confidence.  

Image of a laptop with the screen showing a bookshelf

 

Learning Aims 

The trial course was broken into two modules, a taught lesson on ‘Demystifying Computing with Python’ and a written ‘Industry Project’ on a software solution to a work-based problem.  A third module, Analytic Tools for Information Professionals, would be offered to participants outside of the trial as part of the full live course in order to earn their PGCert.

By the end of the trial, participants were able to: 

  • Demonstrate satisfactory knowledge of programming with Python. 
  • Understand techniques for Python data structures and algorithms. 
  • Work on case studies to apply data analytics using Python. 
  • Understand the programming paradigm of object-oriented programming. 
  • Use Python to apply the techniques learned on the module to real-world problems. 
  • Demonstrate the ability to develop an algorithm to carry out a specified task and to convert this into an executable program. 
  • Demonstrate the ability to debug a program. 
  • Understand the concepts of data security and general data protection regulations and standards. 
  • Develop a systematic understanding and critical awareness of a commonly agreed problem between the work environment and the academic supervisor in the area of computing. 
  • Develop a software solution for a work-based problem using the skills developed from the taught modules, for example develop software using the programming languages and software tools/libraries taught. 
  • Present a critical discussion on existing approaches in the particular problem area and position their own approach within that area and evaluate their contribution. 

  • Gain experience in communicating complex ideas/concepts and approaches/techniques to others by writing a comprehensive, self-contained report. 

The learning objectives were designed and delivered with the cultural heritage context in mind, and as such incorporated, for instance, examples and datasets from the British Library Music collections in the Python programming elements of the taught module. Additionally, there was a lecture focused on a British Library user case involving the design and implementation of a Database Management System. 

Following the completion of the trial, participants had the opportunity to complete their PGCert in Applied Data Science by attending the final module, Analytic Tools for Information Professionals, which was part of the official course launched last autumn. 

 

The Lasting Impact of Computing for Cultural Heritage 

Now that we’re six months on from the end of the trial, and the participants who opted in have earned their full PGCert, we followed up with some of the learners to hear about their experiences and the lasting effects of the course: 

“The third and final module of the computing for cultural heritage course was not only fascinating and enjoyable, it was also really pertinent to my job and I was immediately able to put the skills I learned into practice.  

The majority of the third module focussed on machine learning. We studied a number of different methods and one of these proved invaluable to the Agents of Enslavement research project I am currently leading. This project included a crowdsourcing task which asked the public to draw rectangles around four different types of newspaper advertisement. The purpose of the task was to use the coordinates of these rectangles to crop the images and create a dataset of adverts that can then be analysed for research purposes. To help ensure that no adverts were missed and to account for individual errors, each image was classified by five different people.  

One of my biggest technical challenges was to find a way of aggregating the rectangles drawn by five different people on a single page in order to calculate the rectangles of best fit. If each person only drew one rectangle, it was relatively easy for me to aggregate the results using the coding skills I had developed in the first two modules. I could simply find the average (or mean) of the five different classification attempts. But what if people identified several adverts and therefore drew multiple rectangles on a single page? For example, what if person one drew a rectangle around only one advert in the top left corner of the page; people two and three drew two rectangles on the same page, one in the top left and one in the top right; and people four and five drew rectangles around four adverts on the same page (one in each corner). How would I be able to create a piece of code that knew how to aggregate the coordinates of all the rectangles drawn in the top left and to separately aggregate the coordinates of all the rectangles drawn in the bottom right, and so on?  

One solution to this problem was to use an unsupervised machine learning method to cluster the coordinates before running the aggregation method. Much to my amazement, this worked perfectly and enabled me to successfully process the total of 92,218 rectangles that were drawn and create an aggregated dataset of more than 25,000 unique newspaper adverts.” 

-Graham Jevon, EAP Cataloguer; BL Endangered Archives Programme 

 

“The final module of the course was in some ways the most challenging — requiring a lot of us to dust off the statistics and algebra parts of our brain. However, I think, it was also the most powerful; revealing how machine learning approaches can help us to uncover hidden knowledge and patterns in a huge variety of different areas.  

Completing the course during COVID meant that collection access was limited, so I ended up completing a case study examining how generic tropes have evolved in science fiction across time using a dataset extracted from GoodReads. This work proved to be exceptionally useful in helping me to think about how computers understand language differently; and how we can leverage their ability to make statistical inferences in order to support our own, qualitative analyses. 

In my own collection area, working with born digital archives in Contemporary Archives and Manuscripts, we treat draft material — of novels, poems or anything else — as very important to understanding the creative process. I am excited to apply some of these techniques — particularly Unsupervised Machine Learning — to examine the hidden relationships between draft material in some of our creative archives. 

The course has provided many, many avenues of potential enquiry like this and I’m excited to see the projects that its graduates undertake across the Library.” 

-Callum McKean, Lead Curator, Digital; Contemporary British Collection

 

"I really enjoyed the Analytics Tools for Data Science module. As a data science novice, I came to the course with limited theoretical knowledge of how data science tools could be applied to answer research questions. The choice of using real-life data to solve queries specific to professionals in the cultural heritage sector was really appreciated as it made everyday applications of the tools and code more tangible. I can see now how curators’ expertise and specialised knowledge could be combined with tools for data analysis to further understanding of and meaningful research in their own collection area."

-Giulia Carla Rossi, Curator, Digital Publications; Contemporary British Collection

 

Final Report 

The Computing for Cultural Heritage project concluded in February 2021 with a virtual panel session that highlighted the learners’ projects and allowed discussion of the course and feedback to the key project coordinators and contributors. Case studies of the participants’ projects, as well as links to other blog posts and project pages can be found on our Computing for Cultural Heritage Student Projects page. 

The final report highlights these projects as well as demographical statistics on the participants and feedback that was gained through anonymous survey at the end of the trial. In order to evaluate the experience of the students on the PGCert we composed a list of questions that would provide insight into various aspects of the course with respect to how the learner fit in the work around their work commitments and how well they met the learning objectives. 

 

Why Computing for Cultural Heritage? 

Bar graph showing the results of the question 'Why did you choose to do this course' with the results discussed in the text below
Figure 1: Why did you choose to do this course? Results breakdown by topic and gender

When asked why the participants chose to take part in the course, we found that one of the most common answers was to develop methods for automating repetitive, manual tasks – such as generating unique identifiers for digital records and copying data between Excel spreadsheets – to free up more curatorial time for their digital collections. One participant said:  

“I wanted to learn more about coding and how to use it to analyse data, particularly data that I knew was rich and had value but had been stuck in multiple spreadsheets for quite some time.” 

There was also a desire to learn new skills, either for personal or professional development: 

“I believe in continuous professional development and knew that this would be an invaluable course to undertake for my career.”  

“I felt I was lagging behind and my job was getting static, and the feeling that I was behind [in digital] and I wanted to kind of catch up.” 

Bar graph showing the results to the question 'Did the course help you meet your aims?' with 14 answering yes, 1 answering no and 1 answering 'mixed'
Figure 2: 'Did the course help you meet your aims? Results broken down by answer and gender.

A follow up question asked whether these goals and aims was met by the course. Happily, most participants indicated that they had been met, for reasons of increased confidence, help in developing new computational skills, and a deeper knowledge of information technology. 

 

What was the most enjoyed aspect of the course? 

Bar graph showing the results of the question 'What did you enjoy most about the course' with the results discussed in the text below
Figure 3: 'What did you enjoy most about the course?' Results breakdown by topic and gender

When broken down, the responses to ‘What did you enjoy most’ largely reflect the student experience, whether it was being in taught modules (4), getting hands on experience (4), or being in a learning environment again (6). Participants also indicated that networking with peers was an enjoyable part of the experience: 

“Day out of work with like minded people made it really easy to stick with rather than just doing it online.”  

“Spending a day away from work and meeting the people I had never met at the NA, and also speaking to people from the BL about what they did.”  

“I enjoyed being a student again, learning a new skill amongst my peers, which week after week is a really valuable experience…” 

“Learning with colleagues and people working in similar fields was also a plus, as our interests often overlapped...” 

While only two responses were made where the project module was considered as one of the most enjoyable components, it was useful to see how the course really afforded the opportunity to apply their learning to solving a work-based problem that provides some benefit to their role, department or digital collection: 

“I really enjoyed being able to apply my learning to a real-world work-based project and to finally analyze some of the data that has been lying around the department for over a decade without any further analysis.”  

“The design and create aspect of the project. Applying what I learned to solving a genuine problem was the most enjoyable part - using Python and solving problems to achieve something tangible. This is where I really consolidated my learning.” 

 

What was the most challenging aspect of the course? 

Bar graph showing the results of the question 'What did you find the most challenging and why?' with the results discussed in the text below
Figure 4: 'What did you find the most challenging and why?' Results breakdown by topic and gender.

When discussing the most challenging aspect of the course, most of the learners focused on the practical Python lab sessions and the work-based project module. Interestingly, participants also stated that they were able to overcome the challenges through personal perseverance and the learning provided by the course itself: 

“I found the initial hurdle of learning how [to] code very challenging, but after the basics it became possible to become more creative and experimental.”  

“The work-based project was a huge challenge. We'd only really done 5 weeks of classes and, having never done anything like this before, it was hard to envisage an end product let alone how to put it together. But got there in the end!” 

While the majority of the cohort found the practical components of the PGCert trial most challenging, the feedback also suggested that the inclusion of the second module – which will be available as part of the full programme – will provide more opportunity to practice the practical programming skills like software tools and APIs. 

 

The Effectiveness of Computing with Cultural Heritage 

Bar graph showing the results of the question 'Have you applied anything you have learnt?' with 2 results for 'Data analysis concepts', 12 results for 'Python coding' and 2 results for 'Nothing'
Figure 5: 'Have you applied anything you have learnt?' Results breakdown by topic and gender.

Participants were asked whether they had used any of the knowledge or skills acquired in the PGCert trial. Even after sitting just the first and third modules, participants responded that they were able to apply their learning to their current role in some form.  

“I now regularly use the software program I built as part of my day-to-day job. This program performs a task in a few seconds, which otherwise could take hours or days, and which is otherwise subject to human error. I have since adapted this so that it can also be used by a colleague in another department.”  

“Python helps me perform tasks that I previously did not know how to achieve. I have also led a couple of training sessions within the library, introducing Python to beginners (using the software I built in the project as a cultural heritage use case to frame the introduction).” 

“I changed [job] role at the end of the course so I think that helped me also in getting this promotion. And in this new role I have many more data analysis tasks to perform [quickly] for actions that would take months so yeah I managed to write that with a few scripts in my new role.” 

It was great to hear that the impacts of the trial were being felt so immediately by the participants, and that they were able to not only retain but also apply the new skills that they had gained.  

 This blog post was written by Deirdre Sullivan, Business Support Officer for Digital Scholarship Training Initiatives, part of the Digital Research and Curators Team. Special thanks to Nora McGregor, Digital Curator for the European and American Collection for support on the blog post and Martyn Harris, Institute of Coding Manager, for his work on the final report, as well as Giulia Rossi, Callum McKean and Graham Jevon for sharing their experiences.

National Libraries Now: Wikimedians Unite!

On Friday 17th September 2012, I was delighted to participate in a conference panel for the National Libraries Now Conference. I had worked to assemble a veritable dream team of Wikimedia and library talent, to talk about Wikimedia Residencies from a four-nation perspective. 

Joining me on the panel were Stella Wisdom (British Library), Jason Evans (National Library of Wales), Rebecca O’Neill (Wikimedia Community Ireland) and Ruth Small (Digital Productions Operator, National Library of Scotland). Stuart Prior (Programme Coordinator, Wikimedia UK) kindly agreed to be our chair. We pre-recorded presentations that were circulated to participants, so that our time on the 17th could be devoted to questions and discussion.

Going over my notes now, the best way to try to reflect the discussion is to look at some of the questions asked and the responses garnered. Please bear in mind that some remarks may be out of chronological order!

  • How do you think working with Wikimedia helps your institution’s strategic goals?

We reflected as a group on the move from WikiPedians in Residence to WikiMedians in residence [emphasis my own] and how this shows a shift in institutional thinking towards the potential of larger Wikimedia projects, and the use of platforms such as Commons, Wikisource and WikiBase.

Jason spoke about the way that fewer onsite footfall numbers at NLW, because of its physical location, enhance the importance of digital work and online outreach. He also spoke about the need for training, promotion and contribution through Wikimedia platforms as being just as valuable, if not more so, than the total number of views gained.

Image of National Library of Wales, Aberystwyth
It might not be digital, but it is a beauty! Ian Capper, via Wikimedia Commons.

 

The National Library of Scotland is in the heart of Edinburgh, so does not face the same issues with footfall, however, as Ruth pointed out, a key strategic goal of the Library is to reach people, and digitising is not the end of the road. Engagement with collections like the NLS Data Foundry is crucial, and the groundbreaking Scottish Chapbooks project run by the NLS was born out of the pandemic, showing a new imagining of institutional goals.

  • How do you incorporate Wikimedia work into your ‘normal’ work?

It was agreed that the inclusion of Wiki in job descriptions could help change at an institutional level, while Rebecca pointed out that the inclusion of Wiki activity as an outreach activity in funding applications is often a good way forward for inclusion of this work as part of major research projects. Again, advocacy and emphasis on the ease with which Wiki work can be undertaken was a key focal point, showing colleagues that their interests and our tools can align well.

  • How do you implement elements of quality control to what is ultimately crowdsourced work?

Jason suggested that we start to think about ‘context’ control: we can upload content and edit and amend details from the beginning, however how we contextualise this material and the activity of Wiki engagement is crucial. There is a high level of quality in curation already, and often Wiki datasets will link back to other repositories such as Flickr or institutional catalogues.

The classic counterpoint of ‘anyone can edit’ and ‘everyone can edit’ came to the fore here: as was rightly pointed out, the early 00s impression of Wikipedia as a free-for-all is largely outdated. In fact, expectations are often inverted, as the enthusiastic and diligent Wiki community are quick to act upon misinformation or inaccuracies. We spoke about the beauty of the process in Wikimedia whereby information picks up value and enriched data along the way, an active evolution of resources.

Image of WIkipedia welcome page stating 'the free encyclopedia that anyone can edit'
The WIkipedia landing page: anyone can edit!

 

  • What about decolonisation and Wikimedia?

Decolonisation is a huge question for Wikimedia: movements around the world are examining what we can do to better serve the larger cause of anti-racist practice. For the British Library, I spoke about the work we have done on the India Office Records in offering a template for content warnings and working with the input of our colleagues to make this as robust of a model as we can.

Rebecca’s experience of working in Ireland was incredibly insightful: she shared with us the experience of working with Irish material that is shaped by colonial ideas of what Ireland is, and how the culture has formed. Despite being a white, European, primarily English-speaking nation, the influence of colonialism is still felt.

The use of Wikimedia as a tool for breaking down barriers is vital, as each of our speakers illustrated. Jason spoke about the digital repatriation of items, and gave an example of the Red Book of Hergest, held by Jesus College Oxford (MS 111) and now available through Wikimedia Commons. Though this kind of action cannot always stand in place of physical repatriation, the move towards collaboration is notable and important.

 

An image of anti-Irish propaganda, featuring an Irish Frankenstein figure
'The Irish Frankenstein', a piece of anti-Irish propaganda from 1882. John Tenniel, Public domain, via Wikimedia Commons.

 

An hour was simply not enough! National Libraries Now was an incredibly important experience for me, at this point in my residency. I was particularly delighted with the dedication and enthusiasm of my co-panelists, and hope that we were able to shed some light on the Wikimedian-in-Residence role for those attending.

This post is by Wikimedian in Residence Lucy Hinnie (@BL_Wikimedian).

25 August 2021

Dabbling in DCMI

One of the best bits of working in digital scholarship is the variety of learning, training and knowledge exchange we can participate in. I have come to my post as a Wikimedian with a background in digital humanities and voluntary experience, and the opportunity to solidify my skills through training courses is really exciting.

Shortly after I started at the library, I had the chance to participate in the Library Juice Academy’s course ‘Introduction to Metadata’. Metadata has always fascinated me: as someone who can still remember when the internet was installed in their house, by means of numerous AOL compact discs, the way digital information has developed is something I have had direct experience of, even if I didn’t realise it.

Green and yellow CD with 1990s AOL branding.
Image of AOL CD, courtesy of archive.org.

Metadata, simply put, is data about data. It tells us information about resource you might find in a library or museum: the author of a book, the composer of a song, the artist behind a painting. In analogue terms, this is like the title page in a novel. In digital terms, it sits alongside the content of the resource, in attached records or headers. In the Dublin Core Metadata Initiative format, one of the most common ways of expressing metadata, there are fifteen separate ‘elements’ you can apply to describe a resource, such as title, date, format and publisher.

Wikidata houses an amazing amount of data, which is unusual as it is not bounded by a set number of ‘elements’. There are many different ways of describing the items on Wikidata, and many properties and statements can be added to each item. There have been initiatives to integrate Wikidata and metadata in a meaningful way, such as the WikiProject Source Metadata and WikiCite. I have certainly found it very useful to have a sound understanding of metadata and its function, in order to utilise Wikidata effectively.

Image of Wikicite logo, with birthday branding.
Wikicite 8th Birthday Logo by bleeptrack.

The Library Juice Academy course was asynchronous and highly useful. Over four weeks, we completed modules involving self-selected readings, discussion forum posts and video seminars. I particularly enjoyed the varied selection of readings: the group of participants came from a breadth of backgrounds and experiences, and the readings reflected this. The balance between theoretical reading and practical application was excellent, and I enjoyed getting to work with MARCEdit for the first time.

I completed the course in May 2021, and was delighted to receive my certificate by email. I have a much stronger handle on the professional standard of metadata in the GLAM sector and how this intersects with the potential of the vast array of data descriptors available in Wikidata. It was also a great opportunity to think about the room for nuance, subjectivity and bias in data. During Week One, we considered ‘Misinformation and Bias in Data Processing’ by Thornburg and Oskins. I said the following in our forum discussion:

“What I have taken from this piece is a real sense of the hard work that goes into the preparation of resources, and the many different forms bias can take, often inadvertently. It has made me think about and appreciate the difficult decisions that have to be made, and the processes that underlie these practices.”

Overall, participating in this course and expanding my skills into more traditional librarianship fields was fascinating, and left me eager to learn more about metadata and start working more closely with our collections and Wikidata.

This post is by Wikimedian in Residence Lucy Hinnie (@BL_Wikimedian).

12 August 2021

Dates to discuss Wikidata at Wikimania 2021

Wikimania is often the highlight of any Wikimedian’s calendar. Hosted by the Wikimedia Foundation, Wikimania is a conference like no other. A large number of participants take part in the annual celebration of open knowledge and Wikimedia projects. Previous events have taken place in  Stockholm (2019), Cape Town (2018), Montreal (2017) and Italy (2016). Due to the ongoing global pandemic situation, this year's conference being held 13-17 August 2021 is taking place entirely online, something Wikimania is ideally suited for!

  Logo for Wikimania 2021, 4 squares, 1 with a drawing of 12 peoples faces as if they are in a videocall, the 2nd of 2 jigsaw puzzle pieces, the 3rd of paper confetti and the 4th square showing 2 people sitting at a table talking

In addition to more traditional conference sessions, Wikimania will be running an Unconference, a Community Village, and a community Hackathon. Communication is encouraged through a variety of channels including Telegram, IRC and Wiki talk pages.

Telegram machine
A photograph of an old telegraph key by Sandra Tan on Unsplash

Looking at the programme, so many interesting topics are on the table for presentation and discussion: from copyright reform, to innovation and community development, there’s a wide spectrum of material to interest all Wikimedians of every level. Handily, events are rated in terms of their suitability for beginners, to make things as welcoming as possible. There is a whole strand of presentations devoted to Wikidata, which you can view here.

I am very excited to be presenting remotely at this conference on behalf of the British Library. I will be introducing the work of Tom Derrick on the Bengali Books Wikisource Competition, and Dominic Kane (UCL) on the India Office Records project. We have shaped our panel to show what GLAM institutions can do to promote and effectively utilise Wiki platforms for public engagement with library and archive collections. Our panel will run on Sunday 15th of August at 8.15pm (7.15pm UTC).

Wikimania is free to attend online, 13-17 August 2021, registration is open until midnight on Thursday 12th August. We hope to see you there!

This post is by Wikimedian in Residence Lucy Hinnie (@BL_Wikimedian)

03 August 2021

Automating the Recognition of Chinese Manuscripts: New Chevening British Library Fellowship

 

The Chevening Fellowship Programme is the UK government’s international awards scheme aimed at fostering knowledge exchange and collaboration, and developing global leaders. In 2015, the Foreign, Commonwealth & Development Office (FCDO) has partnered with the British Library to offer professionals two new fellowships every year, and recently the two organisations have announced the renewal of their partnership until 2024/25.

Chevening logo and the British Library logo

These fellowships are unique opportunities for one-year placements at the Library, working with exceptional collections under the Library’s custodianship. The Library has hosted international fellows through this scheme since 2016, with each fellowship framing a distinct project inspired by Library collections. Past and present Chevening Fellows at the Library have focused on geographically diverse collections, from Latin America through Africa to South Asia, with different themes such as archival material from Latin America and the Caribbean, African-language printed books, Nationalism, Independence, and Partition in South Asia and Big Data and Libraries.

We are thrilled to (re-)announce that one of the two placements available for the 2022/2023 academic year will focus on automating the recognition of historical Chinese handwritten texts. This fellowship, originally announced two years ago, had to be postponed due to the pandemic – and we are excited to be able to offer it again. This is a special opportunity to work in the Library’s Digital Research Team, and engage with unique historical collections digitised as part of the International Dunhuang Project and the Lotus Sutra Manuscripts Digitisation Project. Focusing on material from Dunhuang (China), part of the Stein collection, this fellowship will engage with new digital tools and techniques in order to explore possible solutions to automate the transcription of these handwritten texts.

End piece of a Chinese Lotus Sutra Scroll (shelfmark: Or.8210/S.1606). Digitised as part of the Lotus Sutra Manuscripts Digitisation Project.
End piece of a Chinese Lotus Sutra Scroll (shelfmark: Or.8210/S.1606). Digitised as part of the Lotus Sutra Manuscripts Digitisation Project.

 

The context for this fellowship is the Library’s efforts towards making its collection items available in machine-readable format, to enable full-text search and analysis. The Library has been digitising its collections at scale for over two decades, with digitisation opening up access to diversely rich collections. However, it is important for us to further support discovery and digital research by unlocking the huge potential in automatically transcribing our collections. Until recently, Western languages print collections have been the main focus, especially newspaper collections. A flagship collaboration with the Alan Turing Institute, the Living with Machines project, has been applying Optical Character Recognition (OCR) technology to UK newspapers, designing and implementing new methods in data science and artificial intelligence, and analysing these materials at scale.

Taking a broader perspective on Library collections, we have been exploring opportunities with non-Western collections too. Library staff have been engaging closely with the exploration of OCR and Handwritten Text Recognition (HTR) systems for English, Bangla and Arabic. Digital Curators Tom Derrick, Nora McGregor and Adi Keinan-Schoonbaert have teamed up with PRImA Research Lab and the Alan Turing Institute to ran four competitions in 2017-2019, inviting providers of text recognition methods to try them out on our historical material. We have been working with Transkribus as well – for example, Alex Hailey, Curator for Modern Archives and Manuscripts, used the software to automatically transcribe 19th century botanical records from the India Office Records. An ongoing work led by Tom Derrick is to OCR our digitised collection of Bengali printed texts, digitised as part of the Two Centuries of Indian Print project.

 

Regions, text lines and illustrations demarcated as ground truth, as shown in Transkribus (Shelfmark: Or 3366). Digitised and available on Qatar Digital Library.
Regions, text lines and illustrations demarcated as ground truth, as shown in Transkribus (Shelfmark: Or 3366). Digitised and available on Qatar Digital Library.
 
 
Another screenshot from Transkribus, showing automatically transcribed Bengali printed text (Shelfmark: VT 1914 d). Digitised as part of the Two Centuries of Indian Print project.
Another screenshot from Transkribus, showing automatically transcribed Bengali printed text (Shelfmark: VT 1914 d). Digitised as part of the Two Centuries of Indian Print project.

 

The Chevening Fellow will contribute to our efforts to identify OCR/HTR systems that can tackle digitised historical collections. They will explore the current landscape of Chinese handwritten text recognition, look into methods, challenges, tools and software, use them to test our material, and demonstrate digital research opportunities arising from the availability of these texts in machine-readable format.

This fellowship programme will start in September 2022 for a 12-month period of project-based activity at the British Library. The successful candidate will receive support and supervision from Library staff, and will benefit from professional development opportunities, networking and stakeholder engagement, gaining access to a range of organisational training and development opportunities (such as the Digital Scholarship Training Programme), as well as staff-level access to unique British Library collections and research resources.

For more information and to apply, please visit the Chevening British Library Fellowship page: https://www.chevening.org/fellowship/british-library/, and the “Automating the recognition of historical Chinese handwritten texts” fellowship page: https://www.chevening.org/fellowship/british-library-historical-chinese-texts/.

Applications open on 3 August, 12:00 (midday) BST and close on 2 November, 12:00 (midday) GMT.

Good Luck!

This post is by Dr Adi Keinan-Schoonbaert, Digital Curator for Asian and African Collections, British Library. She is on twitter as @BL_AdiKS

 

22 July 2021

Building the New Media Writing Prize Special Collection

The New Media Writing Prize is awarded annually to interactive works that use technology and digital tools in exciting and innovative ways. Organised by Bournemouth University, the prize is now in its 12th year and open for entries until 26th November 2021.

Banner saying "Innovative, Immersive, Interactive. The 2021 New Media Writing Prize is open for entries. Find out more.
The homepage banner on the New Media Writing Prize website

The British Library hosted a Digital Conversations event to celebrate the 10th anniversary of the prize in 2019 and as part of our work on collecting and preserving emerging formats, last year we started building a special collection to archive all shortlisted and winning entries to the prize in the UK Web Archive. Thanks to Joan Francis for her valued support adding targets and metadata into the Annotation and Curation Tool, at the moment of writing, the collection stands at 226 websites, including not only all the works that were web-based and live at the moment of collection, but blog posts, press kits, online reviews and author’s websites as well. This kind of contextual information (like the data recorded on the ELMCIP Knowledge Base website) is especially valuable in those instances where the work itself couldn’t be captured, due to the limitations of web archiving tools, or the fact that it had already disappeared from the Internet. More information on how the collection was conceived and developed is available in the Collection Scoping Document on the British Library Research Repository.

In order to improve access to the collection and assure quality for the websites we captured, a PhD placement project started at the beginning of this June. Tegan Pyke, from Cardiff Metropolitan University, is working on the collection to identify best captures for each of these works and is also developing a creative response to the collection.

Tegan writes:

From the New Media Writing Prize shortlists, a total of 78 works have been captured, with each work averaging 13 instances to compare and contrast. Each instance represents a web crawl undertaken by the team from the Emerging Formats project.

Screen capture of UKWA search results
A screenshot showing the instances collected for Serge Bouchardon’s 2011 Main Prize winning piece, "Loss of Grasp".

One of the most difficult aspects of this work has been deciding what, exactly, constitutes an ‘acceptable’ capture. By nature digital works are highly complex—featuring audio, visual, and kinetic assets—and using bespoke platforms, formats, and code. These attributes are heightened by the speed at which technology changes; what was acceptable a decade ago may be entirely defunct today, as is the case with Adobe removing their Flash Player support.

After an initial overview of the collection, I came to the conclusion that a strict set of criteria wouldn’t be appropriate. Nor would the capture of all aspects of a work, as many—such as Amira Hanafi’s What I’m Wearing and J R Carpenter’s The Gathering Cloud—make use of external links or externally hosted image and video files. If these lie outside the UK Legal Deposit’s scope, capturing them in their entirety becomes more difficult and sometimes impossible.

Instead, I decided to focus on narrative, asking three questions as I approached each instance: 

  • Can viewers complete the narrative? 
  • Does the theme remain understandable?
  • Is the atmosphere (the overall mood of the piece) intact?

If an instance fulfils these questions, it’s acceptable, with the most complete of those captures being identified as suitable for display in the archive.

At this point, I’m half-way through comparing instances for the collection. Of the pieces captured, just less than half meet the criteria above. Out of these, most can be improved by additional crawls that capture the missing assets. Those that cannot be improved have, for the most part, been affected by software deprecation or EOL (end-of-life), where support has been completely removed.

I’m aiming to finish my review of the collection over the next couple of months, at which point I hope to provide further insight into the process. I’ve also started a collaboration with the BL's Wikimedian-in-Residence, Lucy Hinnie, to plan a Wikidata project related to the collection aiming to make use of contextual data points collected during its creation—I’m sure you’ll read about this work here soon!

This post is by Giulia Carla Rossi, Curator of Digital Publications on twitter as @giugimonogatari and Tegan Pyke, a PhD student at Cardiff Metropolitan University currently undertaking a placement in Contemporary British Published Collections at the British Library.

Digital scholarship blog recent posts

Archives

Tags

Other British Library blogs