Digital scholarship blog

230 posts categorized "Data"

29 October 2021

Thought Bubble 2021 Wikithon Preparation

Comics fans, are you getting geared up for Thought Bubble? If you enjoy, or want to learn how to edit Wikipedia and Wikidata about comics, please do join us and our collaborators at Leeds Libraries for our first in-person Wikithon since this residency started, on Thursday 11th November, from 1.30pm to 4.30pm, in the Sanderson Room of Leeds Central Library.

Drawing of a person reading a comic and drinking a mug of tea

Joining us in person?

Remember the first step is to book your place here, via Eventbrite

If you’d like to get a head start, you can download and read our handy guide to setting up your Wikipedia account. There is advice on creating your account, Wikipedia's username policy and how to create your user page.

Once you have done that, or if you already have a Wikipedia account, please join our Thought Bubble Wikithon dashboard (the enrollment passcode is ltspmyfa) and go through the introductory exercises, which cover:

  • Wikipedia Essentials
  • Editing Basics
  • Evaluating Articles and Sources
  • Contributing Images and Media Files
  • Sandboxes and Mainspace
  • Sources and Citations
  • Plagiarism
  • Introduction to Wikidata (for those interested in this)

These are all short exercises that will help familiarise you with Wikipedia and its processes. Don’t have time to do them? We get it, and that’s totally fine - we’ll cover the basics on the day too!

You may want to verify your Wikipedia account - this function exists to make sure that people are contributing responsibly to Wikipedia. The easiest and swiftest way to verify your account is to do 10 small edits. You could do this by correcting typos or adding in missing dates. However, another way to do this is to find articles where citations are needed, and add them via Citation Hunt. For further information on adding citations, watching this video may be useful.

When it comes to Wikidata, we are very inspired by the excellent work of the Graphic Possibilities project at the Michigan University Department of English and we have been learning from them. For those interested in editing Wikidata we will be on hand to support this during our Thought Bubble Wikithon event.

Happier with a hybrid approach?

If you cannot join the physical event in person, but would like to contribute, please do check out and sign up to our dashboard. Although we cannot run the training as a hybrid presentation on this occasion, the online dashboard training exercises will be an excellent starting point. From there, all of your edits and contributions will be registered, and you can pat yourself firmly on the back for making the world of comics a better place from a distance.

However, if you can attend in person, please register for the Wikithon at Leeds Central Library here and check out the Thought Bubble festival programme here. Hope to see you there!

This post is by Wikimedian in Residence Lucy Hinnie (@BL_Wikimedian) and Digital Curator Stella Wisdom (@miss_wisdom).

26 October 2021

On Digital Technologies, Our Cultural Heritage and Global Warming. How do they come together in Venice?

Global warming does not affect only the environment, it affects the entire system we live in. We can’t think of it as detached from gender, social and racial inequalities. Neither as something separated from our cultural heritage. For this reason, when we think about actions we shouldn’t focus only on emissions reductions, but also think about how to preserve our cultural and artistic production and learn how this, with the aid of new technologies, can help us find new ways to shape our future.

Last year, during my spare time, with the help of Marco Magini (writer and environmental policy adviser), Paolo Nelli (writer) and Maddalena Vatti (producer) I started investigating what role digital technologies play in a city like Venice, which is notoriously under the threat of rising waters, and even more so with the increased global warming.

On the 13th of November 2019 an exceptional acqua alta (a high tide) hit the city bringing one of the worst devastation of the last century. Various archives, buildings, commercial activities, homes and cultural venues were damaged. This prompted a question: what can we understand from an event like this? Is the case of Venice an isolated one or is it a cautionary tale for humanity? After all Venice is not the only city which is sinking and where rising tides threaten to unravel the urban fabric. We should not simply mourn the devastation and start to repair the damage, we should consider the event as an opportunity to think about the direct impact of global warming on our cultural heritage and what we can do to reduce it.

While conducting interviews with scholars, experts, professionals and citizens with the aim of producing a podcast, we slowly came to understand the role and potential of digital technologies in the study of the evolution of a city in respect to changing climate and urban conditions, as well as the role these play in its preservation.

Digital preservation, 3D rendering and water sensors

A fantastic example of digital preservation  is the one carried out  between the 6th and 17th of July 2020 by a team from the Factum Foundation for Digital Technology in Conservation in collaboration with the Cini Foundation, EPFL and Iconem (https://www.factumfoundation.org/pag/1640/recording-the-island-of-san-giorgio-maggiore). They spent twelve days in Venice recording the Island of San Giorgio Maggiore in its entirety. The result was a virtual rendering of the island made using a mix of LID long-range LIDAR scanning to capture the overall shape of the buildings, external and internal views and high resolution photogrammetry to add the surface detail to that. The island was recorded from more than 600 different recording spots, from which a massive 60.000 million-point cloud was generated. The data acquired through photogrammetry is currently being merged with the point-clouds with the aim of creating a 3D model of the whole island.

two images of the same statue side by side, the one on the right uses high resolution photogrammetry
First (right) and final (right) data processing of the render of one the statues on the façade © Factum Foundation for ARCHiVe

This massive work enabled researchers to study the sculptures and the inscriptions that are high up on the facade of San Giorgio but also to analyse the way that the plaster covering the walls was being affected by salt and peeling off.

Thanks to these data it is now possible to carry out really detailed recording of the breakdown of a surface and also monitor the speed at which the cobalt coverings are being blown off by the salt, the speed of decay, to really look and create data to discuss how best to preserve the material heritage on the island.

Camera obscura, painting and digital image analysis: what can the past tell us about the present and the future

It is also possible to use paintings and buildings to look at the past to learn our present. In fact, these artifacts can unconsciously record events and phenomena that postdate their own creation, carrying them into the future.

The researcher in atmospheric physics and cultural heritage Dario Camuffo has conducted a scientific analysis of the works of Venetian painters, Canaletto in particular, depicting buildings and compared them with the state of the very same buildings today in an attempt to calculate the impact of land subsidence in Venice.

Painting of The Grand Canal in Venice
Canaletto (Venice 1697-Venice 1768) - The Grand Canal looking East from the Carità towards the Bacino

As professor Camuffo has written, “in general paintings provide a qualitative image, but in Venice’s case, a quantitative evaluation of the apparent sea level rise is possible, thanks to accurate paintings by Canaletto and Bellotto, drawn with the aid of the camera obscura. The paintings accurately reproduce all of the details with a high degree of precision, including the algae belt. […] By analysing these paintings, and comparing them with the algae level we see today, we can extend our knowledge of Venice’s submersion, reaching back in time almost as far back as three centuries.”

How many stories and information are buried in the archives? Deep learning image analysis can help to reveal them, we just need to think creatively.

Maps and algorithms, space syntax, literature and architecture

Maps and literature can also reveal more stories about a city than we think.

UCL/Bartlett Institute Professor Sophia Psarra, drawing inspiration from Italo Calvino’s Invisible Cities and Le Corbusier’s discarded project for the Venice Hospital, has studied the urban evolution of Venice computing the distribution and distances between bridges, calli (=tiny alleys), squares and wells over time. The analysis, which is based on the approaches developed within the world of space syntax, has shown that Venice has and still evolves as a system that resembles a highly probabilistic ‘algorithm’.

What seems a chaotic evolution is in fact the result of the interaction between space and social activity. Maps and data analysis can reveal the modularity of a city and the traces of how social activities have interacted and forged the space. These can help see new connections between literary imagination and the evolution of our society but also help us understand how we can imagine a future which is affected by growing uncertainties.

Digital technologies applied to our cultural heritage as these three examples have shown are an aid to study the past and imagine the future. They can help understand how we as a society can evolve, but also how all our cultural productions are sources of incredible information if we know how to look at them. We can measure the impact of global warming on our cultural artifacts and try to imagine a better future.

To know more on the role of Venice as a vantage point from where to look at the growing emergencies surrounding us –– environmental, cultural, social, and technological –– you can listen to the podcast The Fifth Siren (thefifthsiren.com) and join us for a British Library free online event on Monday 8th November with Professor Sophia Psarra and architectural artists Ila Bêka and Louise Lemoine. More info here: https://www.bl.uk/events/venice-tales-of-a-sinking-city.

This post is by Dr Giorgia Tolfo (@giorgiatolfo), Data and Content Manager for the Living with Machines project.

22 October 2021

Thought Bubble 2021 Wikithon

We are so excited to be working with Thought Bubble and our friends at Leeds Libraries to run our first in-person Wikithon since this residency started. Thought Bubble is an amazing comics festival spread across Yorkshire, culminating in a two day convention in Harrogate, where the British Library will be having a stall and curating a panel discussion, more details about these can be found here.

Thought Bubble Comic Convention Banner

The Thought Bubble website sums it up best when it says: ‘[w]e use our festival week to promote the power of comics! We believe they can inspire, educate and bring people together like no other medium [...]’. We at the library quite agree.

On Thursday November 11, from 1.30pm to 4.30pm, we’ll be taking up residence in the Sanderson Room of Leeds Central Library to demonstrate how to update, create and improve Wikipedia articles, and we'll even dabble in a bit of basic Wikidata editing for those who are interested. The Comics Wikithon event is free, but please book here.

Photograph of Leeds Central Library on a clear sunny day
Leeds Central Library by Lad 2011, CC BY-SA 4.0 via Wikimedia Commons

We’ll be focusing on underrepresented and marginalised voices in graphic novels and comics. We’re particularly interested in exploring the way Black, Asian and minority ethnic, disabled and LGBTQ+ creators and characters, and want to amplify representation at all levels!

As with all our Wikithons, no previous experience of editing Wikipedia is required. If you can write an email, you can edit Wikipedia! Whether it’s Widdershins, The Walking Dead or Wolverine that you like best, come along and learn some new skills and expand your comic horizons.

For those of you keen to get started, we’ll be following up next week with a blog post on how to get set up for the event. In the meantime you can freely register for the Comics Wikithon event here. 

This post is by Wikimedian in Residence Lucy Hinnie (@BL_Wikimedian).

29 September 2021

Sailing Away To A Distant Land - Mahendra Mahey, Manager of BL Labs - final post

Posted by Mahendra Mahey, former Manager of British Library Labs or "BL Labs" for short

[estimated reading time of around 15 minutes]

This is is my last day working as manager of BL Labs, and also my final posting on the Digital Scholarship blog. I thought I would take this chance to reflect on my journey of almost 9 years in helping to set up, maintain and enabling BL Labs to become a permanent fixture at the British Library (BL).

BL Labs was the first digital Lab in a national library, anywhere in the world, that gets people to experiment with its cultural heritage digital collections and data. There are now several Gallery, Library, Archive and Museum Labs or 'GLAM Labs' for short around the world, with an active community which I helped build, from 2018.

I am really proud I was there from the beginning to implement the original proposal which was written by several colleagues, but especially Adam Farquhar, former head of Digital Scholarship at the British Library (BL). The project was at first generously funded by the Andrew W. Mellon foundation through four rounds of funding as well as support from the BL. In April 2021, the project became a permanently funded fixture, helped very much by my new manager Maja Maricevic, Head of Higher Education and Science.

The great news is that BL Labs is going to stay after I have left. The position of leading the Lab will soon be advertised. Hopefully, someone will get a chance to work with my helpful and supportive colleague Technical Lead of Labs, Dr Filipe Bento, bright, talented and very hard working Maja and other great colleagues in Digital Research and wider at the BL.

The beginnings, the BL and me!

I met Adam Farquhar and Aly Conteh (Former Head of Digital Research at the BL) in December 2012. They must have liked something about me because I started working on the project in January 2013, though I officially started in March 2013 to launch BL Labs.

I must admit, I had always felt a bit intimidated by the BL. My first visit was in the early 1980s before the St Pancras site was opened (in 1997) as a Psychology student. I remember coming up from Wolverhampton on the train to get a research paper about "Serotonin Pathways in Rats when sleeping" by Lidov, feeling nervous and excited at the same time. It felt like a place for 'really intelligent educated people' and for those who were one for the intellectual elites in society. It also felt for me a bit like it represented the British empire and its troubled history of colonialism, especially some of the collections which made me feel uncomfortable as to why they were there in the first place.

I remember thinking that the BL probably wasn't a place for some like me, a child of Indian Punjabi immigrants from humble beginnings who came to England in the 1960s. Actually, I felt like an imposter and not worthy of being there.

Nearly 9 years later, I can say I learned to respect and even cherish what was inside it, especially the incredible collections, though I also became more confident about expressing stronger views about the decolonisation of some of these.  I became very fond of some of the people who work or use it, there are some really good kind-hearted souls at the BL. However, I never completely lost that 'imposter and being an outsider' feeling.

What I remember at that time, going for my interview, was having this thought, what will happen if I got the position and 'What would be the one thing I would try and change?'. It came easily to me, namely that I would try and get more new people through the doors literally or virtually by connecting them to the BL's collections (especially the digital). New people like me, who may have never set foot, or had been motivated to step into the building before. This has been one of the most important reasons for me to get up in the morning and go to work at BL Labs.

So what have been my highlights? Let's have a very quick pass through!

BL Labs Launch and Advisory Board

I launched BL Labs in March 2013, one week after I had started. It was at the launch event organised by my wonderfully supportive and innovative colleague, Digital Curator Stella Wisdom. I distinctly remember in the afternoon session (which I did alone), I had to present my 'ideas' of how I might launch the first BL Labs competition where we would be trying to get pioneering researchers to work with the BL's digital collections.

God it was a tough crowd! They asked pretty difficult questions, questions I myself was asking too which I still didn't know the answer too either.

I remember Professors Tim Hitchcock (now at Sussex University and who eventually sat (and is still sitting) on the BL Labs Advisory Board) and Laurel Brake (now Professor Emerita of Literature and Print Culture, Birkbeck, University of London) being in the audience together with staff from the Royal Library of Netherlands, who 6 months later launched their own brilliant KB Lab. Subsequently, I became good colleagues with Lotte Wilms who led their Lab for many years and is now Head of Research support at Tilburg University.

My first gut feeling overall after the event was, this is going to be hard work. This feeling and reality remained a constant throughout my time at BL Labs.

In early May 2013, we launched the competition, which was a really quick and stressful turnaround as I had only officially started in mid March (one and a half months). I remember worrying as to whether anyone would even enter!  All the final entries were pretty much submitted a few minutes before the deadline. I remember being alone that evening on deadline day near to midnight waiting by my laptop, thinking what happens if no one enters, it's going to be disaster and I will lose my job. Luckily that didn't happen, in the end, we received 26 entries.

I am a firm believer that we can help make our own luck, but sometimes luck can be quite random! Perhaps BL Labs had a bit of both!

After that, I never really looked back! BL Labs developed its own kind of pattern and momentum each year:

  • hunting around the BL for digital collections to make into datasets and make available
  • helping to make more digital collections openly licensed
  • having hundreds of conversations with people interested in connecting with the BL's digital collections in the BL and outside
  • working with some people more intensively to carry out experiments
  • developing ideas further into prototype projects
  • telling the world of successes and failures in person, meetings, events and social media
  • launching a competition and awards in April or May
  • roadshows before and after with invitations to speak at events around the world
  • the summer working with competition winners
  • late October/November the international symposium showcased things from the year
  • working on special projects
  • repeat!

The winners were announced in July 2013, and then we worked with them on their entries showcasing them at our annual BL Labs Symposium in November, around 4 months later.

'Nothing interesting happens in the office' - Roadshows, Presentations, Workshops and Symposia!

One of the highlights of BL Labs was to go out to universities and other places to explain what the BL is and what BL Labs does.  This ended up with me pretty much seeing the world (North America, Europe, Asia, Australia, and giving virtual talks in South America and Africa).

My greatest challenge in BL Labs was always to get people to truly and passionately 'connect' with the BL's digital collections and data in order to come up with cool ideas of what to actually do with them. What I learned from my very first trip was that telling people what you have is great, they definitely need to know what you have! However, once you do that, the hard work really begins as you often need to guide and inspire many of them, help and support them to use the collections creatively and meaningfully. It was also important to understand the back story of the digital collection and learn about the institutional culture of the BL if people also wanted to work with BL colleagues.  For me and the researchers involved, inspirational engagement with digital collections required a lot of intellectual effort and emotional intelligence. Often this means asking the uncomfortable questions about research such as 'Why are we doing this?', 'What is the benefit to society in doing this?', 'Who cares?', 'How can computation help?' and 'Why is it necessary to even use computation?'.

Making those connections between people and data does feel like magic when it really works. It's incredibly exciting, suddenly everyone has goose bumps and is energised. This feeling, I will take away with me, it's the essence of my work at BL Labs!

A full list of over 200 presentations, roadshows, events and 9 annual symposia can be found here.

Competitions, Awards and Projects

Another significant way BL Labs has tried to connect people with data has been through Competitions (tell us what you would like to do, and we will choose an idea and work collaboratively with you on it to make it a reality), Awards (show us what you have already done) and Projects (collaborative working).

At the last count, we have supported and / or highlighted over 450 projects in research, artistic, entrepreneurial, educational, community based, activist and public categories most through competitions, awards and project collaborations.

We also set up awards for British Library Staff which has been a wonderful way to highlight the fantastic work our staff do with digital collections and give them the recognition they deserve. I have noticed over the years that the number of staff who have been working on digital projects has increased significantly. Sometimes this was with the help of BL Labs but often because of the significant Digital Scholarship Training Programme, run by my Digital Curator colleagues in Digital Research for staff to understand that the BL isn't just about physical things but digital items too.

Browse through our project archive to get inspiration of the various projects BL Labs has been involved in or highlighted.

Putting the digital collections 'where the light is' - British Library platforms and others

When I started at BL Labs it was clear that we needed to make a fundamental decision about how we saw digital collections. Quite early on, we decided we should treat collections as data to harness the power of computational tools to work with each collection, especially for research purposes. Each collection should have a unique Digital Object Identifier (DOI) so researchers can cite them in publications.  Any new datasets generated from them will also have DOIs, allowing us to understand the ecosystem through DOIs of what happens to data when you get it out there for people to use.

In 2014, https://data.bl.uk was born and today, all our 153 datasets (as of 29/09/2021) are available through the British Library's research repository.

However, BL Labs has not stopped there! We always believed that it's important to put our digital collections where others are likely to discover them (we can't assume that researchers will want to come to BL platforms), 'where the light is' so to speak.  We were very open and able to put them on other platforms such as Flickr and Wikimedia Commons, not forgetting that we still needed to do the hard work to connect data to people after they have discovered them, if they needed that support.

Our greatest success by far was placing 1 million largely undescribed images that were digitally snipped from 65,000 digitised public domain books from the 19th Century on Flickr Commons in 2013. The number of images on the platform have grown since then by another 50 to 60 thousand from collections elsewhere in the BL. There has been significant interaction from the public to generate crowdsourced tags to help to make it easier to find the specific images. The number of views we have had have reached over a staggering 2 billion over this time. There have also been an incredible array of projects which have used the images, from artistic use to using machine learning and artificial intelligence to identify them. It's my favourite collection, probably because there are no restrictions in using it.

Read the most popular blog post the BL has ever published by my former BL Labs colleague, the brilliant and inspirational Ben O'Steen, a million first steps and the 'Mechanical Curator' which describes how we told the world why and how we had put 1 million images online for anyone to use freely.

It is wonderful to know that George Oates, the founder of Flickr Commons and still a BL Labs Advisory Board member, has been involved in the creation of the Flickr Foundation which was announced a few days ago! Long live Flickr Commons! We loved it because it also offered a computational way to access the collections, critical for powerful and efficient computational experiments, through its Application Programming Interface (API).

More recently, we have experimented with browser based programming / computational environments - Jupyter Notebooks. We are huge fans of Tim Sherrat who was a pioneer and brilliant advocate of OPEN GLAM in using them, especially through his GLAM Workbench. He is a one person Lab in his own right, and it was an honour to recognise his monumental efforts by giving him the BL Labs Research Award 2020 last year. You can also explore the fantastic work of Gustavo Candela and colleagues on Jupyter Notebooks and the ones my colleageue Filipe Bento created.

Art Exhibitions, Creativity and Education

I am extremely proud to have been involved in enabling two major art exhibitions to happen at the BL, namely:

Crossroads of Curiosity by David Normal

Imaginary Cities by Michael Takeo Magruder

I loved working with artists, its my passion! They are so creative and often not restricted by academic thinking, see the work of Mario Klingemann for example! You can browse through our archives for various artistic projects that used the BL's digital collections, it's inspiring.

I was also involved in the first British Library Fashion Student Competition won by Alanna Hilton, held at the BL which used the BL's Flickr Commons collection as inspiration for the students to design new fashion ranges. It was organised by my colleague Maja Maricevic, the British Fashion Colleges Council and Teatum Jones who were great fun to work with. I am really pleased to say that Maja has gone on from strength to strength working with the fashion industry and continues to run the competition to this day.

We also had some interesting projects working with younger people, such as Vittoria's world of stories and the fantastic work of Terhi Nurmikko-Fuller at the Australian National University. This is something I am very much interested in exploring further in the future, especially around ideas of computational thinking and have been trying out a few things.

GLAM Labs community and Booksprint

I am really proud of helping to create the international GLAM Labs community with over 250 members, established in 2018 and still active today. I affectionately call them the GLAM Labbers, and I often ask people to explore their inner 'Labber' when I give presentations. What is a Labber? It's the experimental and playful part of us we all had as children and unfortunately many have lost when becoming an adult. It's the ability to be fearless, having the audacity and perhaps even naivety to try crazy things even if they are likely to fail! Unfortunately society values success more than it does failure. In my opinion, we need to recognise, respect and revere those that have the courage to try but failed. That courage to experiment should be honoured and embraced and should become the bedrock of our educational systems from the very outset.

Two years ago, many of us Labbers 'ate our own dog food' or 'practised what we preached' when me and 15 other colleagues came together for 5 days to produce a book through a booksprint, probably the most rewarding professional experience of my life. The book is about how to set up, maintain, sustain and even close a GLAM Lab and is called 'Open a GLAM Lab'. It is available as public domain content and I encourage you to read it.

Online drop-in goodbye - today!

I organised a 30 minute ‘online farewell drop-in’ on Wednesday 29 September 2021, 1330 BST (London), 1430 (Paris, Amsterdam), 2200 (Adelaide), 0830 (New York) on my very last day at the British Library. It was heart-warming that the session was 'maxed out' at one point with participants from all over the world. I honestly didn't expect over 100 colleagues to show up. I guess when you leave an organisation you get to find out who you actually made an impact on, who shows up, and who tells you, otherwise you may never know.

Those that know me well know that I would have much rather had a farewell do ‘in person’, over a pint and praying for the ‘chip god’ to deliver a huge portion of chips with salt/vinegar and tomato sauce’ magically and mysteriously to the table. The pub would have been Mc'Glynns (http://www.mcglynnsfreehouse.com/) near the British Library in London. I wonder who the chip god was?  I never found out ;)

The answer to who the chip god was is in text following this sentence on white on white text...you will be very shocked to know who it was!- s

Spoiler alert it was me after all, my alter ego

Farwell-bl-labs-290921Mahendra's online farewell to BL Labs, Wednesday 29 September, 1330 BST, 2021.
Left: Flowers and wine from the GLAM Labbers arrived in Tallinn, 20 mins before the meeting!
Right: Some of the participants of the online farewell

Leave a message of good will to see me off on my voyage!

It would be wonderful if you would like to leave me your good wishes, comments, memories, thoughts, scans of handwritten messages, pictures, photographs etc. on the following Google doc:

http://tiny.cc/mahendramahey

I will leave it open for a week or so after I have left. Reading positive sincere heartfelt messages from colleagues and collaborators over the years have already lifted my spirits. For me it provides evidence that you perhaps did actually make a difference to somone's life.  I will definitely be re-reading them during the cold dark Baltic nights in Tallinn.

I would love to hear from you and find out what you are doing, or if you prefer, you can email me, the details are at the end of this post.

BL Labs Sailor and Captain Signing Off!

It's been a blast and lots of fun! Of course there is a tinge of sadness in leaving! For me, it's also been intellectually and emotionally challenging as well as exhausting, with many ‘highs’ and a few ‘lows’ or choppy waters, some professional and others personal.

I have learned so much about myself and there are so many things I am really really proud of. There are other things of course I wish I had done better. Most of all, I learned to embrace failure, my best teacher!

I think I did meet my original wish of wanting to help to open up the BL to as many new people who perhaps would have never engaged in the Library before. That was either by using digital collections and data for cool projects and/or simply walking through the doors of the BL in London or Boston Spa and having a look around and being inspired to do something because of it.

I wish the person who takes over my position lots of success! My only piece of advice is if you care, you will be fine!

Anyhow, what a time this has been for us all on this planet? I have definitely struggled at times. I, like many others, have lost loved ones and thought deeply about life and it's true meaning. I have also managed to find the courage to know what’s important and act accordingly, even if that has been a bit terrifying and difficult at times. Leaving the BL for example was not an easy decision for me, and I wish perhaps things had turned out differently, but I know I am doing the right thing for me, my future and my loved ones. 

Though there have been a few dark times for me both professionally and personally, I hope you will be happy to know that I have also found peace and happiness too. I am in a really good place.

I would like to thank former alumni of BL Labs, Ben O'Steen - Technical Lead for BL Labs from 2013 to 2018, Hana Lewis (2016 - 2018) and Eleanor Cooper (2018-2019) both BL Labs Project Officers and many other people I worked through BL Labs and wider in the Library and outside it in my journey.

Where I am off to and what am I doing?

My professional plans are 'evolving', but one thing is certain, I will be moving country!

To Estonia to be precise!

I plan to live, settle down with my family and work there. I was never a fan of Brexit, and this way I get to stay a European.

I would like to finish with this final sweet video created by writer and filmaker Ling Low and her team in 2016, entitled 'Hey there Young Sailor' which they all made as volunteers for the Malaysian band, the 'Impatient Sisters'. It won the BL Labs Artistic Award in 2016. I had the pleasure and honour of meeting Ling over a lovely lunch in Kuala Lumpa, Malaysia, where I had also given a talk at the National Library about my work and looked for remanants of my grandfather who had settled there many years ago.

I wish all of you well, and if you are interested in keeping in touch with me, working with me or just saying hello, you can contact me via my personal email address: mr.mahendra.mahey@gmail.com or follow my progress on my personal website.

Happy journeys through this short life to all of you!

Mahendra Mahey, former BL Labs Manager / Captain / Sailor signing off!

23 September 2021

Computing for Cultural Heritage: Trial Outcomes and Final Report

Six months ago, twenty members of staff from the British Library and The National Archives UK completed Computing for Cultural Heritage, a project that trialled Birkbeck University and Institute of Coding’s new PGCert, Applied Data Science. In this blog post we explore the necessity of this new course, the final report of this trial, and the lasting impact that this PGCert has made on some of the participants. 

 

 

Background 

Information professionals have been experiencing a massive shift to digital in the way collections are being donated, held and accessed. In the British Library’s digital collections there are e-books, maps, digitised newspapers, journal titles, sound recordings and over 500 terabytes of preserved data from the UK Web Archive. Yearly, the library sees 6 million catalogue searches by web users with almost 4 million items consulted online. This amounts to a vast amount of potential cultural heritage data available to researchers, and it requires complex digital workflows to curate, collect, manage, provide access, and help researchers computationally make sense of it all. 

Staff at collecting institutions like the British Library and the National Archives, UK are engaging in computationally driven projects like never before, but often without the benefit of data skills and computational thinking to support them. That is where a program like Computing for Cultural Heritage can help information professionals, allowing them to upskill and tackle issues – like building new digital systems and services, supporting collaborative, computational and data-driven research using digital collections and data, or deploying simple scripts to make everyday tasks easier – with confidence.  

Image of a laptop with the screen showing a bookshelf

 

Learning Aims 

The trial course was broken into two modules, a taught lesson on ‘Demystifying Computing with Python’ and a written ‘Industry Project’ on a software solution to a work-based problem.  A third module, Analytic Tools for Information Professionals, would be offered to participants outside of the trial as part of the full live course in order to earn their PGCert.

By the end of the trial, participants were able to: 

  • Demonstrate satisfactory knowledge of programming with Python. 
  • Understand techniques for Python data structures and algorithms. 
  • Work on case studies to apply data analytics using Python. 
  • Understand the programming paradigm of object-oriented programming. 
  • Use Python to apply the techniques learned on the module to real-world problems. 
  • Demonstrate the ability to develop an algorithm to carry out a specified task and to convert this into an executable program. 
  • Demonstrate the ability to debug a program. 
  • Understand the concepts of data security and general data protection regulations and standards. 
  • Develop a systematic understanding and critical awareness of a commonly agreed problem between the work environment and the academic supervisor in the area of computing. 
  • Develop a software solution for a work-based problem using the skills developed from the taught modules, for example develop software using the programming languages and software tools/libraries taught. 
  • Present a critical discussion on existing approaches in the particular problem area and position their own approach within that area and evaluate their contribution. 

  • Gain experience in communicating complex ideas/concepts and approaches/techniques to others by writing a comprehensive, self-contained report. 

The learning objectives were designed and delivered with the cultural heritage context in mind, and as such incorporated, for instance, examples and datasets from the British Library Music collections in the Python programming elements of the taught module. Additionally, there was a lecture focused on a British Library user case involving the design and implementation of a Database Management System. 

Following the completion of the trial, participants had the opportunity to complete their PGCert in Applied Data Science by attending the final module, Analytic Tools for Information Professionals, which was part of the official course launched last autumn. 

 

The Lasting Impact of Computing for Cultural Heritage 

Now that we’re six months on from the end of the trial, and the participants who opted in have earned their full PGCert, we followed up with some of the learners to hear about their experiences and the lasting effects of the course: 

“The third and final module of the computing for cultural heritage course was not only fascinating and enjoyable, it was also really pertinent to my job and I was immediately able to put the skills I learned into practice.  

The majority of the third module focussed on machine learning. We studied a number of different methods and one of these proved invaluable to the Agents of Enslavement research project I am currently leading. This project included a crowdsourcing task which asked the public to draw rectangles around four different types of newspaper advertisement. The purpose of the task was to use the coordinates of these rectangles to crop the images and create a dataset of adverts that can then be analysed for research purposes. To help ensure that no adverts were missed and to account for individual errors, each image was classified by five different people.  

One of my biggest technical challenges was to find a way of aggregating the rectangles drawn by five different people on a single page in order to calculate the rectangles of best fit. If each person only drew one rectangle, it was relatively easy for me to aggregate the results using the coding skills I had developed in the first two modules. I could simply find the average (or mean) of the five different classification attempts. But what if people identified several adverts and therefore drew multiple rectangles on a single page? For example, what if person one drew a rectangle around only one advert in the top left corner of the page; people two and three drew two rectangles on the same page, one in the top left and one in the top right; and people four and five drew rectangles around four adverts on the same page (one in each corner). How would I be able to create a piece of code that knew how to aggregate the coordinates of all the rectangles drawn in the top left and to separately aggregate the coordinates of all the rectangles drawn in the bottom right, and so on?  

One solution to this problem was to use an unsupervised machine learning method to cluster the coordinates before running the aggregation method. Much to my amazement, this worked perfectly and enabled me to successfully process the total of 92,218 rectangles that were drawn and create an aggregated dataset of more than 25,000 unique newspaper adverts.” 

-Graham Jevon, EAP Cataloguer; BL Endangered Archives Programme 

 

“The final module of the course was in some ways the most challenging — requiring a lot of us to dust off the statistics and algebra parts of our brain. However, I think, it was also the most powerful; revealing how machine learning approaches can help us to uncover hidden knowledge and patterns in a huge variety of different areas.  

Completing the course during COVID meant that collection access was limited, so I ended up completing a case study examining how generic tropes have evolved in science fiction across time using a dataset extracted from GoodReads. This work proved to be exceptionally useful in helping me to think about how computers understand language differently; and how we can leverage their ability to make statistical inferences in order to support our own, qualitative analyses. 

In my own collection area, working with born digital archives in Contemporary Archives and Manuscripts, we treat draft material — of novels, poems or anything else — as very important to understanding the creative process. I am excited to apply some of these techniques — particularly Unsupervised Machine Learning — to examine the hidden relationships between draft material in some of our creative archives. 

The course has provided many, many avenues of potential enquiry like this and I’m excited to see the projects that its graduates undertake across the Library.” 

-Callum McKean, Lead Curator, Digital; Contemporary British Collection

 

"I really enjoyed the Analytics Tools for Data Science module. As a data science novice, I came to the course with limited theoretical knowledge of how data science tools could be applied to answer research questions. The choice of using real-life data to solve queries specific to professionals in the cultural heritage sector was really appreciated as it made everyday applications of the tools and code more tangible. I can see now how curators’ expertise and specialised knowledge could be combined with tools for data analysis to further understanding of and meaningful research in their own collection area."

-Giulia Carla Rossi, Curator, Digital Publications; Contemporary British Collection

 

Final Report 

The Computing for Cultural Heritage project concluded in February 2021 with a virtual panel session that highlighted the learners’ projects and allowed discussion of the course and feedback to the key project coordinators and contributors. Case studies of the participants’ projects, as well as links to other blog posts and project pages can be found on our Computing for Cultural Heritage Student Projects page. 

The final report highlights these projects as well as demographical statistics on the participants and feedback that was gained through anonymous survey at the end of the trial. In order to evaluate the experience of the students on the PGCert we composed a list of questions that would provide insight into various aspects of the course with respect to how the learner fit in the work around their work commitments and how well they met the learning objectives. 

 

Why Computing for Cultural Heritage? 

Bar graph showing the results of the question 'Why did you choose to do this course' with the results discussed in the text below
Figure 1: Why did you choose to do this course? Results breakdown by topic and gender

When asked why the participants chose to take part in the course, we found that one of the most common answers was to develop methods for automating repetitive, manual tasks – such as generating unique identifiers for digital records and copying data between Excel spreadsheets – to free up more curatorial time for their digital collections. One participant said:  

“I wanted to learn more about coding and how to use it to analyse data, particularly data that I knew was rich and had value but had been stuck in multiple spreadsheets for quite some time.” 

There was also a desire to learn new skills, either for personal or professional development: 

“I believe in continuous professional development and knew that this would be an invaluable course to undertake for my career.”  

“I felt I was lagging behind and my job was getting static, and the feeling that I was behind [in digital] and I wanted to kind of catch up.” 

Bar graph showing the results to the question 'Did the course help you meet your aims?' with 14 answering yes, 1 answering no and 1 answering 'mixed'
Figure 2: 'Did the course help you meet your aims? Results broken down by answer and gender.

A follow up question asked whether these goals and aims was met by the course. Happily, most participants indicated that they had been met, for reasons of increased confidence, help in developing new computational skills, and a deeper knowledge of information technology. 

 

What was the most enjoyed aspect of the course? 

Bar graph showing the results of the question 'What did you enjoy most about the course' with the results discussed in the text below
Figure 3: 'What did you enjoy most about the course?' Results breakdown by topic and gender

When broken down, the responses to ‘What did you enjoy most’ largely reflect the student experience, whether it was being in taught modules (4), getting hands on experience (4), or being in a learning environment again (6). Participants also indicated that networking with peers was an enjoyable part of the experience: 

“Day out of work with like minded people made it really easy to stick with rather than just doing it online.”  

“Spending a day away from work and meeting the people I had never met at the NA, and also speaking to people from the BL about what they did.”  

“I enjoyed being a student again, learning a new skill amongst my peers, which week after week is a really valuable experience…” 

“Learning with colleagues and people working in similar fields was also a plus, as our interests often overlapped...” 

While only two responses were made where the project module was considered as one of the most enjoyable components, it was useful to see how the course really afforded the opportunity to apply their learning to solving a work-based problem that provides some benefit to their role, department or digital collection: 

“I really enjoyed being able to apply my learning to a real-world work-based project and to finally analyze some of the data that has been lying around the department for over a decade without any further analysis.”  

“The design and create aspect of the project. Applying what I learned to solving a genuine problem was the most enjoyable part - using Python and solving problems to achieve something tangible. This is where I really consolidated my learning.” 

 

What was the most challenging aspect of the course? 

Bar graph showing the results of the question 'What did you find the most challenging and why?' with the results discussed in the text below
Figure 4: 'What did you find the most challenging and why?' Results breakdown by topic and gender.

When discussing the most challenging aspect of the course, most of the learners focused on the practical Python lab sessions and the work-based project module. Interestingly, participants also stated that they were able to overcome the challenges through personal perseverance and the learning provided by the course itself: 

“I found the initial hurdle of learning how [to] code very challenging, but after the basics it became possible to become more creative and experimental.”  

“The work-based project was a huge challenge. We'd only really done 5 weeks of classes and, having never done anything like this before, it was hard to envisage an end product let alone how to put it together. But got there in the end!” 

While the majority of the cohort found the practical components of the PGCert trial most challenging, the feedback also suggested that the inclusion of the second module – which will be available as part of the full programme – will provide more opportunity to practice the practical programming skills like software tools and APIs. 

 

The Effectiveness of Computing with Cultural Heritage 

Bar graph showing the results of the question 'Have you applied anything you have learnt?' with 2 results for 'Data analysis concepts', 12 results for 'Python coding' and 2 results for 'Nothing'
Figure 5: 'Have you applied anything you have learnt?' Results breakdown by topic and gender.

Participants were asked whether they had used any of the knowledge or skills acquired in the PGCert trial. Even after sitting just the first and third modules, participants responded that they were able to apply their learning to their current role in some form.  

“I now regularly use the software program I built as part of my day-to-day job. This program performs a task in a few seconds, which otherwise could take hours or days, and which is otherwise subject to human error. I have since adapted this so that it can also be used by a colleague in another department.”  

“Python helps me perform tasks that I previously did not know how to achieve. I have also led a couple of training sessions within the library, introducing Python to beginners (using the software I built in the project as a cultural heritage use case to frame the introduction).” 

“I changed [job] role at the end of the course so I think that helped me also in getting this promotion. And in this new role I have many more data analysis tasks to perform [quickly] for actions that would take months so yeah I managed to write that with a few scripts in my new role.” 

It was great to hear that the impacts of the trial were being felt so immediately by the participants, and that they were able to not only retain but also apply the new skills that they had gained.  

 This blog post was written by Deirdre Sullivan, Business Support Officer for Digital Scholarship Training Initiatives, part of the Digital Research and Curators Team. Special thanks to Nora McGregor, Digital Curator for the European and American Collection for support on the blog post and Martyn Harris, Institute of Coding Manager, for his work on the final report, as well as Giulia Rossi, Callum McKean and Graham Jevon for sharing their experiences.

National Libraries Now: Wikimedians Unite!

On Friday 17th September 2012, I was delighted to participate in a conference panel for the National Libraries Now Conference. I had worked to assemble a veritable dream team of Wikimedia and library talent, to talk about Wikimedia Residencies from a four-nation perspective. 

Joining me on the panel were Stella Wisdom (British Library), Jason Evans (National Library of Wales), Rebecca O’Neill (Wikimedia Community Ireland) and Ruth Small (Digital Productions Operator, National Library of Scotland). Stuart Prior (Programme Coordinator, Wikimedia UK) kindly agreed to be our chair. We pre-recorded presentations that were circulated to participants, so that our time on the 17th could be devoted to questions and discussion.

Going over my notes now, the best way to try to reflect the discussion is to look at some of the questions asked and the responses garnered. Please bear in mind that some remarks may be out of chronological order!

  • How do you think working with Wikimedia helps your institution’s strategic goals?

We reflected as a group on the move from WikiPedians in Residence to WikiMedians in residence [emphasis my own] and how this shows a shift in institutional thinking towards the potential of larger Wikimedia projects, and the use of platforms such as Commons, Wikisource and WikiBase.

Jason spoke about the way that fewer onsite footfall numbers at NLW, because of its physical location, enhance the importance of digital work and online outreach. He also spoke about the need for training, promotion and contribution through Wikimedia platforms as being just as valuable, if not more so, than the total number of views gained.

Image of National Library of Wales, Aberystwyth
It might not be digital, but it is a beauty! Ian Capper, via Wikimedia Commons.

 

The National Library of Scotland is in the heart of Edinburgh, so does not face the same issues with footfall, however, as Ruth pointed out, a key strategic goal of the Library is to reach people, and digitising is not the end of the road. Engagement with collections like the NLS Data Foundry is crucial, and the groundbreaking Scottish Chapbooks project run by the NLS was born out of the pandemic, showing a new imagining of institutional goals.

  • How do you incorporate Wikimedia work into your ‘normal’ work?

It was agreed that the inclusion of Wiki in job descriptions could help change at an institutional level, while Rebecca pointed out that the inclusion of Wiki activity as an outreach activity in funding applications is often a good way forward for inclusion of this work as part of major research projects. Again, advocacy and emphasis on the ease with which Wiki work can be undertaken was a key focal point, showing colleagues that their interests and our tools can align well.

  • How do you implement elements of quality control to what is ultimately crowdsourced work?

Jason suggested that we start to think about ‘context’ control: we can upload content and edit and amend details from the beginning, however how we contextualise this material and the activity of Wiki engagement is crucial. There is a high level of quality in curation already, and often Wiki datasets will link back to other repositories such as Flickr or institutional catalogues.

The classic counterpoint of ‘anyone can edit’ and ‘everyone can edit’ came to the fore here: as was rightly pointed out, the early 00s impression of Wikipedia as a free-for-all is largely outdated. In fact, expectations are often inverted, as the enthusiastic and diligent Wiki community are quick to act upon misinformation or inaccuracies. We spoke about the beauty of the process in Wikimedia whereby information picks up value and enriched data along the way, an active evolution of resources.

Image of WIkipedia welcome page stating 'the free encyclopedia that anyone can edit'
The WIkipedia landing page: anyone can edit!

 

  • What about decolonisation and Wikimedia?

Decolonisation is a huge question for Wikimedia: movements around the world are examining what we can do to better serve the larger cause of anti-racist practice. For the British Library, I spoke about the work we have done on the India Office Records in offering a template for content warnings and working with the input of our colleagues to make this as robust of a model as we can.

Rebecca’s experience of working in Ireland was incredibly insightful: she shared with us the experience of working with Irish material that is shaped by colonial ideas of what Ireland is, and how the culture has formed. Despite being a white, European, primarily English-speaking nation, the influence of colonialism is still felt.

The use of Wikimedia as a tool for breaking down barriers is vital, as each of our speakers illustrated. Jason spoke about the digital repatriation of items, and gave an example of the Red Book of Hergest, held by Jesus College Oxford (MS 111) and now available through Wikimedia Commons. Though this kind of action cannot always stand in place of physical repatriation, the move towards collaboration is notable and important.

 

An image of anti-Irish propaganda, featuring an Irish Frankenstein figure
'The Irish Frankenstein', a piece of anti-Irish propaganda from 1882. John Tenniel, Public domain, via Wikimedia Commons.

 

An hour was simply not enough! National Libraries Now was an incredibly important experience for me, at this point in my residency. I was particularly delighted with the dedication and enthusiasm of my co-panelists, and hope that we were able to shed some light on the Wikimedian-in-Residence role for those attending.

This post is by Wikimedian in Residence Lucy Hinnie (@BL_Wikimedian).

25 August 2021

Dabbling in DCMI

One of the best bits of working in digital scholarship is the variety of learning, training and knowledge exchange we can participate in. I have come to my post as a Wikimedian with a background in digital humanities and voluntary experience, and the opportunity to solidify my skills through training courses is really exciting.

Shortly after I started at the library, I had the chance to participate in the Library Juice Academy’s course ‘Introduction to Metadata’. Metadata has always fascinated me: as someone who can still remember when the internet was installed in their house, by means of numerous AOL compact discs, the way digital information has developed is something I have had direct experience of, even if I didn’t realise it.

Green and yellow CD with 1990s AOL branding.
Image of AOL CD, courtesy of archive.org.

Metadata, simply put, is data about data. It tells us information about resource you might find in a library or museum: the author of a book, the composer of a song, the artist behind a painting. In analogue terms, this is like the title page in a novel. In digital terms, it sits alongside the content of the resource, in attached records or headers. In the Dublin Core Metadata Initiative format, one of the most common ways of expressing metadata, there are fifteen separate ‘elements’ you can apply to describe a resource, such as title, date, format and publisher.

Wikidata houses an amazing amount of data, which is unusual as it is not bounded by a set number of ‘elements’. There are many different ways of describing the items on Wikidata, and many properties and statements can be added to each item. There have been initiatives to integrate Wikidata and metadata in a meaningful way, such as the WikiProject Source Metadata and WikiCite. I have certainly found it very useful to have a sound understanding of metadata and its function, in order to utilise Wikidata effectively.

Image of Wikicite logo, with birthday branding.
Wikicite 8th Birthday Logo by bleeptrack.

The Library Juice Academy course was asynchronous and highly useful. Over four weeks, we completed modules involving self-selected readings, discussion forum posts and video seminars. I particularly enjoyed the varied selection of readings: the group of participants came from a breadth of backgrounds and experiences, and the readings reflected this. The balance between theoretical reading and practical application was excellent, and I enjoyed getting to work with MARCEdit for the first time.

I completed the course in May 2021, and was delighted to receive my certificate by email. I have a much stronger handle on the professional standard of metadata in the GLAM sector and how this intersects with the potential of the vast array of data descriptors available in Wikidata. It was also a great opportunity to think about the room for nuance, subjectivity and bias in data. During Week One, we considered ‘Misinformation and Bias in Data Processing’ by Thornburg and Oskins. I said the following in our forum discussion:

“What I have taken from this piece is a real sense of the hard work that goes into the preparation of resources, and the many different forms bias can take, often inadvertently. It has made me think about and appreciate the difficult decisions that have to be made, and the processes that underlie these practices.”

Overall, participating in this course and expanding my skills into more traditional librarianship fields was fascinating, and left me eager to learn more about metadata and start working more closely with our collections and Wikidata.

This post is by Wikimedian in Residence Lucy Hinnie (@BL_Wikimedian).

12 August 2021

Dates to discuss Wikidata at Wikimania 2021

Wikimania is often the highlight of any Wikimedian’s calendar. Hosted by the Wikimedia Foundation, Wikimania is a conference like no other. A large number of participants take part in the annual celebration of open knowledge and Wikimedia projects. Previous events have taken place in  Stockholm (2019), Cape Town (2018), Montreal (2017) and Italy (2016). Due to the ongoing global pandemic situation, this year's conference being held 13-17 August 2021 is taking place entirely online, something Wikimania is ideally suited for!

  Logo for Wikimania 2021, 4 squares, 1 with a drawing of 12 peoples faces as if they are in a videocall, the 2nd of 2 jigsaw puzzle pieces, the 3rd of paper confetti and the 4th square showing 2 people sitting at a table talking

In addition to more traditional conference sessions, Wikimania will be running an Unconference, a Community Village, and a community Hackathon. Communication is encouraged through a variety of channels including Telegram, IRC and Wiki talk pages.

Telegram machine
A photograph of an old telegraph key by Sandra Tan on Unsplash

Looking at the programme, so many interesting topics are on the table for presentation and discussion: from copyright reform, to innovation and community development, there’s a wide spectrum of material to interest all Wikimedians of every level. Handily, events are rated in terms of their suitability for beginners, to make things as welcoming as possible. There is a whole strand of presentations devoted to Wikidata, which you can view here.

I am very excited to be presenting remotely at this conference on behalf of the British Library. I will be introducing the work of Tom Derrick on the Bengali Books Wikisource Competition, and Dominic Kane (UCL) on the India Office Records project. We have shaped our panel to show what GLAM institutions can do to promote and effectively utilise Wiki platforms for public engagement with library and archive collections. Our panel will run on Sunday 15th of August at 8.15pm (7.15pm UTC).

Wikimania is free to attend online, 13-17 August 2021, registration is open until midnight on Thursday 12th August. We hope to see you there!

This post is by Wikimedian in Residence Lucy Hinnie (@BL_Wikimedian)

Digital scholarship blog recent posts

Archives

Tags

Other British Library blogs