Digital scholarship blog

Enabling innovative research with British Library digital collections

03 November 2016

Black Abolitionist Performances and their Presence in Britain - An update!

Posted by Hannah-Rose Murray, finalist in the BL Labs Competition 2016.

Reflecting back on an incredible and interesting journey over the last few months, it is remarkable at the speed in which five months has flown by! In May, I was chosen as one of the finalists for the British Library Labs Competition 2016, and my project has focused on black abolitionist performances and their presence in Britain during the nineteenth century. Black men and women had an impact in nearly every part of Great Britain, and it is of no surprise to learn their lectures were held in famous meeting halls, taverns, the houses of wealthy patrons, theatres, and churches across the country: we inevitably and unknowably walk past sites with a rich history of Black Britain every day.

I was inspired to apply for this competition by last year’s winner, Katrina Navickas. Her project focused on the Chartist movement, and in particular using the nineteenth century digitised newspaper database to find locations of Chartist meetings around the country. Katrina and the Labs team wrote code to identify these meetings in the Chartist newspaper, and churned out hundreds of results that would have taken her years to search manually.

I wanted to do the same thing, but with black abolitionist speeches. However, there was an inherent problem: these abolitionists travelled to Britain between 1830-1900 and gave lectures in large cities and small towns: in other words their lectures were covered in numerous city and provincial newspapers. The scale of the project was perhaps one of the most difficult things we have had to deal with.

When searching the newspapers, one of the first things we found was the OCR (Optical Character Recognition) is patchy at best. OCR refers to scanned images that have been turned into machine-readable text, and the quality of the OCR depended on many factors – from the quality of the scan itself, to the quality of the paper the newspaper was printed on, to whether it has been damaged or ‘muddied.’ If the OCR is unintelligible, the data will not be ‘read’ properly – hence there could be hundreds of references to Frederick Douglass that are not accessible or ‘readable’ to us through an electronic search (see the image below).

American-slavery
An excerpt from a newspaper article about a public meeting about slavery, from the Leamington Spa Courier, 20 February 1847

In order to 'clean' and sort through the ‘muddied’ OCR and the ‘clean’ OCR, we need to teach the computer what is ‘positive text’ (i.e., language that uses the word ‘abolitionist’, ‘black’, ‘fugitive’, ‘negro’) and ‘negative text’ (language that does not relate to abolition). For example, the image to the left shows an advert for one of Frederick Douglass’s lectures (Leamington Spa Courier, 20 February 1847). The key words in this particular advert that are likely to appear in other adverts, reports and commentaries are ‘Frederick Douglass’, ‘fugitive’, ‘slave’, ‘American’, and ‘slavery.’ I can search for this advert through the digitised database, but there are perhaps hundreds more waiting to be uncovered.
We found examples where the name ‘Frederick’ had been ‘read’ as F!e83hrick or something similar. The image below shows some OCR from the Aberdeen Journal, 5 February 1851, and an article about “three fugitive slaves.” The term ‘Fugitive Slaves’ as a heading is completely illegible, as is William’s name before ‘Crafts.’ If I used a search engine to search for William Craft, it is unlikely this result would be highlighted because of the poor OCR.

Ocr-text
OCR from the Aberdeen Journal, 5 February 1851, and an article about “three fugitive slaves.”

I have spent several years transcribing black abolitionist speeches and most of this will act as the ‘positive’ text. ‘Negative’ text can refer to other lectures of a similar structure but do not relate to abolition specifically, for example prison reform meetings or meetings about church finances. This will ensure the abolitionist language becomes easily readable. We can then test the performance of this against some of the data we already have, and once the probability ensures we are on the right track, we can apply it to a larger data set.

All of this data is built into what is called a classifier, created by Ben O’Steen, Technical Lead of BL Labs. This classifier will read the OCR and collect newspaper references, but works differently to a search engine because it measures words by weight and frequency. It also relies on probability, so for example, if there is an article that mentions fugitive and slave in the same section, it ranks a higher probability that article will be discussing someone like Frederick Douglass or William Craft. On the other hand, a search engine might read the word ‘fugitive slave’ in different articles on the same page of a newspaper.

We’re currently processing the results of the classifier, and adjusting accordingly to try and reach a higher accuracy. This involves some degree of human effort while I double check the references to see whether the results actually contains an abolitionist speech. So far, we have had a few references to abolitionist speeches, but the classifier’s biggest difficulty is language. For example, there were hundreds of results from the 1830s and the 1860s – I instantly knew that these would be references around the Chartist movement because the language the Chartists used would include words like ‘slavery’ when describing labour conditions, and frequently compared these conditions to ‘negro slavery’ in the US. The large number of references from the 1860s highlight the renewed interest in American slavery because of the American Civil War, and there are thousands of articles discussing the Union, Confederacy, slavery and the position of black people as fugitives or soldiers. Several times, the results focused on fugitive slaves in America and not in Britain.

Another result we had referred to a West Indian lion tamer in London! This is a fascinating story and part of the hidden history we see as a central part of the project, but is obviously not an abolitionist speech. We are currently working on restricting our date parameters from 1845 to 1860 to start with, to avoid numerous mentions of Chartists and the War. This is one way in which we have had to be flexible with the initial proposal of the project.

Aside from the work on the classifier, we have also been working on numerous ways to improve the OCR – is it better to apply OCR correction software or is it more beneficial to completely re-OCR the collection, or perhaps a combination of both? We have sent some small samples to a company based in Canberra, Australia called Overproof, who specialise in OCR correction and have provided promising results. Obviously the results are on a small scale but it’s been really interesting so far to see the improvements in today’s software compared to when some of these newspapers were originally scanned ten years before. We have also sent the same sample to the IMPACT centre for competence of Competence in Digitisation whose mission is to make the digitisation of historical printed text “better, faster, cheaper” and provides tools, services and facilities to further advance the state-of-the-art in the field of document imaging, language technology and the processing of historical text. Preliminary results will be presented at the Labs Symposium.

Updated website

Before I started working with the Library, I had designed a website at http://www.frederickdouglassinbritain.com. The structure was rudimentary and slightly awkward, dwarfed by the numerous pages I kept adding to it. As the project progressed, I wanted to improve the website at the same time, and with the invaluable help of Dr Mike Gardner from the University of Nottingham, I re-launched my website at the end of October. Initially, I had two maps, one showing the speaking locations of Frederick Douglass, and another map showing speaking locations by other black abolitionists such as William and Ellen Craft, William Wells Brown and Moses Roper (shown below).

Website-update-maps
Left map showing the speaking locations of Frederick Douglass. Right map showing speaking locations by other black abolitionists such as William and Ellen Craft, William Wells Brown and Moses Roper.

After working with Mike, we not only improved the aesthetics of the website and the maps (making them more professional) but we also used clustering to highlight the areas where these men and women spoke the most. This avoided the ‘busy’ appearance of the first maps and allowed visitors to explore individual places and lectures more efficiently, as the old maps had one pin per location. Furthermore, on the black abolitionist speaking locations map (below right), a user can choose an individual and see only their lectures, or choose two or three in order to correlate patterns between who gave these lectures and where they travelled. 

Website-update-maps-v2
The new map interface for my website.

Events

I am very passionate about public engagement and regard it as an essential part of being an academic, since it is so important to engage and share with, and learn from, the public. We have created two events: as part of Black History Month on the 6th October, we had a performance here at the Library celebrating the life of two formerly enslaved individuals named William and Ellen Craft. Joe Williams of Heritage Corner in Leeds – an actor and researcher who has performed as numerous people such as Frederick Douglass and the black circus entertainer Pablo Fanque – had been writing a play about the Crafts, and because it fitted so well with the project, we invited Joe and actress Martelle Edinborough, who played Ellen, to London for a performance. Both Joe and Martelle were incredible and it really brought the Craft’s story and the project to life. We had a Q&A afterwards where everyone was very responsive and positive to the performance and the Craft’s story of heroism and bravery.

Hannah-murray-actors
(Left to Right) Martelle Edinborough, Hannah-Rose Murray and Joe Williams

The next event is a walking tour, taking place on Saturday 26 November. I’ve devised this tour around central London, highlighting six sites where black activists made an indelible mark on British society during the nineteenth century. It is a way of showing how we walk past these sites on a daily basis, and how we need to recognise the contributions of these individuals to British history.

Hopefully this project will inspire others to research and use digital scholarship to find more ‘hidden voices’ in the archive. In terms of black history specifically, people of colour were actors, sailors, boxers, students, authors as well as lecturers, and there is so much more to uncover about their contribution to British history. My personal journey with the Library and the Labs team has also been a rewarding experience. It has further convinced me that we need stronger networks of collaboration between scholars and computer scientists, and the value of digital humanities in general. Academics could harness the power of technology to bring their research to life, an important and necessary tool for public engagement. I hope to continue working with the Labs team fine-tuning some of the results, as well as writing some pages about black abolitionists for the new website. I’m very grateful to the Library and the Labs team for their support, patience, and this amazing opportunity as I’ve learned so much about digital humanities, and this project – with its combination of manual and technological methods – as a larger model for how we should move forward in the future. The project will shape my career in new and exciting ways, and the opportunity to work with one of the best libraries in the world is a really gratifying experience.

I am really excited that I will be there in London in a few days time to present my findings, why don't you come and join us at the British Library Labs Symposium, between 0930 - 1730 on Monday 7th of November, 2016?

.