Digital scholarship blog

20 posts categorized "Modern history"

21 July 2017

Russian Language Books Research Project by Nadya Miryanova

Add comment

Finding digitised books in the Russian language in a collection of 65,000 books

Posted by Nadya Miryanova BL Labs School Work Placement Student, currently studying at Lady Eleanor Holles, working with Mahendra Mahey, Manager of BL Labs.


Although there are 200 million items in the British Library, contrary to popular belief, only 1-2% of these items are digitised. The ‘Microsoft’ books are 65,000 digitised volumes - about 22.5 million pages, and they were published between 1789 and 1914; digitised in partnership with Microsoft. They cover a wide range of subject areas including topics such as philosophy, poetry and history and they include Optically Character Recognised (OCR) text from the millions of pages.

In discussion with Mahendra Mahey, Project Manager of BL Labs, we explored making a ‘sub collection’ from this larger set which will hopefully be of use to the library in the future. At first, I simply brainstormed possible ideas and looked at different possibilities for this project, and I thought that since 2017 celebrates a century since the Russian Revolution, I would do some research into the concept of ‘revolution’.


Definition - A forcible overthrow of a government or social order, in favour of a new system.

Etymology - Late latin ‘revolvere’, meaning to roll back, which turned into the Old French or Late Latin ‘revolutio’, from which came about our contemporary English word ‘revolution’.

Revolutions date back to as early as 2730 BC, where there was a set rebellion against the reign of the pharaoh Seth-Peribsen of the Second Dynasty of Egypt. The most recent revolution actually happened only last year in 2016, when there was a Turkish coup d'état attempt.

About the Russian Revolution

The British Library have recently opened an exhibition perfectly capturing not only the events that took place in this particularly intense period in history, but also the atmosphere that was omnipresent at the time and on my very first day here at the British Library, I got the chance to explore and study this fascinating exhibition in great depth.

The Russian Revolution was initiated by Lenin and the Bolsheviks, who hoped to create a socialist government, and in 1917, they successfully dismantled Tsarist autocracy in the hope of making society less stratified. The revolution resulted in the rise of the USSR and in the words of Karl Liebknecht, “The Russian revolution was to an unprecedented degree the cause of the proletariat of the whole world becoming more revolutionary”. However, this revolution also led to months of social and political turmoil and provoked the tragedy of the Russian Civil War on an unforeseeable scale, in which 10 million lives were lost. The revolution also produced myths that entered the artistic and intellectual fabric of the modern world, which the exhibitions uncovers and investigates. Learn more about the Russian Revolution by booking your tickets for the Russian Revolution Exhibition at the British Library on the website

Russian Revolution Poster
Russian Revolution Exhibition Poster at the British Library

As part of my research project, I also wanted to incorporate some of the other subjects that I had studied at GCSE, and so I thought this would be a brilliant opportunity to compare the Russian Revolution to the French Revolution, both French and Russian being subjects that I wish to at A-level. The French Revolution was a period of far-reaching social and political upheaval in France that lasted from 1789 until 1799, and was partially carried forward by Napoleon during the later expansion of the French Empire.

Below is a mind-map I made detailing the differences and similarities between the French and the Russian Revolution.

Russian and French Revolution Research
French and Russian Revolution Comparison

Although my initial focus for the project was revolution, we soon established that it was too specific a topic and it would be more beneficial to focus on something broader, that would be useful to a larger group of researchers.

I soon discovered that the Russian titles within the digitised collection had never previously seperated and categorised, and being a native Russian speaker, I thought that this would be a better avenue to go down and explore. This would be a project in commemoration of the 100th anniversary of the Russian Revolution, which would hopefully help researchers looking at books in the Russian language in the future.

Facts about the Russian Language

  • Largest European native language.
    • 7th most spoken language in the world.
  • There are only 200,000 words in the Russian language in comparison to 1,000,000 in English.
  • The stress pattern in a word can drastically change its meaning, e.g. :
    • я плачу  (emphasis on second syllable) - I pay.
    • я плáчу (emphasis on first syllable) -I cry.


My first task included examining a huge spread sheet containing information about the 65,000 books in the collection.

  • In order to make this task a little less daunting, I first used the ‘Filter’ function in the language column of my Excel spreadsheet, and selected the Russian language. As a result, I found 583 books in total that were written in the Russian Language.
  • I now had to think of a way to organise these books. The possibilities seemed endless, should I sort them into history books? Science books? Books about Russia?
  • In the end, I decided to establish two broad categories as a starting point, fiction vs non-fiction, as this seemed like a logical place to start.
  • In order to access the Russian keyboard, I went onto the site, which turns normal Latin letters into Cyrillic.
  • I typed in a Russian word, using the English keyboard, that related to one of my two categories, e.g. for non-fiction, I wanted to find history related books, so used the simple word ‘history’, which translates as история.
  • I then copied this word, and pasted it into my spreadsheet.
  • I used the filter function on the 'Titles' section, and this would hopefully produce a number of books that included the word history in their title.
Spread Sheet Screenshot
Screenshot of my spread sheet.


In this project, I found that I had to overcome a number of difficulties.

  • In Russian, nouns can have up to 12 inflections and adjectives can have as many as 16. This clearly shows that looking up different versions of the same word was necessary.
  • Like I previously said, I first experimented with simple words, such as history. You would think that there would definitely be books relating to history lurking somewhere in a collection of nearly 600 Russian titles. However, when I conducted my search, the spread sheet had no results. Confused, I tried another simple word, and once again had no definitive results.

Scanning more closely through the list of books, I soon noticed that there were certain spellings and letters that I did not recognise. I decided to research this matter more closely, looking at the history of the Russian language, and found out that the Russian of the 19th century does not directly resemble the Russian language used today. Why? Because of the Russian Revolution, of course.

1918 Spelling Reform Research
Bolshevik Spelling Reform of 1918 Research, detailing the causes for the reform and the changes made to the Russian language

Suddenly, everything made a lot more sense.

This discovery meant that I had to change my approach a little bit, so rather than typing in the Russian words in the spelling that I knew today, I would have to go for a sort of hunt throughout the spreadsheet, looking for words in the titles of the books that could encompass a number of books. In a way, this made the process of my project even more interesting, despite the fact that it took longer.

As I mentioned in my previous blog, the majority of the Russian language books were actually non-fiction. As a result, I decided to create sub-categories for the non-fiction set, which can be seen in the speech-bubble I created below.

Non-fiction categories
Speech bubble containing non-fiction categories

To help me in this task, I decided to create a colour-coding system for classification, so that I could keep track of my progress.

  • Yellow=Classified
  • Purple= латиницa (latin letters)- quite often I found titles which where written in Russian but using latin letters. Purple also used for titles written in another language
  • Blue=unknown classification
  • Orange= near classification
Colour coding system
Screenshot of my spread sheet showing the colour coding system that I used.


In conclusion, I managed to categorise the Russian language books into two broad categories, fiction and non-fiction, and I created 25 sub-collections within the non-fiction category. This project has been extremely enjoyable to work on, and although there were many challenges involved in the process, I have learnt lots during my research journey. In order to improve this project, I would definitely say that more work needs to be done on splitting up the 'history' sub-collection of my non-fiction title, since it is very broad and covers political accounts, as well as books about Russian History. Additionally, I think that this project would also considerably benefit from undergoing a thorough check with curators, in order to help classify some of the books I have not organised into separate collections yet. 

Picture from Russian Book
An illustration from one of the Russian books, По Сѣверо-Западу Россіи, available in the digitised collections. Image can be accessed on British Library Flickr Commons.



21 December 2016

Mobius programme – on the beach of learning

Add comment

This guest post is by Virve Miettinen, who spent four months with various teams at the British Library.

Every morning there’s a 100 meter queue in front of the British Library. It seems to say a lot about an unashamed nerdiness and love for learning in this city. Usually all the queuers have already put the things they might need in the Reading Room in a clear plastic bag, so they can head straight down to the lockers, stow away their coats, handbags and laptop cases and secure a place on the beach of learning.

Virve Miettinen

The Mobius fellowship programme, organised by the Finnish Institute in London, enables mobility for visual arts, museum, library and archives professionals, and customised working periods as part of the host organisation’s staff, in my case the British Library. The programme is a great opportunity to break away from daily routines, to think about one’s professional identity, find fresh ideas, compare the practices and methods between two countries, share knowledge and build meaningful networks.

Learn, relearn and unlearn from each other

Learning isn’t a destination, it’s a never-ending road of discovery, challenge, inspiration and wonder. Each learning moment builds character, shapes thoughts, guides futures. But what makes us learn? For me the answer is other people, and during the Mobius Fellowship I’ve been blessed with the chance to work with talented people willing to share their knowledge at the British Library.

I’ve familiarised myself with British Library Learning Team which is responsible for the library’s engagement with all kinds of learners. The Learning Team offers workshops, activities and resources for schools, teachers and learners of all ages.

I’ve been following the work of the Digital Scholarship team and BL Labs project to learn more about the incredible digital collections the library has to offer, and how to open them up for the public through various activities such as competitions, events and projects.

I’ve worked with the Knowledge Quarter, which is a network of now 76 partners within a one mile radius of Kings Cross and who actively create and disseminate knowledge. Partners include over 49 academic, cultural, research, scientific and media organisations large and small: from the British Library and University of the Arts London to the School of Life, Connected Digital Economy Catapult, Francis Crick Institute and Google.

I’ve assisted the Library’s Community Engagement Manager Emma Morgan. She has been working as a community engagement manager for six months now and the aim of her work is to create meaningful, long-lasting, mutually beneficial relationships with the surrounding community, i.e. residents, networks and organisations.

image from
Inside the British Library

I’ve observed the library’s marketing and communications unit in action, and learned for example how they measure and research the customer experience, i.e. who visits and uses the BL, what they think of their experience and how the BL might improve it.


I’ve got many 'mental souvenirs' to take back home with me - if they interest you, read more from my Mobius blog: 

100 digital stories about Finnish-British relations

As part of the Mobius programme I’ve been working on a co-operative project between the British Library, the National Library in Finland, the Finnish National Archives, The Finnish Institute in London and the Finnish Embassy. In the last three decades, contacts between Finland and UK, the two relatively distant nations have multiplied. At the same time, the network of cultural relationships has tightened into a seamless 'love-story' – something that would not have been easy to predict just 50 years ago. In the coming year of 2017 the Finnish Institute celebrates the centennial anniversary of Finland’s independence by telling the story of two nations – the aim is to make the history, the interaction and the links between these two countries tangible and visible.

We are collaborating to create a digital gallery open to all, which offers its visitors carefully curated pieces of the shared history of the two countries and their political, cultural and economic relations. It will offer new information on the relations and influences between the two countries. It consists of digitised historical materials, like letters, news, cards, photographs, tickets and maps. The British Library and other partners will select 100 digitised items to create the basis of the gallery.

The gallery will be expanded further through co-creation. In the spirit of the theme of Finland’s centenary 'together', the gallery is open to all and easily accessible. With the call 'Wanted – make your own heritage' we invite people to share their own stories and interpretations, and record history through them. The gallery feeds curiosity, creates interaction and engages users to share their own memories relating to Finnish-British experiences. The users are invited to interpret recent history from a personal point of view.

The work continues after my Mobius-period and the gallery will open in September 2017. Join us and share your memories. Be frank, withdrawn, furious, imaginative, witty or sad. Through your story you create history.

P.S. The British Library Reading Room is actually far from The Beach of Learning, it’s more like The Coolest Place To Be, I found myself freezing in the air-conditioned Rare Books Reading Room despite wearing my leather jacket and extra pair of leggings

Virve Miettinen is working at Helsinki City Library/ Central Library as a participation planner. Her job is to engage citizens and partners to design the library of the future. For Helsinki City Library co-operative planning and service design means designing the premises and services together with the library users while taking advantage of user centric methods. Her interests involve co-design, service design, community engagement and community-led city development. At the moment she is also working with her PhD under the title 'Co-creative practices in library services'.

12 August 2016

Black Abolitionist Performances and their Presence in Britain

Add comment

Posted by Mahendra Mahey on behalf of Hannah-Rose Murray, finalist of the BL Labs Competition 2016

Overview of the project

The Black Abolitionist project focuses on African American lives, experiences and lectures in Britain between 1830-1895. It builds on my PhD project, which I am currently studying for at the Department of American and Canadian Studies, University of Nottingham. Working with the British Library has already proved a fortunate and enriching opportunity, and by harnessing the power of technology, we want to work together to search through thousands of newspapers to find abolitionist speeches, a process that would take years by hand. By reading black abolitionist speeches in the Nineteenth Century Newspaper Collection (and using the Flickr collection to illustrate), we can get a sense of their performances and how their lectures reached nearly every corner of Britain. Newspapers can also provide us with the locations of these meetings, and for the first time, I have mapped these locations to gather an estimate of how many lectures black abolitionists gave in Britain and to allow their hidden voices to be heard. I am updating my website to reflect this project, which can be found at

These are the maps I have so far: the map (below left) chronicles the lectures of Frederick Douglass, and the second one (on the right) represents the lectures given by other black abolitionists such as Josiah Henson, Sarah Remond, Moses Roper, William Wells Brown, Henry ‘Box’ Brown, Ida B. Wells, James Watkins and William and Ellen Craft (to name a few): Abolitionist_maps

African Americans visited Britain for a variety of reasons. Many came to publish slave narratives, teach Britons about slavery and look for their support in the abolitionist cause. Others came to live in Britain safely, away from the ever-watchful eyes of slave-catchers, while several wanted to raise money to purchase family members from the jaws of slavery. 

Black abolitionists made their mark in nearly every part of Great Britain, and it is of no surprise to learn they had a strong impact on London too. Lectures were held in famous meeting halls, taverns, the houses of wealthy patrons, theatres, and churches across London: we inevitably and unknowably walk past sites with a rich history of Black Britain every day.

When searching the newspapers, what we have found so far is that the OCR (Optical Character Recognition) is patchy at best. OCR refers to scanned images that have been turned into machine-readable text, and the quality of the OCR can depend on many factors – from the quality of the scan itself, to the quality of the paper the newspaper was printed on, to whether it has been damaged or ‘muddied.’ If the OCR is unintelligible, the data will not be ‘read’ properly – hence there could be hundreds of references to Frederick Douglass that are not accessible or ‘readable’ to us through an electronic search (see the image below).


In order to clean and sort through the ‘muddied’ OCR and the ‘clean’ OCR, we need to teach the computer what is ‘positive text’ (i.e., language that uses the word ‘abolitionist’, ‘black’, ‘fugitive’, ‘negro’) and ‘negative text’ (language that does not relate to abolition). For example, the image to the left shows an advert for one of Frederick Douglass’s lectures (Leamington Spa Courier, 20 February 1847). The key words in this particular advert that are likely to appear in other adverts, reports and commentaries are ‘Frederick Douglass’, ‘fugitive’, ‘slave’, ‘American’, and ‘slavery.’ I can search for this advert through the digitized database, but there are perhaps hundreds more waiting to be uncovered.

I have spent several years transcribing many of Frederick Douglass’ speeches and most of this will act as the ‘positive’ text. ‘Negative’ text can refer to other lectures of a similar structure but do not relate to abolition specifically, for example prison reform meetings or meetings about church finances. This will ensure the abolitionist language becomes easily readable. We can then test the performance of this against some of the data we already have, and once the probability ensures we are on the right track, we can apply it to a larger data set.

The prospect of uncovering hidden speeches by African Americans is incredibly exciting, and hopefully this will add to our knowledge of the black presence in Britain: we can use these extensive sources to build a more complete picture of Victorian London in particular.


11 July 2016

Finding digitised books and images about Finland in a collection of 65,000 books

Add comment

Posted by Ruby Dixon, currently a student at Graveney School and on work-experience at BL Labs.


The ‘Microsoft’ books are 65,000 digitised volumes - about 22.5 million pages - which were published between 1789 and 1914; they were digitised in partnership with Microsoft. They cover a wide range of subject areas including philosophy, poetry, history and literature and they include Optically Character Recognised (OCR) text from the millions of pages.

In discussion with Mahendra Mahey, Project Manager of BL Labs, we explored making a ‘sub collection’ from this larger set which will hopefully help researchers in the future. After thinking about making a collection of ‘works of fiction’, ‘bibles’ or titles about ‘slavery’ I decided that identifying a collection of books about Finland would be the most interesting and realistic thing to do as part of my mini-project at the Library.

The collection I am creating will hopefully help a project that the Library might be working on which celebrates the 100th year of independence of Finland in 2017.

Facts about Finland

When starting this mini-project, I thought it would be wise to do some background research about Finland. I thought this would be a great way to put my GSCEs in Geography and History to use. Knowing more about the history and geography of Finland would help me in my ‘detective’ hunt through the collection of books. I would learn about important keywords I might need to use to help me identify relevant books in the digitised collection.

Here are some useful facts that you may not know about Finland:

  • Finland had autonomy with Russia on 29 March 1809.
  • Finland received independence on 6 December 1917.
  • Finland joined the European Union on 1 January 1995.

These and more facts can be accessed online:

Map of Finland picA map showing Finland, taken from Wikipedia:

This gave me a clue in understanding that there may in fact be several books in the collection in the Russian Language that could cover Finland, given that Finland was given autonomy in 1809 from Russia. Looking at the map of Finland, I also realised that bordering countries would most likely have books about Finland as well.


Analysing the collection spreadsheet 

Master spreadsheet pic 2A screen shot of a section of the spreadsheet containing 65,000 records of digitised books in the ‘Microsoft Books’ collection.

My first task was to examine the huge spreadsheet containing information about the 65,000 books in the collection.

There were several lines of ‘attack’ we could take in finding information about Finland in this collection, some which involve using the ‘Filter’ function in Excel.

Master spreadsheet picScreen shot from Microsoft Books Spreadsheet: 1. The 'Filter' function in Excel. 2. Filter has been applied on the language code for Finland ‘fin’

We came up with the following strategy:

  1. Find words relating to 'Finland' in the Title field in the spreadsheet for the books.
  2. This task would have to be done in several languages as there are 28 languages listed in the language code field (column C). I decided I would prioritise English and languages of bordering nations around Finland and if I had time would look at the other languages too.
  3. I knew I would have to use Google translate ( to find equivalent words in that language relating to Finland to help me with filtering.

In terms of thinking of what words I might use for the filtering, Mahendra suggested that it might be useful to create a word cloud about all things 'Finnish'; this might help me decide which words were the most important and to use first in filtering.

I used and here is the word cloud I made using the Wikipedia page about Finland:

Word cloud picWordcloud created using Tagul, based on the Wikipedia page in English about Finland.

From this, we decided to use the following words (the amount of words was limited due to time): Finland, Finnish, Helsinki and Finn. 

We also filtered using Danish, Swedish, German, English, Finnish and Russian languages and using related words about Finland in those languages.

Below is a summary table showing the number of books we found by applying a filter to the 'Title' field in the spreadsheet about words related to 'Finland'.

Table 1The table above shows the number of books I found using various filters in the digitised collection.

Please note, that I didn’t have time to look further into the collections we found in some of the non-English language collections, as I am not a native speaker in any of them. More time would be needed to filter this collection. The spreadsheet is available here.

What is interesting, however, is that we know there are 582 books in the collection in the Russian language, details of which I sent to Katya Rogatchevskaia, Lead Curator of East European Collections. 

Images in the books about Finland

I learned how the images from the 'Microsoft' books were extracted and placed on The British Library’s Flickr page. This slide from a BL Labs presentation nicely summarises how it all happened: 

Flickr process pic

Taken from the BL Labs Slideshare account,

More information is available from a blog post written by Ben O’Steen, Technical Lead of BL Labs, which explains this process in much more detail.

What I realised was that there must be images identified in these books which relate to Finland. Mahendra suggested that I first look at some work done by the Wikimedia community on trying to find maps within these images.

Wikimedia commons synoptic index

The Wikimedia Commons Synoptic Index for the Mechanical Curator images, contains a really handy breakdown of the images by geographical place.

Wikimedia pic

Image taken from British Library/Mechanical Curator collection/Synoptic index, Europe.

From this, I was able to find that there were 12 books that had been identified as having images which had something to do with Finland in them.

Wikimedia Finland picImage taken from Wikimedia Commons page.

This was a great way to start, but now I thought I would try the British Library’s Flickr Commons site to see if there were more images about Finland that had been tagged with Finland-related words.

British Library Flickr Commons

As of 07/07/16 there are 1,023,705 images on the British Library’s Flickr Commons page; a large proportion of these come from images snipped out of the digitised books that I have been working on.

The site has had an incredible 400,000,000 plus views and users have tagged over 100,000 images with around 500,000 tags. I am really looking forward to see what the winners of the Labs Competition 2016 will do on their SherlockNet project as they are hoping to tag all the images using computers code!

For now, I wanted to use the tags already there to see if I could find images relating to Finland.

Here is an example image which has several tags added, some of which relate to Finland:

  Image from Flickr 1 Flickr tags pic
Tags added to an example image on the British Library Flickr Commons page.

Here you can see tags such as ‘Finland’, ‘Suomi’ (Finnish for ‘Finnish’), ‘Helsinki’, ‘Helsingfors’ (Swedish for ‘Helsinki’) etc. which have been added by Flickr users (grey tags). Please note that tags in white are those added automatically by Flickr itself.

I have summarised the images I have found on the British Library’s Flickr Commons collection below:

 Keyword(s) used and link to BL Flickr Commons   Number of images found 
Finland 917
Helsinki 18
Suomi 3
Suomen 418
Suomalaiset 15
Finns 42
Finnish 352
Gulf of Finland 43
Kulturbilder ur Finlands historie 1
Turku 3
Pori 4
Tampere 1
Kuopio 2
Hanko 177
Lapland 148
Suomenlinna 2
Kemi 1
Total 1997

 Table showing links and number of British Library Flickr Commons images about Finland

What is clear from this initial research is that there are definitely more books with images about Finland than the 12 identified through Wikimedia Commons. Much more work will be needed on this. Also, I would recommend that all the images that I have found be downloaded so that they may be used for the Finnish 100 year independence project.

In conclusion, I have enjoyed being able to participate in this project and have loved getting involved in some work on it. Although it has been relatively challenging, this new experience has been very interesting and I have definitely enjoyed spending my time on it. On the other hand, I would say that more time is certainly needed on this project to find more books in the 65,000 collection as I have only had a limited amount of time to spend on it. Furthermore, I would recommend that more words relating to Finland should be found and used in several languages to filter the master spreadsheet, in order to add more books to the Finnish collection. Lastly, one other thing that could be done to develop this project even further is to work with the curators of other languages to help identify Finland-related books.

If you would like to find more sub collections in the Microsoft books collection, please email, they would love to hear from you!

Tomorrow I will blog about my work experience at the library.



15 April 2016

The Georgian Pingbacks Project

Add comment Comments (0)

Posted by Mahendra Mahey, Manager of BL Labs on behalf of Dr. Melodee Beals, Lecturer in Digital History, Department of Politics, History and International Relations, Loughborough University.

Georgian Pingbacks

In the wild west of the World Wide Web, if you compose a hilarious joke, provide a simple solution to a complex problem or break a major new story, it is almost certain that your work will be copied. Although intellectual property laws exist, they are inconsistently enforced because of the sheer number of sites where reposting occurs - a number that increases with each passing second. If you are lucky, and your re-poster is honest, you may discover how far your ideas have spread through a pingback, an automatically generated comment on your original blog post with a link to its reprint.

In the nineteenth century, reprinting—especially unauthorised reprinting—was the backbone of Atlantic journalism but, unlike modern bloggers, these authors had no effective means of discovering the fate of their quips or queries, except through chance encounters with competing papers or their readers. Although concerns of commercial losses are long past, this lack of attribution continues to plague researchers working with newspapers. Without a precise date of composition or of original publication, and without a specific or even a corporate author, the provenance of these texts remain frustratingly uncertain. One solution to this problem is to track reprinting through text-matching. Using plagiarism detection software, we can carefully reconnect different versions appearing in a wide range of publications. Yet, however efficient our text-matching processes become, two major problems remain. First, text-matching requires machine-readable versions of the articles—electronic texts rather than images. While the sheer number of historical newspapers that have been digitised is impressive, the number that have high-quality, searchable text is deceptively limited. Many community sites have uploaded images of their physical or microfilm archives but do not have the resources to create fully searchable transcriptions. Others, created by state or commercial providers, have relied upon optical-character recognition, the accuracy of which is subject to wild variations. Even when OCR texts are excellent, these represent a considerable investment to providers and often remain locked behind subscription fees.

Reprints within the British Library's 19th Century Newspaper Database, 1818-1819, based on analysis with Copyfind
Reprints within the British Library's 19th Century Newspaper Database, 1818-1819, based on analysis with Copyfind

Thanks to the efforts of public institutions—including the British Library, National Library of Wales, National Library of Australia and the Library of Congress—machine-readable transcriptions for a large number of nineteenth-century newspapers are now available to researchers. But within these collections, a second, more sinister problem arises. No matter how diligently archivists have worked to provide a representative or diverse selection, these digital holdings remain only a slice of the sprawling news network that once existed. Even if we find every single digital copy of a text, how can we know for sure that the original is among them? It is here that the humble pingback returns to the fore. Whether prompted by the innate honesty of editors or by their desire to establish the authenticity of their materials, a significant minority of newspapers articles contained an attribution. Whether appearing as an introductory dateline or a concluding tagline, these Georgian pingbacks offer tantalising clues as to the true origins of these anonymised texts. Yet, because only a minority of articles contain these attributions, because they can appear in many different forms or locations within the article text and because OCR is frustratingly inconsistent in transcribing italic and gothic typefaces, searching for datelines algorithmically is exceedingly difficult.A Snippet from the Ipswich Journal, 13 January 1821. Courtesy of the British Library.

A Snippet from the Ipswich Journal, 13 January 1821. Courtesy of the British Library.

That is where the crowd come in. Although computers can process data very quickly, the human brain is still more adept at finding patterns when the parameters for those patterns are particularly fuzzy. Because of this, it was easier for astronomers to train volunteers to identify dusty debris disks in nebulae than to train computers to do the same thing. And what is true for nebulae is equally true of these Georgian pingbacks. Using thousands of images from the British Library's 19th-Century Newspapers collection, we have created a new site where you can help spot these attributions and provide researchers with what Georgian authors could only dream of, a in-depth understanding of just who was stealing from whom! The site includes an in-depth tutorial on the structure of nineteenth-century newspapers articles as well as three different ways you can help us tag the database. So, whether you have a smart phone and 5 minutes waiting for your train or want to explore the collection in more depth at your home PC, please visit Georgian Pingabcks and try your hand uncovering a 200-year-old case of plagiarism.

Dr M. H. Beals is a historian of migration and media a Loughborough University. She would like to thank the following undergraduate students at Loughborough University's Department of Politics, History and International Relations for their work on this project. Will Dickinson, Alice Gilbert, Ollie Luhrs, Alex Mackinder, Pooja Makwana, Matthew McCulloch, Jonny Ord, Emily Stanyard and Rebecca Thompson.

26 February 2016

'Why Londoners need not stand in fear of Drought': Depictions of Late Nineteenth Century London in the Pall Mall Gazette

Add comment Comments (0)

Today we have a guest post from Dr. Tessa Hauswedell. We love hearing from people who've used our digital collections - get in touch if you've got a story to share!

London according to the ‘picture book’: Taken from Paul Villars, London and its Environs. A picturesque survey of the metropolis and the suburbs, p.5, 1888 Taken from the British Library Flickr Images:

This January I had the opportunity to give a talk at the Digital History Seminar Series at the Institute for Higher Research in London on ‘European or Imperial Metropolis: Depictions of London in British Newspapers, 1870-1900’. The talk was based on research I undertook with some material from the British Newspaper Archive, the nineteenth London-based newspaper Pall Mall Gazette, to which the British Library kindly granted full-text access.[1] This work is part of a larger research project which I am involved in, entitled ‘Asymmetrical Encounters’, which traces cultural and historical references and themes in European newspaper corpora from the nineteenth and twentieth century.

I am interested in how meanings of terms change over times in public usage, because tracking changing meanings allows us insights into certain preferences, mentalities and world views which are unique to a specific culture and epoch. Some terms lend themselves quite readily to such a historical semantic analysis – think of broad terms such as ‘freedom’, ‘liberty’ and ‘democracy’, all of which may mean many different things in 18th century France and 21st century China (to take one obvious example). The ‘metropolis’ is another such capacious and evocative term. Often, the metropolis is applied to the European cities of the nineteenth century, with London at the forefront in commercial, financial and political terms.

But to what degree was London implicated in a pan-European discourse with the other ‘metropoles’ of its age, and what role did comparisons with these European cities play in public discourse? I sought to answer this question by looking at the Pall Mall Gazette’s digital archive in the last thirty years of the nineteenth century.

An advertisement from the Illustrated London News, selling ‘Vaissier’s Congo Soap as the soap to be had in every ‘metropolis in Europe’. Image taken from the Illustrated London News 18 Oct. 1890: 507. The Illustrated London News Historical Archive, 1842-2003.

To do this, I used a tool called AntConc, which was built originally for corpus linguistics research at the University of Lancaster, and did a collocation analysis. Collocations are essentially words that are frequently and habitually used together, and analyzing them allows you to establish common contexts and discourse fields around a given term.

For example, we find frequent references to mentions of poverty and related themes, indicated by terms such as ‘poor’, ‘pauperism’ ‘vagrants’ and ‘destitution’. This in itself is unsurprising, given that we know from previous historiography that discussions about the plight of the working poor especially in cities were frequent during the late 1870s and 1880s. Some other obvious themes come into view quickly – references to ‘management’, ‘local management’, to ‘water’ and ‘water supply’ and discussions about ‘railways’ and ‘railway lines’ leading in and out of London.

A picture from the Illustrated London News with reservoirs in proximity to London in order to alleviate fears about safety of the water supply in the city. Taken from The Illustrated London News August 5, 1911, Issue 3772, p.228-229, The Illustrated London News Historical Archive, 1842-2003.

We also see increasingly stronger associations in relation to London as a place of entertainment and tourism over the thirty-year period, references to ‘music halls’ and ‘nightlife’ as well as to ‘visitors to the metropolis’ are on the increase. Notably, however, references to other European cities or to terms indicating a ‘German’, ‘British’ or ‘European’ metropolis are missing. This does not mean that the Pall Mall Gazette did not report on cities from abroad; its outlook was far from provincial and it had a broad coverage on international events. But the collocation analysis suggested that the term ‘metropolis’ was almost exclusively applied to the city of London and hardly to other cities. This proved an interesting finding, because it indicated that the European dimension of the metropolis would have been perhaps less relevant to the late nineteenth century newspaper reader, and instead London was presented to its readers mostly as a metropolis quite distinct from its European counterparts.


[1] Thanks are due especially to James Baker, then a Digital Curator at the British Library, for assistance in obtaining access to the Pall Mall Gazette, and especially to Melvin Wevers, University of Utrecht, who prepared the data for text-mining purposes.

28 January 2016

Book Now! Nottingham @BL_Labs Roadshow event - Wed 3 Feb (12.30pm-4pm)

Add comment Comments (0)

Do you live in or near Nottingham and are you available on Wednesday 3 Feb between 1230 - 1600? Come along to the FREE UK @BL_Labs Roadshow event at GameCity and The National Video Game Arcade, Nottingham (we have some places left and booking is essential for anyone interested).


BL Labs Roadshow in Nottingham - Wed 3 Feb (1200 - 1600)
BL Labs Roadshow at GameCity and The National Video Game Arcade, Nottingham, hosted by the Digital Humanities and Arts (DHA) Praxis project based at the University of Nottingham, Wed 3 Feb (1230 - 1600)
  • Discover the digital collections the British Library has, understand some of the challenges of using them and even take some away with you.
  • Learn how researchers found and revived forgotten Victorian jokes and Political meetings from our digital archives.
  • Understand how special games and computer code have been developed to help tag un-described images and make new art.
  • Find out about a tool that links digitised handwritten manuscripts to transcribed texts and one that creates statistically representative samples from the British Library’s book collections.
  • Consider how the intuitions of a DJ could be used to mix and perform the Library's digital collections.
  • Talk to Library staff about how you might use some of the Library's digital content innovatively.
  • Get advice, pick up tips and feedback on your ideas and projects for the 2016 BL Labs Competition (deadline 11 April) and Awards (deadline 5 September). 

Our hosts are the Digital Humanities and Arts (DHA) Praxis project at the University of Nottingham who are kindly providing food and refreshments and will be talking about two amazing projects they have been involved in:

ArtMaps: putting the Tate Collection on the map project
ArtMaps: Putting the Tate Collection on the map

Dr Laura Carletti will be talking about the ArtMaps project which is getting the public to accurately tag the locations of the Tate's 70,000 artworks.

The 'Wander Anywhere' free mobile app developed by Dr Benjamin Bedwell.
The 'Wander Anywhere' free mobile app developed by Dr Benjamin Bedwell.

Dr Benjamin Bedwell, Research Fellow at the University of Nottingham will talk about the free mobile app he developed called 'Wander Anywhere'.  The mobile software offers users new ways to experience art, culture and history by guiding them to locations where it downloads stories intersecting art, local history, architecture and anecdotes on their mobile device relevant to where they are.

For more information, a detailed programme and to book your place, visit the Labs and Digital Humanities and Arts Praxis Workshop event page.

Posted by Mahendra Mahey, Manager of BL Labs.

The BL Labs project is funded by the Andrew W. Mellon Foundation.

27 January 2016

Come to our first @BL_Labs Roadshow event at #citylis London Mon 1 Feb (5pm-7.30pm)

Add comment Comments (0)

Labs Roadshow at #citylis London, Mon 1 Feb (5pm-7.30pm)

Live in or near North-East London and are available on Monday 1 Feb between 1700 - 1930? Come along to the first FREE UK Labs Roadshow event of 2016 (we have a few places left and booking is essential for anyone interested) and:

#citylis London BL Labs London Roadshow Event Mon 1 Feb (1730 - 1930)
#citylis at the Department for Information ScienceCity University London,
the first BL Labs Roadshow event Mon 1 Feb (1700 - 1930)
  • Discover the digital collections the British Library has, understand some of the challenges of using them and even take some away with you.
  • Learn how researchers found and revived forgotten Victorian jokes and Political meetings from our digital archives.
  • Understand how special games and computer code have been developed to help tag un-described images and make new art.
  • Talk to Library staff about how you might use some of the Library's digital content innovatively.
  • Get advice, pick up tips and feedback on your ideas and projects for the 2016 BL Labs Competition (deadline 11 April) and Awards (deadline 5 September). 

Our first hosts are the Department for Information Science (#citylis) at City University London. #citylis have kindly organised some refreshments, nibbles and also an exciting student discussion panel about their experiences of working on digital projects at the British Library, who are:

#citylis student panel  Top-left, Ludi Price and Top-right, Dimitra Charalampidou Bottom-left, Alison Pope and Bottom-right, Daniel van Strien
#citylis student panel.
Top-left, Ludi Price 
Top-right, Dimitra Charalampidou
Bottom-left, Alison Pope
Bottom-right, Daniel van Strien

For more information, a detailed programme and to book your place (essential), visit the BL Labs Workshop at #citylis event page.

Posted by Mahendra Mahey, Manager of BL Labs.

The BL Labs project is funded by the Andrew W. Mellon Foundation.