THE BRITISH LIBRARY

Digital scholarship blog

11 posts from February 2019

28 February 2019

The World Wide Lab: Building Library Labs - Part II

BL Flickr Copenhagen 1

We're setting sail for Denmark! Along with colleagues from the UK, Austria, Belgium, Egypt, Finland, Germany, Ireland, Latvia, Luxembourg, the Netherlands, Qatar, Spain, Sweden and the USA, we will be mooring at Copenhagen's Black Diamond, waterfront home to Denmark's Royal Library, for the second International Building Library Labs event: 4-5 March 2019.

Danish lib & BL logis

For some time now, leading national, state, university and public libraries around the world have been creating 'digital lab type environments'. The purpose of these 'laboratories' is to afford access to their institutions' digital content - the digitised and 'born digital' collections as well as data - and to provide a space where users can experiment and work with that content in creative, innovative and inspiring ways. Our shared ethos is to open up our collections for everyone: digital researchers, artists, entrepreneurs, educators, and everyone in between.

BL Labs has been running in such a capacity for six years. In September 2018, we hosted a 2-day workshop at the British Library in London for invited participants from national, state and university libraries - the first event of its kind in the world. It was a resounding success, and it was decided that we should organise a second event, this time in collaboration with our colleagues in Copenhagen.

11248527023_2655ce2ceb_oNext week's participants, from over 30 institutions, will be sharing lessons learned, talking about innovative projects and services that have used their digital collections and data in clever ways, and continuing to establish the foundations for an international network of Library Labs. We aim to work together in the spirit of collaboration so that we can continue to build even better Library Labs for our users in the future.

Our packed programme is available to view on Eventbrite or as a Googledoc. We still have a few spaces left so if you are interested in coming along, you can still book here. As well as presentations and plenary debates, we will have eight lightning talks with topics ranging from how to handle big data to how to run a data visualisation lab. To accommodate our many delegates, with their own interests and specialisms, we will break out into 12 parallel discussion groups focusing on subjects such as how to set up a lab; how to get access to data; moving from 'project' lab to 'business as usual'; data curation; how to deal with large datasets; and using Labs & Makerspaces for data-driven research and innovation in creative industries. 

We will blog again after the event, and provide links to some of the presentations and outputs. Watch this space! 

11150060314_bcf2b92af3_o

Danish-themed images trawled from our British Library Flickr Images set: pages 37, 126, and 15 of Copenhagen, the Capital of Denmark, published by the Danish Tourist Society, 1898. Find the original book here.

Posted by Eleanor Cooper on behalf of BL Labs

26 February 2019

Competition to automate text recognition for printed Bangla books

You may have seen the exciting news last week that the British Library has launched a competition on recognition of historical Arabic scientific manuscripts that will run as part of ICDAR2019. We thought it only fair to cover printed material too! So we’re running another competition, also at ICDAR, for automated text recognition of rare and unique printed books written in Bangla that have been digitised through the Library's Two Centuries of Indian Print project.

Some of you may remember the Bangla printed books competition which took place at ICDAR2017 which generated significant interest among academic institutions and technology providers both in India and across the world. The 2017 competition set the challenge of finding an optimal solution for automating recognition of Bangla printed text and resulted in Google’s method performing best for both text detection and layout analysis.

Fast forward to 2019 and, thanks to Jadavpur University in Kolkata, we have added more ground truth transcriptions for competition entrants to train their OCR systems with. We hope that the competition encourages submissions again from cutting-edge OCR methods leading to a solution that can truly open up these historic books, dating between 1713 and 1914, for text mining, enabling scholars of South Asian studies to explore hundreds of thousands of pages on a scale that has not been possible until now.

AletheiaGroundTruth

              Image showing a transcribed page from one of the Bengali books featured in the ICDAR2019 competition

As with the Arabic competition, we are collaborating with PRImA (Pattern Recognition & Image Analysis Research Lab) who will provide expert and objective evaluation of OCR results produced through the competition. The final results will be revealed at the ICDAR2019 conference in Sydney in September.

So if you missed out last time but are interested in testing your OCR systems on our books the competition is now open! For instructions of how to apply and more about the competition, please visit https://www.primaresearch.org/REID2019/

 

This post is by Tom Derrick, Digital Curator for Two Centuries of Indian Print, British Library. He is on Twitter as @TommyID83 and Two Centuries of Indian Print tweet from @BL_IndianPrint

 

21 February 2019

Automatic Transcription of Historical Arabic Scientific Manuscripts - Round 2

I am very pleased to announce that the British Library in collaboration with PRImA (Pattern Recognition & Image Analysis Research Lab) and the Alan Turing Institute are launching the ICDAR2019 Competition on Recognition of Historical Arabic Scientific Manuscripts.

Why are we doing this?

The British Library has a significant collection of Arabic manuscripts, among the largest in Europe and North America. These include copies of major religious, historical, literary and scientific works. As a post-digitisation step, we aim to make their contents more discoverable and usable by creating machine-readable text from scanned images. Opening up this content for full-text search and enabling text analysis at scale can revolutionise research!

Add MS 7474_0032

What did we do last year?

In collaboration with the aforementioned partners, we launched a competition as part of the 16th International Conference on Frontiers in Handwriting Recognition (ICFHR2018). This competition was aimed at finding an optimal solution for an automatic Recognition of Historical Arabic Scientific Manuscripts (RASM2018).

For this purpose, we provided competition participants with a ground truth set – digitised images and XML files – derived from the British Library/Qatar Foundation Partnership digitised collection of historical Arabic manuscripts available on the Qatar Digital Library. This set files indicated the different text regions and lines, alongside their accurate transcription. It was used to train participants’ text recognition systems to automatically identify Arabic script in other images. We supplied participants with an additional set of 85 digitised images to try this out – and then PRImA evaluated the results using objective comparative evaluation methods.

Who won?

Participants had to address one or more of these three challenges: page segmentation, text line detection and Optical Character Recognition (OCR).

We had two winners, for two different tasks:

  • Page segmentation: Berat Kurar Barakat, Ben-Gurion University of the Negev
  • Text lines segmentation & Text recognition: Hany Ahmed, RDI Company, Cairo University

You can read more about it in this article, published in the proceedings of ICFHR2018.

Why another competition?

The field of OCR and HTR (Handwritten Text Recognition) is rapidly evolving, and we would like to provide text recognition communities with a larger and more enhanced ground truth set to train their systems. Our goal is to leave the research community with the most useful dataset for developing state-of-the-art solutions for Arabic HTR.

We are also adding another challenge in the current competition! Our Arabic manuscripts provide text recognition systems with many challenges, such as varying text column widths and font sizes, different text directions, faded ink, non-rectangular text regions, decorations and much more. This time we are trying to tackle marginalia – text written in the margins of the manuscripts – which is often less standardised and legible than the main text, and frequently goes in different directions.

Now what?

We are now inviting anyone with a text recognition software to try it out with our unique Arabic material. This competition is held in the context of the 15th International Conference on Document Analysis and Recognition (ICDAR2019).

This is the official RASM2019 website: https://www.primaresearch.org/RASM2019/

Here you will be able to find more information on this competition, its schedule and resources. To enter the competition please e-mail rasm2019@primaresearch.org

Organisers:

  • Prof Apostolos Antonacopoulos, Professor of Pattern Recognition, University of Salford and Head of (PRImA) research lab
  • Christian Clausner, Research Fellow at the Pattern Recognition and Image Analysis (PRImA) research lab
  • Dr Adi Keinan-Schoonbaert, Digital Curator for Asian and African Collections at the British Library
  • Lynda Barraclough, Head of Curatorial Operations for the British Library’s partnership with the Qatar Foundation
  • Daniel Lowe, Curator for Arabic Collections at British Library
  • Dr Bink Hallum, Arabic Scientific Manuscripts Curator for the British Library/Qatar Foundation Partnership
  • Daniel Wilson-Nunn, PhD student at the University of Warwick & Turing PhD Student based at the Alan Turing Institute

Any questions – do get in touch with digitalresearch@bl.uk or rasm2019@primaresearch.org

Good Luck!

Delhi Arabic 1901_0154

 

This post is by Adi Keinan-Schoonbaert, Digital Curator for Asian and African Collections, British Library. She is on twitter as @BL_AdiKS

 

19 February 2019

BL Labs 2018 Teaching & Learning Award Runner Up: 'Pocahontas and After'

This guest blog is by Border Crossing, the 2018 BL Labs Teaching & Learning Award Runners Up, for their project, 'Pocahontas and After'.

BorderCross image 1

Two images, each showing two young women dressed to show their culture, their pride, their sense of self. The first image dates from 1907, and shows The Misses Simeon, from the Stoney-Nakoda people of Western Canada, photographed by Byron Harmon. The second was taken in 2018 by John Cobb at Marlborough Primary School, West London, and shows a pupil of Iraqi heritage called Rose Al Saria, pictured with her sister. It was Rose who chose the particular archive image as the basis for her self-portrait, and who conceptualised the way it would be configured and posed.

This pair of photos is just one example in Border Crossings' exhibition Pocahontas and After, which was recently honoured in the British Library’s Labs Teaching and Learning category. The exhibition - which was seen by more than 20,000 people at Syon House last summer, and goes to St Andrews in February - represents the culmination of a sustained period of education and community work, beginning with the 2017 ORIGINS Festival. During the Festival, we not only held a ceremony for three indigenous women to commemorate Pocahontas at Syon, where she had stayed in the summer of 1616: we also brought indigenous artists into direct contact with the diverse communities around the House, in the two Primary Schools where they led workshops and study sessions, in the wonderful CARAS refugee group, and through our network of committed and energetic festival volunteers. In the following months, a distilled group from each of these partners worked closely with heritage experts from the archives (including the British Library’s own Dr. Philip Hatfield), Native American cultural consultants, and our own artistic staff to explore the ways in which Native American people have been presented in the past.

Their journeys into the archives were rich and challenging. What we think of as "realistic" photographs of indigenous people often turned out to be nothing of the kind. Edward Curtis, for example, apparently carried a chest of "authentic" costumes and props with him, which he used in his photographs to recreate the life of "the vanishing race" as he imagined it may have been in some pre-contact Romantic idyll. In other words, the archive photos are often about the photographer and the viewer, far more than they are about the subject.

BorderCross image 2

BorderCross image 3

As our volunteers came to realise this, they became more and more assertive of the need for agency in contemporary portraiture. Complex and fascinating decisions started to be made, placing the generation of meaning in the bodies of the people photographed. For example, Sebastian Oliver Wallace-Odi, who has Ghanaian heritage, saw how Ronald Mumford’s archive photo had been contrived to show “British patriotism” from First Nations chiefs, riding a car bedecked in a Union Jack, during the First World War. Philip showed him how other photos demonstrated the presence of Mounties at the shoot, emphasising the lack of agency from the subjects. Sebastian countered it with an image in which the red white and blue flag is the symbol of the London Underground where his father works, and the car, like his shirt, is distinctly African.

What I love about this exhibition is that the meaning generated does not reside in one image or the other within the pair - but is rather in the energising of the space between, the dialogue between past and present, between different cultures, between human beings portrayed in different ways. It seems to me to be at once of way of honouring the indigenous subjects portrayed in the archive photographs, and of reinventing the form that was often too reductive in its attempts to categorise them.

Thanks to the Heritage Lottery Fund for supporting this project. Photos from the British Library digital collections.

Michael Walling - Artistic Director, Border Crossings. www.bordercrossings.org.uk

Watch the Border Crossing team receiving their Runner Up award and talking about their project on our YouTube channel (clip runs from 3.46 to 10.09):

Find out more about Digital Scholarship and BL Labs. If you have a project which uses British Library digital content in innovative and interesting ways, consider applying for an award this year! The 2019 BL Labs Symposium will take place on Monday 11 November at the British Library.

18 February 2019

Updated Eighteenth-Century Collections Online

The traditional, somewhat stereotypical image of the researcher of things past has not changed much in recent times. There is nothing easier than to imagine a scholar sitting at a scarcely illuminated wooden desk, surrounded by piles of old hardbound volumes, spending hours on end rummaging through the sheets in search of a clue.

In the field of eighteenth-century studies, this is certainly still the case. Scholars often go on a pilgrimage to prestigious repositories such as the British Library. However, in the last fifteen years or so, technology has started to offer attractive alternatives to the pleasure of travelling to London. Powered by Gale-Cengage, the Eighteenth-Century Collections Online (commonly referred to as ECCO) is a well-known resource that provides access to English-language and foreign-language publications printed in Britain, Ireland and the American colonies during the eighteenth century. This extensive collection contains over 180,000 titles (200,000 volumes) and allows full-text searching of some 32 million pages. These are digital editions based on the Eighteenth Century microfilming that started in 1981 and the English Short Title Catalogue.

New ECCO main screen
New ECCO home page

Moving away from its classic web-1.0 design, the Gale-Cengage team recently decided to revamp the layout of ECCO – indeed, of their entire portfolio of archive products, which include among others the Seventeenth- and Eighteenth-Century Burney Newspapers Collection. The aim is to make the Gale Primary Sources experience more consistent and intuitive for the user. At the head of this delicate operation are product managers Doran Steele and Megan Sullivan, who lead a nine-person team of software developers, content engineers, researchers and designers. Not quite the IT-only type of personnel, Doran and Megan are scholars themselves, respectively holding degrees in History and Information Science and a remarkable passion for all things past. They are responsible for the maintenance of the existing ECCO interface, as well as the development of the upcoming design refresh.

During a recent interview they gave to the authors of this post, Doran and Megan declared their objective of evolving ECCO in line ‘with user expectations of modern online research experiences’. Their driving force was stated very clearly as a bottom-up process. ‘This redesign’, they explained, ‘is informed by user feedback and market research’. A beta version of the new site has been available since the second half of 2018 to enable the Gale-Cengage team to gather feedback about the new design. The product managers specified that the final transition to the ‘new’ ECCO will only be completed once they feel confident that the new experience ‘successfully meets the needs of our users’. The final goal is a better user experience, ‘one that is faster and more intuitive’. To achieve this, a range of new features have been included, such as more filters on search results; results more relevant to the search queries; data visualization tools; improved subject indexing; more options for adjusting the image; and the ability to download in a text format the OCR (optical character recognition) version of a volume. The latter feature will be a particularly welcome innovation for scholars that often need to look up the occurrence of a single word or cut and paste long chunks of text.

ECCO search results
New ECCO search results screen

The options for adjusting the page view are another significant novelty. The beta version boasts new settings to quickly select the preferred zoom level, as well as sliders to increase or decrease the brightness and contrast of the page. These improvements are particularly welcome considering that the quality of the scans remains unchanged. The page quality is not directly related to ECCO. The portal simply allows the consultation of the digitised microfilms included in the first collection (also known as ECCO 1, comprising over 154.000 texts) and the digitisation of a second, smaller collection of books (ECCO 2, over 52.000 titles). This raises an important issue. A plethora of relatively unknown, yet precious eighteenth-century material remains difficult to consult because, on top of the uneven quality in the texts that came out of eighteenth-century printing presses, the original microfilming technology that was employed for the first collection yielded relatively low-resolution results. This causes some hiccups with OCR recognition, thus discouraging the use of quantitative methodologies. But the issue is all the more salient when the category of eighteenth-century visuals is taken into account. At a time when British engravers multiplied in numbers to illustrate the newly-discovered wonders of the natural world or the archaeological remains of Roman cities in England, illustrations became an essential aspect of the eighteenth-century book market and reading experience. While for essential texts such as William Stukeley’s Itinerarium curiosum (1724) or Eleazar Albin and William Derham’s A Natural History of Birds (1734) more refined scans can be found elsewhere, a large number of texts is digitally available only through ECCO 1. Scholars interested in images are either to focus on well-known texts that have been digitised by other providers – with serious consequences in terms of canonicity – or eventually need to plan a visit to major libraries to consult the relevant volumes in person, somehow defeating the very idea of digital reading. Either way, the study of visual culture is somewhat inhibited. Nevertheless, the ‘new’ ECCO promises to enhance the user experience and to offer even more opportunities to engage with outstanding repositories of primary material. If you already had a chance to use the new version, we encourage you to get in touch with Doran and Megan: as your feedback and suggestions can improve ECCO even further.

New ECCO text screen
New ECCO image viewer screen

This post is by Alessio Mattana, Teaching Assistant in Eighteenth-Century Literature at the University of Leeds (on Twitter as @mattanaless), and Dr Giacomo Savani, Teaching Assistant in Ancient History at the University of Leeds (on Twitter as @GiacomoSavani).

13 February 2019

Sign up for a research workshop on books written for mobile devices!

Can you help us shape the future of our digital collection? London workshop 20th February

On 7- 8th February, we ran some user experience sessions in one of our Reading Rooms at the British Library, in order to better understand users’ expectations and requirements when accessing complex digital publications within our collections. Our focus is on “emerging formats” – eBook mobile apps and web-based interactive narratives – and exploring options for their future preservation and access. The first round of sessions was really informative and we are now pleased to announce that our research is continuing with a second round of research in London.

We are now looking for new participants to take part in a 2-and-a half-hour workshop, this time based at our partner agency, Bunnyfoot. As with the previous research sessions, we are looking for people who have familiarity with book apps and web-based interactive narratives designed for mobile devices, whether they’re using them for pleasure or as part of their practice. We would like participants to join a workshop on Wednesday 20th February, either in the morning or the afternoon, at Bunnyfoot’s research labs in London. The sessions will last for about 2-and-a-half hours and Bunnyfoot will offer a £120 incentive for anyone taking part.

If you are interested in taking part, please follow the link and complete the short survey to sign up: https://www.surveygizmo.com/s3/4841789/British-Library-Screener-Survey

To find out further information about the Emerging Formats project, please see our previous post on Emerging Formats and our project page.

Ipad_small

 

This post is by Ian Cooke, Head of Contemporary British Publications, on twitter as @IanCooke13 and Giulia Carla Rossi, Curator of Digital Publications on twitter as @giugimonogatari.

BL Labs 2018 Artistic Award Runner Up: 'Nomad'

Nomad is a collaborative project between Abira Hussein, an independent researcher and curator, and Sophie Dixon and Ed Silverton of Mnemoscene. They were the runners up in the BL Labs Artistic Award category for 2018, and they've written a guest blog post about their project for the Digital Scholarship blog.

Nomad: Reconnecting Somali heritage

The project has been supported by the Heritage Lottery fund and premiered at the British Library and British Museum during the Somali Week Festival 2018. Centred around workshops engaging Somali communities in London, Nomad explores the creative use of Mixed Reality and web-based technology to contextualise archival Somali objects with the people and traditions to which they belong.

Nomad 1

Nomad began with three Somali heritage objects - a headrest, bowl, and incense burner - which had been digitised at the British Museum. Thanks to Object Journeys, a previous project Abira was involved in, they were freely available to use.

Our goal was to reflect the utilitarian nature of the objects by showing their intended use. Furthermore, in Somali culture, songs and poetry are very important and we wanted to reconnect the objects to the sounds and traditions to which they belonged.

Our approach was to use Microsoft’s Mixed Reality HoloLens headset to show a Nomadic Somali family using the objects in real, everyday spaces. When wearing the headset the user can select different objects to reveal different members of the family, seeing how the object would be used, and hearing the songs which would have accompanied their use.

You can get a taste of the HoloLens experience in this short video (1 minute).

To create these ephemeral figures we used motion capture and 3D modelling, creating the clothing by referencing archival photographs held at the Powell Cotton Museum in Kent.

We used the British Library’s John Low collection as the source for the sounds you hear in the Mixed Reality experience. John Low travelled across Somalia between 1983-1986 working for an NGO to support community development. In his spare time he made field recordings with different tribes and dialects, providing an insight into the diversity of Somali oral traditions. The collection includes work songs reflecting pastoral life and poems, also known as Gabay, which are often recited in communal settings.

Nomad 2
Workshop held at the British Museum during the Somali Week Festival

With support from the Heritage Lottery Fund we toured the Mixed Reality experience to different Somali communities in London. The immersive experience became a way to inspire and encourage communities to share their own stories, to be part of an openly accessible archive representing their own narratives for Somali cultural heritage.

These workshops were exciting events in which participants handled real objects, tried the Mixed Reality experience and took part in the photogrammetry process to capture 3D models of the objects they had brought to the workshops.

To make the objects and sounds accessible to all, we also created Web-based Augmented Reality postcards to be used in the workshops. 

Nomad 3
Workshop participants looking at 3D objects using web-based Augmented Reality on their mobile phones

From the workshops we have 3D models, photographs and audio recordings which we’re currently adding to an online archive using the Universal Viewer. For updates about the archive and to find out more about our project please visit us at nomad-project.co.uk.

Watch the Nomad team receiving their award and talking about their project on our YouTube channel (clip runs from 4:15 to 8:16):

Find out more about Digital Scholarship and BL Labs. If you have a project which uses British Library digital content in innovative and interesting ways, consider applying for an award this year! The 2019 BL Labs Symposium will take place on Monday 11 November at the British Library.

07 February 2019

BL Labs 2018 Research Award Honourable Mention: 'HerStories: Sites of Suffragette Protest and Sabotage'

At our symposium in November 2018, BL Labs awarded two Honourable Mentions in the Research category for projects using the British Library's digital collections. This guest blog is by the recipients of one of these - a collaborative project by Professor Krista Cowman at the University of Lincoln and Tamsin Silvey, Rachel Williams, Ben Ellwood and Rosie Ryder at Historic England. 

HerStories: Sites of Suffragette Protest and Sabotage

The project marked the commemoration of the centenaries of some British women winning the Parliamentary vote in February 2018, the right to stand as MPs in November 1918 and of the first election in which women voted in December 1918.  The centenary year caught the public imagination and resulted in numerous commemorative events.  Our project added to these by focussing on the suffragette connections of England’s historic buildings.  Its aim was to uncover the suffragette stories hidden in the bricks and mortar of England’s historic buildings and to highlight the role that the historic built environment played in the militant suffrage movement.  The Women’s Social and Political Union co-ordinated a national campaign of militant activities across the country in the decade before the First World War.  Buildings were integral to this.  The Union rented out shops and offices in larger towns and cities.  It held large public meetings in the streets and inside meeting halls.

Suffragettes also identified buildings as legitimate targets for political sabotage.  The WSPU’s leader, Emmeline Pankhurst, famously urged her followers to strike at the enemy through property.  Buildings were then seen as legitimate targets for political sabotage by suffragettes who broke windows, set fires and placed bombs as part of their campaign to force the government to give votes to women. 

The project used the newly-digitised resources of Votes for Women and The Suffragette to identify historic buildings connected with the militant suffrage campaign.  Local reports in both papers were consulted to compile a database of sites connected to the WSPU across England.

HerStories image 1

This revealed a huge diversity in locations and activities.  Over 5000 entries from more than 300 geographical locations were logged. Some were obscure and mundane such as 6 Bronte Street in Keighley, the contact address for the local WSPU branch for 1908.  Others were much more high–profile including St Paul’s Cathedral where a number of services were disrupted by suffragettes and a bomb was planted.   All of the sites on the database were then compared with the National Heritage List, the official record of England’s protected historic buildings compiled and maintained by Historic England. https://historicengland.org.uk/listing/the-list/

This provided a new data set of over a hundred locations whose historic significance had already been recognised through listing but whose connection to militant suffrage was currently unrecognised. 

These sites were further researched using the British Library’s collection of historic local newspapers to retrieve more detail about their suffragette connections including their contemporary reception. This showed previously unknown detail including an attempted attack on the old Grammar School, King’s Norton, where the Nottingham Evening Post reported how suffragettes who broke in did no damage but left a message on the blackboard saying that they had refrained from damaging it’s ‘olde worlde’ rooms.

HerStories image 2

The team selected 41 sites and updated their entries on The List to include their newly-uncovered suffragette connections. 

The amended entries can be seen in more detail on Historic England’s searchable map at https://historicengland.org.uk/whats-new/news/suffragette-protest-and-sabotage-sites 

The results provided a significant addition to the suffragette centenary commemorations by marking the important connections between suffragette’s fight for the vote and England’s Historic listed buildings.

Watch Krista Cowman and Tamsin Silvey receiving their Honourable Mention award on behalf of their team, and talking about their project on our YouTube channel (clip runs from 10.45 to 13.33): 

Find out more about Digital Scholarship and BL Labs. If you have a project which uses British Library digital content in innovative and interesting ways, consider applying for an award this year! The 2019 BL Labs Symposium will take place on Monday 11 November at the British Library.