THE BRITISH LIBRARY

Digital scholarship blog

123 posts categorized "Collaborations"

21 April 2018

On the Road (Again)

Add comment

Flickr image: Wanderer
Image from the British Library’s Million Images on Flickr, found on p 198 of 'The Cruise of the Land Yacht “Wanderer”; or, thirteen hundred miles in my caravan, etc' by William Gordon Stables, 1886.

Now that British Summer Time has officially arrived, and with it some warmer weather, British Library Labs are hitting the road again with a series of events in Universities around the UK. The aim of these half-day roadshows is to inspire people to think about using the library's digitised collections and datasets in their research, art works, sound installations, apps, businesses... you name it!

A digitised copy of a manuscript is a very convenient medium to work on, especially if you are unable to visit the library in person and order an original item up to a reading room. But there are so many other uses for digitised items! Come along to one of the BL Labs Roadshows at a University department near you and find out more about the methods used by researchers in Digital Scholarship, from data-mining and crowd sourcing to optical character recognition for transcribing the words from an imaged page into searchable text. 

At each of the roadshow events, there will be speakers from the host institution describing some of the research projects they have already completed using digitised materials, as well as members of the British Library who will be able to talk with you about proposed research plans involving digitised resources. 

The locations of this year's roadshows are: 

Mon 9th April - BL Labs Roadshow 2018 (Open University) - internal event

Mon 26th March - BL Labs Roadshow 2018 (CityLIS) - internal event

Thu 12th April - BL Labs Roadshow 2018 (University of Bristol & Cardiff Digital Cultures Network)

Tue 24th April - BL Labs Roadshow 2018 (UCL)

Wed 25th April - BL Labs Roadshow 2018 (University of Kent)

Wed 2nd May - BL Labs Roadshow 2018 (University of Edinburgh)

Tue 15th May - BL Labs Roadshow 2018 (University of Wolverhampton)

Wed 16th May - BL Labs Roadshow 2018 (University of Lincoln)

Tue 5th June - BL Labs Roadshow 2018 (University of Leeds)

Wed 27th June - BL Labs Roadshow 2018 (University of Birmingham)

  BL Labs Roadshows 2018
See a full programme and book your place using the Eventbrite page for each event.

If you want to discover more about the Digital Collections, and Digital Scholarship at the British Library, follow us on Twitter @BL_Labs, read our Blog Posts, and get in touch with BL Labs if you have some burning research questions!

13 April 2018

Gaming the Gothic on Friday the 13th

Add comment

“The bats have left the bell tower, the victims have been bled”  - Happy Friday the 13th to those of you with gothic sensibilities! I’ve been enjoying singing along to the wonderful CHVRCHES cover of “Bela Lugosi’s Dead” originally by Bauhaus, while preparing for the Gaming the Gothic conference, which takes place at the University of Sheffield today, and where @GamingTheGothic have promised both cake and badges!

I am giving a paper on the Off the Map videogame design competition, which accompanied the British Library’s exhibition ‘Terror and Wonder: The Gothic Imagination’, which in 2014 celebrated 250 years of gothic literature and culture, starting from the publication of Horace Walpole’s The Castle of Otranto.

The Off The Map competition challenged higher education students based in the UK to create videogames inspired by the British Library’s collections and in 2014 three students from University of South Wales created a winning underwater game where the player rebuilds Fonthill Abbey, the once-stunning Gothic revival country house in Wiltshire home to author William Beckford, which was demolished in 1846 after the collapse of its spectacular 300-foot tower twenty years earlier.

image from http://s3.amazonaws.com/feather-files-aviary-prod-us-east-1/98739f1160a9458db215cec49fb033ee/2018-04-13/dc3b621871af410592efdfd3652390af.png
Image from 2014 Off the Map winning game Nix

 

image from http://s3.amazonaws.com/feather-files-aviary-prod-us-east-1/98739f1160a9458db215cec49fb033ee/2018-04-13/ce6820f7a60b4cdd9bfe2bbaa5ef57a6.png
Image taken from "Delineations of Fonthill and its Abbey", by John Rutter; published by the author, 1823 (BL 191.e.6-81)

The winning team used images, maps of the estate and sounds held in the British Library’s collections to create Nix; a game for the first generation Oculus Rift, a revolutionary virtual reality headset for 3D gaming. Tim Pye, curator of the British Library’s exhibition Terror and Wonder, said this about their entry:

“What is so impressive about the Nix game is the way in which it takes the stunning architecture of the Abbey, combines it with elements from its troubled history and infuses it all with a very ghostly air. The game succeeds in transforming William Beckford’s stupendously Gothic building into a magical, mysterious place reminiscent of the best Gothic novels.”

Keeping the gothic flames burning in 2018 and to mark the 200th year anniversary of the publication of Frankenstein, the British Library’s Digital Scholarship team is pleased to be collaborating on Gothic Novel Jam with Read Watch Play; an online reading group that has monthly themes. Last year we partnered on Odyssey Jam and it was inspiring to see the end results, which I blogged about here.

To get involved in Gothic Novel Jam participants need to make something creative inspired by the gothic novel genre. Then by the 31st July upload or share it on the itch.io Gothic Novel Jam site. Entries can include stories, poetry, art, games, music, films, pictures, soundscapes, or any other type of digital media response.

image from http://s3.amazonaws.com/feather-files-aviary-prod-us-east-1/98739f1160a9458db215cec49fb033ee/2018-04-13/1d2870c4f0cb46f4917c6dd8b4393191.png
Gothic Novel Jam, #GothNovJam, promotional postcard

As part of the jam we want participants to use images from the British Library Flickr account as inspiration for submissions. They’re freely available for anyone to use and the following albums may be particularly inspiring:

However, don't feel limited to using just those images, the full list of albums can be found here. There are also the Off the Map Gothic Collections of images on Wikimedia Commons and sounds on SoundCloud, which you are free to use. If you want to learn more about the gothic genre and it's authors, check out this hugely informative section of the Discovering Literature website.

Although the gothic novel is the main jam theme, we’ll also be announcing a sub-theme on the 1st July, so please follow the #GothNovJam hashtag on social media for more news and also to see what others are creating for the jam. Good luck and have fun!

image from http://s3.amazonaws.com/feather-files-aviary-prod-us-east-1/98739f1160a9458db215cec49fb033ee/2018-04-13/744d401b775b4859a6bbc731544c70a9.png
Button badges made for the Gaming the Gothic conference, really hope I get a #CakeAndDeath one!

This post is by resident goth, Digital Curator Stella Wisdom, on twitter as @miss_wisdom.

12 April 2018

The 2018 BL Labs Awards: enter before midnight Thursday 11th October!

Add comment

With six months to go before the submission deadline, we would like to announce the 2018 British Library Labs Awards!

The BL Labs Awards are a way of formally recognising outstanding and innovative work that has been created using the British Library’s digital collections and data.

Have you been working on a project that uses digitised material from the British Library's collections? If so, we'd like to encourage you to enter that project for an award in one of our categories.

This year, the BL Labs Awards is commending work in four key areas:

  • Research - A project or activity which shows the development of new knowledge, research methods, or tools.
  • Commercial - An activity that delivers or develops commercial value in the context of new products, tools, or services that build on, incorporate, or enhance the Library's digital content.
  • Artistic - An artistic or creative endeavour which inspires, stimulates, amazes and provokes.
  • Teaching / Learning - Quality learning experiences created for learners of any age and ability that use the Library's digital content.

BLAwards2018
BL Labs Awards 2018 Winners (Top-Left- Research Award Winner – A large-scale comparison of world music corpora with computational tools , Top-Right (Commercial Award Winner – Movable Type: The Card Game), Bottom-Left(Artistic Award Winner – Imaginary Cities) and Bottom-Right (Teaching / Learning Award Winner – Vittoria’s World of Stories)

There is also a Staff award which recognises a project completed by a staff member or team, with the winner and runner up being announced at the Symposium along with the other award winners.

The closing date for entering your work for the 2018 round of BL Labs Awards is midnight BST on Thursday 11th October (2018)Please submit your entry and/or help us spread the word to all interested and relevant parties over the next few months. This will ensure we have another year of fantastic digital-based projects highlighted by the Awards!

The entries will be shortlisted after the submission deadline (11/10/2018) has passed, and selected shortlisted entrants will be notified via email by midnight BST on Friday 26th October 2018. 

A prize of £500 will be awarded to the winner and £100 to the runner up in each of the Awards categories at the BL Labs Symposium on 12th November 2018 at the British Library, St Pancras, London.

The talent of the BL Labs Awards winners and runners up from 2017, 2016 and 2015 has resulted in a remarkable and varied collection of innovative projects. You can read about some of the 2017 Awards winners and runners up in our other blogs, links below:

BLAwards2018-Staff
British Library Labs Staff Award Winner – Two Centuries of Indian Print


Research category Award (2017) winner: 'A large-scale comparison of world music corpora with computational tools', by Maria Panteli, Emmanouil Benetos and Simon Dixon. Centre for Digital Music, Queen Mary University of London

  • Research category Award (2017) runner up: 'Samtla' by Dr Martyn Harris, Prof Dan Levene, Prof Mark Levene and Dr Dell Zhang
  • Commercial Award (2017) winner: 'Movable Type: The Card Game' by Robin O'Keeffe
  • Artistic Award (2017) winner: 'Imaginary Cities' by Michael Takeo Magruder
  • Artistic Award (2017) runner up: 'Face Swap', by Tristan Roddis and Cogapp
  • Teaching and Learning (2017) winner: 'Vittoria's World of Stories' by the pupils and staff of Vittoria Primary School, Islington
  • Teaching and Learning (2017) runner up: 'Git Lit' by Jonathan Reeve
  • Staff Award (2017) winner: 'Two Centuries of Indian Print' by Layli Uddin, Priyanka Basu, Tom Derrick, Megan O’Looney, Alia Carter, Nur Sobers khan, Laurence Roger and Nora McGregor
  • Staff Award (2017) runner up: 'Putting Collection metadata on the map: Picturing Canada', by Philip Hatfield and Joan Francis

For any further information about BL Labs or our Awards, please contact us at labs@bl.uk.

11 April 2018

Ambient Literature Festival

Add comment

As the final months of the Ambient Literature project approach, the research team are convening a series of final events (more on which below), but are also spending time drawing out conclusions and reflections regarding the last two years of work. Below is a guest post discussing this by Tom Abba from the Ambient Literature project and the University of the West of England, you can  follow him on twitter as @tomabba:   

When we began in May 2016, we were upfront about the challenges of the work we were going to make and address. Here’s what we said at the launch event at Hachette’s (then shiny new) headquarters in Blackfriars:

Here’s an admission at the start of a research programme:

We don’t know what Ambient Literature is.

We’ve started to map the territory, to define by identifying borders and by testing the edges. It’s important to note though, that we don’t want to reduce the idea to something tight and defined, rather our intention is to open it up, so show by doing, making and thinking. We do know that Ambient Literature asks for writing to be specific, to be for this form. That there are rules, grammars of making and thinking about readers and texts in new ways.

Twenty three months later, I think we know what this is, and we’ve made progress toward a set of rules and grammars for making work in this form. Each of our three commissions demonstrates how Ambient Literature might work, and each does so in a completely different way. Duncan Speakman’s It Must Have Been Dark By Then, James Attlee’s The Cartographer’s Confession and Kate Pullinger’s Breathe (made with Editions at Play) ask something of their audience that is particular to the decisions each writer made, how those were translated into a technologically mediated form, and the goals at the heart of each of those works. In different ways, for different reasons, we’re very proud of each of them.

Ambient lit

Ambient Literature has been an extended conversation about storytelling, situation, audience, presence and much much more. We opened that conversation up last year at our half-way Symposium, and want to take it much further now. We want to show, and to talk to you all, and celebrate everything that’s been part of this journey. If you’re interested in being part of that conversation and celebration, then our Showcase Festival takes place on 23rd April at the British Library Conference Centre. We’ll be sharing our secrets and discoveries, and letting you look behind the scenes at how each of our projects were created. The event will feature workshops with Duncan Speakman and Kate Pullinger and talks, as well as a guided tour through the London of The Cartographer’s Confession with its author James Attlee and producer Emma Whittaker. We’re aiming the event at publishing industry professionals, students and practitioners, as well as anyone interested in the future of reading and writing. We can promise at the very least you’ll come away knowing something new about digital storytelling. If you would like to attend please register here and book places on to the workshops.

image from https://s3.amazonaws.com/feather-client-files-aviary-prod-us-east-1/2018-04-11/b7a254e5-66bb-4ac6-9f98-6313ed72069d.png
Schedule for Ambient Literature Festival

We’re also taking the whole project to the Hay Festival in May. We’re running workshops, hosting a panel discussion (with guests including Dan Franklin and Joanna Walsh) and are making a new piece of work - Words We Never Wrote - specially for Hay. It premieres at the Festival and explores the meaning of writing, language and storytelling. We’re incredibly proud of this piece - it asks questions about linearity and form, art and suggestion that we’ve been aching to address for years. We’re delighted to be at Hay and, if you want to join us there, we can promise you a little bit of magic when you visit. 

05 April 2018

Digital Conversations @BL: Digital Comics

Add comment

Venturing off the page; into multimedia and new narrative forms – we invite you to join us for an evening exploring the worlds of digital comics.

Over the past year, our Contemporary British Collections team have been busy exploring how comics are created and distributed in the 21st century. You can read these blog posts about projects done by our PhD placement researchers:

The Proper Serious Work of Preserving Digital Comics and Collecting Webcomics in the UK Web Archive by Jen Aggleton, who created a UK Web Archive collection of web comics.

21st Century British Comics by Olivia Hicks.

image from http://s3.amazonaws.com/feather-files-aviary-prod-us-east-1/98739f1160a9458db215cec49fb033ee/2018-04-05/0ff33265c5be478a8ff314494fd368ff.png
An extract from The Archivist by Daniel Merlin Goodbrey

Continuing this work, we have been collaborating with John Freeman and Graham Baines of British Comics website downthetubes.net to bring together an exciting panel of comics creators for our next Digital Conversation at the British Library. We’ll be exploring the fortunes of comics in the online world, and looking ahead at what’s next for digital comics. Our panel brings together a great range of experience in creating comics and digital media in many forms:

Kate Ashwin – has been creating internet comics since 2002 and is the creator of Widdershins, a series mixing magic, comedy and adventure, set in a fictional West Yorkshire town.

Yomi Ayeni – creates work across different media, including film and digital projects as well as comics. His Clockwork Watch series was voted best Graphic Novel in 2015 by readers of Steampunk Chronicle.

Daniel Merlin Goodbrey – is a pioneer of digital comics, experimenting with the hypercomic form and “infinite canvass” comics (an extract from The Archivist can be seen above). Daniel’s comics, and writing about comics, can be found on e-merl.com

Bryan Talbot – is one of Britain’s best known comic artists and credited as one of the creators of the graphic novel form. His work includes The Adventures of Luther Arkwright, Alice in Sunderland, the Grandville series of steampunk detective thrillers, Dotter of Her Father’s Eyes and Sally Heathcote: Suffragette.    

image from http://s3.amazonaws.com/feather-files-aviary-prod-us-east-1/98739f1160a9458db215cec49fb033ee/2018-04-05/958d0d8429ca4949ba428b69149662d0.png
Page from Widdershins Vol 5. by Kate Ashwin

Comics have always provided an immediate and emotionally-engaging way of telling imaginative stories. British comics creators have been at the vanguard of innovation, and this has been true also of digital comics. Join our Digital Conversation to find out how new technologies are leading to new forms of story-telling, plus what the challenges and opportunities are for building web comics collections. 

The Digital Comics Conversation event takes place in The Terrace Restaurant at the British Library on Wednesday 18th April, 18.30- 20.30; for more details including booking, visit: https://www.bl.uk/events/digital-conversation-digital-comics.

This is a guest post by Ian Cooke, Head of Contemporary British Publications, on twitter as @IanCooke13.     

21 March 2018

BL Labs 2017 Symposium: Vittoria's World of Stories, Learning & Teaching Award Winner

Add comment

Vittoria’s 'World of Stories' - the BL Labs Learning and Teaching Award Winner 2017 - is a project led by parents at Vittoria Primary School through the PTA, with the support of school staff. The aim of the project is to collect and share traditional tales from around the world and creative work by current pupils through workshops, the production of a book, school assemblies, readings and performances, and via the creation of audio, text and images for the school website during the current academic year. The illustrations for the project are drawn from the British Library’s Flickr collection which are displayed alongside pupils’ artwork.

VS 1
The front cover of Vittoria primary school's 'World of Stories'

Our school is a diverse community of learners: pupils’ families come from a wide range of ethnic and cultural backgrounds. Languages spoken by pupils at home include Arabic, Bengali, Vietnamese, Russian, Chechen, Turkish and Somali. One of the pedagogical goals of the project was to make visible the similarities between well-loved traditional tales and explore how different cultures use the same cast of characters - heroes and heroines, tricksters and magicians, villains and monsters – in order to speak across generations about what it means to be human. We wanted to promote and celebrate the diversity of the multi-cultural community which makes up our school, and show parents and children that the characters and stories they love are shared by others from different cultures.   

 The stories in the book include original works by pupils, gathered through a story-writing competition with winning entries selected by the PTA committee. We also asked parents to nominate traditional tales for inclusion in the collection, and held a bi-lingual (English and Arabic) story-sharing workshop for parents organised by the PTA. During the workshop, parents spoke about well-known traditional tales which they remembered from childhood and discussed the contrasts and similarities between characters and narratives from different cultures. For example, the section of the book which presents ‘bogeyman’ type monsters was developed from discussions in the workshop. We discovered that the Beast from Beauty and the Beast is called ‘Al-Ba’ati’ in Sudan, where the story is known as ‘Jamila wal Ba’ati’. Sudanese parents discussed how ‘Al-Ba’ati’ is used to encourage good behaviour in children, which prompted another parent to share her family’s stories of ‘The Boogerman’ who plays a similar role in persuading children to stay in bed at bedtime.

VS 4
One of the British Library's Flickr images, used as an illustration

The project also links with our work within the classroom to develop children’s reading skills, through promoting a love of reading and books at home. By showing that we value and celebrate the oral culture of storytelling between parents and children, and by collecting and translating tales from languages other than English, we aim to encourage parents to read with their children and support their learning at home.

The project has had a positive effect within the school community, by promoting dialogue and interaction between parents from different cultures through the parents’ workshop, and provided a vehicle to celebrate pupils’ achievements to the school community. Parents have also bought copies of the book to share with family and friends. One of our parent contributors took copies of the book to share with older generations of her family in Sudan during a recent visit, and we hope that other parents will do the same.

During the next phase of the project we will be organising a series of readings and performances using the book with different year groups and making audio recordings which we will publish on the school website for parents to download and listen to with their children at home.

If this blog post has stimulated your interest in working with the British Library's digital collections, start a project and enter it for one of the BL Labs 2018 Awards! Join us on 12 November 2018 for the BL Labs annual Symposium at the British Library.

Posted by BL Labs on behalf of Vittoria Primary school

14 March 2018

Working with BL Labs in search of Sir Jagadis Chandra Bose

Add comment

The 19th Century British Library Newspapers Database offers a rich mine of material to be sourced for a comprehensive view of British life in the nineteenth and early twentieth century. The online archive comprises 101 full-text titles of local, regional, and national newspapers across the UK and Ireland, and thanks to optical character recognition, they are all fully searchable. This allows for extensive data mining across several millions worth of newspaper pages. It’s like going through the proverbial haystack looking for the equally proverbial needle, but with a magnet in hand.

For my current research project on the role of the radio during the British Raj, I wanted to find out more about Sir Jagadis Chandra Bose (1858–1937), whose contributions to the invention of wireless telegraphy were hardly acknowledged during his lifetime and all but forgotten during the twentieth century.

J.C.Bose
Jagadish Chandra Bose in Royal Institution, London
(Image from Wikimedia Commons)

The person who is generally credited with having invented the radio is Guglielmo Marconi (1874–1937). In 1909, he and Karl Ferdinand Braun (1850–1918) were awarded the Nobel Prize in Physics “in recognition of their contributions to the development of wireless telegraphy”. What is generally not known is that almost ten years before that, Bose invented a coherer that would prove to be crucial for Marconi’s successful attempt at wireless telegraphy across the Atlantic in 1901. Bose never patented his invention, and Marconi reaped all the glory.

In his book Jagadis Chandra Bose and the Indian Response to Western Science, Subrata Dasgupta gives us four reasons as to why Bose’s contributions to radiotelegraphy have been largely forgotten in the West throughout the twentieth century. The first reason, according to Dasgupta, is that Bose changed research interest around 1900. Instead of continuing and focusing his work on wireless telegraphy, Bose became interested in the physiology of plants and the similarities between inorganic and living matter in their responses to external stimuli. Bose’s name thus lost currency in his former field of study.

A second reason that contributed to the erasure of Bose’s name is that he did not leave a legacy in the form of students. He did not, as Dasgupta puts it, “found a school of radio research” that could promote his name despite his personal absence from the field. Also, and thirdly, Bose sought no monetary gain from his inventions and only patented one of his several inventions. Had he done so, chances are that his name would have echoed loudly through the century, just as Marconi’s has done.

“Finally”, Dasgupta writes, “one cannot ignore the ‘Indian factor’”. Dasgupta wonders how seriously the scientific western elite really took Bose, who was the “outsider”, the “marginal man”, the “lone Indian in the hurly-burly of western scientific technology”. And he wonders how this affected “the seriousness with which others who came later would judge his significance in the annals of wireless telegraphy”.

And this is where the BL’s online archive of nineteenth-century newspapers comes in. Looking at newspaper coverage about Bose in the British press at the time suggests that Bose’s contributions to wireless telegraphy were soon to be all but forgotten during his lifetime. When Bose died in 1937, Reuters Calcutta put out a press release that was reprinted in several British newspapers. As an example, the following notice was published in the Derby Evening Telegraph of November 23rd, 1937, on Bose’s death:

Newspaper clipping announcing death of JC Bose
Notice in the Derby Evening Telegraph of November 23rd, 1937

This notice is as short as it is telling in what it says and does not say about Bose and his achievements: he is remembered as the man “who discovered a heart beat in trees”. He is not remembered as the man who almost invented the radio. He is remembered for the Western honours that are bestowed upon him (the Knighthood and his Fellowship of the Royal Society), and he is remembered as the founder of the Bose Research Institute. He is not remembered for his career as a researcher and inventor; a career that span five decades and saw him travel extensively in India, Europe and the United States.

The Derby Evening Telegraph is not alone in this act of partial remembrance. Similar articles appeared in Dundee’s Evening Telegraph and Post and The Gloucestershire Echo on the same day. The Aberdeen Press and Journal published a slightly extended version of the Reuters press release on November 24th that includes a brief account of a lecture by Bose in Whitehall in 1929, during which Bose demonstrated “that plants shudder when struck, writhe in the agonies of death, get drunk, and are revived by medicine”. However, there is again no mention of Bose’s work as a physicist or of his contributions to wireless telegraphy. The same is true for obituaries published in The Nottingham Evening Post on November 23rd, The Western Daily Press and Bristol Mirror on November 24th, another article published in the Aberdeen Press and Journal on November 26th, and two articles published in The Manchester Guardian on November 24th.

The exception to the rule is the obituary published in The Times on November 24th. Granted, with a total of 1116 words it is significantly longer than the Reuters press release, but this is also partly the point, as it allows for a much more comprehensive account of Bose’s life and achievements. But even if we only take the first two sentences of The Times obituary, which roughly add up to the word count of the Reuters press release, we are already presented with a different account altogether:

“Our Calcutta Correspondent telegraphs that Sir Jagadis Chandra Bose, F.R.S., died at Giridih, Bengal, yesterday, having nearly reached the age of 79. The reputation he won by persistent investigation and experiment as a physicist was extended to the general public in the Western world, which he frequently visited, by his remarkable gifts as a lecturer, and by the popular appeal of many of his demonstrations.”

We know that he was a physicist; the focus is on his skills as a researcher and on his talents as a lecturer rather than on his Western titles and honours, which are mentioned in passing as titles to his name; and we immediately get a sense of the significance of his work within the scientific community and for the general public. And later on in the article, it is finally acknowledged that Bose “designed an instrument identical in principle with the 'coherer' subsequently used in all systems of wireless communication. Another early invention was an instrument for verifying the laws of refraction, reflection, and polarization of electric waves. These instruments were demonstrated on the occasion of his first appearance before the British Association at the 1896 meeting at Liverpool”.

Posted by BL Labs on behalf of Dr Christin Hoene, a BL Labs Researcher in Residence at the British Library. Dr Hoene is a Leverhulme Early Career Fellow in English Literature at the University of Kent. 

If you are interested in working with the British Library's digital collections, why not come along to one of our events that we are holding at universities around the UK this year? We will be holding a roadshow at the University of Kent on 25 April 2018. You can see a programme for the day and book your place through this Eventbrite page. 

12 March 2018

The Ground Truth: Transcribing historical Arabic Scientific Manuscripts for OCR research

Add comment

Announcing a collaborative transcription project to support state-of-the-art research in automatic handwritten text recognition for historical Arabic texts

Cultural heritage institutions around the world are digitising hundreds of thousands of pages of historical Arabic manuscript and archive collections. Making these fully text searchable has the potential to truly transform scholarship, opening up this rich content for discovery and enabling large-scale analysis.

Computer scientists and scholars are working on this challenge, building systems which can automatically transcribe images of handwritten text, but for historical Arabic script a solution remains just out of reach.

Our aim is to contribute to continued research in this area by building an open image and ground truth dataset of historical handwritten Arabic texts, ensuring historical Arabic collections benefit from state-of-the-art developments in handwritten text recognition.

What is Ground Truth?

Optical Character Recognition (OCR) systems essentially turn a picture of text into text itself—in other words, producing something like a .TXT or .DOC file from a scanned .JPG of a printed or handwritten page. Most OCR systems require ground truth, a set of files which represent the truthful record of elements of an image, for training and evaluation purposes.

The ground truth of an image’s text content, for instance, is the complete and accurate record of every character and word in the image.

By knowing what the system is supposed to recognise on a page of handwritten text, researchers can both train their system to recognise the characters as well as test how well the system does once trained.

Transcription
 

  
View more transcriptions in progress from this manuscript (Or 3366) on the platform 

A collaborative approach

This project is a proof of concept exploring whether the creation of such a dataset can be done collaboratively at scale, using the collective expertise of volunteers around the world. At the heart of this approach is the Library’s enduring commitment to creating new and interesting ways to connect diverse communities of interest and expertise, be it scholars, the general public, computer scientists, students, and curators, around our collections. For this we are utilising a free and open-source platform, From the Page, which allows anyone with an interest in historical Arabic manuscripts to experience them up close, many for the first time, to discuss, learn and share expertise in their transcription.

Helping transform research

The Digital Scholarship Department was able to fund the development of this open source platform to support Right-to-Left transcription, a feature which will benefit any scholar wishing to use the software for their own transcription needs. Any transcriptions produced in this pilot will be transformed into ground truth resources, hosted by the British Library and made freely available, without rights restriction, for anyone wishing to advance the state-of-the-art in optical character recognition technology. Specifically, resources created will be contributed to ground-breaking projects already underway such as Transkribus, the Open Islamic Texts Initiative, the IMPACT Centre of Competence Image and Ground Truth Resources and more!

Visit the new Arabic Scientific Manuscripts of the British Library transcription platform and download our Getting Started Guide for more detail (an Arabic version will be available shortly). 

  

Posted by Nora McGregor, Digital Curator, British Library