05 February 2018
8th Century Arabic science meets today's computer science
Or, Announcing a Competition for the Automatic Transcription of Historical Arabic Scientific Manuscripts
“An impartial view of Digital Humanities (DH) scholarship in the present day reveals a stark divide between ‘the West and the rest’…Far fewer large-scale DH initiatives have focused on Asia and the non-Western world than on Western Europe and the Americas…Digital databases and text corpora – the ‘raw material’ of text mining and computational text analysis – are far more abundant for English and other Latin alphabetic scripts than they are for Chinese, Japanese, Korean, Sanskrit, Hindi, Arabic and other non-Latin orthographies…Troves of unread primary sources lie dormant because no text mining technology exists to parse them.”
-Dr. Thomas Mullaney, Associate Professor of Chinese History at Stanford University
Supporting the use of Asian & African Collections in digital scholarship means shining a light on this stark divide and seeking ways to close the gap. In this spirit, we are excited to announce the ICFHR2018 Competition on Recognition of Historical Arabic Scientific Manuscripts.
The Competition
Drawing together experts from British Library, The Alan Turing Institute, Qatar Digital Library and PRImA Research Lab, our aim in launching this competition is to play an active roll in advancing the state-of-the-art in handwritten text recognition technologies for Arabic. For our first challenge we are focussing on finding an optimal solution for accurately and automatically transcribing historical Arabic scientific handwritten manuscripts.
Though such technologies are still in their infancy, unlocking historical handwritten Arabic manuscripts for large-scale text analysis has the potential to truly transform research. In conjunction with the competition we hope to build and make freely open and available a substantial image and ground truth dataset to support continued efforts in this area.
Organisers
• Apostolos Antonacopoulos Professor of Pattern Recognition, University of Salford and Head of (PRImA) research lab
• Christian Clausner Research Fellow at the Pattern Recognition and Image Analysis (PRImA) research lab
• Nora McGregor Digital Curator at British Library, Asian & African Collections
• Daniel Lowe Curator at British Library, Arabic Collections
• Daniel Wilson-Nunn, PhD student at University of Warwick & Turing PhD Student based at Alan Turing Institute
• Bink Hallum, Arabic Scientific Manuscripts Curator at British Library/Qatar Foundation Partnership
Further reading
For more on recent Digital Research Team text recognition and transcription projects see:
- Using Transkribus for handwritten text recognition with the India Office Records
- Introducing...Playbills In the Spotlight
- A workshop on Optical Character Recognition for Bangla
This post is by Nora McGregor, Digital Curator, British Library. She is on twitter as @ndalyrose