THE BRITISH LIBRARY

Digital scholarship blog

167 posts categorized "Data"

13 December 2019

Do you want to see my butterfly collection?

Add comment

Posted on behalf of Sara Lucas Agutoli, artist, associate professor at the Accademia di Belle Arti di Bologna, BL Labs Artist in residence and runner up in the BL Labs Artistic Award 2019.

Sara Lucas Agutoli
Artist: Sara Lucas Agutoli
(Copyright: Ilenia Arosio)

Sara Lucas Agutoli lives and works between London and Bologna.  Her academic research focuses on the concepts of true and false in art, in particular in photography. In her art S. L. Agutoli merges popular themes with a learned and symbolic system of citations. Working with different media, she reflects on the idea of ongoing transformation – of the spaces, of the body, as well as of aesthetics – and creates personal architectures drawing on her inner experiences, knowledge and visions.

When occupied with my full time job, I often spend the time wandering on the net, looking for pictures that trigger my interest, either because they are odd and curious or aesthetically pleasant and elegant.

Since 2011 I’ve enjoyed calling myself a cyber-flâneur1:. unlike the Parisian strollers described by Baudelaire, I walked through cyber avenues, getting lost amid different digital archives. I glimpsed through collections of images instead of windows, stared at close-ups of manuscripts instead of sunsets on rivers. The net was my city and I just followed my nose walking through it. I wanted to make my curiosity an aesthetic operation. In doing so I’ve come to believe that online archives are my personal church of Saint-Julien-le-Puvre, the chosen venue for my cyber-dadaist performances,
see: https://www.moma.org/collection/works/184056

For years my working activity followed a pattern: a few months of research – during which I spend hours and hours on Flickr Commons browsing online archives of museums and institutions saving selected images on my hard disk–, followed by months in the studio working creatively with the pictures accumulated.

I did accumulate images and emotions, from advertising to family album pictures. I wanted to explore how photography was used in different parts of the world, eras and in different economical contexts.

In 2011, while in Montreal for my first art residence, I analysed the different uses of vernacular photography in the 50s in North America and Italy. To do so, I used the open archives of most of the North American Libraries (New York Public library, Congregation of Sister of St. Joseph in Canada, California Historical Society and many others) and a private physical archive located in a tin box in my grandmother house.

This lead to a series of pictures inspired by this contrast. The series was exhibited in a solo show called Fermez les yeux.

Sara Mickey: Fermez les yeux
Sara Mickey: Fermez les yeux

The vastness and the richness of topics of the images I accumulated triggered constantly my creativity and my sense of humour. They often made me ask myself  “why do those pictures exist”?

The images – especially those more vernacular, random and unforeseen – became the objects trouvés I could rework using my imagination and reality.

During this dadaist-inspired net-surfing, the most fertile encounter of the last years has been the one with the collections of two of the major London institutions: the British Library and the Wellcome Collection digital archives.  

I was about to move from Italy to London and so my artistic research was about to change, inspired by this encounter.

I started to become interested in the aesthetics of the Victorian era and in the concept of the museum as an extension of a wunderkammer.

I started collecting  images of naturalia 2 and decided to transform them into artificialia in my studio.  And so I did, merging and morphing creatively these images. In 2013 I produced a digital collage of a butterfly scientific illustration and a medical vulva lithography and it was exhibited in public space in Bologna during CHEAP poster Festival.

Cheap Poster Festival
Posters as part of the CHEAP poster festival

This collage of images from the British Library and the Wellcome Collection became the first piece of the larger project Il muro delle meraviglie – the wall of wonders – for which I chose to use the wall of my living room in my home/atelier in NW London.

Il muro delle meraviglie started like a joke to mock the colonialist aesthetic of Victorian museum collections and it became a work of art. Among the wonders I added subsequently, you can find that first collage of the butterfly and the vulva, which I decided to call  “Do you want to see my butterfly collection?” to make my queer/ feminist perspective encounter the delicacy of the naturalistic illustration of butterfly.

The title, in Italian, refers to an apparently naïve question which has an explicit sexual allusion.

The person who asks “come see my butterflies’ collection” might be suggesting it to obtain something more, as the butterfly is used as a metaphor for the female sex.

Sara Deep Thrash
Intallazione a DEEP THRASH

This work criticises the male chauvinist obsession for cataloguing, intended as an activity aimed more at showing off, than simply showing. 

It represents a feminist critique and re-appropriation of such images.

Here the butterflies become proper “c*nts” and give visibility to the female genitalia.

It has been exhibited for the first time in 2013 on the streets of Bologna (IT) during CHEAP festival and at Queer demonstration thanks to C*ntemporary

If I didn’t have access to the BL and the Wellcome digital archives, all of this wouldn’t have been possible.

Finally, I would like to thank the support I have received from BL Labs and am excited about the new experiments and projects waiting for me around the corner.

Footnotes

  1. Flâneur: Flâneur is a French term meaning ‘stroller’ or ‘loafer’ used by nineteenth-century French poet Charles Baudelaire to identify an observer of modern urban life. Dada raised the tradition of Flânerie to the level of an aesthetic operation. The Parisian walk described by Walter Benjamin in the 1920s id utilized as an art form that inscribes itself directly in the real space and time, rather than on a medium.
  2. Naturalia : Naturalia, which includes creatures and natural objects, with a particular interest in monsters

29 November 2019

Introducing Filipe Bento - BL Labs Technical Lead

Add comment

Posted by Filipe Bento, BL Labs Technical Lead

Filipe BentoI am passionate about libraries and digital initiatives within them, and am particularly interested in Open Knowledge, scholarly communication, scientific information dissemination, (Linked) Open Data, and all the innovative services that can be offered to promote their ultimate dissemination and usage, not only within academia, but also within the wider community such as industry and society. I have over twenty years experience in developing and supporting library tools, some of which have facilitated automation over manual methods to make the lives of people who work or use libraries easier.

Before working at the British Library, I was an independent consultant in the areas of digital strategies and initiatives, library technologies, information management, digital policies, Software as a Service (SaaS) and Open Source Software (OSS). Previous to that, I worked at EBSCO Information Services in several roles, firstly as the Discovery Service Engineering Support Team Manager (Europe and Latin America) and for three years as the Software Services, Application Programming Interfaces (API) and Applications (Apps) manager. My last role at EBSCO was implementing and managing the EBSCO App Store which involved working with several departments within the organisation such as marketing and legal.

Filipe Bento giving a talk the BAD conference in the Azores
Giving a talk the National Congress of BAD (Portuguese Librarians, Archivists and Documentalists Association), in the Azores

I helped the University of Aveiro's Library become the first Portuguese adopter of reference Open Source Software (OSS)  - OJS [Open Journal Systems] and implemented the institutional digital repository DSpace for the university (which included a massive data transformation and records deposit, often from citations exported from Scopus). I started my career as a lecturer and then as a computer specialist at the University of Aveiro’s Library, coordinating the development of information systems for its many branches for over fifteen years.

My PhD research in Information and Communication in Digital Platforms gave me the opportunity to connect with my professional interests in libraries, especially in the areas of information discovery. In my PhD, I was able to implement VuFind with innovative community features, as a proposal for the university, which involved engaging actively in its developer community, providing general and technical support in the process. My thesis is available via the link "Search 4.0: Integration and Cooperation Confluence in Scientific Information Discovery".

University of Aveiro (main campus), Portugal
University of Aveiro (main campus), Portugal

I have also been very active in a number of communities;
I was the (former) chairman of the board of USE.pt, the Portuguese Ex Libris Systems’ Users Association, and a previous member of the DigiMedia Research Center - Digital Media and Interaction at the University of Aveiro.

In my personal life I had been a radio and club DJ and worked on a number of personal music projects. I enjoy photography and video and am a keen traveler. I especially like being behind the wheels of cars / motorbikes and the propellers of drones.

I am really excited in joining the BL Labs team as I believe it provides an excellent opportunity to apply my skills, knowledge and expertise in library digital collections development, systems, data and APIs in a digital scholarship and wider context. I am really looking forward in offering practical advice and implementations in providing access to data, data curation, data visualisation, text and data mining and interactive web based computing environments such as Jupyter Notebooks to name a few. BL Labs and the British Library offers a rich, innovative and stimulating environment to explore what its staff and users want to do with its incredible and diverse digital collections.

30 October 2019

Workshop on “Digitisation Workflows & Digital Research Studies Methodologies”

Add comment

In this post, Nicolas Moretto, Metadata Systems Analyst at the British Library, reflects on his work trip to India.

Earlier this year I was given the opportunity to attend a workshop on “Digitisation Workflows & Digital Research Studies Methodologies” held at the National Centre for Biological Sciences (NCBS) in Bangalore, India.

The workshop, which was held on the NCBS campus in the northern part of Bangalore, was jointly organised by Tom Derrick (Two Centuries of Indian Print - 2CIP) and our host Venkat Srinivasan who is the archivist at NCBS. Tom represented the 2CIP project while I attended to cover different metadata aspects. The event was attended by colleagues from 26 different institutions. Tom and I were kindly provided with accommodation on the campus.

a photo showing the workshop participants sitting outside the main building at NCBS campus

Attendees of the workshop outside the NCBS main building                                                                                                         

The workshop was intended as an opportunity to learn more about cataloguing, digitisation and OCR, and for the Indian participants to meet colleagues from Bangalore and other parts of India, share experiences, exchange ideas and discuss common standards and best practices. The chance to meet with colleagues working on similar activities – and encountering similar challenges – was an important aspect of the workshop. Most attendees were not professional archivists but had come into archives from academic and other backgrounds and had been exposed to archives and cultural heritage in different ways. All participants shared a high level of enthusiasm for archives and a passion for preserving cultural heritage and the memory of their communities.

workshop participants sitting at desks during the workshop one group of workshop participants in discussion
On the left: The Safeda Room at NCBS. On the right: the NCBS campus offered space for discussions during the breaks

 

The topics of the two-day workshop ranged from talks on description and arrangement of material (archival and related discovery standards), presentations on specific projects to digitisation workflows and OCR. Tom gave a practical demo of OCR tools for Indic scripts. I gave a presentation on each day, covering metadata description as well as reuse and discovery.

Ten of the Indian institutions presented five-minute lightning talks covering a diverse range of initiatives and describing their archival collections. The Ashoka Archives of Contemporary India presented their collection, which includes the Mahatma Ghandi papers as well as material from other Indian politicians and academics. The Keystone Foundation gave an overview of the opportunities and challenges around their work with indigenous communities in India. Their aim is to challenge traditional portrayals of indigenous culture by employing oral history interviews, which give a voice to parts of the culture that would otherwise remain unheard. The French Institute of Pondicherry featured material that had been digitised for several Endangered Archives Programme (EAP) projects, including ceiling murals and glass frames. The participants from FLAME University presented a project of digitising Indian cookbooks, showing the interdependencies between caste and cooking. The multimedia resource Sahapedia (https://www.sahapedia.org/) was presented as a way of curating Indian heritage in an online environment. All participants were looking for ways to make cultural heritage more accessible using digital tools. On the afternoon of the second day, the participants had an opportunity to undertake a hands-on activity testing OCR tools using their own material.

The workshop was well received and feedback was overall positive. The participants voiced interest in receiving more in-depth practical training and how-to guides around cataloguing and metadata capture, setting up systems as well as preservation and conservation.

Maya Dodd speaking during her presentation Venkat shows a group of participants some documents inside the NCBS archive
On the left: MayaDodd from FLAME University presents the Indian recipes project. On the right: Venkat giving a tour of the NCBS archive

 

On the evening of the first day, Venkat gave us a tour of the NCBS archives, which he had built up from scratch, working with NCBS researchers and with the help of student volunteers. The archive was remarkably open, inviting in students and staff even if they did not have an explicit research interest. Venkat was very interested in maintaining it as an open space. His archive is accompanied by an open and evolving exhibition space, which students can contribute to.

Setting up archives in India is not an easy undertaking, and Venkat has put in a tremendous effort to make it work. Even the essentials can be difficult to come by, since there is no supplier for archival materials in India for example, and Venkat had to import all his acid-free boxes from Germany.

On my last day, I accompanied Tom on a visit to the Karnataka State Central Library. The Director of the Department of Public Libraries, Dr. Satish Kumar Hosamani was not present, but his team kindly offered to give us a tour of the library. The Librarian showed us the round reading room and newspaper reading room and the collection of rare books and manuscripts. The State Library is planning to digitise these in the near future. This activity is currently awaiting approval and funding from the Karnataka state government.

A view outside the front of the State Central Library  A view of the reading room inside the State Central Library

On the left: Karnataka State Central Library in Cubbon Park. On the right: the round reading room in the State Central Library

 

Trying to find our way to the library, we discovered the existence of a “British Library Road” in Bangalore but were unable to reach it due to the customary extremely heavy traffic in Bangalore. Getting to and from destinations usually took a long time. The best way to get around over short distances was by “Tuk-tuk”, the ever-present means of transport in Indian cities.

A screenshot of Google Maps centred on British Library Road, Bangalore A photo taken from a tuk tuk of congested traffic in Bangalore
On the left: British Library Road in Bangalore. On the right: view from a Tuk-Tuk - the traffic in Bangalore was eternally gridlocked!

 

03 October 2019

BL Labs Symposium (2019): Book your place for Mon 11-Nov-2019

Add comment

Posted by Mahendra Mahey, Manager of BL Labs

The BL Labs team are pleased to announce that the seventh annual British Library Labs Symposium will be held on Monday 11 November 2019, from 9:30 - 17:00* (see note below) in the British Library Knowledge Centre, St Pancras. The event is FREE, and you must book a ticket in advance to reserve your place. Last year's event was the largest we have ever held, so please don't miss out and book early!

*Please note, that directly after the Symposium, we have teamed up with an interactive/immersive theatre company called 'Uninvited Guests' for a specially organised early evening event for Symposium attendees (the full cost is £13 with some concessions available). Read more at the bottom of this posting!

The Symposium showcases innovative and inspiring projects which have used the British Library’s digital content. Last year's Award winner's drew attention to artistic, research, teaching & learning, and commercial activities that used our digital collections.

The annual event provides a platform for the development of ideas and projects, facilitating collaboration, networking and debate in the Digital Scholarship field as well as being a focus on the creative reuse of the British Library's and other organisations' digital collections and data in many other sectors. Read what groups of Master's Library and Information Science students from City University London (#CityLIS) said about the Symposium last year.

We are very proud to announce that this year's keynote will be delivered by scientist Armand Leroi, Professor of Evolutionary Biology at Imperial College, London.

Armand Leroi
Professor Armand Leroi from Imperial College
will be giving the keynote at this year's BL Labs Symposium (2019)

Professor Armand Leroi is an author, broadcaster and evolutionary biologist.

He has written and presented several documentary series on Channel 4 and BBC Four. His latest documentary was The Secret Science of Pop for BBC Four (2017) presenting the results of the analysis of over 17,000 western pop music from 1960 to 2010 from the US Bill Board top 100 charts together with colleagues from Queen Mary University, with further work published by through the Royal Society. Armand has a special interest in how we can apply techniques from evolutionary biology to ask important questions about culture, humanities and what is unique about us as humans.

Previously, Armand presented Human Mutants, a three-part documentary series about human deformity for Channel 4 and as an award winning book, Mutants: On Genetic Variety and Human Body. He also wrote and presented a two part series What Makes Us Human also for Channel 4. On BBC Four Armand presented the documentaries What Darwin Didn't Know and Aristotle's Lagoon also releasing the book, The Lagoon: How Aristotle Invented Science looking at Aristotle's impact on Science as we know it today.

Armands' keynote will reflect on his interest and experience in applying techniques he has used over many years from evolutionary biology such as bioinformatics, data-mining and machine learning to ask meaningful 'big' questions about culture, humanities and what makes us human.

The title of his talk will be 'The New Science of Culture'. Armand will follow in the footsteps of previous prestigious BL Labs keynote speakers: Dan Pett (2018); Josie Fraser (2017); Melissa Terras (2016); David De Roure and George Oates (2015); Tim Hitchcock (2014); Bill Thompson and Andrew Prescott in 2013.

The symposium will be introduced by the British Library's new Chief Librarian Liz Jolly. The day will include an update and exciting news from Mahendra Mahey (BL Labs Manager at the British Library) about the work of BL Labs highlighting innovative collaborations BL Labs has been working on including how it is working with Labs around the world to share experiences and knowledge, lessons learned . There will be news from the Digital Scholarship team about the exciting projects they have been working on such as Living with Machines and other initiatives together with a special insight from the British Library’s Digital Preservation team into how they attempt to preserve our digital collections and data for future generations.

Throughout the day, there will be several announcements and presentations showcasing work from nominated projects for the BL Labs Awards 2019, which were recognised last year for work that used the British Library’s digital content in Artistic, Research, Educational and commercial activities.

There will also be a chance to find out who has been nominated and recognised for the British Library Staff Award 2019 which highlights the work of an outstanding individual (or team) at the British Library who has worked creatively and originally with the British Library's digital collections and data (nominations close midday 5 November 2019).

As is our tradition, the Symposium will have plenty of opportunities for networking throughout the day, culminating in a reception for delegates and British Library staff to mingle and chat over a drink and nibbles.

Finally, we have teamed up with the interactive/immersive theatre company 'Uninvited Guests' who will give a specially organised performance for BL Labs Symposium attendees, directly after the symposium. This participatory performance will take the audience on a journey through a world that is on the cusp of a technological disaster. Our period of history could vanish forever from human memory because digital information will be wiped out for good. How can we leave a trace of our existence to those born later? Don't miss out on a chance to book on this unique event at 5pm specially organised to coincide with the end of the BL Labs Symposium. For more information, and for booking (spaces are limited), please visit here (the full cost is £13 with some concessions available). Please note, if you are unfortunate in not being able to join the 5pm show, there will be another performance at 1945 the same evening (book here for that one).

So don't forget to book your place for the Symposium today as we predict it will be another full house again and we don't want you to miss out.

We look forward to seeing new faces and meeting old friends again!

For any further information, please contact labs@bl.uk

02 October 2019

The 2019 British Library Labs Staff Award - Nominations Open!

Add comment

Looking for entries now!

A set of 4 light bulbs presented next to each other, the third light bulb is switched on. The image is supposed to a metaphor to represent an 'idea'
Nominate a British Library staff member or a team that has done something exciting, innovative and cool with the British Library’s digital collections or data.

The 2019 British Library Labs Staff Award, now in its fourth year, gives recognition to current British Library staff who have created something brilliant using the Library’s digital collections or data.

Perhaps you know of a project that developed new forms of knowledge, or an activity that delivered commercial value to the library. Did the person or team create an artistic work that inspired, stimulated, amazed and provoked? Do you know of a project developed by the Library where quality learning experiences were generated using the Library’s digital content? 

You may nominate a current member of British Library staff, a team, or yourself (if you are a member of staff), for the Staff Award using this form.

The deadline for submission is 12:00 (BST), Tuesday 5 November 2019.

Nominees will be highlighted on Monday 11 November 2019 at the British Library Labs Annual Symposium where some (winners and runners-up) will also be asked to talk about their projects.

You can see the projects submitted by members of staff for the last two years' awards in our online archive, as well as blogs for last year's winners and runners-up.

The Staff Award complements the British Library Labs Awards, introduced in 2015, which recognise outstanding work that has been done in the broader community. Last year's winner focused on the brilliant work of the 'Polonsky Foundation England and France Project: Digitising and Presenting Manuscripts from the British Library and the Bibliothèque nationale de France, 700–1200'.

The runner up for the BL Labs Staff Award last year was the 'Digital Documents Harvesting and Processing Tool (DDHAPT)' which was designed to overcome the problem of finding individual known documents in the United Kingdom's Legal Deposit Web Archive.

In the public competition, last year's winners drew attention to artistic, research, teaching & learning, and commercial activities that used our digital collections.

British Library Labs is a project within the Digital Scholarship department at the British Library that supports and inspires the use of the Library's digital collections and data in exciting and innovative ways. It was previously funded by the Andrew W. Mellon Foundation and is now solely funded by the British Library.

If you have any questions, please contact us at labs@bl.uk.

 

20 September 2019

Labbers of the world unite to write a book in 1 week through a Book Sprint

Add comment

Posted by Mahendra Mahey Manager of BL Labs.

I can't believe it's been a year since people from national, state, regional, university libraries (as well as a few galleries, archives and museums) met in London to attend the first global 'Library Labs' event at the British Library on 13th and 14th of September 2018. These 'Labs' are increasingly found in cultural heritage and academic institutions around the world and offer a space for their users to experiment and innovate on-site and on-line with their own (and others') digitised and born digital collections and data.

We had over 70 people from 43 institutions and 20 countries attend the London event and it was really wonderful, with a very full programme. There was a palpable sense of excitement and willingness to want to share experiences, build new professional relationships and witness the birth of a new international 'Labs' community. Through the event, we were able to understand more about the digital 'Labs' landscape around the world from the results of Library Labs survey. For example, we learned that many institutions were in the process of planning a 'Lab', many wanted to learn more about how to set them up, maintain and sustain them and learn the lessons from those that had already done it. About half of the attendees in London had already set up Labs in their organisations and wanted to share their experiences with other professionals so that they could build better Labs and help others so they didn't have to reinvent the wheel to save time and precious resources.

Growing an international Cultural Heritage Labs community
Some of the presenters from the first Building Library Labs Event at the
British Library, London, UK on 13-14 September 2019

The event was a mixture of presentations and lightning talks, stories of how labs are developing, parallel discussion groups and debates, many of which were videoed. At the end of the event, the collaborative document we had created contained over 60 edited pages of notes, together with a folder of other useful documents and presentations. It was concluded that it would be wonderful to come together to perhaps convert these shared experiences into a useful book/guide, perhaps through a Book Sprint. A Book Sprint is where up to 15 people come together for a week, and with minimal distractions work together to create a book. Each day when the participants sleep, a team of illustrators and editors transform their content for the next day remotely. The week ends having created a book! A great idea for busy people! We felt it was a nice fit for the Labs community we work in or want to create, which are largely based on a 'mindset' of experimentation, taking risks and being prepared to learn from your mistakes. I started to research how it might be possible to hold such a Book Sprint by talking to the Book Sprint company that has had over 20 years experience organising and running these book creation events.

Collectively as a group we decided that we would continue to build the Labs community and establish a mailing list. Clemens Neudecker wrote an excellent blog post about the event.

Zoom meeting Building Library LabsA screen grab from a virtual zoom meeting of the building Labs community

Subsequently, we held various meetings from October 2018 through to February 2019 (some virtual and some face to face) and agreed to hold our next global Labs meeting at the Royal Danish Library in Copenhagen, Denmark on 4-5 March 2019, again with an action packed programme with the help of Katrine Gasser and her team at kbtechlab. Directly after that event, some of us participated in a pre-conference workshop as part of Digital Humanities Nordic 2019, DHN-Labs - Digital Humanities and the National and University Libraries and Archives (in the Nordic and Baltic Countries) on the 6 March 2019.

Royal Danish Library, Copenhagen, Denmark

Royal Danish Library, Copenhagen, Denmark where the second
Building Library Labs event was held between 4-5 March, 2019

Over 50 people attended the 2-day event in Copenhagen, although similar to the previous event in London, this time we agreed we would hold it under Chatham House rule (an idea from Kirsty Lingstadt from the University of Edinburgh) which many of us found was very liberating.

Again, we managed to produce over 60 pages of notes and collect other relevant and helpful information. It was even more abundantly clear at the end of this event that we would definitely need to find a way for some of us to come together to write a book through the Book Sprint methodology previously proposed.

A very kind and generous offer of exploring funding from her institution was made by Milena Dobreva-McPherson Associate Professor Library and Information Studies at University College London Qatar. Abigail Potter from the Library of Congress Labs also kindly suggested that she and her team may be able to hold the next global Labs meeting in Washington between 4-6 May, 2020 in the USA.

Myself and Milena met in Qatar at the first Musuem's and Big Data conference in Qatar organised by her colleague Georgios Papaioannou Associate Professor of Museum Studies, in May 2019. We formulated a proposal to UCL Qatar (funded by the Qatar Foundation) which was successful. Milena also managed to also obtain funding from the University of Qatar. There has also been support from the British Library Labs, the Library of Congress Labs, Book Sprint Ltd, who agreed to donate half of the Book Sprint fee to run the event and finally Qatar National Library.

What was important from the outset was that the digital version of the book should be made FREELY available on the web to reuse, in line with the spirit and ethos of the group.

Milena also managed to secure funding for research assistants Somia Salim and Fidelity Phiri to help create a global directory of organisations which are doing Labs style things or might want to. They have also helped out and are helping at various Labs style events including the Book Sprint.

From the first building library labs event in September 2018 to the present day there have been various events where the work of this community has been mentioned. Here is a small sample:

In July 2019, we released an open invitation to apply to be part of the Book Sprint and received some fantastic entries. We would like to thank everyone that sent an application and we would like to reassure everyone that they can still contribute to the community even if they were not chosen on this occasion.

We can now finally announce who will be attending the Book Sprint...drum droll...:

  1. Abigail Potter, Senior Innovation Specialist with the Library of Congress Digital Innovation Lab. She tweets at @opba.
  2. Aisha Al Abdulla, Section Head of the Digital Repository and Archives at Qatar University Library.
  3. Caleb Derven, Head of Technical and Digital Services at the University of Limerick with overall responsibility for strategy and operations related to collections, electronic resources and library systems. He tweets at @calebderven.
  4. Ditte Laursen, Head of Department, The Royal Library Denmark responsible for the acquisition of digitally born cultural heritage materials, long-term preservation of digital heritage collections, and access to digital cultural heritage collections. She tweets at @DitteDla.
  5. Gustavo Candela, Associate Professor at the University of Alicante and member of the Research and Development department at The Biblioteca Virtual Miguel de Cervantes. He tweets at @gus_candela.
  6. Katrine Gasser, Section Head of IT at The Royal Library Denmark managing a team of 40 IT experts in programming, networking and research. She tweets at @blackat_ and kbtechlab
  7. Kristy Kokegei, Director of Public Engagement at the History Trust of South Australia who oversees the organisation’s public programming, digital engagement, marketing, learning and education programs across 4 State Government funded museums and supporting and enabling 350 community museums and historical societies across South Australia. She tweets at @KristyKokegei and @SAGLAMLab.
  8. Lotte Wilms, Digital Scholarship advisor managing the KB Research Lab and Digital Humanities in libraries advocate, co-chair for the LIBER working group Digital Humanities and a board member of the IMPACT Centre of Competence. She tweets at @Lottewilms.
  9. Mahendra Mahey, Manager of British Library Labs (BL Labs), an Andrew W. Mellon foundation and British Library funded initiative supporting and inspiring the use of its data in innovative ways with scholars, artists, entrepreneurs, educators and innovators through competitions, awards and other engagement activities. He tweets at @BL_Labs and @mahendra_mahey.
  10. Milena Dobreva-McPherson, Associate Professor Library and Information Studies at UCL Qatar with international experience of working in Bulgaria, Scotland and Malta. She tweets at @Milena_Dobreva.
  11. Paula Bray, DX Lab Leader at the State Library of NSW and responsible for developing and promoting an innovation lab utilising emerging and existing web technologies to deliver new ways to explore the Library’s collections and its data. She tweets at @paulabray #dxlab @statelibrarynsw
  12. Sally Chambers, Digital Humanities Research Coordinator at Ghent Centre for Digital Humanities, Ghent University, Belgium and National Coordinator for DARIAH, the Digital Research Infrastructure for the Arts and Humanities in Belgium. She tweets at @schambers3, @GhentCDH and @KBRbe
  13. Sarah Ames, Digital Scholarship Librarian at the National Library of Scotland, responsible for developing a Digital Scholarship Service and launching the Data Foundry. She tweets at @semames1.
  14. Sophie-Carolin Wagner, Co-Founder of RIAT Research Institute for Art and Technology, Co-Editor of the Journal for Research Cultures and Project Manager of ONB Labs at the Austrian National Library.
  15. Stefan Karner, Technical Lead of the ONB Labs at the Austrian National Library, providing access to diverse data and metadata sources within the library, developing a platform for users of the digital library to create and share annotations and other user generated data with each other and the public.
  16. Armin Straube, Teaching Fellow in Library and Information Studies at UCL Qatar. He is an archivist with work experience in data curation, digital preservation and web archiving and tweets at @ArminStraube.

Laia Ros Gasch will be facilitating the Book Sprint and has 10 years of experience as a cultural producer working all over the world with all kinds of groups. Laia speaks English, French, Spanish and Catalan. 

More detailed biographies are available here.

WE NEED YOUR HELP!

We all realise how incredibly lucky and privileged we are to be chosen. However, we want to hear from those of you who are interested in this area. What do you think we should be writing about, who should it be for, what style of writing should we use? Please HELP us by completing this questionnaire by Monday 23 September at 0600 BST! We will consider your thoughts and opinions seriously when we sit down to write the book on Monday morning in Doha in Qatar.

We would also like to get your help when we will be disseminating information about how to get hold of the book on social media, and at various events around the world, especially to coincide with International Open Access week 2019 (21-27 October 2019). Planned activities in 2019-2020 include:

We plan to run a 'Read Sprint' in the near future to review the Book and perhaps create an improved version. We know what we will produce next week won't be perfect!

We have plans to ensure that the book is published on a interactive platform so that it becomes a 'living' book, so that others can add chapters, make amendments, enhancements and add new case studies. We will be making announcements about this soon after the book has been completed.

On a personal note, I feel incredibly grateful, lucky and privileged to have been involved at the very start of this journey. I also feel daunted to be part of the Book Sprint but excited too!

I really want us to create a useful handbook to help cultural heritage organisations build better innovation labs which are often strapped for resources and need help. I have a strong desire that our ‘Book’ will genuinely help and inspire galleries, libraries, archives, museums, universities and other cultural heritage organisations to learn and benefit from those of us who can talk honestly about and share our experiences. I want to share the risks we have taken, mistakes we have made, provide realistic lessons and give sensible advice about what we have learned over the many years in setting up, maintaining and sustaining innovation labs. I believe this approach could mean it may prevent many institutions from having to re-invent the wheel and save them time, money and resources too.

The people in this community have a passionate desire to create something useful and meaningful that will help all of us be better at our jobs and build better innovation labs for the benefit of all our users. Hopefully, we will be following the principles of kindness, generously sharing and understanding and having empathy for the contexts in which we work. In short we hope it sincerely makes a difference and prove that sharing and kindness really can change things.

Now that I have written this, I realise I have done it again, I have written too much! However, I am glad I have written the story of how we got here. What I realise is what a busy year it’s been for everyone and particularly for people in this community, it’s amazing what we have achieved and I want to thank everyone who has played an active role, no matter how small. Let’s hope it continues to grow.

Monday morning, fifteen of us have got to write a book, gulp!

14 September 2019

BL Labs Awards 2019: enter before 2100 on Sunday 29th September! (deadline extended)

Add comment

We have extended our deadline for our BL Labs Awards to 21:00 (BST) on Sunday 29th September, submit your entry here. If you have already entered, you don't have to resubmit, however, we are happy to receive updated entries too.

The BL Labs Awards formally recognises outstanding and innovative work that has been created using the British Library’s digital collections and data.

Submit your entry, and help us spread the word to all interested parties!

This year, BL Labs is commending work in four key areas:

  • Research - A project or activity that shows the development of new knowledge, research methods, or tools.
  • Commercial - An activity that delivers or develops commercial value in the context of new products, tools, or services that build on, incorporate, or enhance the Library's digital content.
  • Artistic - An artistic or creative endeavour that inspires, stimulates, amazes and provokes.
  • Teaching / Learning - Quality learning experiences created for learners of any age and ability that use the Library's digital content.

After the submission deadline of 21:00 (BST) on Sunday 29th September for entering the BL Labs Awards has passed, the entries will be shortlisted. Selected shortlisted entrants will be notified via email by midnight BST on Thursday 10th October 2019. 

A prize of £500 will be awarded to the winner and £100 to the runner up in each Awards category at the BL Labs Symposium on 11th November 2019 at the British Library, St Pancras, London.

The talent of the BL Labs Awards winners and runners up over the last four years has led to the production of a remarkable and varied collection of innovative projects. In 2018, the Awards commended work in four main categories – Research, Artistic, Commercial and Teaching & Learning:

Photo collage

  • Research category Award (2018) winner: The Delius Catalogue of Works: the production of a comprehensive catalogue of works by the composer Delius, based on research using (and integrated with) the BL’s Archives and Manuscripts Catalogue by Joanna Bullivant, Daniel Grimley, David Lewis and Kevin Page from Oxford University’s Music department.
  • Artistic Award (2018) winner: Another Intelligence Sings (AI Sings): an interactive, immersive sound-art installation, which uses AI to transform environmental sound recordings from the BL’s sound archive by Amanda Baum, Rose Leahy and Rob Walker independent artists and experience designers.
  • Commercial Award (2018) winner: Fashion presentation for London Fashion Week by Nabil Nayal: the Library collection - a fashion collection inspired by digitised Elizabethan-era manuscripts from the BL, culminating in several fashion shows/events/commissions including one at the BL in London.
  • Teaching and Learning (2018) winner: Pocket Miscellanies: ten online pocket-book ‘zines’ featuring images taken from the BL digitised medieval manuscripts collection by Jonah Coman, PhD student at Glasgow School of Art.

For further information about BL Labs or our Awards, please contact us at labs@bl.uk.

Posted by Mahendra Mahey, Manager of of British Library Labs.

13 September 2019

Results of the RASM2019 Competition on Recognition of Historical Arabic Scientific Manuscripts

Add comment

This blog post is by Dr Adi Keinan-Schoonbaert, Digital Curator for Asian and African Collections, British Library. She's on Twitter as @BL_AdiKS.

 

Earlier this year, the British Library in collaboration with PRImA Research Lab and the Alan Turing Institute launched a competition on the Recognition of Historical Arabic Scientific Manuscripts, or in short, RASM2019. This competition was held in the context of the 15th International Conference on Document Analysis and Recognition (ICDAR2019). It was the second competition of this type, following RASM2018 which took place in 2018.

The Library has an extensive collection of Arabic manuscripts, comprising of almost 15,000 works. We have been digitising several hundred manuscripts as part of the British Library/Qatar Foundation Partnership, making them available on Qatar Digital Library. A natural next-step would be the creation of machine-readable content from scanned images, for enhanced search and whole new avenues of research.

Running a competition helps us identify software providers and tool developers, as well as introduce us to the specific challenges that pattern recognition systems face when dealing with historic, handwritten materials. For this year’s competition we provided a ground truth set of 120 images and associated XML files: 20 pages to be used to train text recognition systems to automatically identify Arabic script, and a 100 pages to evaluate the training.

Aside from providing larger training and evaluation sets, for this year’s competition we’ve added an extra challenge – marginalia. Notes written in the margins are often less consistent and less coherent than main blocks of text, and can go in different directions. The competition set out three different challenges: page segmentation, text line detection and Optical Character Recognition (OCR). Tackling marginalia was a bonus challenge!

We had just one submission for this year’s competition – RDI Company, Cairo University, who previously participated in 2018 and did very well. RDI submitted three different methods, and participated in two challenges: text line segmentation and OCR. When evaluating the results, PRImA compared established systems used in industry and academia – Tesseract 4.0, ABBYY FineReader Engine 12 (FRE12), and Google Cloud Vision API – to RDI’s submitted methods. The evaluation approach was the same as last year’s, with PRImA evaluating page analysis and recognition methods using different evaluation metrics, in order to gain an insight into the algorithms.

 

Results

Challenge 1 - Page Layout Analysis

The first challenge was set out to identify regions in a page, and find out where blocks of text are located on the page. RDI did not participate in this challenge, therefore an analysis was made only on common industry software mentioned above. The results can be seen in the chart below:

Chart showing RASM2019 page segmentation results
Chart showing RASM2019 page segmentation results

 

Google did relatively well here, and the results are quite similar to last year’s. Despite dealing with the more challenging marginalia text, Google’s previous accuracy score (70.6%) has gone down only very slightly to a still impressive 69.3%.

Example image showing Google’s page segmentation
Example image showing Google’s page segmentation

 

Tesseract 4 and FRE12 scored very similarly, with Tesseract decreasing from last year’s 54.5%. Interestingly, FRE12’s performance on text blocks including marginalia (42.5%) was better than last year’s FRE11 performance without marginalia, scoring at 40.9%. Analysis showed that Tesseract and FRE often misclassified text areas as illustrations, with FRE doing better than Tesseract in this regard.

 

Challenge 2 - Text Line Segmentation

The second challenge looked into segmenting text into distinct text lines. RDI submitted three methods for this challenge, all of which returned the text lines of the main text block (as they did not wish to participate in the marginalia challenge). Results were then compared with Tesseract and FineReader, and are reflected below:

Chart showing RASM2019 text line segmentation results
Chart showing RASM2019 text line segmentation results

 

RDI did very well with its three methods, with an accuracy level ranging between 76.6% and 77.6%. However, despite not attempting to segments marginalia text lines, their methods did not perform as well as last year’s method (with 81.6% accuracy). Their methods did seem to detect some marginalia, though very little overall, as seen in the screenshot below.

Example image showing RDI’s text line segmentation results
Example image showing RDI’s text line segmentation results

 

Tesseract and FineReader again scored lower than RDI, both with decreasing accuracy compared to RASM2018’s results (Tesseract 4 with 44.2%, FRE11 with 43.2%). This is due to the additional marginalia challenge. The Google method does not detect text lines, therefore the Text Line chart above does not include their results.

 

Challenge 3 - OCR Accuracy

The third and last challenge was all about text recognition, tackling the correct identification of characters and words in the text. Evaluation for this challenge was conducted four times: 1) on the whole page, including marginalia, 2) only on main blocks of text, excluding marginalia, 3) using the original texts, and 4) using normalised texts. Text normalisation was performed for both ground truth and OCR results, due to the historic nature of the material, occasional unusual spelling, and use/lack of diacritics. All methods performed slightly better when not tested on marginalia; accuracy rates are demonstrated in the charts below:

Chart showing OCR accuracy results, for main text body only (normalised, no marginalia)
Chart showing OCR accuracy results, for main text body only (normalised, no marginalia)
 
Chart showing OCR accuracy results for all text regions (normalised, with marginalia)
Chart showing OCR accuracy results for all text regions (normalised, with marginalia)

 

It is evident that there are minor differences in the character accuracies for the three RDI methods, with RDI2 performing slightly better than the others. When comparing the OCR accuracy between texts with and without marginalia, there are slightly higher success rates for the latter, though the difference is not significant. This means that tested methods performed on the marginalia almost as well as they did on the main text, which is encouraging.

Comparing RASM2018’s results, RDI’s results are good but not as good as last year (with 85.44% accuracy), likely to be a result of adding marginalia to the recognition challenge. Google performed very well too, considering they did not specifically train or optimised for this competition. Tesseract’s results went down from 30.45% to 25.13%, and FineReader Engine 12 performed better than its previous version FRE11, going up from 12.23% to 17.53% accuracy. However, it is still very low, as handwritten texts are not part of their target material.

 

Further Thoughts

RDI-Corporation has its own historical Arabic handwritten and typewritten OCR system, which has been built using different historical manuscripts. Its methods have done well, given the very challenging nature of the documents. Neither Tesseract nor ABBYY FineReader produce usable results, but that’s not surprising since they are both optimised for printed texts, and target contemporary material and not historical manuscripts.

As next steps, we would like to test these materials with Transkribus, which produced promising results for early printed Indian texts (see e.g. Tom Derrick’s blog post – stay tuned for some even more impressive results!), and potentially Kraken as well. All ground truth will be released through the Library’s future Open Access repository (now in testing phase), as well as through the website of IMPACT Centre for Competence. Watch this space for any developments!