THE BRITISH LIBRARY

Digital scholarship blog

37 posts categorized "Research collaboration"

14 May 2018

Seeing British Library collections through a digital lens

Add comment

Digital Curator Mia Ridge writes: in this guest post, Dr Giles Bergel describes some experiments with the Library's digitised images...

The University of Oxford’s Visual Geometry Group has been working with a number of British Library curators to apply computer vision technology to their collections. On April 5 of this year I was invited by BL Digital Curator Dr. Mia Ridge to St. Pancras to showcase some of this work and to give curators the opportunity to try the tools out for themselves.  

Image1
Visual Geometry’s VISE tool matching two identical images from separate books digitised for the British Library’s Two Centuries of Indian Print project.

Computer vision - the extraction of meaning from images - has made considerable strides in recent years, particularly through the application of so-called ‘deep learning’ to large datasets. Cultural collections provide some of the most interesting test-cases for computer vision researchers, due to their complexity; the intensity of interest that researchers bring to them; and to their importance for human well-being. Can computers see collections as humans do? Computer vision is perhaps better regarded as a powerful lens rather than as a substitute for human curation. A computer can search a large collection of images far more quickly than can a single picture researcher: while it will not bring the same contextual understanding to bear on an image, it has the advantage of speed and comprehensiveness. Sometimes, a computer vision system can surprise the researcher by suggesting similarities that weren’t readily apparent.

As a relatively new technology, computer vision attracts legitimate concerns about privacy, ethics and fairness. By making its state of the art tools freely available, Visual Geometry hope to encourage experimentation and responsible use, and to enlist users to help determine what they can and cannot do. Cultural collections provide a searching test-case for the state of the art, due to their diversity as media (prints, paintings, stamped images, photographs, film and more) each of which invite different responses. One BL curator made a telling point by searching the BBC News collection with the term 'football': the system was presented with images previously tagged with that word that related to American, Gaelic, Rugby and Association football. Although inconclusive due to lack of sufficiently specific training data, the test asked whether a computer could (or should) pick the most popular instances; attempt to generalise across multiple meanings; or discern separate usages. Despite increases in processing power and in software methods, computers' ability to generalise; to extract semantic meaning from images or texts; and to cope with overlapping or ambiguous concepts remains very basic.  

Other tests with BL images have been more immediately successful. Visual Geometry's Traherne tool, developed originally to detect differences in typesetting in early printed books, worked well with many materials that exhibit small differences, such as postage stamps or doctored photographs. Visual Geometry's Image Search Engine (VISE) has shown itself capable of retrieving matching illustrations in books digitised for the Library's Indian Print project, as well as certain bookbinding features, or popular printed ballads. Some years ago Visual Geometry produced a search interface for the Library's 1 Million Images release. A collaboration between the Library's Endangered Archives programme and Oxford researcher David Zeitlyn on the archive of Cameroonian studio photographer Jacques Toussele employed facial recognition as well as pattern detection. VGG's facial recognition software works on video (BBC News, for example) as well as still photographs and art, and is soon to be freely released to join other tools under the banner of the Seebibyte Project.    

I'll be returning to the Library in June to help curators explore using the tools with their own images. For more information on the work of Visual Geometry on cultural collections, subscribe to the project's Google Group or contact Giles Bergel.      

Dr. Giles Bergel is a digital humanist based in the Visual Geometry Group in the Department of Engineering Science at the University of Oxford.  

The event was supported by the Seebibyte project under an EPSRC Programme Grant EP/M013774/1

 

08 May 2018

The Italian Academies database – now available in XML

Add comment

Dr Mia Ridge writes: in 2017, we made XML and image files from a four-year, AHRC-funded project: The Italian Academies 1525-1700 available through the Library's open data portal. The original data structure was quite complex, so we would be curious to hear feedback from anyone reusing the converted form for research or visualisations.

In this post, Dr Lisa Sampson, Reader in Early Modern Italian Studies at UCL, and Dr Jane Everson, Emeritus Professor of Italian literature, RHUL, provide further information about the project...

New research opportunities for students of Renaissance and Baroque culture! The Italian Academies database is now available for download. It's in a format called XML which represents the original structure of the database.

This dedicated database results from an eight-year project, funded by the Arts and Humanities Research Council UK, and provides a wealth of information on the Italian learned academies. Around 800 such institutions flourished across the peninsula over the sixteenth and seventeenth centuries, making major contributions to the cultural and scientific debates and innovations of the period, as well as forming intellectual networks across Europe. This database lists a total of 587 Academies from Venice, Padua, Ferrara, Bologna, Siena, Rome, Naples, and towns and cities in southern Italy and Sicily active in the period 1525-1700. Also listed are more than 7,000 members of one or more academies (including major figures like Galileo, as well as women and artists), and almost 1,000 printed works connected with academies held in the British Library. The database therefore provides an essential starting point for research into early modern culture in Italy and beyond. It is also an invitation to further scholarship and data collection, as these totals constitute only a fraction of the data relating to the Academies.

Terracina
Laura Terracina, nicknamed Febea, of the Accademia degli Incogniti, Naples

The database is designed to permit searches from many different perspectives and to allow easy searching across categories. In addition to the three principal fields – Academies, People, Books – searches can be conducted by title keyword, printer, illustrator, dedicatee, censor, language, gender, nationality among others. The database also lists and illustrates the mottoes and emblems of the Academies (where known) and similarly of individual academy members. Illustrations from the books entered in the database include frontispieces, colophons, and images from within texts.

Intronati emblem
Emblem of the Accademia degli Intronati, Siena


The database thus aims to promote research on the Italian Academies in disciplines ranging from literature and history, through art, science, astronomy, mathematics, printing and publishing, censorship, politics, religion and philosophy.

The Italian Academies project which created this database began in 2006 as a collaboration between the British Library and Royal Holloway University of London, funded by the Arts and Humanities Research council and led by Jane Everson. The objective was the creation of a dedicated resource on the publications and membership of the Italian learned Academies active in the period between 1525 and 1700. The software for the database was designed in-house by the British Library and the first tranche of data was completed in 2009 listing information for academies in four cities (Naples, Siena, Bologna and Padua). A second phase, listing information for many more cities, including in southern Italy and Sicily, developed the database further, between 2010 and 2014, with a major research grant from the AHRC and collaboration with the University of Reading.

The exciting possibilities now opened up by the British Library’s digital data strategy look set to stimulate new research and collaborations by making the records even more widely available, and easily downloadable, in line with Open Access goals. The Italian Academies team is now working to develop the project further with the addition of new data, and the incorporation into a hub of similar resources.

The Italian Academies project team members welcome feedback on the records and on the adoption of the database for new research (contact: www.italianacademies.org).

The original database remains accessible at http://www.bl.uk/catalogues/ItalianAcademies/Default.aspx 

An Introduction to the database, its aims, contents and objectives is available both at this site and at the new digital data site: https://data.bl.uk/iad/

Jane E. Everson, Royal Holloway University of London

Lisa Sampson, University College, London

25 April 2018

Some challenges and opportunities for digital scholarship in 2018

Add comment

In this post, Digital Curator Dr Mia Ridge shares her presentation notes for a talk on 'challenges and opportunities for digital scholarship' at the British Library's first Research Collaboration 'Open House'.

I'm part of a team that supports the creation and innovative use of the British Library's digital collections. Our working definition of digital scholarship is 'using computational methods to answer existing research questions or challenge existing theoretical paradigms'. In this post/talk, my perspective is informed by my knowledge of the internal processes necessary to support digital scholarship and of the issues that some scholars face when using digital/digitised collections, so I'm not by any means claiming this is a complete list.

Opportunities in digital scholarship

  • Scale: you can explore a bigger body of material computationally - 'reading' thousands, or hundreds of thousands, of volumes of text, images or media files - while retaining the ability to individually examine individual items as research questions arise from that distant reading
  • Perspective: you can see trends, patterns and relationships not apparent from close reading individual items, or gain a broad overview of a topic
  • Speed: you can test an idea or hypothesis on a large dataset; prototype new interfaces; generate classification data about people, places, concepts; transcribe content

Together, these opportunities enable new research questions.

Sample digital scholarship tools and methods

Some of these processes help get data ready for analysis (e.g. turning images of items into transcribed and annotated texts), while others support the analysis of large collections at scale, improve discoverability or enable public engagement.

  • OCR, HTR - optical character recognition, handwritten text recognition
  • Data visualisation for analysis or publication
  • Text and data mining - applying classifications to or analysing texts, images or media. Key terms include natural language processing, corpus linguistics, sentiment analysis, applied machine learning. Examples include: Voyant tools, Clarifai image classification.
  • Mapping and GIS - assigning coordinates to quantitative or qualitative data
  • Public participation and learning including crowdsourcing, citizen science/history. Examples include In the Spotlight, transcribing information from historical playbills.
  • Creative and emerging formats including games
An experiment with image classification with Clarifai
An experiment with image classification with Clarifai

Putting it all together, we have case studies like Dr. Katrina Navickas, BL Labs Winner 2015's Political Meetings Mapper. This project, based on digitised 19th century newspapers, used Python scripts to calculate the meeting date, and extract and geocode their locations to create a map of Chartist meetings.

The Library has created a data portal, data.bl.uk, containing openly licensed datasets. We aim to describe collections in terms of their data format (images, full text, metadata, etc.), licences, temporal and geographic scope, originating purpose (e.g. specific digitisation projects or exhibitions) and collection, and related subjects or themes. Other datasets may be available by request, or digitised via funded partnerships.

We're aware that, currently, it can be hard to use the datasets from data.bl.uk as they can be too large to easily download, store and manipulate. This leads me neatly onto...

Challenges in digital scholarship

  • Digitisation and cataloguing backlog - the material you want mightn't be available without a special digitisation project
  • Providing access to assets for individual items - between copyright and technology, scholars don't always have the ability to download OCR/HTR text, or download all digitised media about an item
  • Providing access to collections as datasets - moving more material into the 'sweet spot' of material that's nicely digitised in suitable formats, usable sizes, with open licences allowing for re-use is an on-going (and expensive, time-consuming process)
  • 'Cleaning' historical data and dealing with gaps in both tools provision and source collections - none of these processes are straightforward
  • Providing access to platforms or suites of tools - how much should the Library take on for researchers, and how much should other institutions or individuals provide?
  • Skills - where will researchers learn digital scholarship methods?
  • Peer review - what if your discipline lacks DS-skilled peers? How can peers judge a website or database if they've only had experience with monographs or articles? How can scholars overcome prejudice about the 'digital'?
  • Versioning datasets as annotations or classifications change, software tools improve over time, transcriptions are corrected, etc - some of these changes may affect the argument you're making

Overall, I hope the opportunities outweigh the challenges, and it's certainly possible to start with small projects with existing tools and digital sources to explore the potential of a larger project.

If you've used BL data, you can enter the BL Labs awards - they don't close until October so you have time to start an experimental project now! You can also ask the Labs team to reality check your digital scholarship idea based on Library collections and data.

Digital scholarship is constantly shifting so on another date I might have come up with different opportunities and challenges. Let me know if you have challenges or opportunities that you think could be included in this very brief overview!

21 April 2018

On the Road (Again)

Add comment

Flickr image: Wanderer
Image from the British Library’s Million Images on Flickr, found on p 198 of 'The Cruise of the Land Yacht “Wanderer”; or, thirteen hundred miles in my caravan, etc' by William Gordon Stables, 1886.

Now that British Summer Time has officially arrived, and with it some warmer weather, British Library Labs are hitting the road again with a series of events in Universities around the UK. The aim of these half-day roadshows is to inspire people to think about using the library's digitised collections and datasets in their research, art works, sound installations, apps, businesses... you name it!

A digitised copy of a manuscript is a very convenient medium to work on, especially if you are unable to visit the library in person and order an original item up to a reading room. But there are so many other uses for digitised items! Come along to one of the BL Labs Roadshows at a University department near you and find out more about the methods used by researchers in Digital Scholarship, from data-mining and crowd sourcing to optical character recognition for transcribing the words from an imaged page into searchable text. 

At each of the roadshow events, there will be speakers from the host institution describing some of the research projects they have already completed using digitised materials, as well as members of the British Library who will be able to talk with you about proposed research plans involving digitised resources. 

The locations of this year's roadshows are: 

Mon 9th April - BL Labs Roadshow 2018 (Open University) - internal event

Mon 26th March - BL Labs Roadshow 2018 (CityLIS) - internal event

Thu 12th April - BL Labs Roadshow 2018 (University of Bristol & Cardiff Digital Cultures Network)

Tue 24th April - BL Labs Roadshow 2018 (UCL)

Wed 25th April - BL Labs Roadshow 2018 (University of Kent)

Wed 2nd May - BL Labs Roadshow 2018 (University of Edinburgh)

Tue 15th May - BL Labs Roadshow 2018 (University of Wolverhampton)

Wed 16th May - BL Labs Roadshow 2018 (University of Lincoln)

Tue 5th June - BL Labs Roadshow 2018 (University of Leeds)

  BL Labs Roadshows 2018
See a full programme and book your place using the Eventbrite page for each event.

If you want to discover more about the Digital Collections, and Digital Scholarship at the British Library, follow us on Twitter @BL_Labs, read our Blog Posts, and get in touch with BL Labs if you have some burning research questions!

12 April 2018

The 2018 BL Labs Awards: enter before midnight Thursday 11th October!

Add comment

With six months to go before the submission deadline, we would like to announce the 2018 British Library Labs Awards!

The BL Labs Awards are a way of formally recognising outstanding and innovative work that has been created using the British Library’s digital collections and data.

Have you been working on a project that uses digitised material from the British Library's collections? If so, we'd like to encourage you to enter that project for an award in one of our categories.

This year, the BL Labs Awards is commending work in four key areas:

  • Research - A project or activity which shows the development of new knowledge, research methods, or tools.
  • Commercial - An activity that delivers or develops commercial value in the context of new products, tools, or services that build on, incorporate, or enhance the Library's digital content.
  • Artistic - An artistic or creative endeavour which inspires, stimulates, amazes and provokes.
  • Teaching / Learning - Quality learning experiences created for learners of any age and ability that use the Library's digital content.

BLAwards2018
BL Labs Awards 2018 Winners (Top-Left- Research Award Winner – A large-scale comparison of world music corpora with computational tools , Top-Right (Commercial Award Winner – Movable Type: The Card Game), Bottom-Left(Artistic Award Winner – Imaginary Cities) and Bottom-Right (Teaching / Learning Award Winner – Vittoria’s World of Stories)

There is also a Staff award which recognises a project completed by a staff member or team, with the winner and runner up being announced at the Symposium along with the other award winners.

The closing date for entering your work for the 2018 round of BL Labs Awards is midnight BST on Thursday 11th October (2018)Please submit your entry and/or help us spread the word to all interested and relevant parties over the next few months. This will ensure we have another year of fantastic digital-based projects highlighted by the Awards!

The entries will be shortlisted after the submission deadline (11/10/2018) has passed, and selected shortlisted entrants will be notified via email by midnight BST on Friday 26th October 2018. 

A prize of £500 will be awarded to the winner and £100 to the runner up in each of the Awards categories at the BL Labs Symposium on 12th November 2018 at the British Library, St Pancras, London.

The talent of the BL Labs Awards winners and runners up from 2017, 2016 and 2015 has resulted in a remarkable and varied collection of innovative projects. You can read about some of the 2017 Awards winners and runners up in our other blogs, links below:

BLAwards2018-Staff
British Library Labs Staff Award Winner – Two Centuries of Indian Print


Research category Award (2017) winner: 'A large-scale comparison of world music corpora with computational tools', by Maria Panteli, Emmanouil Benetos and Simon Dixon. Centre for Digital Music, Queen Mary University of London

  • Research category Award (2017) runner up: 'Samtla' by Dr Martyn Harris, Prof Dan Levene, Prof Mark Levene and Dr Dell Zhang
  • Commercial Award (2017) winner: 'Movable Type: The Card Game' by Robin O'Keeffe
  • Artistic Award (2017) winner: 'Imaginary Cities' by Michael Takeo Magruder
  • Artistic Award (2017) runner up: 'Face Swap', by Tristan Roddis and Cogapp
  • Teaching and Learning (2017) winner: 'Vittoria's World of Stories' by the pupils and staff of Vittoria Primary School, Islington
  • Teaching and Learning (2017) runner up: 'Git Lit' by Jonathan Reeve
  • Staff Award (2017) winner: 'Two Centuries of Indian Print' by Layli Uddin, Priyanka Basu, Tom Derrick, Megan O’Looney, Alia Carter, Nur Sobers khan, Laurence Roger and Nora McGregor
  • Staff Award (2017) runner up: 'Putting Collection metadata on the map: Picturing Canada', by Philip Hatfield and Joan Francis

For any further information about BL Labs or our Awards, please contact us at labs@bl.uk.

11 April 2018

Ambient Literature Festival

Add comment

As the final months of the Ambient Literature project approach, the research team are convening a series of final events (more on which below), but are also spending time drawing out conclusions and reflections regarding the last two years of work. Below is a guest post discussing this by Tom Abba from the Ambient Literature project and the University of the West of England, you can  follow him on twitter as @tomabba:   

When we began in May 2016, we were upfront about the challenges of the work we were going to make and address. Here’s what we said at the launch event at Hachette’s (then shiny new) headquarters in Blackfriars:

Here’s an admission at the start of a research programme:

We don’t know what Ambient Literature is.

We’ve started to map the territory, to define by identifying borders and by testing the edges. It’s important to note though, that we don’t want to reduce the idea to something tight and defined, rather our intention is to open it up, so show by doing, making and thinking. We do know that Ambient Literature asks for writing to be specific, to be for this form. That there are rules, grammars of making and thinking about readers and texts in new ways.

Twenty three months later, I think we know what this is, and we’ve made progress toward a set of rules and grammars for making work in this form. Each of our three commissions demonstrates how Ambient Literature might work, and each does so in a completely different way. Duncan Speakman’s It Must Have Been Dark By Then, James Attlee’s The Cartographer’s Confession and Kate Pullinger’s Breathe (made with Editions at Play) ask something of their audience that is particular to the decisions each writer made, how those were translated into a technologically mediated form, and the goals at the heart of each of those works. In different ways, for different reasons, we’re very proud of each of them.

Ambient lit

Ambient Literature has been an extended conversation about storytelling, situation, audience, presence and much much more. We opened that conversation up last year at our half-way Symposium, and want to take it much further now. We want to show, and to talk to you all, and celebrate everything that’s been part of this journey. If you’re interested in being part of that conversation and celebration, then our Showcase Festival takes place on 23rd April at the British Library Conference Centre. We’ll be sharing our secrets and discoveries, and letting you look behind the scenes at how each of our projects were created. The event will feature workshops with Duncan Speakman and Kate Pullinger and talks, as well as a guided tour through the London of The Cartographer’s Confession with its author James Attlee and producer Emma Whittaker. We’re aiming the event at publishing industry professionals, students and practitioners, as well as anyone interested in the future of reading and writing. We can promise at the very least you’ll come away knowing something new about digital storytelling. If you would like to attend please register here and book places on to the workshops.

image from https://s3.amazonaws.com/feather-client-files-aviary-prod-us-east-1/2018-04-11/b7a254e5-66bb-4ac6-9f98-6313ed72069d.png
Schedule for Ambient Literature Festival

We’re also taking the whole project to the Hay Festival in May. We’re running workshops, hosting a panel discussion (with guests including Dan Franklin and Joanna Walsh) and are making a new piece of work - Words We Never Wrote - specially for Hay. It premieres at the Festival and explores the meaning of writing, language and storytelling. We’re incredibly proud of this piece - it asks questions about linearity and form, art and suggestion that we’ve been aching to address for years. We’re delighted to be at Hay and, if you want to join us there, we can promise you a little bit of magic when you visit. 

21 March 2018

BL Labs 2017 Symposium: Vittoria's World of Stories, Learning & Teaching Award Winner

Add comment

Vittoria’s 'World of Stories' - the BL Labs Learning and Teaching Award Winner 2017 - is a project led by parents at Vittoria Primary School through the PTA, with the support of school staff. The aim of the project is to collect and share traditional tales from around the world and creative work by current pupils through workshops, the production of a book, school assemblies, readings and performances, and via the creation of audio, text and images for the school website during the current academic year. The illustrations for the project are drawn from the British Library’s Flickr collection which are displayed alongside pupils’ artwork.

VS 1
The front cover of Vittoria primary school's 'World of Stories'

Our school is a diverse community of learners: pupils’ families come from a wide range of ethnic and cultural backgrounds. Languages spoken by pupils at home include Arabic, Bengali, Vietnamese, Russian, Chechen, Turkish and Somali. One of the pedagogical goals of the project was to make visible the similarities between well-loved traditional tales and explore how different cultures use the same cast of characters - heroes and heroines, tricksters and magicians, villains and monsters – in order to speak across generations about what it means to be human. We wanted to promote and celebrate the diversity of the multi-cultural community which makes up our school, and show parents and children that the characters and stories they love are shared by others from different cultures.   

 The stories in the book include original works by pupils, gathered through a story-writing competition with winning entries selected by the PTA committee. We also asked parents to nominate traditional tales for inclusion in the collection, and held a bi-lingual (English and Arabic) story-sharing workshop for parents organised by the PTA. During the workshop, parents spoke about well-known traditional tales which they remembered from childhood and discussed the contrasts and similarities between characters and narratives from different cultures. For example, the section of the book which presents ‘bogeyman’ type monsters was developed from discussions in the workshop. We discovered that the Beast from Beauty and the Beast is called ‘Al-Ba’ati’ in Sudan, where the story is known as ‘Jamila wal Ba’ati’. Sudanese parents discussed how ‘Al-Ba’ati’ is used to encourage good behaviour in children, which prompted another parent to share her family’s stories of ‘The Boogerman’ who plays a similar role in persuading children to stay in bed at bedtime.

VS 4
One of the British Library's Flickr images, used as an illustration

The project also links with our work within the classroom to develop children’s reading skills, through promoting a love of reading and books at home. By showing that we value and celebrate the oral culture of storytelling between parents and children, and by collecting and translating tales from languages other than English, we aim to encourage parents to read with their children and support their learning at home.

The project has had a positive effect within the school community, by promoting dialogue and interaction between parents from different cultures through the parents’ workshop, and provided a vehicle to celebrate pupils’ achievements to the school community. Parents have also bought copies of the book to share with family and friends. One of our parent contributors took copies of the book to share with older generations of her family in Sudan during a recent visit, and we hope that other parents will do the same.

During the next phase of the project we will be organising a series of readings and performances using the book with different year groups and making audio recordings which we will publish on the school website for parents to download and listen to with their children at home.

If this blog post has stimulated your interest in working with the British Library's digital collections, start a project and enter it for one of the BL Labs 2018 Awards! Join us on 12 November 2018 for the BL Labs annual Symposium at the British Library.

Posted by BL Labs on behalf of Vittoria Primary school

07 March 2018

Breathe, A Digital Ghost Story

Add comment

Recently I posted about The Cartographer's Confession an immersive digital story based in London, where readers interact with the app on location. However, then the ‘beast from the east’ arrived in the UK and made walking in London rather a bracing and slippy experience last week!

So if during the cold weather you prefer seeking chills of a different kind, you may like to read Breathe, a digital ghost story, from the comfort and warmth of your sofa or bed. The story takes about fifteen minutes to read, is designed for mobile devices and it is available for free. To start reading, go to http://www.breathe-story.com/.

Breathe

Written by Kate Pullinger, the work is collaboration with Editions at Play, which is itself collaboration between Google Creative Labs Sydney and London-based publisher Visual Editions. The result is a literary experience delivered using Application Programming Interfaces (APIs) and context recognition technology. The app uses place, time, context and environment to place the reader in the story, making the experience individual and personal to each reader.  Kate has blogged about creating Breathe on The Writing Platform.

As with the other two Ambient Literature commissioned literary works, the research project team are looking for participants to try out Breathe and talk to them about their reading experience. If you are interested in assisting, please fill out this form and one of the researchers will be in touch via email to arrange a time to talk. If you have any questions about this process, please contact Dr Michael Marcinkowski.

This post is by Digital Curator Stella Wisdom, on twitter as @miss_wisdom and member of the Ambient Literature Advisory Board.