THE BRITISH LIBRARY

Digital scholarship blog

34 posts categorized "Manuscripts"

26 November 2020

Using British Library Cultural Heritage Data for a Digital Humanities Research Course at the Australian National University

Add comment

Posted on behalf of Terhi Nurmikko-Fuller, Senior Lecturer, Centre for Digital Humanities Research, Australian National University by Mahendra Mahey, Manager of BL Labs.

The teaching philosophy and pedagogy of the Centre for Digital Humanities Research (CDHR) at the Australian National University (ANU) focus on research-fuelled, practice-led, object-orientated learning. We value collaboration, experimentation, and individual growth, rather than adhering to standardised evaluation matrix of exams or essays. Instead, students enrolled in jointly-taught undergraduate and postgraduate courses are given a task: to innovate at the intersection of digital technologies and cultural heritage sector institutions. They are given a great degree of autonomy, and are trusted to deliver. Their aim is to create digital prototypes, which open up GLAM sector material to a new audience.

HUMN2001: Digital Humanities Theories and Projects, and its postgraduate equivalent HUMN6001 are core courses for the programs delivered from the CDHR. HUMN2001 is a compulsory course for both the Minor and the Major in Digital Humanities for the Bachelor of Arts; HUMN6001 is a core, compulsory course in the Masters of Digital Humanities and Public Culture. Initially the course structure was quite different: experts would be invited to guest lecture on their Digital Humanities projects, and the students were tasked with carrying out critical evaluations of digital resources of various kinds. What quickly became apparent, was that without experience of digital projects, the students struggled to meaningfully and thoughtfully evaluate the projects they encountered. Many focused exclusively on the user-interface; too often critical factors like funding sources were ignored; the critical evaluative context in which the students operated was greatly skewed by their experiences of tools such as Google and platforms such as Facebook.

The solution to the problem became clear - students would have to experience the process of developing digital projects themselves before they could reasonably be expected to evaluate those of others. This revelation brought on a paradigm shift in the way in which the CDHR engages with students, projects, and their cultural heritage sector collaborators.

In 2018, we reached out to colleagues at the ANU for small-scale projects for the students to complete. The chosen project was the digitisation and the creation of metadata records for a collection of glass slides that form part of the Heritage in the Limelight project. The enthusiasm, diligence, and care that the students applied to working with this external dataset (external only to the course, since this was an ANU-internal project) gave us confidence to pursue collaborations outside of our own institution. In Semester 1 of 2019, Dr Katrina Grant’s course HUMN3001/6003: Digital Humanities Methods and Practices ran in collaboration with the National Museum of Australia (NMA) to almost unforeseeable success: the NMA granted five of the top students a one-off stipend of $1,000 each, and continued working with the students on their projects, which were then added to the NMA’s Defining Moments Digital Classroom, launched in November 2020. This collaboration was featured in a piece in the ANU Reporter, the University’s internal circular. 

Encouraged by the success of Dr Grant’s course, and presented with a serendipitous opportunity to meet up at the Australasian Association for Digital Humanities (aaDH) conference in 2018 where he was giving the keynote, I reached out to Mahendra Mahey to propose a similar collaboration. In Semester 2, 2019 (July to November), HUMN2001/6001 ran in collaboration with the British Library. 

Our experiences of working with students and cultural heritage institutions in the earlier semester had highlighted some important heuristics. As a result, the delivery of HUMN2001/6001 in 2019 was much more structured than that of HUMN3001/6003 (which had offered the students more freedom and opportunity for independent research). Rather than focus on a theoretical framework per se, HUMN2001/6001 focused on the provision of transferable skills that improved the delivery and reporting of the projects, and could be cited directly in future employment opportunities as a skills-base. These included project planning and time management (such as Gantt charts and SCRUM as a form of agile project management), and each project was to be completed in groups.

The demographic set up of each group had to follow three immutable rules:

  • The first, was that each team had to be interdisciplinary, with students from more than one degree program.
  • Second, the groups had to be multilingual, and not each member of the group could have the same first language, or be monolingual in the same language.
  • Third, was that the group had to represent more than one gender.

Although not all groups strictly implemented these rules, the ones that did benefitted from the diversity and critical lens afforded by this richness of perspective to result in the top projects.

Three examples that best showcase the diversity (and the creative genius!) of these groups and their approach to the British Library’s collection include a virtual reality (VR) concert hall, a Choose-You-Own-Adventure-Game travelling through Medieval manuscripts, and an interactive treasure hunt mobile app.

Examples of student projects

(VR)2 : Virtuoso Rachmaninoff in Virtual Reality

Research Team: Angus Harden, Noppakao (Angel) Leelasorn, Mandy McLean, Jeremy Platt, and Rachel Watson

Fig. 1 Angel Leelasorn testing out (VR)2
Figure 1: Angel Leelasorn testing out (VR)2
Figure 2: Snapshots documenting the construction of (VR)2
Figure 2: Snapshots documenting the construction of (VR)2

This project is a VR experience of the grand auditorium of the Bolshoi Theatre in Moscow. It has an audio accompaniment of Sergei Rachmaninoff’s Prelude in C# Minor, Op.3, No.2, the score for which forms part of the British Library’s collection. Reflective of the personal experiences of some of the group members, the project was designed to increase awareness of mental health, and throughout the experience the user can encounter notes written by Rachmaninoff during bouts of depression. The sense of isolation is achieved by the melody playing in an empty auditorium. 

The VR experience was built using Autodesk Maya and Unreal Engine 4. The music was produced  using midi data, with each note individually entered into Logic Pro X, and finally played through Addictive Keys Studio Grand virtual instrument.

The project is available through a website with a disclosure, and links to various mental health helplines, accessible at: https://virtuosorachmaninoff.wixsite.com/vrsquared

Fantastic Bestiary

Research Team: Jared Auer, Victoria (Vick) Gwyn, Thomas Larkin, Mary (May) Poole, Wen (Raven) Ren, Ruixue (Rachel) Wu, Qian (Ariel) Zhang

Fig. 3 Homepage of A Fantastic Bestiary
Figure 3:  Homepage of A Fantastic Bestiary

This project is a bilingual Choose-Your-Own-Adventure hypertext game that engages with the Medieval manuscripts (such as Royal MS 12 C. xix. Folios 12v-13, based off the Greek Physiologus and the Etymologiae of St. Isidore of Seville) collection at the British Library, first discovered through the Turning the Pages digital feature. The project workflow included design and background research, resource development, narrative writing, animation, translation, audio recording, and web development. Not only does it open up the Medieval manuscripts to the public in an engaging and innovative way through five fully developed narratives (~2,000-3,000 words each), all the content is also available in Mandarin Chinese.

The team used a plethora of different tools, including Adobe Animate, Photoshop, Illustrator, and Audition and Audacity. The website was developed using HTML, CSS, and JavaScript in the Microsoft Visual Studio Integrated Development Environment

The project is accessible at: https://thomaslarkin7.github.io/hypertextStory/

ActionBound

Research Team: Adriano Carvalho-Mora, Conor Francis Flannery, Dion Tan, Emily Swan

Fig 4 (Left)Testing the app at the Australian National Botanical Gardens, (Middle) An example of one of the tasks to complete in ActionBound (Right) Example of sound file from the British Library (a dingo)
Figure 4: (Left) Testing the app at the Australian National Botanical Gardens, (Middle) An example of one of the tasks to complete in ActionBound (Right) Example of sound file from the British Library (a dingo)

This project is a mobile application, designed as a location-based authoring tool inspired by the Pokemon Go! augmented reality mobile game. This educational scavenger-hunt aims to educate players about endangered animals. Using sounds of endangered or extinct animals from the British Library’s collection, but geo-locating the app at the Australian National Botanical Gardens, this project is a perfect manifestation of truly global information sharing and enrichment.

The team used a range of available tools and technologies to build this Serious Game or Game-With-A-Purpose. These include GPS and other geo-locating (and geo-caching), they created QR codes to be scanned during the hunt, locations are mapped using Open Street Map

The app can be downloaded from: https://en.actionbound.com/bound/BotanicGardensExtinctionHunt

Course Assessment

Such a diverse and dynamic learning environment presents some pedagogical challenges and required a new approach to student evaluation and assessment. The obvious question here is how to fairly, objectively, and comprehensively grade such vastly different projects? Especially since not only do they differ in both methodology and data, but also in the existing level of skills within the group. The approach I took for the grading of these assignments is one that I believe will have longevity and to some extent scalability. Indeed, I have successfully applied the same rubric in the evaluation of similarly diverse projects created for the course in 2020, when run in collaboration with the National Film and Sound Archives of Australia

The assessment rubric for this course awards students on two axis: ambition and completeness. This means that projects that were not quite completed due to their scale or complexity are awarded for the vision, and the willingness of the students to push boundaries, do new things, and take on a challenge. The grading system allows for four possible outcomes: a High Distinction (for 80% or higher), Distinction (70-79%), Credit (60-69%), and Pass (50-59%). Projects which are ambitious and completed to a significant extent land in the 80s; projects that are either ambitious but not fully developed, or relatively simple but completed receive marks in the 70s; those that very literally engaged with the material, implemented a technologically straightforward solution (such as building a website using WordPress or Wix, or using one of the suite of tools from Northwestern University’s Knightlab) were awarded marks in the 60s. Students were also rewarded for engaging with tools and technologies they had no prior knowledge of. Furthermore, in week 10 of a 12 week course, we ran a Digital Humanities Expo! Event, in which the students showcased their projects and received user-feedback from staff and students at the ANU. Students able to factor these evaluations into their final project exegeses were also rewarded by the marking scheme.

Notably, the vast majority of the students completed the course with marks 70 or higher (in the two top career brackets). Undoubtedly, the unconventional nature of the course is one of its greatest assets. Engaging with a genuine cultural heritage institution acted as motivation for the students. The autonomy and trust placed in them was empowering. The freedom to pursue the projects that they felt best reflected their passions, interests in response to a national collection of international fame resulted, almost invariably, in the students rising to the challenge and even exceeding expectations.

This was a learning experience beyond the rubric. To succeed students had to develop the transferable skills of project-planning, time-management and client interaction that would support a future employment portfolio. The most successful groups were also the most diverse groups. Combining voices from different degree programs, languages, cultures, genders, and interests helped promote internal critical evaluations throughout the design process, and helped the students engage with the materials, the projects, and each other in a more thoughtful way.

Two groups discussing their projects with Mahendra Mahey
Figure 5: Two groups discussing their projects with Mahendra Mahey
Figure 6 : National Museum of Australia curator Dr Lily Withycombe user-testing a digital project built using British Library data, 2019.
Figure 6: National Museum of Australia curator Dr Lily Withycombe user-testing a digital project built using British Library data, 2019.
User-testing feedback! Staff and students came to see the projects and support our students in the Digital Humanities Expo in 2019.
Figure 7: User-testing feedback! Staff and students came to see the projects and support our students in the Digital Humanities Expo in 2019.

Terhi Nurmikko-Fuller Biography

Dr. Terhi Nurmikko-Fuller
Dr. Terhi Nurmikko-Fuller

Terhi Nurmikko-Fuller is a Senior Lecturer in Digital Humanities at the Australian National University. She examines the potential of computational tools and digital technologies to support and diversify scholarship in the Humanities. Her publications cover the use of Linked Open Data with musicological information, library metadata, the narrative in ancient Mesopotamian literary compositions, and the role of gamification and informal online environments in education. She has created 3D digital models of cuneiform tables, carved boab nuts, animal skulls, and the Black Rod of the Australian Senate. She is a British Library Labs Researcher in Residence and a Fellow of the Software Sustainability Institute, UK; an eResearch South Australia (eRSA) HASS DEVL (Humanities Arts and Social Sciences Data Enhanced Virtual Laboratory) Champion; an iSchool Research Fellow at the University of Illinois at Urbana-Champaign, USA (2019 - 2021), a member of the Australian Government Linked Data Working Group; and, since September 2020 has been a member of the Territory Records Advisory Council for the Australian Capital Territory Government.

BL Labs Public Awards 2020 - REMINDER - Entries close NOON (GMT) 30 November 2020

Inspired by this work that uses the British Library's digitised collections? Have you done something innovative using the British Library's digital collections and data? Why not consider entering your work for a BL Labs Public Award 2020 and win fame, glory and even a bit of money?

This year's public awards 2020 are open for submission, the deadline for entry is NOON (GMT) Monday 30 November 2020

Whilst we welcome projects on any use of our digital collections and data (especially in research, artistic, educational and community categories), we are particularly interested in entries in our public awards that have focused on anti-racist work, about the pandemic or that are using computational methods such as the use of Jupyter Notebooks.

Work will be showcased at the online BL Labs Annual Symposium between 1400 - 1700 on Tuesday 15 December, for more information and a booking form please visit the BL Labs Symposium 2020 webpage.

11 November 2020

BL Labs Online Symposium 2020 : Book your place for Tuesday 15-Dec-2020

Add comment

Posted by Mahendra Mahey, Manager of BL Labs

The BL Labs team are pleased to announce that the eighth annual British Library Labs Symposium 2020 will be held on Tuesday 15 December 2020, from 13:45 - 16:55* (see note below) online. The event is FREE, but you must book a ticket in advance to reserve your place. Last year's event was the largest we have ever held, so please don't miss out and book early, see more information here!

*Please note, that directly after the Symposium, we are organising an experimental online mingling networking session between 16:55 and 17:30!

The British Library Labs (BL Labs) Symposium is an annual event and awards ceremony showcasing innovative projects that use the British Library's digital collections and data. It provides a platform for highlighting and discussing the use of the Library’s digital collections for research, inspiration and enjoyment. The awards this year will recognise outstanding use of British Library's digital content in the categories of Research, Artistic, Educational, Community and British Library staff contributions.

This is our eighth annual symposium and you can see previous Symposia videos from 201920182017201620152014 and our launch event in 2013.

Dr Ruth Anhert, Professor of Literary History and Digital Humanities at Queen Mary University of London Principal Investigator on 'Living With Machines' at The Alan Turing Institute
Ruth Ahnert will be giving the BL Labs Symposium 2020 keynote this year.

We are very proud to announce that this year's keynote will be delivered by Ruth Ahnert, Professor of Literary History and Digital Humanities at Queen Mary University of London, and Principal Investigator on 'Living With Machines' at The Alan Turing Institute.

Her work focuses on Tudor culture, book history, and digital humanities. She is author of The Rise of Prison Literature in the Sixteenth Century (Cambridge University Press, 2013), editor of Re-forming the Psalms in Tudor England, as a special issue of Renaissance Studies (2015), and co-author of two further books: The Network Turn: Changing Perspectives in the Humanities (Cambridge University Press, 2020) and Tudor Networks of Power (forthcoming with Oxford University Press). Recent collaborative work has taken place through AHRC-funded projects ‘Living with Machines’ and 'Networking the Archives: Assembling and analysing a meta-archive of correspondence, 1509-1714’. With Elaine Treharne she is series editor of the Stanford University Press’s Text Technologies series.

Ruth's keynote is entitled: Humanists Living with Machines: reflections on collaboration and computational history during a global pandemic

You can follow Ruth on Twitter.

There will be Awards announcements throughout the event for Research, Artistic, Community, Teaching & Learning and Staff Categories and this year we are going to get the audience to vote for their favourite project in those that were shortlisted, a people's BL Labs Award!

There will be a final talk near the end of the conference and we will announce the speaker for that session very soon.

So don't forget to book your place for the Symposium today as we predict it will be another full house again, the first one online and we don't want you to miss out, see more detailed information here

We look forward to seeing new faces and meeting old friends again!

For any further information, please contact labs@bl.uk

04 November 2020

Transforming Legacy Indexes into Catalogue Entries

Add comment

This guest post is by Alex Hailey, Curator of Modern Archives and Manuscripts. He's on Twitter as @ajrhailey.

In late 2019 I was lucky enough to join BL and National Archives staff to trial a PG Certificate in Computing for Cultural Heritage at Birkbeck. The course provided an introduction to programming with Python, the basics of SQL, and using the two to work with data. Fellow attendees Graham, Nick, Chris and Giulia have written about their work previously, and I am going to briefly introduce one of my project tasks addressing issues with legacy metadata within the India Office Records.

 

The original data

The IOR/E/4 Correspondence with India series consists of 1,112 volumes dating from 1703-1858: four series of letters received by the East India Company (EIC) Court of Directors from the administration in India, and four series of dispatches sent to India. Catalogue entries for these volumes contain only basic information – title, dates, language, reference and former references – and subject, name and place access to the dispatches is provided through 72 index volumes (reference IOR/Z/E/4), which contain around 430,000 entries.

Sample catalogue record titled Pensions, Carnatic, Proceedings respecting from Reference IOR/Z/E/4/42/P133
Sample catalogue record of an index entry, IOR/Z/E/4/42/P133

The original indexes were produced from 1901-1929 by staff of the Secretarial Bureau, led by indexing pioneer Mary Petherbridge; my colleague Antonia Moon has written about Petherbridge’s work in a previous post. When these indexes were converted to the catalogue in the early 2010s, entries within the index volumes were entered as child or sub-items of the index volumes themselves, with information on the related correspondence volumes entered into the free-text Related material field, as shown in the image above.

 

Problem and solution

This approach has caused some issues. Firstly, users attempting to order the related correspondence regularly end up trying to place an order for an index volume instead, which is frustrating. Secondly, it makes it practically impossible to determine the whole contents of a particular volume in a quick and easy manner, which frustrates access and use.

Manually working through 430,000 entries to group the entries by volume would be an impossible task, but I was able to use Python and a library called Pandas, which has a number of useful features for examining and manipulating catalogue data: methods for reading and writing data from multiple sources, flexible reshaping of datasets, and methods for aggregation, indexing, splitting and replacing strings, including regular expressions.

Using Pandas I was able to separate information in the Related material field, restructure the data so that each instance of an index entry formed an individual record, and then group these by volume and further arrange them alphabetically or by page order.

 

Index entries for reference IOR/Z/E/4/42/P133 split into separate records
Index entries for reference IOR/Z/E/4/42/P133 split into separate records

 

 

 

Outputs and analysis

Examining these outputs gave us new insights into the data. We now know that the indexes cover 230 volumes of the dispatches only. We were also able to identify incomplete references originally recorded in the Related material field, as well as what appear to be keying errors (references which fall outside of the range of the dispatches series). We can now follow these up and correct errors in the catalogue which were previously unknown.

Comparing the data at volume level arranged alphabetically and by page order, we could appreciate just how much depth there was to the index. Traditional indexes are written with a lot of information redundancy, which isn’t immediately apparent until you group the entries according to their location within a particular volume:

Example of index entries arranged by page order, for example, 'Chart, Maps & Surveys, Harbours, Dalrymples' plans of, sent to India, pp87, 377' followed by 'East Indian Ports, Plans of Dalrymple publishing, pp87, 377' etc.
Example of index entries arranged by page order

After discussion with the IOR team we have decided to take the alphabetically arranged data and import it to the archives catalogue, so that users selecting a dispatches volume are presented with the relevant index entries immediately.

The original dataset and derived datasets have been uploaded to the Library’s research repository where they are available for download and reuse under a CC0 licence.

To enable further analysis of the index data I have also tried my hand at creating a Jupyter Notebook to use with the derived data. This is intended to introduce colleagues to using Notebooks, Python and the Pandas library to examine catalogue metadata, conducting basic queries, producing a visualisation and exporting subsets for further investigation.

Wordcloud based on terms contained in the IOR/Z/E/4 data, generated within the Jupyter Notebook. Some of the larger, highlighted words are 'respecting', 'Army', 'India', 'Administration', 'Department', 'Madras', etc. Some small words include 'late', 'allowances', 'paid', 'appointment', 'repair', etc.
Wordcloud based on terms contained in the IOR/Z/E/4 data, generated within the Jupyter Notebook.

My Birkbeck project also included work to create place and institution authority files for the Proceedings of the Governments of India series using keyword extraction with existing catalogue metadata, and this will be discussed in a future post.

Huge thanks must go to Nora McGregor, Jo Pugh and the folks at Birkbeck Department of Computer Science for developing the course and providing us with this opportunity; Antonia Moon and the IOR team for helpful discussions about the IOR data; and the rest of the cohort for moral support when the computer just wouldn’t behave.

Alex Hailey

Curator of Modern Archives and Manuscripts

19 October 2020

The 2020 British Library Labs Staff Award - Nominations Open!

Add comment

Looking for entries now!

A set of 4 light bulbs presented next to each other, the third light bulb is switched on. The image is supposed to a metaphor to represent an 'idea'
Nominate an existing British Library staff member or a team that has done something exciting, innovative and cool with the British Library’s digital collections or data.

The 2020 British Library Labs Staff Award, now in its fifth year, gives recognition to current British Library staff who have created something brilliant using the Library’s digital collections or data.

Perhaps you know of a project that developed new forms of knowledge, or an activity that delivered commercial value to the library. Did the person or team create an artistic work that inspired, stimulated, amazed and provoked? Do you know of a project developed by the Library where quality learning experiences were generated using the Library’s digital content? 

You may nominate a current member of British Library staff, a team, or yourself (if you are a member of staff), for the Staff Award using this form.

The deadline for submission is NOON (GMT), Monday 30 November 2020.

Nominees will be highlighted on Tuesday 15 December 2020 at the online British Library Labs Annual Symposium where some (winners and runners-up) will also be asked to talk about their projects (everyone is welcome to attend, you just need to register).

You can see the projects submitted by members of staff and public for the awards in our online archive.

In 2019, last year's winner focused on the brilliant work of the Imaging Team for the 'Qatar Foundation Partnership Project Hack Days', which were sessions organised for the team to experiment with the Library's digital collections. 

The runner-up for the BL Labs Staff Award in 2019 was the Heritage Made Digital team and their social media campaign to promote the British Library's digital collections one language a week from letters 'A' to 'U' #AToUnknown).

In the public Awards, last year's winners (2019) drew attention to artisticresearchteaching & learning, and community activities that used our data and / or digital collections.

British Library Labs is a project within the Digital Scholarship department at the British Library that supports and inspires the use of the Library's digital collections and data in exciting and innovative ways. It was previously funded by the Andrew W. Mellon Foundation and is now solely funded by the British Library.

If you have any questions, please contact us at labs@bl.uk.

14 September 2020

Digital geographical narratives with Knight Lab’s StoryMap

Add comment

Visualising the journey of a manuscript’s creation

Working for the Qatar Digital Library (QDL), I recently catalogued British Library oriental manuscript 2361, a musical compendium copied in Mughal India during the reign of Aurangzeb (1618-1707; ruled from 1658). The QDL is a British Library-Qatar Foundation collaborative project to digitise and share Gulf-related archival records, maps and audio recordings as well as Arabic scientific manuscripts.

Portrait of Aurangzeb on a horse
Figure 1: Equestrian portrait of Aurangzeb. Mughal, c. 1660-70. British Library, Johnson Album, 3.4. Public domain.

The colophons to Or. 2361 fourteen texts contain an unusually large – but jumbled-up – quantity of information about the places and dates it was copied and checked, revealing that it was largely created during a journey taken by the imperial court in 1663.

Example of handwritten bibliographic information: Colophon to the copy of Kitāb al-madkhal fī al-mūsīqī by al-Fārābī
Figure 2: Colophon to the copy of Kitāb al-madkhal fī al-mūsīqī by al-Fārābī, transcribed in Delhi, 3 Jumādá I, 1073 hijrī/14 December 1662 CE, and checked in Lahore, 22 Rajab 1073/2 March 1663. Or. 2361, f. 240r.

Seeking to make sense of the mass of bibliographic information and unpick the narrative of the manuscript’s creation, I recorded all this data in a spreadsheet. This helped to clarify some patterns- but wasn’t fun to look at! To accompany an Asian and African Studies blog post, I wanted to find an interactive digital tool to develop the visual and spatial aspects of the story and convey the landscapes and distances experienced by the manuscript’s scribes and patron during its mobile production.

Screen shot of a spreadsheet of copy data for Or. 2361 showing information such as dates, locations, scribes etc.
Figure 3: Dull but useful spreadsheet of copy data for Or. 2361.

Many fascinating digital tools can present large datasets, including map co-ordinates. However, I needed to retell a linear, progressive narrative with fewer data points. Inspired by a QNF-BL colleague’s work on Geoffrey Prior’s trip to Muscat, I settled on StoryMap, one of an expanding suite of open-source reporting, data management, research, and storytelling tools developed by Knight Lab at Northwestern University, USA.

 

StoryMap: Easy but fiddly

Requiring no coding ability, the back-end of this free, easy-to-use tool resembles PowerPoint. The user creates a series of slides to which text, images, captions and copyright information can be added. Links to further online media, such as the millions of images published on the QDL, can easily be added.

Screen shot of someone editing in StoryMap
Figure 4: Back-end view of StoryMap's authoring tool.

The basic incarnation of StoryMap is accessed via an author interface which is intuitive and clear, but has its quirks. Slide layouts can’t be varied, and image manipulation must be completed pre-upload, which can get fiddly. Text was faint unless entirely in bold, especially against a backdrop image. A bug randomly rendered bits of uploaded text as hyperlinks, whereas intentional hyperlinks are not obvious.

 

The mapping function

StoryMap’s most interesting feature is an interactive map that uses OpenStreetMap data. Locations are inputted as co-ordinates, or manually by searching for a place-name or dropping a pin. This geographical data links together to produce an overview map summarised on the opening slide, with subsequent views zooming to successive locations in the journey.

Screen shot showing a preview of StoryMap with location points dropped on a world map
Figure 5: StoryMap summary preview showing all location points plotted.

I had to add location data manually as the co-ordinates input function didn’t work. Only one of the various map styles suited the historical subject-matter; however its modern street layout felt contradictory. The ‘ideal’ map – structured with global co-ordinates but correct for a specific historical moment – probably doesn’t exist (one for the next project?).

Screen shot of a point dropped on a local map, showing modern street layout
Figure 6: StoryMap's modern street layout implies New Delhi existed in 1663...

With clearly signposted advanced guidance, support forum, and a link to a GitHub repository, more technically-minded users could take StoryMap to the next level, not least in importing custom maps via Mapbox. Alternative platforms such as Esri’s Classic Story Maps can of course also be explored.

However, for many users, Knight Lab StoryMap’s appeal will lie in its ease of usage and accessibility; it produces polished, engaging outputs quickly with a bare minimum of technical input and is easy to embed in web-text or social media. Thanks to Knight Lab for producing this free tool!

See the finished StoryMap, A Mughal musical miscellany: The journey of Or. 2361.

 

This is a guest post by Jenny Norton-Wright, Arabic Scientific Manuscripts Curator from the British Library Qatar Foundation Partnership. You can follow the British Library Qatar Foundation Partnership on Twitter at @BLQatar.

11 September 2020

BL Labs Public Awards 2020: enter before NOON GMT Monday 30 November 2020! REMINDER

Add comment

The sixth BL Labs Public Awards 2020 formally recognises outstanding and innovative work that has been carried out using the British Library’s data and / or digital collections by researchers, artists, entrepreneurs, educators, students and the general public.

The closing date for entering the Public Awards is NOON GMT on Monday 30 November 2020 and you can submit your entry any time up to then.

Please help us spread the word! We want to encourage any one interested to submit over the next few months, who knows, you could even win fame and glory, priceless! We really hope to have another year of fantastic projects to showcase at our annual online awards symposium on the 15 December 2020 (which is open for registration too), inspired by our digital collections and data!

This year, BL Labs is commending work in four key areas that have used or been inspired by our digital collections and data:

  • Research - A project or activity that shows the development of new knowledge, research methods, or tools.
  • Artistic - An artistic or creative endeavour that inspires, stimulates, amazes and provokes.
  • Educational - Quality learning experiences created for learners of any age and ability that use the Library's digital content.
  • Community - Work that has been created by an individual or group in a community.

What kind of projects are we looking for this year?

Whilst we are really happy for you to submit your work on any subject that uses our digital collections, in this significant year, we are particularly interested in entries that may have a focus on anti-racist work or projects about lock down / global pandemic. We are also curious and keen to have submissions that have used Jupyter Notebooks to carry out computational work on our digital collections and data.

After the submission deadline has passed, entries will be shortlisted and selected entrants will be notified via email by midnight on Friday 4th December 2020. 

A prize of £150 in British Library online vouchers will be awarded to the winner and £50 in the same format to the runner up in each Awards category at the Symposium. Of course if you enter, it will be at least a chance to showcase your work to a wide audience and in the past this has often resulted in major collaborations.

The talent of the BL Labs Awards winners and runners up over the last five years has led to the production of remarkable and varied collection of innovative projects described in our 'Digital Projects Archive'. In 2019, the Awards commended work in four main categories – Research, Artistic, Community and Educational:

BL_Labs_Winners_2019-smallBL  Labs Award Winners for 2019
(Top-Left) Full-Text search of Early Music Prints Online (F-TEMPO) - Research, (Top-Right) Emerging Formats: Discovering and Collecting Contemporary British Interactive Fiction - Artistic
(Bottom-Left) John Faucit Saville and the theatres of the East Midlands Circuit - Community commendation
(Bottom-Right) The Other Voice (Learning and Teaching)

For further detailed information, please visit BL Labs Public Awards 2020, or contact us at labs@bl.uk if you have a specific query.

Posted by Mahendra Mahey, Manager of British Library Labs.

06 May 2020

What did you call me?!

Add comment

This guest blog post is by Michael St John-McAlister, Western Manuscripts Cataloguing Manager at the British Library.

The coronavirus lockdown is a good opportunity to carry out some of those house-keeping tasks that would never normally get done (and I do not mean re-grouting the bathroom). Anticipating that we would be sent home and knowing I would be limited in the work I could do at home, I asked IT to download all the name authorities in our archives and manuscripts cataloguing system (all 324,106 of them) into a spreadsheet that I would be able to work on at home.

Working through the names, looking for duplicate records, badly-formed names, and typos, my eye was caught by the variety of epithets that have been used over 267 years of manuscripts cataloguing.

For the uninitiated, an epithet is part of a name authority or index term, in the form of a short descriptive label, used to help distinguish people of the same name. Imagine you are writing a biography of a John Smith. You search the Explore Archives and Manuscripts catalogue for any relevant primary sources, only to find three entries for Smith, John, 1800-1870. How would you know which John Smith’s letters and diaries to call up for your research? (Humour me: let us assume our three Smiths all have the same vital dates, unlikely I know, and that the papers are not fully catalogued so the catalogue descriptions of the papers themselves cannot help you decide as they would normally).

Now imagine your catalogue search for John Smith turned up the following entries instead:

Smith, John, 1810-1880, baker

Smith, John, 1810-1880, butcher

Smith, John, 1810-1880, candlestick maker

Instantly, you can see which of the three John Smiths is relevant to your ground-breaking research into the history of candlestick making in the West Riding in the early Victorian era.

The epithet is one element of a well-formed index term and it tends to be a position in life (King of Jordan; Queen of Great Britain and Ireland), a former or alternative name (née Booth; pseudonym ‘Jane Duncan’), a career or occupation (soldier; writer), or a relationship to another person (husband of Rebecca West; son of Henry VII).

Scrolling through the spreadsheet, in amongst the soldiers, writers, composers, politicians, Earls of this, and Princesses of that, I stumbled across a fascinating array of epithets, some obvious, some less so.

There are plenty of examples of the perhaps slightly everyday, but important all the same: bricklayer; plumber; glazier; carpenter. As well as the trades common to us today, some of the trades used as epithets seem very much of their times: button-maker; coach and harness maker; dealer in fancy goods; butterman; copperplate printer; hackney coachman.

Those from the edges of law-abiding society loom large, with people described as burglar and prisoner (presumably the former led to his becoming the latter), convict, assassin, murderer, pickpocket, forger, felon, regicide, and rioter. There are even 50 pirate’s wives in the catalogue (but only seven pirates!). The victims of conflict and persecution also crop up, including prisoner of war, martyr, and galley slave, as well as, occasionally, their tormentors (inquisitor, head jailer, arms dealer).

Some of the epithets have a distinct air of mystery about them (codebreaker; conspirator; spy; alchemist; child prodigy; fugitive; renegade priest; hermit; recluse; mystic; secret agent; intercept operator; dream interpreter) whilst others exude a certain exoticism or loucheness: casino owner; dance band leader; acrobat; mesmerist; jazz poet; pearl fisher; showman; diamond tycoon; charioteer.

Many of the epithets relate to services provided to others. Where would the great and the good be without people to drive them around, manage their affairs, assist in their work, take their letters, make their tea, cook their food, and treat them when they fall ill. So, Marcel Proust’s chauffeur, Charlie Chaplin’s business manager, Gustav Holsts’s many amanuenses, Laurence Olivier’s secretary, Virginia Woolf’s charwoman, as well as her cook, and HG Wells’s physician all make appearances in the catalogue.

Then there are the epithets which are less than useful and do not really enlighten us about their subjects: appraiser (of what?); connoisseur (ditto); purple dyer (why only purple?); political adventurer; official. The less said about the usefulness, or otherwise, of epithets such as Mrs, widow, Mr Secretary, and Libyan the better.  Some fall into the ‘What is it?’ category: coastwaiter (and landwaiter, for that matter); pancratiast; paroemiographer; trouvère.*

Another interesting category contains epithets of people with more than one string to their bow. One’s mind boggles at the career path of the ‘music scribe and spy’, or the ‘inn-keeper, gunner, and writer on mathematics’; is awed by the variety of skills of the ‘composer and physician’; marvels at the multi-talented ‘army officer, footballer, and Conservative politician’; and wonders what occurred in someone’s life to earn them the epithet ‘coach-painter and would-be assassin’.

As we have discovered, an epithet can help identify individuals, thus making the reader’s life easier, but if all else fails, and it is not possible to say who someone is, you can always say who they are not. Hence one of our manuscripts cataloguing forbears leaving us with Barry, Garrett; not Garrett Barry of Lisgriffin, county Cork as an index term.

  • a type of Customs officer; ditto; a participant in a boxing or wrestling contest, esp. in ancient Greece; a writer or collector of proverbs; a medieval epic poet.

This guest blog post is by Michael St John-McAlister, Western Manuscripts Cataloguing Manager at the British Library.

 

20 April 2020

BL Labs Research Award Winner 2019 - Tim Crawford - F-Tempo

Add comment

Posted on behalf of Tim Crawford, Professorial Research Fellow in Computational Musicology at Goldsmiths, University of London and BL Labs Research Award winner for 2019 by Mahendra Mahey, Manager of BL Labs.

Introducing F-TEMPO

Early music printing

Music printing, introduced in the later 15th century, enabled the dissemination of the greatest music of the age, which until that time was the exclusive preserve of royal and aristocratic courts or the Church. A vast repertory of all kinds of music is preserved in these prints, and they became the main conduit for the spread of the reputation and influence of the great composers of the Renaissance and early Baroque periods, such as Josquin, Lassus, Palestrina, Marenzio and Monteverdi. As this music became accessible to the increasingly well-heeled merchant classes, entirely new cultural networks of taste and transmission became established and can be traced in the patterns of survival of these printed sources.

Music historians have tended to neglect the analysis of these patterns in favour of a focus on a canon of ‘great works’ by ‘great composers’, with the consequence that there is a large sub-repertory of music that has not been seriously investigated or published in modern editions. By including this ‘hidden’ musical corpus, we could explore for the first time, for example, the networks of influence, distribution and fashion, and the effects on these of political, religious and social change over time.

Online resources of music and how to read them

Vast amounts of music, mostly audio tracks, are now available using services such as Spotify, iTunes or YouTube. Music is also available online in great quantity in the form of PDF files rendering page-images of either original musical documents or modern, computer-generated music notation. These are a surrogate for paper-based books used in traditional musicology, but offer few advantages beyond convenience. What they don’t allow is full-text search, unlike the text-based online materials which are increasingly the subject of ‘distant reading’ in the digital humanities.

With good score images, Optical Music Recognition (OMR) programs can sometimes produce useful scores from printed music of simple texture; however, in general, OMR output contains errors due to misrecognised symbols. The results often amount to musical gibberish, severely limiting the usefulness of OMR for creating large digital score collections. Our OMR program is Aruspix, which is highly reliable on good images, even when they have been digitised from microfilm.

Here is a screen-shot from Aruspix, showing part of the original page-image at the top, and the program’s best effort at recognising the 16th-century music notation below. It is not hard to see that, although the program does a pretty good job on the whole, there are not a few recognition errors. The program includes a graphical interface for correcting these, but we don’t make use of that for F-TEMPO for reasons of time – even a few seconds of correction per image would slow the whole process catastrophically.

The Aruspix user-interface
The Aruspix user-interface

 

 

Finding what we want – error-tolerant encoding

Although OMR is far from perfect, online users are generally happy to use computer methods on large collections containing noise; this is the principle behind the searches in Google Books, which are based on Optical Character Recognition (OCR).

For F-TEMPO, from the output of the Aruspix OMR program, for each page of music, we extract a ‘string’ representing the pitch-name and octave for the sequence of notes. Since certain errors (especially wrong or missing clefs or accidentals) affect all subsequent notes, we encode the intervals between notes rather than the notes themselves, so that we can match transposed versions of the sequences or parts of them. We then use a simple alphabetic code to represent the intervals in the computer.

Here is an example of a few notes from a popular French chanson, showing our encoding method.

A few notes from a Crequillon chanson, and our encoding of the intervals
A few notes from a Crequillon chanson, and our encoding of the intervals

F-TEMPO in action

F-TEMPO uses state-of-the-art, scalable retrieval methods, providing rapid searches of almost 60,000 page-images for those similar to a query-page in less than a second. It successfully recovers matches when the query page is not complete, e.g. when page-breaks are different. Also, close non-identical matches, as between voice-parts of a polyphonic work in imitative style, are highly ranked in results; similarly, different works based on the same musical content are usually well-matched.

Here is a screen-shot from the demo interface to F-TEMPO. The ‘query’ image is on the left, and searches are done by hitting the ‘Enter’ or ‘Return’ key in the normal way. The list of results appears in the middle column, with the best match (usually the query page itself) highlighted and displayed on the right. As other results are selected, their images are displayed on the right. Users can upload their own images of 16th-century music that might be in the collection to serve as queries; we have found that even photos taken with a mobile phone work well. However, don’t expect coherent results if you upload other kinds of image!

F-Tempo-User Interface
F-Tempo-User Interface

The F-TEMPO web-site can be found at: http://f-tempo.org

Click on the ‘Demo’ button to try out the program for yourself.

What more can we do with F-TEMPO?

Using the full-text search methods enabled by F-TEMPO’s API we might begin to ask intriguing questions, such as:

  • ‘How did certain pieces of music spread and become established favourites throughout Europe during the 16th century?’
  • ‘How well is the relative popularity of such early-modern favourites reflected in modern recordings since the 1950s?’
  • ‘How many unrecognised arrangements are there in the 16th-century repertory?’

In early testing we identified an instrumental ricercar as a wordless transcription of a Latin motet, hitherto unknown to musicology. As the collection grows, we are finding more such unexpected concordances, and can sometimes identify the composers of works labelled in some printed sources as by ‘Incertus’ (Uncertain). We have also uncovered some interesting conflicting attributions which could provoke interesting scholarly discussion.

Early Music Online and F-TEMPO

From the outset, this project has been based on the Early Music Online (EMO) collection, the result of a 2011 JISC-funded Rapid Digitisation project between the British Library and Royal Holloway, University of London. This digitised about 300 books of early printed music at the BL from archival microfilms, producing black-and-white images which have served as an excellent proof of concept for the development of F-TEMPO. The c.200 books judged suitable for our early methods in EMO contain about 32,000 pages of music, and form the basis for our resource.

The current version of F-TEMPO includes just under 30,000 more pages of early printed music from the Polish National Library, Warsaw, as well as a few thousand from the Bibliothèque nationale, Paris. We will soon be incorporating no fewer than a further half-a-million pages from the Bavarian State Library collection in Munich, as soon as we have run them through our automatic indexing system.

 (This work was funded for the past year by the JISC / British Academy Digital Humanities Research in the Humanities scheme. Thanks are due to David Lewis, Golnaz Badkobeh and Ryaan Ahmed for technical help and their many suggestions.)