Digital scholarship blog

Enabling innovative research with British Library digital collections

Introduction

Tracking exciting developments at the intersection of libraries, scholarship and technology. Read more

14 July 2020

Legacies of Catalogue Descriptions and Curatorial Voice: Training Sessions

This guest post is by James Baker, Senior Lecturer in Digital History and Archives at the University of Sussex.

This month the team behind "Legacies of Catalogue Descriptions and Curatorial Voice: Opportunities for Digital Scholarship" ran two training sessions as part of our Arts and Humanities Research Council funded project. Each standalone session provided instruction in using the software tool AntConc and approaches from computational linguistics for the purposes of examining catalogue data. The objectives of the sessions were twofold: to test our in-development training materials, and to seek feedback from the community in order to better understand their needs and to develop our training offer.

Rather than host open public training, we decided to foster existing partnerships by inviting a small number of individuals drawn from attendees at events hosted as part of our previous Curatorial Voice project (funded by the British Academy). In total thirteen individuals from the UK and US took part across the two sessions, with representatives from libraries, archives, museums, and galleries.

Screenshot of the website for the lesson entitled Computational Analysis of Catalogue Data

Screenshot of the content page and timetable for the lesson
Carpentries-style lesson about analysing catalogue data in Antconc


The training was delivered in the style of a Software Carpentry workshop, drawing on their wonderful lesson templatepedagogical principles, and rapid response to moving coding and data science instruction online in light of the Covid-19 crisis (see ‘Recommendations for Teaching Carpentries Workshops Online’ and ‘Tips for Teaching Online from The Carpentries Community’). In terms of content, we started with the basics: how to get data into AntConc, the layout of AntConc, and settings in AntConc. After that we worked through two substantial modules. The first focused on how to generate, interact with, and interpret a word list, and this was followed by a module on searching, adapting, and reading concordances. The tasks and content of both modules avoided generic software instruction and instead focused on the analysis of free text catalogue fields, with attendees asked to consider what they might infer about a catalogue from its use of tense, what a high volume of capitalised words might tell us about cataloguing style, and how adverb use might be a useful proxy for the presence of controlled vocabulary.

Screenshot of three tasks and solutions in the Searching Concordances section
Tasks in the Searching Concordances section

Running Carpentries-style training over Zoom was new to me, and was - frankly - very odd. During live coding I missed hearing the clack of keyboards as people followed along in response. I missed seeing the sticky notes go up as people completed the task at hand. During exercises I missed hearing the hubbub that accompanies pair programming. And more generally, without seeing the micro-gestures of concentration, relief, frustration, and joy on the faces of learners, I felt somehow isolated as an instructor from the process of learning.

But from the feedback we received the attendees appear to have been happy. It seems we got the pace right (we assumed teaching online would be slower than face-to-face, and it was). The attendees enjoyed using AntConc and were surprised, to quote one attendees, "to see just how quickly you could draw some conclusions". The breakout rooms we used for exercises were a hit. And importantly we have a clear steer on next steps: that we should pivot to a dataset that better reflects the diversity of catalogue data (for this exercise we used a catalogue of printed images that I know very well), that learners would benefit having a list of suggested readings and resources on corpus linguistics, and that we might - to quote one attendee - provide "more examples up front of the kinds of finished research that has leveraged this style of analysis".

These comments and more will feed into the development of our training materials, which we hope to complete by the end of 2020 and - in line with the open values of the project - is happening in public. In the meantime, the materials are there for the community to use, adapt and build on (more or less) as they wish. Should you take a look and have any thoughts on what we might change or include for the final version, we always appreciate an email or a note on our issue tracker.

"Legacies of Catalogue Descriptions and Curatorial Voice: Opportunities for Digital Scholarship" is a collaboration between the Sussex Humanities Lab, the British Library, and Yale University Library that is funded under the Arts and Humanities Research Council (UK) “UK-US Collaboration for Digital Scholarship in Cultural Institutions: Partnership Development Grants” scheme. Project Reference AH/T013036/1.

07 July 2020

Readings at the intersection of digital scholarship and anti-racism

Digital Curator Mia Ridge writes, 'It seems a good moment to share some of the articles we've discussed as a primer on how and why technologies and working practices in libraries and digital scholarship are not neutral'.

'Do the best you can until you know better. Then when you know better, do better.'

― Attributed to Maya Angelou 

The Digital Scholarship Reading Group is one of the ways the Digital Research team help British Library staff grapple with emerging technologies and methods that could be used in research and scholarship with collections. Understanding the impact of the biases that new technologies such as AI and machine learning can introduce through algorithmic or data sourcing decisions has been an important aspect of these discussions since the group was founded in 2016. As we began work on what would eventually become the Living with Machines project, our readings became particularly focused on AI and data science, aiming to ensure that we didn't do more harm than good.

Reading is only the start of the anti-racism work we need to do. However, reading and discussing together, and bringing the resulting ideas and questions into discussions about procuring, implementing and prioritising digital platforms in cultural and research institutions is a relatively easy next step.

I've listed the topics under the dates we discussed them, and sometimes added a brief note on how it is relevant to intersectional issues of gender, racism and digital scholarship or commercial digital methods and tools. We always have more to learn about these issues, so we'd love to hear your recommendations for articles or topics (contact details here).


Digitizing and Enhancing Description Across Collections to Make African American Materials More Discoverable on Umbra Search African American History by Dorothy Berry

Abstract: This case study describes a project undertaken at the University of Minnesota Libraries to digitize materials related to African American materials across the Universities holdings, and to highlight materials that are otherwise undiscoverable in existing archival collections. It explores how historical and current archival practices marginalize material relevant to African American history and culture, and how a mass digitization process can attempt to highlight and re-aggregate those materials. The details of the aggregation process — e.g. the need to use standardized vocabularies to increase aggregation even when those standardized vocabularies privilege majority representation — also reveal important issues in mass digitization and aggregation projects involving the history of marginalized groups.

Discussed June 2020.

The Nightmare of Surveillance Capitalism, Shoshana Zuboff

For this Reading Group Session, we will be doing something a little different and discussing a podcast on The Nightmare of Surveillance Capitalism. This podcast is hosted by Talking Politics, and is a discussion with Shoshana Zuboff who has recently published The Age of Surveillance Capitalism (January, 2019). 

For those of you who would also like to bring some reading to the table, we can also consult the reviews of this book  as a way of engaging with reactions to the topic. Listed below are a few examples, but please bring along any reviews that you find to be especially thought provoking:

Discussed November 2019. Computational or algorithmic 'surveillance' and capitalism have clear links to structural inequalities. 

You and AI – Just An Engineer: The Politics of AI (video), Kate Crawford

Kate Crawford, Distinguished Research Professor at New York University, a Principal Researcher at Microsoft Research New York, and the co-founder and co-director the AI Now Institute, discusses the biases built into machine learning, and what that means for the social implications of AI. The talk is the fourth event in the Royal Society’s 2018 series: You and AI. 

Discussed October 2018.

'Facial Recognition Is Accurate, if You’re a White Guy'

Read or watch any one of:

'Facial Recognition Is Accurate, if You’re a White Guy' By Steve Lohr

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification by Joy Buolamwini, Timnit Gebru

Abstract: Recent studies demonstrate that machine learning algorithms can discriminate based on classes like race and gender. In this work, we present an approach to evaluate bias present in automated facial analysis algorithms and datasets with respect to phenotypic subgroups. Using the dermatologist approved Fitzpatrick Skin Type classification system, we characterize the gender and skin type distribution of two facial analysis benchmarks, IJB-A and Adience. We find that these datasets are overwhelmingly composed of lighter-skinned subjects (79.6% for IJB-A and 86.2% for Adience) and introduce a new facial analysis dataset which is balanced by gender and skin type. We evaluate 3 commercial gender classification systems using our dataset and show that darker-skinned females are the most misclassified group (with error rates of up to 34.7%). The maximum error rate for lighter-skinned males is 0.8%. The substantial disparities in the accuracy of classifying darker females, lighter females, darker males, and lighter males in gender classification systems require urgent attention if commercial companies are to build genuinely fair, transparent and accountable facial analysis algorithms.

How I'm fighting bias in algorithms (TED Talk) by Joy Buolamwini

Abstract: MIT grad student Joy Buolamwini was working with facial analysis software when she noticed a problem: the software didn't detect her face -- because the people who coded the algorithm hadn't taught it to identify a broad range of skin tones and facial structures. Now she's on a mission to fight bias in machine learning, a phenomenon she calls the "coded gaze." It's an eye-opening talk about the need for accountability in coding ... as algorithms take over more and more aspects of our lives.

Discussed April 2018, topic suggested by Adam Farquhar.

Feminist Research Practices and Digital Archives, Michelle Moravec

Abstract: In this article I reflect on the process of conducting historical research in digital archives from a feminist perspective. After reviewing issues that arose in conjunction with the British Library’s digitisation of the feminist magazine Spare Rib in 2013, I offer three questions researchers should consider before consulting materials in a digital archive. Have the individuals whose work appears in these materials consented to this? Whose labour was used and how is it acknowledged? What absences must be attended to among an abundance of materials? Finally, I suggest that researchers should draw on the existing body of scholarship about these issues by librarians and archivists.

Discussed October 2017.

Pedagogies of Race: Digital Humanities in the Age of Ferguson by Amy E. Earhart, Toniesha L. Taylor.

From their introduction: 'we are also invested in the development of a practice-based digital humanities that attends to the crucial issues of race, class, gender, and sexuality in the undergraduate classroom and beyond. Our White Violence, Black Resistance project merges foundational digital humanities approaches with issues of social justice by engaging students and the community in digitizing and interpreting historical moments of racial conflict. The project exemplifies an activist model of grassroots recovery that brings to light timely historical documents at the same time that it exposes power differentials in our own institutional settings and reveals the continued racial violence spanning 1868 Millican, Texas, to 2014 Ferguson, Missouri.'

Discussed August 2017.

Recovering Women’s History with Network Analysis: A Case Study of the Fabian News, Jana Smith Elford

Abstract: Literary study in the digital humanities is not exempt from reproducing historical hierarchies by focusing on major or canonical figures who have already been recognized as important historical or literary figures. However, network analysis of periodical publications may offer an alternative to the biases of human memory, where one has the tendency to pay attention to a recognizable name, rather than one that has had no historical significance. It thus enables researchers to see connections and a wealth of data that has been obscured by traditional recovery methodologies. Machine reading with network analysis can therefore contribute to an alternate understanding of women’s history, one that reinterprets cultural and literary histories that tend to reconstruct gender-based biases. This paper uses network analysis to explore the Fabian News, a late nineteenth-century periodical newsletter produced by the socialist Fabian Society, to recover women activists committed to social and political equality.

Discussed July 2017.

Do Artifacts Have Politics? by Langdon Winner

From the introduction: At issue is the claim that the machines, structures, and systems of modern material culture can be accurately judged not only for their contributions of efficiency and productivity, not merely for their positive and negative environmental side effects, but also for the ways in which they can embody specific forms of power and authority.

Discussed April 2017. A classic text from 1980 that describes how seemingly simple design factors can contribute to structural inequalities.

Critical Questions for Big Data by Danah Boyd & Kate Crawford

Abstract: Diverse groups argue about the potential benefits and costs of analyzing genetic sequences, social media interactions, health records, phone logs, government records, and other digital traces left by people. Significant questions emerge. Will large-scale search data help us create better tools, services, and public goods? Or will it usher in a new wave of privacy incursions and invasive marketing? Will data analytics help us understand online communities and political movements? Or will it be used to track protesters and suppress speech? Will it transform how we study human communication and culture, or narrow the palette of research options and alter what ‘research’ means? Given the rise of Big Data as a socio-technical phenomenon, we argue that it is necessary to critically interrogate its assumptions and biases. In this article,we offer six provocations to spark conversations about the issues of Big Data: a cultural, technological, and scholarly phenomenon that rests on the interplay of technology, analysis, and mythology that provokes extensive utopian and dystopian rhetoric.

Discussed August 2016, suggested by Aquiles Alencar Brayner.

 

This blog post is by Mia Ridge, Digital Curator for Western Heritage Collections and Co-Investigator for Living with Machines. She's on twitter at @mia_out.

06 July 2020

Archivists, Stop Wasting Your Ref-ing Time!

“I didn’t get where I am today by manually creating individual catalogue references for thousands of archival records!”

One of the most laborious yet necessary tasks of an archivist is the generation of catalogue references. This was once the bane of my life. But I now have a technological solution, which anyone can download and use for free.

Animated image showing Reference Generator being abbreviated to ReG

Meet ReG: the newest team member of the Endangered Archives Programme (EAP). He’s not as entertaining as Reginald D Hunter. She’s not as lyrical as Regina Spektor. But like 1970s sitcom character Reggie Perrin, ReG provides a logical solution to the daily grind of office life - though less extreme and hopefully more successful.

 

Two pictures of musicians, Reginald Hunter and Regina Spektor
Reginald D Hunter (left),  [Image originally posted by Pete Ashton at https://flickr.com/photos/51035602859@N01/187673692]; Regina Spektor (right), [Image originally posted by Beny Shlevich at https://www.flickr.com/photos/17088109@N00/417238523]

 

Reggie Perrin’s boss CJ was famed for his “I didn’t get where I am today” catchphrase, and as EAP’s resident GJ, I decided to employ my own ReG, without whom I wouldn’t be where I am today. Rather than writing this blog, my eyes would be drowning in metadata, my mind gathering dust, and my ears fleeing from the sound of colleagues and collaborators banging on my door, demanding to know why I’m so far behind in my work.

 

Image of two men at their offices from British sitcom The Rise and Fall of Reginald Perrin
CJ (left) [http://www.leonardrossiter.com/reginaldperrin/12044.jpg] and Reginald Perrin (right) [https://www.imdb.com/title/tt0073990/mediaviewer/rm1649999872] from The Rise and Fall of Reginald Perrin.

 

The problem

EAP metadata is created in spreadsheets by digitisation teams all over the world. It is then processed by the EAP team in London and ingested into the British Library’s cataloguing system.

When I joined EAP in 2018 one of the first projects to process was the Barbados Mercury and Bridgetown Gazette. It took days to create all of the catalogue references for this large newspaper collection, which spans more than 60 years.

Microsoft Excel’s fill down feature helped automate part of this task, but repeating this for thousands of rows is time-consuming and error-prone.

Animated image displaying the autofill procedure being carried out

I needed to find a solution to this.

During 2019 I established new workflows to semi-automate several aspects of the cataloguing process using OpenRefine - but OpenRefine is primarily a data cleaning tool, and its difficulty in understanding hierarchical relationships meant that it was not suitable for this task.

 

Learning to code

For some time I toyed with the idea of learning to write computer code using the Python programming language. I dabbled with free online tutorials. But it was tough to make practical sense of these generic tutorials, hard to find time, and my motivation dwindled.

When the British Library teamed up with The National Archives and Birkbeck University of London to launch a PG Cert in Computing for Information Professionals, I jumped at the chance to take part in the trial run.

It was a leap certainly worth taking because I now have the skills to write code for the purpose of transforming and analysing large volumes of data. And the first product of this new skillset is a computer program that accurately generates catalogue references for thousands of rows of data in mere seconds.

 

The solution - ReG in action

By coincidence, one of the first projects I needed to catalogue after creating this program was another Caribbean newspaper digitised by the same team at the Barbados Archives Department: The Barbadian.

This collection was a similar size and structure to the Barbados Mercury, but the generation of all the catalogue references took just a few seconds. All I needed to do was:

  • Open ReG
  • Enter the project ID for the collection (reference prefix)
  • Enter the filename of the spreadsheet containing the metadata

Animated image showing ReG working to file references

And Bingo! All my references were generated in a new file..

Before and After image explaining 'In just a few seconds, the following transformation took place in the 'Reference' column' showing the new reference names

 

How it works in a nutshell

The basic principle of the program is that it reads a single column in the dataset, which contains the hierarchical information. In the example above, it read the “Level” column.

It then uses this information to calculate the structured numbering of the catalogue references, which it populates in the “Reference” column.

 

Reference format

The generated references conform to the following format:

  • Each reference begins with a prefix that is common to the whole dataset. This is the prefix that the user enters at the start of the program. In the example above, that is “EAP1251”.
  • Forward slashes ( / ) are used to indicate a new hierarchical level.
  • Each record is assigned its own number relative to its sibling records, and that number is shared with all of the children of that record.

 

In the example above, the reference for the first collection is formatted:

Image showing how the reference works: 'EAP1251/1' is the first series

The reference for the first series of the first collection is formatted:

Image showing how the reference works: 'EAP1251/1/1' is the first series of the first collection

The reference for the second series of the first collection is:

Image showing how the reference works: 'EAP1251/1/2' is the second series of the first collection

No matter how complex the hierarchical structure of the dataset, the program will quickly and accurately generate references for every record in accordance with this format.

 

Download for wider re-use

While ReG was designed primarily for use by EAP, it should work for anyone that generates reference numbers using the same format.

For users of the Calm cataloguing software, ReG could be used to complete the “RefNo” column, which determines the tree structure of a collection when a spreadsheet is ingested into Calm.

With wider re-use in mind, some settings can be configured to suit individual requirements

For example, you can configure the names of the columns that ReG reads and generates references in. For EAP, the reference generation column is named “Reference”, but for Calm users, it could be configured as “RefNo”.

Users can also configure their own hierarchy. You have complete freedom to set the hierarchical terms applicable to your institution and complete freedom to set the hierarchical order of those terms.

It is possible that some minor EAP idiosyncrasies might preclude reuse of this program for some users. If this is the case, by all means get in touch; perhaps I can tweak the code to make it more applicable to users beyond EAP - though some tweaks may be more feasible than others.

 

Additional validation features

While generating references is the core function, to that end it includes several validation features to help you spot and correct problems with your data.

Unexpected item in the hierarchy area

For catalogue references to be calculated, all the data in the level column must match a term within the configured hierarchy. The program therefore checks this and if a discrepancy is found, users will be notified and they have two options to proceed.

Option 1: Rename unexpected terms

First, users have the option to rename any unexpected terms. This is useful for correcting typographical errors, such as this example - where “Files” should be “File”.

Animated image showing option 1: renaming unexpected 'files' to 'file'

Before and after image showing the change of 'files' to 'file'

Option 2: Build a one-off hierarchy

Alternatively, users can create a one-off hierarchy that matches the terms in the dataset. In the following example, the unexpected hierarchical term “Specimen” is a bona fide term. It is just not part of the configured hierarchy.

Rather than force the user to quit the program and amend the configuration file, they can simply establish a new, one-off hierarchy within the program.

Animated image showing option 2: adding 'specimen' to the hierarchy under 'file'

This hierarchy will not be saved for future instances. It is just used for this one-off occasion. If the user wants “Specimen” to be recognised in the future, the configuration file will also need to be updated.

 

Single child records

To avoid redundant information, it is sometimes advisable for an archivist to eliminate single child records from a collection. ReG will identify any such records, notify the user, and give them three options to proceed:

  1. Delete single child records
  2. Delete the parents of single child records
  3. Keep the single child records and/or their parents

Depending on how the user chooses to proceed, ReG will produce one of three results, which affects the rows that remain and the structure of the generated references.

In this example, the third series in the original dataset contains a single child - a single file.

Image showing the three possible outcomes to a single child record: A. delete child so it appears just as a series, B. delete parent so it appears just as a file, and C. keep the child record and their parents so it appears as a series followed by a single file

The most notable result is option B, where the parent was deleted. Looking at the “Level” column, the single child now appears to be a sibling of the files from the second series. But the reference number indicates that this file is part of a different branch within the tree structure.

This is more clearly illustrated by the following tree diagrams.

Image showing a tree hierarchy of the three possible outcomes for a single child record: A. a childless series, B. a file at the same level as other series, C. a series with a single child file

This functionality means that ReG will help you spot any single child records that you may otherwise have been unaware of.

But it also gives you a means of creating an appropriate hierarchical structure when cataloguing in a spreadsheet. If you intentionally insert dummy parents for single child records, ReG can generate references that map the appropriate tree structure and then remove the dummy parent records in one seamless process.

 

And finally ...

If you’ve got this far, you probably recognise the problem and have at least a passing interest in finding a solution. If so, please feel free to download the software, give it a go, and get in touch.

If you spot any problems, or have any suggested enhancements, I would welcome your input. You certainly won’t be wasting my time - and you might just save some of yours.

 

Download links

For making this possible, I am particularly thankful to Jody Butterworth, Sam van Schaik, Nora McGregor, Stelios Sotiriadis, and Peter Wood.

This blog post is by Dr Graham Jevon, Endangered Archives Programme cataloguer. He is on twitter as @GJHistory.

15 June 2020

Marginal Voices in UK Digital Comics

I am an AHRC Collaborative Doctoral Partnership student based at the British Library and Central Saint Martins, University of the Arts London (UAL). The studentship is funded by the Arts and Humanities Research Council’s Collaborative Doctoral Partnership Programme.

Supervised jointly by Stella Wisdom from the British Library, Roger Sabin and Ian Hague from UAL, my research looks to explore the potential for digital comics to take advantage of digital technologies and the digital environment to foster inclusivity and diversity. I aim to examine the status of marginal voices within UK digital comics, while addressing the opportunities and challenges these comics present for the British Library’s collection and preservation policies.

A cartoon strip of three vertical panel images, in the first a caravan is on the edge of a cliff, in the second a dog asleep in a bed, in the third the dog wakes up and sits up in bed
The opening panels from G Bear and Jammo by Jaime Huxtable, showing their caravan on The Gower Peninsula in South Wales, copyright © Jaime Huxtable

Digital comics have been identified as complex digital publications, meaning this research project is connected to the work of the broader Emerging Formats Project. On top of embracing technological change, digital comics have the potential to reflect, embrace and contribute to social and cultural change in the UK. Digital comics not only present new ways of telling stories, but whose story is told.

One of the comic creators, whose work I have been recently examining is Jaime Huxtable, a Welsh cartoonist/illustrator based in Worthing, West Sussex. He has worked on a variety of digital comics projects, from webcomics to interactive comics, and also runs various comics related workshops.

Samir's Christmas by Jaime Huxtable, this promotional comic strip was created for Freedom From Torture’s 2019 Christmas Care Box Appeal. This comic was  made into a short animated video by Hands Up, copyright © Jaime Huxtable

My thesis will explore whether the ways UK digital comics are published and consumed means that they can foreground marginal, alternative voices similar to the way underground comix and zine culture has. Comics scholarship has focused on the technological aspects of digital comics, meaning their potentially significant contribution reflecting and embracing social and cultural change in the UK has not been explored. I want to establish whether the fact digital comics can circumvent traditional gatekeepers means they provide space to foreground marginal voices. I will also explore the challenges and opportunities digital comics might present for legal deposit collection development policy.

As well as being a member of the Comics Research Hub (CoRH) at UAL, I have already begun working with colleagues from the UK Web Archive, and hope to be able to make a significant contribution to the Web Comic Archive. Issues around collection development and management are central to my research, I feel very fortunate to be based at the British Library, to have the chance to learn from and hopefully contribute to practice here.

If anyone would like to know more about my research, or recommend any digital comics for me to look at, please do contact me at Tom.Gebhart@bl.uk or @thmsgbhrt on Twitter. UK digital comic creators and publishers can use the ComicHaus app to send their digital comics directly to The British Library digital archive. More details about this process are here.

This post is by British Library collaborative doctoral student Thomas Gebhart (@thmsgbhrt).

12 June 2020

Making Watermarks Visible: A Collaborative Project between Conservation and Imaging

Some of the earliest documents being digitised by the British Library Qatar Foundation Partnership are a series of ship’s journals dating from 1605 - 1705, relating to the East India Company’s voyages. Whilst working with these documents, conservators Heather Murphy and Camille Dekeyser-Thuet noticed within the papers a series of interesting examples of early watermark design. Curious about the potential information these could give regarding the journals, Camille and Heather began undertaking research, hoping to learn more about the date and provenance of the papers, trade and production patterns involved in the paper industry of the time, and the practice of watermarking paper. There is a wealth of valuable and interesting information to be gained from the study of watermarks, especially within a project such as the BLQFP which provides the opportunity for study within both IOR and Arabic manuscript material. We hope to publish more information relating to this online with the Qatar Digital Library in the form of Expert articles and visual content.

The first step within this project involved tracing the watermark designs with the help of a light sheet in order to begin gathering a collection of images to form the basis of further research. It was clear that in order to make the best possible use of the visual information contained within these watermarks, they would need to be imaged in a way which would make them available to audiences in both a visually appealing and academically beneficial form, beyond the capabilities of simply hand tracing the designs.

Hand tracings of the watermark designs
Hand tracings of the watermark designs

 

This began a collaboration with two members of the BLQFP imaging team, Senior Imaging Technician Jordi Clopes-Masjuan and Senior Imaging Support Technician Matt Lee, who, together with Heather and Camille, were able to devise and facilitate a method of imaging and subsequent editing which enabled new access to the designs. The next step involved the construction of a bespoke support made from Vivak (commonly used for exhibition mounts and stands). This inert plastic is both pliable and transparent, which allowed the simultaneous backlighting and support of the journal pages required to successfully capture the watermarks.

Creation of the Vivak support
Creation of the Vivak support
Imaging of pages using backlighting
Imaging of pages using backlighting
Studio setup for capturing the watermarks
Studio setup for capturing the watermarks

 

Before capturing, Jordi suggested we create two comparison images of the watermarks. This involved capturing the watermarks as they normally appear on the digitised image (almost or completely invisible), and how they appear illuminated when the page is backlit. The theory behind this was quite simple: “to obtain two consecutive images from the same folio, in the exact same position, but using a specific light set-up for each image”.

By doing so, the idea was for the first image to appear in the same way as the standard, searchable images on the QDL portal. To create these standard image captures, the studio lights were placed near the camera with incident light towards the document.

The second image was taken immediately after, but this time only backlight was used (light behind the document). In using these two different lighting techniques, the first image allowed us to see the content of the document, but the second image revealed the texture and character of the paper, including conservation marks, possible corrections to the writing, as well as the watermarks.

One unexpected occurrence during imaging was, due to the varying texture and thickness of the papers, the power of the backlight had to be re-adjusted for each watermark.

First image taken under normal lighting conditions
First image taken under normal lighting conditions 
Second image of the same page taken using backlighting
Second image of the same page taken using backlighting 

https://www.qdl.qa/en/archive/81055/vdc_100000001273.0x000342

 

Previous to our adopted approach, other imaging techniques were also investigated: 

  • Multispectral photography: by capturing the same folio under different lights (from UV to IR) the watermarks, along with other types of hidden content such as faded ink, would appear. However, it was decided that this process would take too long for the number of watermarks we were aiming to capture.
  • Light sheet: Although these types of light sheets are extremely slim and slightly flexible, we experienced some issues when trying the double capture, as on many occasions the light sheet was not flexible enough, and was “moving” the page when trying to reach the gutter (for successful final presentation of the images it was mandatory that the folio on both captures was still).

Once we had successfully captured the images, Photoshop proved vital in allowing us to increase the contrast of the watermark and make it more visible. Because every image captured was different, the approach to edit the images was also different. This required varying adjustments of levels, curves, saturation or brightness, and combining these with different fusion modes to attain the best result. In the end, the tools used were not as important as the final image. The last stage within Photoshop was for both images of the same folio to be cropped and exported with the exact same settings, allowing the comparative images to match as precisely as possible.

The next step involved creating a digital line drawing of each watermark. Matt Lee, a Senior Imaging Support Technician, imported the high-resolution image captures onto an iPad and used the Procreate drawing app to trace the watermarks with a stylus pen. To develop an approach that provided accurate and consistent results, Matt first tested brushes and experimented with line qualities and thicknesses. Selecting the Dry Ink brush, he traced the light outlines of each watermark on a separate transparent layer. The tracings were initially drawn in white to highlight the designs on paper and these were later inverted to create black line drawings that were edited and refined.

Tracing the watermarks directly from the screen of an iPad provided a level of accuracy and efficiency that would be difficult to achieve on a computer with a graphics tablet, trackpad or computer mouse. There were several challenges in tracing the watermarks from the image captures. For example, the technique employed by Jordi was very effective in highlighting the watermarks, but it also made the laid and chain lines in the paper more prominent and these would merge or overlap with the light outline of the design.

Some of the watermarks also appeared distorted, incomplete or had handwritten text on the paper which obscured the details of the design. It was important that the tracings were accurate and some gaps had to be left. However, through the drawing process, the eye began to pick out more detail and the most exciting moment was when a vague outline of a horse revealed itself to be a unicorn with inset lettering.

Vector image of unicorn watermark
Vector image of unicorn watermark

 

In total 78 drawings of varying complexity and design were made for this project. To preserve the transparent backgrounds of the drawings, they were exported first as PNG files. These were then imported into Adobe Illustrator and converted to vector drawings that can be viewed at a larger size without loss of image quality.

Vector image of watermark featuring heraldic designs(Drawing)
Vector image of watermark featuring heraldic designs

 

Once the drawings were complete, we now had three images - the ‘traditional view’ (the page as it would normally appear), the ‘translucid view’ (the same page backlit and showing the watermark) and the ‘translucid + white view’ (the translucid view plus additional overlay of the digitally traced watermark in place on the page).

Traditional view
Traditional view
Translucid view
Translucid view
Translucid view with watermark highlighted by digital tracingtranslucid+white view
Translucid view with watermark highlighted by digital tracing

 

Jordi was able to take these images and, by using a multiple slider tool, was able to display them on an offline website. This enabled us to demonstrate this tool to our team and present the watermarks in the way we had been wishing from the beginning, allowing people to both study and appreciate the designs.

Watermarks Project Animated GIF

 

This is a guest post by Heather Murphy, Conservator, Jordi Clopes-Masjuan, Senior Imaging Technician and Matt Lee, Senior Imaging Support Technician from the British Library Qatar Foundation Partnership. You can follow the British Library Qatar Foundation Partnership on Twitter at @BLQatar.

 

10 June 2020

International Conference on Interactive Digital Storytelling 2020: Call for Papers, Posters and Interactive Creative Works

It has been heartening to see many joyful responses to our recent post featuring The British Library Simulator; an explorable, miniature, virtual version of the British Library’s building in St Pancras.

If you would like to learn more about our Emerging Formats research, which is informing our work in collecting examples of complex digital publications, including works made with Bitsy, then my colleague Giulia Carla Rossi (who built the Bitsy Library) is giving a Leeds Libraries Tech Talk on Digital Literature and Interactive Storytelling this Thursday, 11th June at 12 noon, via Zoom.

Giulia will be joined by Leeds Libraries Central Collections Manager, Rhian Isaac, who will showcase some of Leeds Libraries exciting collections, and also Izzy Bartley, Digital Learning Officer from Leeds Museums and Galleries, who will talk about her role in making collections interactive and accessible. Places are free, but please book here.

If you are a researcher, or writer/artist/maker, of experimental interactive digital stories, then you may want to check out the current call for submissions for The International Conference on Interactive Digital Storytelling (ICIDS), organised by the Association for Research in Digital Interactive Narratives, a community of academics and practitioners concerned with the advancement of all forms of interactive narrative. The deadline for proposing Research Papers, Exhibition Submissions, Posters and Demos, has been extended to the 26th June 2020, submissions can be made via the ICIDS 2020 EasyChair Site.

The ICIDS 2020 dates, 3-6 November, on a photograph of Bournemouth beach

ICIDS showcases and shares research and practice in game narrative and interactive storytelling, including the theoretical, technological, and applied design practices. It is an interdisciplinary gathering that combines computational narratology, narrative systems, storytelling technology, humanities-inspired theoretical inquiry, empirical research and artistic expression.

For 2020, the special theme is Interactive Digital Narrative Scholarship, and ICIDS will be hosted by the Department of Creative Technology of Bournemouth University (also hosts of the New Media Writing Prize, which I have blogged about previously). Their current intention is to host a mixed virtual and physical conference. They are hoping that the physical meeting will still take place, but all talks and works will also be made available virtually for those who are unable to attend physically due to the COVID-19 situation. This means that if you submit work, you will still need to register and present your ideas, but for those who are unable to travel to Bournemouth, the conference organisers will be making allowances for participants to contribute virtually.

ICIDS also includes a creative exhibition, showcasing interactive digital artworks, which for 2020 will explore the curatorial theme “Texts of Discomfort”. The exhibition call is currently seeking Interactive digital art works that generate discomfort through their form and/or their content, which may also inspire radical changes in the way we perceive the world.

Creatives are encouraged to mix technologies, narratives, points of view, to create interactive digital artworks that unsettle interactors’ assumptions by tackling the world’s global issues; and/or to create artworks that bring to a crisis interactors’ relation with language, that innovate in their way to intertwine narrative and technology. Artworks can include, but are not limited to:

  • Augmented, mixed and virtual reality works
  • Computer games
  • Interactive installations
  • Mobile and location-based works
  • Screen-based computational works
  • Web-based works
  • Webdocs and interactive films
  • Transmedia works

Submissions to the ICIDS art exhibition should be made using this form by 26th June. Any questions should be sent to icids2020arts@gmail.com. Good luck!

This post is by Digital Curator Stella Wisdom (@miss_wisdom

29 May 2020

IIIF Week 2020

As a founding member of the International Image Interoperability Framework Consortium (IIIF), here at the British Library we are looking forward to the upcoming IIIF Week, which has organised a programme of free online events taking place during 1-5 June.

IIIF Week sessions will discuss digital strategy for cultural heritage, introduce IIIF’s capabilities and community through introductory presentations and demonstrations of use cases. Plus explore the future of IIIF and digital research needs more broadly. 

IIIF logo with text saying International Image Interoperability Framework

Converting the IIIF annual conference into a virtual event held using Zoom, provides an opportunity to bring together a wider group of the IIIF community. Enabling many to attend, including myself, who otherwise would not have been able join the in-person event in Boston, due to budget, travel restrictions, and other obligations.

Both IIIF newbies and experienced implementers will find events scheduled at convenient times, to allow attendees to form regional community connections in their parts of the world. Attendees can sign up for all events during the week, or just the ones that interest them. Proceedings will be in English unless otherwise indicated, and all sessions will be recorded, then made available following the conference on the IIIF YouTube channel.

To those who know me, it will come as no surprise that I’m especially looking forward to the Fun with IIIF session on Friday 5 June, 4-5pm BST, facilitated by Tristan Roddis from Cogapp. Most of the uses of the International Image Interoperability Framework (IIIF) have focused on scholarly and research applications. This session, however, will look at the opposite extreme: the state of the art for creating playful and fun applications of the IIIF APIs. From tile puzzles, to arcade games, via terapixel fractals, virtual galleries, 3D environments, and the Getty's really cool Nintendo Animal Crossing integration.

In addition to the IIIF Week programme, aimed for anyone wanting a more in-depth and practical hands-on teaching, there is a free workshop on getting started with IIIF, the week following the online conference. This pilot course will run over 5 days between 8-12 June, participation is limited to 25 places, available on a first come, first served basis. It will cover:

  • Getting started with the Image API
  • Creating IIIF Manifests with the Bodleian manifest editor
  • Annotating IIIF resources and setting up an annotation server
  • Introduction to various IIIF tools and techniques for scholarship

Tutors will assist participants to create a IIIF project and demonstrate it on a zoom call at the end of the week.

You can view and sign up for IIIF Week events at https://iiif.io/event/2020/iiifweek/. All attendees are expected to adhere to the IIIF Code of Conduct and encouraged to join the IIIF-Week Slack channel for ongoing questions, comments, and discussion (you’ll need to join the IIIF Slack first, which is open to anyone).

For following and participating in more open discussion on twitter, use the hashtags #IIIF and #IIIFWeek, and if you have any specific questions about the event, please get in touch with the IIIF staff at events@iiif.io.

See you there :-)

This post is by Digital Curator Stella Wisdom (@miss_wisdom

21 May 2020

The British Library Simulator

The British Library Simulator is a mini game built using the Bitsy game engine, where you can wander around a pixelated (and much smaller) version of the British Library building in St Pancras. Bitsy is known for its compact format and limited colour-palette - you can often recognise your avatar and the items you can interact with by the fact they use a different colour from the background.

The British Library building depicted in Bitsy
The British Library Simulator Bitsy game

Use the arrow keys on your keyboard (or the WASD buttons) to move around the rooms and interact with other characters and objects you meet on the way - you might discover something new about the building and the digital projects the Library is working on!

Bitsy works best in the Chrome browser and if you’re playing on your smartphone, use a sliding movement to move your avatar and tap on the text box to progress with the dialogues.

Most importantly: have fun!

The British Library, together with the other five UK Legal Deposit Libraries, has been collecting examples of complex digital publications, including works made with Bitsy, as part of the Emerging Formats Project. This collection area is continuously expanding, as we include new examples of digital media and interactive storytelling. The formats and tools used to create these publications are varied, and allow for innovative and often immersive solutions that could only be delivered via a digital medium. You can read more about freely-available tools to write interactive fiction here.

This post is by Giulia Carla Rossi, Curator of Digital Publications (@giugimonogatari).