THE BRITISH LIBRARY

Digital scholarship blog

130 posts categorized "Experiments"

19 June 2019

The Shape of Contemporary British Interactive Fiction

Add comment

When I started this Innovation Placement, I had no idea what I was doing. Six months on, and the main thing I’ve learned is that I know even less than I thought I did. Which is not to say that I haven’t learned a lot, just that archiving interactive narrative is an even more complex and varied task than I had imagined, as are the works of interactive fiction themselves.

One of my key goals was to explore how to preserve interactive works for future researchers. My first task was finding suitable works – they had to be web-based (no downloadable files), be recognisable as interactive narratives in some way and be identifiably created in the UK. Sites such as IFDB (the Interactive Fiction Database) and Sub-Q (the only commercial IF-focussed magazine) and competitions such as Spring Thing and IFComp were invaluable sources, but determining whether the authors were UK-based was more difficult. Some remained entirely anonymous, or gave no indication as to their location on their website or social media, which meant it was not possible to include them in this particular project.

Once found, capturing the works initially didn’t appear to be too much of a challenge. The UK Web Archive’s crawlers were able to get most hypertexts while Webrecorder made it possible to collect most other works. However, playback was where the difficulties crept in. Some works captured well, but wouldn’t play back. Or played back, but with errors. Or showed that actually, the works had still been pulling information from the live web, and when placed in the archive and severed from this outside contact, no longer worked. You can see the Webrecorder collection here, and the UKWA Collection here, although the latter is a work-in-progress. A full list of all works reviewed (some of which were not collectable for various reasons) can be found here.

If you’re a maker of interactive works, I strongly suggest that you submit your work to UKWA or make a copy on Webrecorder and download the WARC (Web ARChive) files it creates (or both), because it will likely be some time before libraries develop systematic collecting policies for these works due to the many challenges associated with collecting and sharing them. Having your work backed up in WARC format may help you stay ahead of the curve!

My other key goal was to get a sense of the ‘shape’ of contemporary British web-based interactive fiction. If I had to draw it, I’d probably do something like this:

Shapes1

Or maybe even like this:

Shapes2

It’s messy and disruptive and gloriously so. But that’s not to say there aren’t some common threads running through the work. Some themes and motifs cropped up many times in many different guises.  Trains, cats, mental health and interactive fiction itself were all addressed by multiple creators, some taking on several of these topics at once in one work. Librarians and archivists were surprisingly well-represented as creators of interactive works, with a piece by the British Library’s own Andy Jackson included in the collection, and creators based at various other UK libraries also contributing works.

Naturally, I wrote some more formal reports on the types of works being created, the tools being used, and the methods used to collect them. However, I felt that the only way to truly summarise the experience of reading and playing and attempting to collect all these amazing works was to create a piece of interactive fiction that mimics the experience of reading and playing and attempting to capture all these amazing works. The result was The Memory Archivist which hopefully goes some way towards conveying the challenges faced by archivists of complex digital works, but also why tackling those challenges is important. I hope you enjoy it.

This post is by the Library's Innovation Fellow for Interactive Fiction Lynda Clark, on twitter as @Notagoth. You can find out more about the Library's Emerging Formats project here.

10 June 2019

Collaborative Digital Scholarship in Action: A Case Study in Designing Impactful Student Learning Partnerships

Add comment

The Arts and Sciences (BASc) department at University College London has been at the forefront of pioneering a renascence of liberal arts and sciences degrees in the UK. As part of its Core modules offering, students select an interdisciplinary elective in Year 2 of their academic programme – from a range of modules specially designed for the department by University College London academics and researchers.

When creating my own module – Information Through the Ages (BASC0033) – as part of this elective set, I was keen to ensure that the student learning experience was both supported and developed in tandem with professional practices and standards, knowing that enabling students to progress their skills developed on the module beyond the module’s own assignments would aid them not only in their own unique academic degree programmes but also provide substantial evidence to future employers of their employability and skills base. Partnering with the British Library, therefore, in designing a data science and data curation project as part of the module’s core assignments, seemed to me to provide an excellent opportunity to enable both a research-based educative framework for students as well as a valuable chance for them to engage in a real-world collaboration, as providing students with external industry partners to collaborate with can contribute an important fillip to their motivation and the learning experience overall – by seeing their assessed work move beyond the confines of the academy to have an impact out in the wider world.

Through discussions with my British Library co-collaborators, Mahendra Mahey and Stella Wisdom, we alighted on the Microsoft Books/BL 19th Century collection dataset as providing excellent potential for student groups to work with for their data curation projects. With its 60,000 public domain volumes, associated metadata and 1 million+ extracted images, it presented as exciting, undiscovered territory across which our student groups might roam and rove, with the results of their work having the potential to benefit future British Library researchers.

Structuring the group project around wrangling a subset of this data: discovering, researching, cleaning and refining it, with the output from each group a curated version of the original dataset we therefore felt presented a number of significant benefits. Students were enabled to explore and develop technical skills such as data curation, software knowledge, archival research, report writing, project development and collaborative working practices, alongside experiencing a real world, digital scholarship learning experience – with the outcomes in turn supporting the British Library’s Digital Scholarship remit regards enabling innovative research based on the British Library digital collections.

Students observed that “working with the data did give me more practical insight to the field of work involved with digitisation work, and it was an enriching experience”, including how they “appreciated how involved and hands-on the projects were, as this is something that I particularly enjoy”. Data curation training was provided on site at the British Library, with the session focused on the use of OpenRefine, “a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.”[1] Student feedback also told us that we could have provided further software training, and more guided dataset exploration/navigation resources, with groups keen to learn more nuanced data curation techniques – something we will aim to respond to in future iterations of the module – but overall, as one student succinctly noted, “I had no idea of the digitalization process and I learned a lot about data science. The training was very useful and I acquired new skills about data cleaning.”

Overall, we had five student groups wrangling the BL 19th Century collection, producing final data subsets in the following areas: Christian and Christian-related texts; Queens of Britain 1510-1946; female authors, 1800-1900 (here's a heatmap this student group produced of the spread of published titles by female authors in the 19th century); Shakespearean works, other author’s adaptations on those works, and any commentary on Shakespeare or his writing; and travel-related books.

In particular, it was excellent to see students fully engaging with the research process around their chosen data subset – exploring its cultural and institutional contexts, as well as navigating metadata/data schemas, requirements and standards.

For example, the Christian texts group considered the issue of different languages as part of their data subset of texts, following this up with textual content analysis to enable accurate record querying and selection. In their project report they noted that “[u]sing our dataset and visualisations as aids, we hope that researchers studying the Bible and Christianity can discover insights into the geographical and temporal spread of Christian-related texts. Furthermore, we hope that they can also glean new information regarding the people behind the translations of Bibles as well as those who wrote about Christianity.”

Similarly, the student group focused on travel-related texts discussed in their team project summary that “[t]he particular value of this curated dataset is that future researchers may be able to use it in the analysis of international points of view. In these works, many cities and nations are being written about from an outside perspective. This perspective is one that can be valuable in understanding historical relations and frames of reference between groups around the world: for instance, the work “Travels in France and Italy, in 1817 and 1818”, published in New York, likely provides an American perspective of Europe, while “Four Months in Persia, and a Visit to Trans-Caspia”, published in London, might detail an extended visit of a European in Persia, both revealing unique perspectives about different groups of people. A comparable work, that may have utilized or benefitted from such a collection, is Hahner’s (1998) “Women Through Women’s Eyes:Latin American Women in Nineteenth Century Travel Accounts.” In it, Hahner explores nineteenth century literature written to unearth the perspectives on Latin American women, specifically noting that the primarily European author’s writings should be understood in the context of their Eurocentric view, entrenched in “patriarchy” and “colonialism” (Hahner, 1998:21). Authors and researchers with a similar intent may use [our] curated British Library dataset comparably – that is, to locate such works.”

Data visualisation by travel books group
Data visualisation by travel books group
Data visualisation by travel books group
Data visualisation by travel books group

Over the ten weeks of the module, alongside their group data curation projects, students covered lecture topics as varied as Is a Star a Document?, "Truthiness" and Truth in a Post-Truth World, Organising Information: Classification, Taxonomies and Beyond!, and Information & Power; worked on an individual archival GIF project which drew on an institutional archival collection to create (and publish on social media) an animated GIF; and spent time in classroom discussions considering questions such as What happens when information is used for dis-informing or mis-informing purposes?; How do the technologies available to us in the 21st century potentially impact on the (data) collection process and its outputs and outcomes?; How might ideas about collections and collecting be transformed in a digital context?; What exactly do we mean by the concepts of Data and Information?; How we choose to classify or group something first requires we have a series of "rules" or instructions which determine the grouping process – but who decides on what the rules are and how might such decisions in fact influence our very understandings of the information the system is supposedly designed to facilitate access to? These dialogues were all situated within the context of both "traditional" collections systems and atypical sites of information storage and collection, with the module aiming to enable students to gain an in-depth knowledge, understanding and critical appreciation of the concept of information, from historical antecedents to digital scientific and cultural heritage forms, in the context of libraries, archives, galleries and museums (including alternative, atypical and emergent sources), and how technological, social, cultural and other changes fundamentally affect our concept of “information.”

“I think this module was particularly helpful in making me look at things in an interdisciplinary light”, one student observed in module evaluation feedback, with others going on to note that “I think the different formats of work we had to do was engaging and made the coursework much more interesting than just papers or just a project … the collaboration with the British Library deeply enriched the experience by providing a direct and visible outlet for any energies expended on the module. It made the material seem more applicable and the coursework more enjoyable … I loved that this module offered different ways of assessment. Having papers, projects, presentations, and creative multimedia work made this course engaging.”

Situating the module’s assessments within such contexts I hope encouraged students to understand the critical, interdisciplinary focus of the field of information studies, in particular the use of information in the context of empire-making and consolidation, and how histories of information, knowledge and power intersect. Combined with a collaborative, interdisciplinary curriculum design approach, which encouraged and supported students to gain technical abilities and navigate teamwork practices, we hope this module can point some useful ways forward in creating and developing engaging learning experiences, which have real world impact.

This blog post is by Sara Wingate-Gray (UCL Senior Teaching Fellow & BASC0033 module leader), Mahendra Mahey (BL Labs Manager) and Stella Wisdom (BL Digital Curator for Contemporary British Collections).

16 April 2019

BL Labs 2018 Commercial Award Winner: 'The Library Collection'

Add comment

This guest blog post is by the team led by fashion designer, Nabil Nayal - winner of the BL Labs Commercial Award for 2018 - for his Spring/Summer 2019 collection, presented at the 2018 London Fashion Week.

Fashion-shoot-two
Nabil Nayal's SS19 Collection: fashion shoot at the British Library

The Nabil Nayal SS19 collection (The Library Collection) made history by becoming the first fashion show, on the official London Fashion Week schedule, to be hosted at the iconic British Library. The British Library’s digital archives deeply informed the collection. The Tilbury Speech, delivered by Queen Elizabeth I ahead of the attempted invasion of England by the Spanish Armada in 1588, was central to the use of print, as were other manuscripts, digitised images, maps and hymn sheets from the era. The collection encapsulates Nabil’s obsession with Elizabethan craftsmanship, whilst symbolising the power and strength of a woman who succeeded in bringing England into its Golden Age.

Nabil undertook historical research in the British Library for his PhD on Elizabethan dress, so the opportunity to collaborate with the Library in order to emphasise the importance of research in fashion education and practice was something he felt passionately about doing. Paying particular attention to the Library’s Elizabethan and Medieval Manuscripts archives, Nabil conducted his research with guidance from expert curators and with support from the Reading Room staff. Using key word search terms and date limitations to search through the digitised archives was particularly useful to find historically accurate documents to incorporate into the collection.

NABIL002FLAT_1
Nabil's design takes inspiration from the British Library's digitised 1588 manuscript of Queen Elizabeth I's 'Tilbury Speech'  © Nabil Nayal 2018

Elizabethan silhouettes were modernised in this collection by printing these manuscripts onto Nabil’s designs, including a three-metre-long cloak featuring the Tilbury Speech. A UK-based supplier, Silk Bureau, digitally printed the archival material on to a range of fine silks and cottons, which were then used to make garments within the collection. Nabil’s love of the classic white shirt was further explored too, offering a puritan backdrop that ‘whitewashes’ the complex hand-cut embellishments made of bonded poplins and marcella.

The designs in the SS19 collection have been sold to prestigious international stores such as Dover Street Market and Joyce and the collection will be launching exclusively in Selfridges this May (2019). The presentation also generated a huge response in key press and social media, including coverage in Vogue.

5 models together
Nabil's Elizabethan-inspired designs at the BL Fashion Shoot © Nabil Nayal 2018

Nabil’s interest in promoting historical research within fashion was not limited to this collection. Currently, the brand is working with Collette Taylor of Vega Associates to continue to raise awareness of the potential of the Library’s collections to inspire the next generation of fashion researchers. Nabil held a Research Masterclass at the British Library in November 2018 to work with emerging designers as part of a fashion research competition to develop a capsule collection inspired by the Library’s collections.

This collaboration between Nabil Nayal and the British Library highlights the importance of design education and research for the future-proofing and continued success of UK creative industries, which is a pressing issue. Since 2010, there has been a 34% drop in GCSE entries across the arts, despite the fact that the UK fashion industry supports over 880,000 jobs and delivered a direct contribution of £28 billion to the UK economy in 2015. The wealth of free resources at the British Library provides ample opportunity for design students to explore how education and research can enrich their creativity and allow them to succeed within the fashion industry.

Nabil’s work has received praise from the late Karl Lagerfeld and celebrities such as Rihanna, Lorde and Florence Welch. His SS19 collection epitomises the way that the use of archival research within fashion can generate commercial success, suggesting that the ever-changing fashion industry can benefit from becoming more historically informed and that modernity can be evoked through an interest in the past.

Watch Jennifer Davies receiving the Commercial award on behalf of Nabil's team, and talking about the collection on our YouTube channel (clip runs from 7.26): 

You can read other blogs about Nabil Nayal at London Fashion Week and the fashion show at the British Library, and if you're feel inspired, use the British Library's online Fashion resources.

Find out more about Digital Scholarship and BL Labs. If you have a project which uses British Library digital content in innovative and interesting ways, consider applying for an award this year! The 2019 BL Labs Symposium will take place on Monday 11 November at the British Library.

29 March 2019

Staying Late at the Library ... to Algorave

Add comment

Blog article by Algorave audio-visual artist Coral Manton. Coral is curating this British Library Lates Algorave in collaboration with British Library Events, BL Labs, Digital Scholarship and The Alan Turing Institute.

On the 5th April British Library Lates will host an Algorave in the atrium. Algorave artists will live-code music and visuals, writing code sequences generating algorithmic beats beneath the iconic Kings’ Library Tower.

Alex Mclean
Alex Mclean AKA Yaxu

The scene grew out of a reaction to ‘black-boxing’ in electronic music - where the audience is unable to interface with the ‘live-ness’ of what the performer is making. Nothing is hidden at an Algorave. In an Algorave you can see what the performer is doing through code projected onto walls in realtime. The creative process is open and shared with the audience. Code is shared freely. Performers share their screens with the crowd, taking them on a journey through making - unmaking - remaking, thought processes laid bare in lines of improvised code weaving it’s way through practised shaping of sound.

43117267581_5d38d8a03e_o
Coral Manton AKA Coral

As a female coder, becoming part of the Algorave community has led me to reflect on the power of seeing women coding live, and how this encourages greater participation from women. Algorave attempts to maintain a positive gender balance. More than this the joy of seeing women confidently and openly experimenting with code, sharing their practise, making mistakes, revelling in uncertainty and error, crashing-restarting-crashing again to cheers from the supportive crowd willing the performances to continue sharing the anarchic joy of failure in a community where failure leads to new possibilities.

28547836316_9a2692fcf3_o
ALGOBABEZ AKA Shelly Knotts and Joanne Armitage

Algorave is a fun word - an algorithmic rave - a scene where people come to together to create and dance to music generate by code. Technically Algorave is described as "sounds wholly or partly characterised by the emission of a succession of repetitive conditionals”. The performers writes
 lines of code that create cyclic patterns of music, layered to create an evolving composition. The same is applied to the visuals: live coded audio reactive patterns, showing shapes bouncing, revolving, repeating to the beat of the music. All of this creates a shared club experience like no other.

Visual Artists Antonio Robert AKA hellocatfood: “I like to do Algorave because I think it runs an otherwise perfect black box computer into a live performance instrument. Playing at an Algorave forces me to abandon what I know and respond to everything happening around me. It shows me that even something as meticulously designed as a computer is a living tool that is subject to randomness and mistakes.”

27053835999_faa947395b_o
Antonio Roberts AKA hellocatfood

Algorave is an open, non-hierarchical global community, with it’s hub in Sheffield. There have been Algoraves in over 50 cities around the world. Algorave is not a franchise, it is a free culture, anyone can put on an Algorave - however their approach should align with the ethos of the community. Algorave collapses hierarchies - headliners are generally frowned upon. Diversity is key to the Algorave community. Algorave is open to everyone and actively promotes diversity in line-ups and audiences. The community is active both online and at live events organised by community members. The software people use is created within the community and open-source. There is little barrier to participation. If you are interested in Algorave come along, speak to the performers, join the online community, download some software
(e.g. IXI LangpuredataMax/MSPSuperColliderExtemporeFluxus, TidalCyclesGibberSonic PiFoxDot and Cyril) and get coding.

If this sounds like your scene or you want to know more, please join us at the Algorave Late Event. Tickets available here: https://www.bl.uk/events/late-at-the-library-algorave

Also check out https://algorave.com & https://toplap.org

26 March 2019

BL Labs Staff Award Runners Up: 'The Digital Documents Harvester'

Add comment

This guest blog is by Jennie Grimshaw on behalf of her team who were the BL Labs Staff Award runners up for 2018.

Harvest Haystack uk

The UK Legal Deposit Web Archive (LDWA) contains terabytes of data harvested from the UK web domain. It has a public search interface at https://webarchive.org.uk/ , but finding individual documents in what is in effect a vast unstructured dataset is challenging. The analogy of looking for a needle in a haystack comes to mind as being entirely appropriate.

The Digital Documents Harvesting and Processing Tool (DDHAPT) was designed to overcome the problem of finding individual known documents in the LDWA. It is an adaptation of the web archiving software that enables selectors to set up regular in-depth crawls of target, document heavy websites. The system then extracts new pdfs published since its previous visit from the target websites and presents them to the selector in a list with the most recent at the top:

DDH image 1

The selector can then view an image of the document on the screen by clicking on the title. If the document is in scope, basic metadata is created by completing an on-screen form. If the document doesn’t make the grade for the creation of an individual record, it can be removed from the list of new documents for selection by clicking on the green Ignore button on the right of the screen.

The metadata we create records the title and subtitle, publication year and publisher, edition, series, personal and corporate authors and ISBN (if present). Some fields such as title, publication year and publisher are automatically populated.  A broad subject heading is assigned from a pick list. Our aim is to create a “good enough” record that can stand without upgrading by the digital cataloguers, avoiding double handling.

DDH image 2

To save time and avoid transcription errors system allows the selector to highlight information in the document such as personal author, publisher, series title or ISBN. You then mouse up, which calls up a list of fields. Clicking on the appropriate field automatically transfers the data into it.

DDH image 3

Once the metadata has been created, the selector clicks on a submit button which starts the process of loading it into the British Library catalogue and the catalogues of the other five legal deposit libraries – the national libraries of Scotland and Wales, the Universities of Oxford and Cambridge, and Trinity College Dublin. The document remains in the Legal Deposit Web Archive. Its URL in the web archive is recorded in the metadata and creates the link between the document and its catalogue record. Readers who find the record in the British Library’s public catalogue or those of any of the legal deposit libraries can then click on the “I want this” button and view the document on screen.

The DDHAPT is currently being used to monitor the publications of Westminster government departments and help us ensure that future generations of researchers can reliably access known official documents via the catalogues of the six legal deposit libraries. However, we intend to extend its use to cover the output of other non-commercial publishers such as campaigning charities, think tanks, academic research centres, and pressure groups as a way of making their archived publications easily discoverable.

Normally material collected under the non-print legal deposit regulations can only be viewed by law on the premised on one of the six legal deposit libraries. However, the Libraries have negotiated licences with the UK government and many other non-commercial online publishers that allow us to make their archived websites and the documents on them open and available remotely. These licences lift non-print legal deposit restrictions and allow us to make the documents covered by them available 24/7 from anywhere in the world.

In these ways the DDHAPT improves the discoverability of non-commercially published documents collected under non-print legal deposit, facilitates metadata creation through auto-population of some fields, and avoids double handling through creation of good quality metadata at the point of selection.

Watch the Digital Documents Harvester team receiving their award and talking about their project on our YouTube channel (clip runs from 8.15 to 14.45):

Find out more about Digital Scholarship and BL Labs. If you have a project which uses British Library digital content in innovative and interesting ways, consider applying for an award this year! The 2019 BL Labs Symposium will take place on Monday 11 November at the British Library.

19 March 2019

BL Labs 2018 Commercial Award Runner Up: 'The Seder Oneg Shabbos Bentsher'

Add comment

This guest blog was written by David Zvi Kalman on behalf of the team that received the runner up award in the 2018 BL Labs Commercial category.

32_god_web2

The bentsher is a strange book, both invisible and highly visible. It is not among the more well known Jewish books, like the prayerbook, Hebrew Bible, or haggadah. You would be hard pressed to find a general-interest bookstore selling a copy. Still, enter the house of a traditional Jew and you’d likely find at least a few, possibly a few dozen. In Orthodox communities, the bentsher is arguably the most visible book of all.

Bentshers are handbooks containing the songs and blessings, including the Grace after Meals, that are most useful for Sabbath and holiday meals, as well as larger gatherings. They are, as a rule, quite small. These days, bentshers are commonly given out as party favors at Jewish weddings and bar/bat mitzvahs, since meals at those events require them anyway. Many bentshers today have personalized covers relating the events at which they were given.

Bentshers have never gone out of print. By this I mean that printing began with the invention of the printing press and has never stopped. They are small, but they have always been useful. Seder Oneg Shabbos, the version which I designed, was released 500 years after the first bentsher was published. It is, in a sense, a Half Millennium Anniversary Special Edition.

SederOneg_4

Bentshers, like other Jewish books, could be quite ornate; some were written and illustrated by hand. Over the years, however, bentshers have become less and less interesting, largely in order to lower the unit cost. In order to make it feasible for wedding planners to order hundreds at a time, all images were stripped from the books, the books themselves became very small, and any interest in elegant typography was quickly eliminated. My grandfather, who designed custom covers for wedding bentshers, simply called the book, “the insert.” Custom prayerbooks were no different from custom matchbooks.

This particular bentsher was created with the goal of bucking this trend; I attempted to give the book the feel of the some of the Jewish books and manuscripts of the past, using the research I was able to gather a graduate student in the field of Jewish history. Doing this required a great deal of image research; for this, the British Library’s online resources were incredible valuable. Of the more than one hundred images in the book, a plurality are from the British Library’s collections.

https://data.bl.uk/hebrewmanuscripts/

https://www.bl.uk/hebrew-manuscripts

OS_36_37

In addition to its visual element, this bentsher differs from others in two important ways. First, it contains ritual languages that is inclusive of those in the LGBTQ community, and especially for those conducting same-sex weddings. In addition, the book contains songs not just in Hebrew, but in Yiddish, as well; this was a homage to two Yiddishists who aided in creating the bentsher’s content. The bentsher was first used at their wedding.

SederOneg_3

More here: https://shabb.es/sederonegshabbos/

Watch David accepting the runner up award and talking about the Seder Oneg Shabbos Bentsher on our YouTube channel (clip runs from 5.33 to 7.26): 

David Zvi Kalman was responsible for the book’s design, including the choice of images. He is a doctoral candidate at the University of Pennsylvania, where he focuses on the relationship between Jewish history and the history of technology. Sarah Wolf is a specialist in rabbinics and is an assistant professor at the Jewish Theology Seminary of America. Joshua Schwartz is a doctoral student at New York University, where he studies Jewish mysticism. Sarah and Joshua were responsible for most of the books translations and transliterations. Yocheved and Yudis Retig are Yiddishists and were responsible for the book’s Yiddish content and translations.

Find out more about Digital Scholarship and BL Labs. If you have a project which uses British Library digital content in innovative and interesting ways, consider applying for an award this year! The 2019 BL Labs Symposium will take place on Monday 11 November at the British Library.

28 February 2019

The World Wide Lab: Building Library Labs - Part II

Add comment

BL Flickr Copenhagen 1

We're setting sail for Denmark! Along with colleagues from the UK, Austria, Belgium, Egypt, Finland, Germany, Ireland, Latvia, Luxembourg, the Netherlands, Qatar, Spain, Sweden and the USA, we will be mooring at Copenhagen's Black Diamond, waterfront home to Denmark's Royal Library, for the second International Building Library Labs event: 4-5 March 2019.

Danish lib & BL logis

For some time now, leading national, state, university and public libraries around the world have been creating 'digital lab type environments'. The purpose of these 'laboratories' is to afford access to their institutions' digital content - the digitised and 'born digital' collections as well as data - and to provide a space where users can experiment and work with that content in creative, innovative and inspiring ways. Our shared ethos is to open up our collections for everyone: digital researchers, artists, entrepreneurs, educators, and everyone in between.

BL Labs has been running in such a capacity for six years. In September 2018, we hosted a 2-day workshop at the British Library in London for invited participants from national, state and university libraries - the first event of its kind in the world. It was a resounding success, and it was decided that we should organise a second event, this time in collaboration with our colleagues in Copenhagen.

11248527023_2655ce2ceb_oNext week's participants, from over 30 institutions, will be sharing lessons learned, talking about innovative projects and services that have used their digital collections and data in clever ways, and continuing to establish the foundations for an international network of Library Labs. We aim to work together in the spirit of collaboration so that we can continue to build even better Library Labs for our users in the future.

Our packed programme is available to view on Eventbrite or as a Googledoc. We still have a few spaces left so if you are interested in coming along, you can still book here. As well as presentations and plenary debates, we will have eight lightning talks with topics ranging from how to handle big data to how to run a data visualisation lab. To accommodate our many delegates, with their own interests and specialisms, we will break out into 12 parallel discussion groups focusing on subjects such as how to set up a lab; how to get access to data; moving from 'project' lab to 'business as usual'; data curation; how to deal with large datasets; and using Labs & Makerspaces for data-driven research and innovation in creative industries. 

We will blog again after the event, and provide links to some of the presentations and outputs. Watch this space! 

11150060314_bcf2b92af3_o

Danish-themed images trawled from our British Library Flickr Images set: pages 37, 126, and 15 of Copenhagen, the Capital of Denmark, published by the Danish Tourist Society, 1898. Find the original book here.

Posted by Eleanor Cooper on behalf of BL Labs

26 February 2019

Competition to automate text recognition for printed Bangla books

Add comment

You may have seen the exciting news last week that the British Library has launched a competition on recognition of historical Arabic scientific manuscripts that will run as part of ICDAR2019. We thought it only fair to cover printed material too! So we’re running another competition, also at ICDAR, for automated text recognition of rare and unique printed books written in Bangla that have been digitised through the Library's Two Centuries of Indian Print project.

Some of you may remember the Bangla printed books competition which took place at ICDAR2017 which generated significant interest among academic institutions and technology providers both in India and across the world. The 2017 competition set the challenge of finding an optimal solution for automating recognition of Bangla printed text and resulted in Google’s method performing best for both text detection and layout analysis.

Fast forward to 2019 and, thanks to Jadavpur University in Kolkata, we have added more ground truth transcriptions for competition entrants to train their OCR systems with. We hope that the competition encourages submissions again from cutting-edge OCR methods leading to a solution that can truly open up these historic books, dating between 1713 and 1914, for text mining, enabling scholars of South Asian studies to explore hundreds of thousands of pages on a scale that has not been possible until now.

AletheiaGroundTruth

              Image showing a transcribed page from one of the Bengali books featured in the ICDAR2019 competition

As with the Arabic competition, we are collaborating with PRImA (Pattern Recognition & Image Analysis Research Lab) who will provide expert and objective evaluation of OCR results produced through the competition. The final results will be revealed at the ICDAR2019 conference in Sydney in September.

So if you missed out last time but are interested in testing your OCR systems on our books the competition is now open! For instructions of how to apply and more about the competition, please visit https://www.primaresearch.org/REID2019/

 

This post is by Tom Derrick, Digital Curator for Two Centuries of Indian Print, British Library. He is on Twitter as @TommyID83 and Two Centuries of Indian Print tweet from @BL_IndianPrint