THE BRITISH LIBRARY

Digital scholarship blog

10 posts categorized "Rare books"

08 May 2018

The Italian Academies database – now available in XML

Add comment

Dr Mia Ridge writes: in 2017, we made XML and image files from a four-year, AHRC-funded project: The Italian Academies 1525-1700 available through the Library's open data portal. The original data structure was quite complex, so we would be curious to hear feedback from anyone reusing the converted form for research or visualisations.

In this post, Dr Lisa Sampson, Reader in Early Modern Italian Studies at UCL, and Dr Jane Everson, Emeritus Professor of Italian literature, RHUL, provide further information about the project...

New research opportunities for students of Renaissance and Baroque culture! The Italian Academies database is now available for download. It's in a format called XML which represents the original structure of the database.

This dedicated database results from an eight-year project, funded by the Arts and Humanities Research Council UK, and provides a wealth of information on the Italian learned academies. Around 800 such institutions flourished across the peninsula over the sixteenth and seventeenth centuries, making major contributions to the cultural and scientific debates and innovations of the period, as well as forming intellectual networks across Europe. This database lists a total of 587 Academies from Venice, Padua, Ferrara, Bologna, Siena, Rome, Naples, and towns and cities in southern Italy and Sicily active in the period 1525-1700. Also listed are more than 7,000 members of one or more academies (including major figures like Galileo, as well as women and artists), and almost 1,000 printed works connected with academies held in the British Library. The database therefore provides an essential starting point for research into early modern culture in Italy and beyond. It is also an invitation to further scholarship and data collection, as these totals constitute only a fraction of the data relating to the Academies.

Terracina
Laura Terracina, nicknamed Febea, of the Accademia degli Incogniti, Naples

The database is designed to permit searches from many different perspectives and to allow easy searching across categories. In addition to the three principal fields – Academies, People, Books – searches can be conducted by title keyword, printer, illustrator, dedicatee, censor, language, gender, nationality among others. The database also lists and illustrates the mottoes and emblems of the Academies (where known) and similarly of individual academy members. Illustrations from the books entered in the database include frontispieces, colophons, and images from within texts.

Intronati emblem
Emblem of the Accademia degli Intronati, Siena


The database thus aims to promote research on the Italian Academies in disciplines ranging from literature and history, through art, science, astronomy, mathematics, printing and publishing, censorship, politics, religion and philosophy.

The Italian Academies project which created this database began in 2006 as a collaboration between the British Library and Royal Holloway University of London, funded by the Arts and Humanities Research council and led by Jane Everson. The objective was the creation of a dedicated resource on the publications and membership of the Italian learned Academies active in the period between 1525 and 1700. The software for the database was designed in-house by the British Library and the first tranche of data was completed in 2009 listing information for academies in four cities (Naples, Siena, Bologna and Padua). A second phase, listing information for many more cities, including in southern Italy and Sicily, developed the database further, between 2010 and 2014, with a major research grant from the AHRC and collaboration with the University of Reading.

The exciting possibilities now opened up by the British Library’s digital data strategy look set to stimulate new research and collaborations by making the records even more widely available, and easily downloadable, in line with Open Access goals. The Italian Academies team is now working to develop the project further with the addition of new data, and the incorporation into a hub of similar resources.

The Italian Academies project team members welcome feedback on the records and on the adoption of the database for new research (contact: www.italianacademies.org).

The original database remains accessible at http://www.bl.uk/catalogues/ItalianAcademies/Default.aspx 

An Introduction to the database, its aims, contents and objectives is available both at this site and at the new digital data site: https://data.bl.uk/iad/

Jane E. Everson, Royal Holloway University of London

Lisa Sampson, University College, London

22 March 2017

British Library Launches OCR Competition for Rare Indian Books

Add comment

Calling all transcription enthusiasts! We’ve launched a competition to find an accurate and automated transcription solution for our rare Indian books and printed catalogue records, currently being digitised through the Two Centuries of Indian Print project. 

The competition, in partnership with the University of Salford’s PRIMA Research Lab, is part of the International Conference on Document Analysis and Recognition, taking place in Kyoto, Japan this November. The winners will be announced at a special event during the conference.

Digitised images of the books will be made openly available through the library’s website and we hope this competition will produce transcriptions that enable full text search and discovery of this rich material. Sharing XML transcriptions will also give researchers the foundation to apply computational tools and methods such as text mining that may lead to new insights into book and publishing history in India.   

Split into two challenges, those wishing to participate in the competition can enter either or both.

The first challenge is to find an automated transcription for the 19th century printed books written in Bengali script. Optical Character Recognition of many non-Latin scripts is a developing area, but still presents a considerable barrier for libraries and other cultural institutions hoping to open up their material for scholarly research.

Vt1712_Schoolbook_lion_0007

Above: A page from 'Animal Biography', one of the Bengali books being digitised as part of Two Centuries of Indian Print (VT 1712)

 

Challenge number two involves our printed catalogue records, known as ‘Quarterly Lists’. These describe books published in India between 1867 and 1967. The lists are arranged in tables and therefore accurately representing the layout of the data is important if researchers are able to use computational methods to identify chunks of information such as the place of publication and cost of the book.    

Quarterly_List

 Above: A typical double page from the Quarterly Lists (SV 412/8)

 

With the competition now open, we’ve already gone some way to helping participants by manually transcribing a few pages to create ‘ground truth’ using PRIMA's editing tool, Aletheia.  You can watch a video introducing the competition. So if you or anyone you know would like to enter, do please register and you could be contributing to this landmark project, and picking up an award for your troubles!   

09 March 2017

Archaeologies of reading: guest post from Matthew Symonds, Centre for Editing Lives and Letters

Add comment

Digital Curator Mia Ridge: today we have a guest post by Matthew Symonds from the Centre for Editing Lives and Letters on the Archaeologies of reading project, based on a talk he did for our internal '21st century curatorship' seminar series. Over to Matt...

Some people get really itchy about the idea of making notes in books, and dare not defile the pristine printed page. Others leave their books a riot of exclamation marks, sarcastic incredulity and highlighter pen.

Historians – even historians disciplined by spending years in the BL’s Rare Books and Manuscripts rooms – would much prefer it if people did mark books, preferably in sentences like “I, Famous Historical Personage, have read this book and think the following having read it…”. It makes it that much easier to investigate how people engaged with the ideas and information they read.

Brilliantly for us historians, rare books collections are filled with this sort of material. The problem is it’s also difficult to catalogue and make discoverable (nota bene – it’s hard because no institutions could afford to employ and train sufficient cataloguers, not because librarians don’t realise this is an issue).

The Archaeology of Reading in Early Modern Europe (AOR) takes digital images of books owned and annotated by two renaissance readers, the professional reader Gabriel Harvey and the extraordinary polymath John Dee, transcribes and translates all the comments in the margin, and marks up all traces of a reader’s intervention with the printed book and puts the whole thing on the Internet in a way designed to be useful and accessible to researchers and the general public alike.

image from https://s3.amazonaws.com/feather-client-files-aviary-prod-us-east-1/2017-03-09/76bacc2c-befe-4e7c-b729-c49cf47adf0b.png
Screenshot, The Archaeology of Reading in Early Modern Europe

AOR is a digital humanities collaboration between the Centre for Editing Lives and Letters (CELL) at University College London, Johns Hopkins University and Princeton University, and generously funded by the Andrew W. Mellon Foundation.

More importantly, it’s also a collaboration between academic researchers, librarians and software engineers. An absolutely vital consideration of how we planned AOR, how we work on it, how we’re planning to expand it, was to identify a project that could offer a common ground to be shared between these three interests, where each party would have something to gain from it.

As one of the researchers, it was really important to me to avoid forming some sort of “client-provider” relationship with the librarians who curate and know so much about my sources, and the software engineers who build the digital infrastructure.

But we do use an academic problem as a means of giving our project a focus. In 1990, Antony Grafton and the late Lisa Jardine published their seminal article ‘“Studied for Action: how Gabriel Harvey read his Livy’ in the journal Past & Present.

One major insight of the article is that people read books in conjunction with one another, often for specific, pragmatic purposes. People didn’t pick up a book from their shelves, open at page one and proceed through to the finis, marking up as they went. They put other books next to them, books that explained, clarified, argued with one another.

By studying the marginalia, it’s possible to reconstruct these pathways across a library, recreating the strategies people used to manage the vast quantities of information they had at their disposal.

In order to produce this archaeology of reading, we’ve built a “digital bookwheel”, an attempt to recreate the revolving reading desk of the renaissance period which allowed the lucky owner to manoeuvre back and forth their books. From here, the user can call up the books we’ve digitised, read the transcriptions, and search for particular words and concepts.

image from http://s3.amazonaws.com/feather-files-aviary-prod-us-east-1/98739f1160a9458db215cec49fb033ee/2017-03-09/ac83353a40f24bea921e478b1450993e.png
Screenshot, The Archaeology of Reading in Early Modern Europe


It’s built out of open source materials, leveraging the International Image Interoperability Framework (IIIF) and the IIIF-compliant Mirador 2 Viewer. Interested parties can download the XML files of our transcriptions, as well as the data produced in the process.

The exciting thing for us is that all the work on creating this digital infrastructure – which is very much a work in progress -- has provided us with the raw materials for asking new research questions, questions that can only be asked by getting away from our computer and returning back to the rare books room.

24 January 2017

Publication of Quarterly Lists: Catalogues of Indian Books

Add comment

The Two Centuries of Indian Print project is pleased to announce the online availability of some wonderful catalogues held by the library, generally known as the Quarterly Lists. They record books published quarterly and by province of British India between 1867 and 1947.

Digitised for the first time, the Quarterly Lists can now be accessed as searchable PDFs via the British Library's datasets portal, data.bl.uk. Researchers will be able to examine rich bibliographic data about books published throughout India, including the names and address of printers and publishers, publication price and how many copies were sold.

 

SV_412_8_1875-78_0003

 

Our next steps will be to OCR the Quarterly Lists to create ALTO XML for every page, which is designed to show accurate representations of the content layout. This will allow researchers to apply computational tools and methods to look across all of the lists to answer their questions about book history. So if a researcher is interested in what the history of book publishing reveals about a particular time period and place, we would like to make that possible by giving them full access to this dataset.

To get to this point however, we will have to overcome the layout challenge that the Quarterly Lists present. Across all of the lists we have found a few different layout styles which are rather tricky for OCR solutions to handle meaningfully. Note for instance how the list below compares to the one from the Calcutta Gazette above. Through the Digital Research strand of the project we will be seeking out innovative research groups willing to take a crack at improving the OCR quality and accuracy of tabular text extraction from the Quarterly Lists. 

The Quarterly Lists available on data.bl.uk are out of copyright and openly licensed for reuse. If you or anyone you know are interested in using the Quarterly Lists in your research or simply want to find out more about them, feel free to drop me an email; Tom.Derrick@bl.uk or follow more about the project @BL_IndianPrint

You can read more about the history of the Quarterly Lists, in a previous blog I wrote last year.

03 November 2016

Quarterly Lists: Digitally Researching Catalogues of Indian Books

Add comment

As well as digitising rare early printed Indian books, the Two Centuries of Indian Print project is making available online some wonderful catalogues held by the library, generally known as the Quarterly Lists, recording all books published quarterly and by province of British India between 1867 and 1947.

The catalogues will complement the Bengali printed books and I’d like to use this blog to share a bit more about what the Quarterly Lists are and what we are doing to make them as accessible as possible for researchers of book history who want to apply digital research methods to explore their rich contents.

Firstly, a little more about the origins of these catalogues. With the passing of The (Indian) Press and Registration of Books Act, 1867 it became mandatory for all books published in provinces of British India to be sent to the provincial secretariat library for registration.  Both the India Office Library and the British Museum Library in London, later to be united in the British Library’s collection, were separately given the privilege of requesting books from these lists free of charge in what amounted to a colonial legal deposit arrangement. The act was passed with the aim of recording the ever growing number of publications originating from the various printing presses throughout India, its purpose political as well as archival.  Not all works that issued from the presses were recorded in the lists and only a small percentage were actually deposited in the London collections.  The library curators in London selected only those works which they thought were important or interesting.  The Quarterly lists were originally published as appendices in the official provincial newspapers, such as the Calcutta Gazette (below).

  SV_412_8_1875-78_0003

 

SV_412_8_1875-78_0004

 

Although Independence brought an end to the arrangement for depositing publications with the India Office Library and British Museum Library, the practice of publishing catalogues of registered printed books continued until the late 1960s.

Now digitised for the first time, the Quarterly Lists will be made available as searchable PDFs via the British Library's new datasets portal, data.bl.uk, in November. Researchers will be able to examine a rich bibliographic data about books published throughout India, including the name and address of printers and publishers. If you are interested in accessing this collection please contact Tom.Derrick@bl.uk

Our next steps will be to OCR the Quarterly Lists to create ALTO XML for every page, which is designed to show accurate representations of the content layout. This will allow researchers to apply computational tools and methods to look across all of the lists to answer their questions about book history. So if a researcher is interested in what the history of book publishing reveals about a particular time period and place, we would like to make that possible by giving them full access to this dataset.

To get to this point however, we will have to overcome the layout challenge that the Quarterly Lists present. Across all of the lists we have found a few different layout styles which are rather tricky for OCR solutions to handle meaningfully. Note for instance how the list below compares to the one from the Calcutta Gazette above. Through the Digital Research strand of the project we will be seeking out innovative research groups willing to take a crack at improving the OCR quality and accuracy of tabular text extraction from the Quarterly Lists. 

  SV_412_8_1935_0016

If you or anyone you know are interested in using the Quarterly Lists in your research or simply want to find out more about them, feel free to drop me an email; Tom.Derrick@bl.uk or follow more about the project @BL_IndianPrint

 

04 July 2016

Two Centuries of Indian Print: Enhancing Scholarly Research

Add comment

Tom Derrick will be working as a Digital Curator within the Digital Research Team at the British Library on a project titled ‘Two Centuries of Indian Print’. This project will digitise rare Bengali printed books and provide opportunities for innovative research at the intersection of Digital Humanities and South Asian studies. He Tweets @tommyid83, and can also be contacted by email at Tom.Derrick@bl.uk.

 

Only a week into my new role I can already see the benefits of the work that the digital research team delivers. I attended a fascinating presentation of the two latest BL Lab award-winning projects. I was impressed to see how young researchers are collaborating with the digital research team here to find innovative methods to open up new avenues for their own research as well as for other academics and the general public.      

I have joined the British Library from a digital publisher of historical primary sources and am excited to use my experience engaging with researchers to facilitate academic interrogation of the Two Centuries of Indian Print project data. This two-year pilot will make, freely available online, digitised Bengali books drawn from the extensive South Asian printed book collection at the British Library along with a selection from SOAS. The books digitised as part of the pilot will span 1801-1867, the bulk of which are religious tracts. It is part of a wider initiative by the British Library to catalogue and make available printed Indian books in 22 South Asian languages, covering 1714-1914.

 Ab_Haval  Ab haval, a poetical account in Gujarati on the disastrous floods at Ahmadabad, 1875

 

Over the course of the next two years, I'll be engaging with researchers, particularly in the fields of South Asian studies and Digital Humanities, to explore the opportunities and challenges involved in applying digital research methods and tools to this newly digitised collection. A key area I'll be looking at is how to ensure the metadata and digitised text produced will cater to the needs and interests of an academic community interested in performing large-scale data analysis. This will involve finding an optimal solution to making the Bengali script machine readable so the full text can be searched and ‘mined’ by researchers. We'll also be developing a series of workshops to provide academics and professionals from Indian institutions, particularly the GLAM (Galleries, Libraries, Archives and Museums) sector, to gain new skills to support digital research.  

Sanskrit_Hymn_2 Illustration from an early printed edition of the Adityahṛdayam, a devotional hymn in Sanskrit to the Sun God, seen here on his chariot drawn by seven horses, Bombay, 1862

 

It is a privilege to be here working for the British Library, an institution I have always admired for its mission and core values and I am proud to support that continued effort through stimulating an international community of researchers to access what will prove to be a fascinating collection. We’ll be posting further blogs describing the progress of the project, so watch this space! If you have any questions about the project or ideas relating to innovative use of the collection, please do email me at Tom.Derrick@bl.uk

28 January 2016

Book Now! Nottingham @BL_Labs Roadshow event - Wed 3 Feb (12.30pm-4pm)

Add comment Comments (0)

Do you live in or near Nottingham and are available on Wednesday 3 Feb between 1230 - 1600? Come along to the FREE UK @BL_Labs Roadshow event at GameCity and The National Video Game Arcade, Nottingham (we have some places left and booking is essential for anyone interested) and:

 

BL Labs Roadshow in Nottingham - Wed 3 Feb (1200 - 1600)
BL Labs Roadshow at GameCity and The National Video Game Arcade, Nottingham, hosted by the Digital Humanities and Arts (DHA) Praxis project based at the University of Nottingham, Wed 3 Feb (1230 - 1600)
  • Discover the digital collections the British Library has, understand some of the challenges of using them and even take some away with you.
  • Learn how researchers found and revived forgotten Victorian jokes and Political meetings from our digital archives.
  • Understand how special games and computer code have been developed to help tag un-described images and make new art.
  • Find out about a tool that links digitised handwritten manuscripts to transcribed texts and one that creates statistically representative samples from the British Library’s book collections.
  • Consider how the intuitions of a DJ could be used to mix and perform the Library's digital collections.
  • Talk to Library staff about how you might use some of the Library's digital content innovatively.
  • Get advice, pick up tips and feedback on your ideas and projects for the 2016 BL Labs Competition (deadline 11 April) and Awards (deadline 5 September). 

Our hosts are the Digital Humanities and Arts (DHA) Praxis project at the University of Nottingham who are kindly providing food and refreshments and will be talking about two amazing projects they have been involved in:

ArtMaps: putting the Tate Collection on the map project
ArtMaps: Putting the Tate Collection on the map

Dr Laura Carletti will be talking about the ArtMaps project which is getting the public to accurately tag the locations of the Tate's 70,000 artworks.

The 'Wander Anywhere' free mobile app developed by Dr Benjamin Bedwell.
The 'Wander Anywhere' free mobile app developed by Dr Benjamin Bedwell.

Dr Benjamin Bedwell, Research Fellow at the University of Nottingham will talk about the free mobile app he developed called 'Wander Anywhere'.  The mobile software offers users new ways to experience art, culture and history by guiding them to locations where it downloads stories intersecting art, local history, architecture and anecdotes on their mobile device relevant to where they are.

For more information, a detailed programme and to book your place, visit the Labs and Digital Humanities and Arts Praxis Workshop event page.

Posted by Mahendra Mahey, Manager of BL Labs.

The BL Labs project is funded by the Andrew W. Mellon Foundation.

22 January 2016

BL Labs Competition and Awards for 2016

Add comment Comments (0)

Today the Labs team is launching the fourth annual Competition and Awards for 2016. Please help us spread the word by tweeting, re-blogging and telling anyone who might be interested about it!

British Library Labs Competition 2016

The annual Competition is looking for transformative project ideas which use the British Library’s digital collections and data in new and exciting ways. Two Labs Competition finalists will be selected to work 'in residence' with the BL Labs team between May and early November 2016, where they will get expert help, access to the Library’s resources and financial support to realise their projects.

Winners will receive a first prize of £3000 and runners up £1000 courtesy of the Andrew W. Mellon Foundation at the Labs Symposium on 7th November 2016 at the British Library in London where they will showcase their work.

The deadline for entering is midnight British Summer Time (BST) on 11th April 2016.

Labs Competition winners from previous years have produced an amazing range of creative and innovative projects. For example:

(Top-left)  Adam Crymble's Crowdsource Arcade (Bottom-left) Katrina Navickas' Political Meetings Mapper and (Right) Bob Nicholson's Mechanical Comedian.
(Top-left) Adam Crymble's Crowdsource Arcade and some specially developed games to help with tagging images
(Bottom-left) Katrina Navickas' Political Meetings Mapper and a photo from a Chartist re-enactment 
(Right) Bob Nicholson's Mechanical Comedian

A further range of inspiring and creative ideas have been submitted in previous years and some have been developed further.

British Library Labs Awards 2016

The annual Awards, introduced in 2015, formally recognises outstanding and innovative work that has been carried out using the British Library’s digital collections and data. This year, they will be commending work in four key areas:

  • Research - A project or activity which shows the development of new knowledge, research methods, or tools.
  • Commercial - An activity that delivers or develops commercial value in the context of new products, tools, or services that build on, incorporate, or enhance the Library's digital content.
  • Artistic - An artistic or creative endeavour which inspires, stimulates, amazes and provokes.
  • Teaching / Learning - Quality learning experiences created for learners of any age and ability that use the Library's digital content.

A prize of £500 will be awarded to the winner and £100 for the runner up for each category at the Labs Symposium on 7th November 2016 at the British Library in London, again courtesy of the Andrew W. Mellon Foundation.

The deadline for entering is midnight BST on 5th September 2016.

The Awards winners for 2015 produced a remarkable and varied collection of innovative projects in  Research, Creative/Artistic, Entrepreneurship categories and a special Jury's prize:

(Top-left) Spatial Humanities research group at the University Lancaster,  (Top-right) A computer generated work of art, part of  'The Order of Things' by Mario Klingemann,  (Bottom-left) A bow tie made by Dina Malkova  and (Bottom-right) work on Geo-referenced maps at the British Library that James Heald is still involved in.
(Top-left) Spatial Humanities research group at the University Lancaster plotting mentions of disease in newspapers on a map in Victorian times,
(Top-right) A computer generated work of art, part of 'The Order of Things' by Mario Klingemann,
(Bottom-left) A bow tie made by Dina Malkova inspired by a digitised original manuscript of Alice in Wonderland
(Bottom-right) Work on Geo-referencing maps discovered from a collection of digitised books at the British Library that James Heald is still involved in.
  • Research: “Representation of disease in 19th century newspapers” by the Spatial Humanities research group at Lancaster University analysed the British Library's digitised London based newspaper, The Era through innovative and varied selections of qualitative and quantitative methods in order to determine how, when and where the Victorian era discussed disease.
  • Creative / Artistic:  “The Order of Things” by Mario Klingemann involved the use of semi-automated image classification and machine learning techniques in order to add meaningful tags to the British Library’s one million Flickr Commons images, creating thematic collections as well as new works of art.
  • Entrepreneurship: “Redesigning Alice” by Dina Malkova produced a range of bow ties and other gift products inspired by the incredible illustrations from a digitised British Library original manuscript of Alice's Adventures Under Ground by Lewis Carroll and sold them through the Etsy platform and in the Alice Pop up shop at the British Library in London.
  • Jury's Special Mention: Indexing the BL 1 million and Mapping the Maps by volunteer James Heald describes both the work he has led and his collaboration with others to produce an index of 1 million 'Mechanical Curator collection' images on Wikimedia Commons from the British Library Flickr Commons images. This gave rise to finding 50,000 maps within this collection partially through a map-tag-a-thon which are now being geo-referenced.

A further range of inspiring work has been carried out with the British Library's digital content and collections.

If you are thinking of entering, please make sure you visit our Competition and Awards pages for further details.

Finally, if you have a specific question that can't be answered through these pages, feel free to contact us at labs@bl.uk, or why not come to one of the 'BL Labs Roadshow 2016' UK events we have scheduled between February and April 2016 to learn more about our digital collections and discuss your ideas?

We really look forward to reading your entries!

Posted by Mahendra Mahey, Manager of British Library Labs.

The British Library Labs project is funded by the Andrew W. Mellon Foundation.