24 January 2017
Publication of Quarterly Lists: Catalogues of Indian Books
The Two Centuries of Indian Print project is pleased to announce the online availability of some wonderful catalogues held by the library, generally known as the Quarterly Lists. They record books published quarterly and by province of British India between 1867 and 1947.
Digitised for the first time, the Quarterly Lists can now be accessed as searchable PDFs via the British Library's datasets portal, data.bl.uk. Researchers will be able to examine rich bibliographic data about books published throughout India, including the names and address of printers and publishers, publication price and how many copies were sold.
Our next steps will be to OCR the Quarterly Lists to create ALTO XML for every page, which is designed to show accurate representations of the content layout. This will allow researchers to apply computational tools and methods to look across all of the lists to answer their questions about book history. So if a researcher is interested in what the history of book publishing reveals about a particular time period and place, we would like to make that possible by giving them full access to this dataset.
To get to this point however, we will have to overcome the layout challenge that the Quarterly Lists present. Across all of the lists we have found a few different layout styles which are rather tricky for OCR solutions to handle meaningfully. Note for instance how the list below compares to the one from the Calcutta Gazette above. Through the Digital Research strand of the project we will be seeking out innovative research groups willing to take a crack at improving the OCR quality and accuracy of tabular text extraction from the Quarterly Lists.
The Quarterly Lists available on data.bl.uk are out of copyright and openly licensed for reuse. If you or anyone you know are interested in using the Quarterly Lists in your research or simply want to find out more about them, feel free to drop me an email; [email protected] or follow more about the project @BL_IndianPrint
You can read more about the history of the Quarterly Lists, in a previous blog I wrote last year.