14 May 2018
Digital Curator Mia Ridge writes: in this guest post, Dr Giles Bergel describes some experiments with the Library's digitised images...
The University of Oxford’s Visual Geometry Group has been working with a number of British Library curators to apply computer vision technology to their collections. On April 5 of this year I was invited by BL Digital Curator Dr. Mia Ridge to St. Pancras to showcase some of this work and to give curators the opportunity to try the tools out for themselves.
Computer vision - the extraction of meaning from images - has made considerable strides in recent years, particularly through the application of so-called ‘deep learning’ to large datasets. Cultural collections provide some of the most interesting test-cases for computer vision researchers, due to their complexity; the intensity of interest that researchers bring to them; and to their importance for human well-being. Can computers see collections as humans do? Computer vision is perhaps better regarded as a powerful lens rather than as a substitute for human curation. A computer can search a large collection of images far more quickly than can a single picture researcher: while it will not bring the same contextual understanding to bear on an image, it has the advantage of speed and comprehensiveness. Sometimes, a computer vision system can surprise the researcher by suggesting similarities that weren’t readily apparent.
As a relatively new technology, computer vision attracts legitimate concerns about privacy, ethics and fairness. By making its state of the art tools freely available, Visual Geometry hope to encourage experimentation and responsible use, and to enlist users to help determine what they can and cannot do. Cultural collections provide a searching test-case for the state of the art, due to their diversity as media (prints, paintings, stamped images, photographs, film and more) each of which invite different responses. One BL curator made a telling point by searching the BBC News collection with the term 'football': the system was presented with images previously tagged with that word that related to American, Gaelic, Rugby and Association football. Although inconclusive due to lack of sufficiently specific training data, the test asked whether a computer could (or should) pick the most popular instances; attempt to generalise across multiple meanings; or discern separate usages. Despite increases in processing power and in software methods, computers' ability to generalise; to extract semantic meaning from images or texts; and to cope with overlapping or ambiguous concepts remains very basic.
Other tests with BL images have been more immediately successful. Visual Geometry's Traherne tool, developed originally to detect differences in typesetting in early printed books, worked well with many materials that exhibit small differences, such as postage stamps or doctored photographs. Visual Geometry's Image Search Engine (VISE) has shown itself capable of retrieving matching illustrations in books digitised for the Library's Indian Print project, as well as certain bookbinding features, or popular printed ballads. Some years ago Visual Geometry produced a search interface for the Library's 1 Million Images release. A collaboration between the Library's Endangered Archives programme and Oxford researcher David Zeitlyn on the archive of Cameroonian studio photographer Jacques Toussele employed facial recognition as well as pattern detection. VGG's facial recognition software works on video (BBC News, for example) as well as still photographs and art, and is soon to be freely released to join other tools under the banner of the Seebibyte Project.
I'll be returning to the Library in June to help curators explore using the tools with their own images. For more information on the work of Visual Geometry on cultural collections, subscribe to the project's Google Group or contact Giles Bergel.
The event was supported by the Seebibyte project under an EPSRC Programme Grant EP/M013774/1