Show me the data
Libraries just worry about books, right? Wrong! We also worry about data. If you want to provide a useful service to the research community (and that community includes anyone who wants to do research), you need to think about all the information, including research data sets, that people may need. But we recognise that isn’t always easy to do.
The Royal Society’s 2012 report on science as an open enterprise focused on the value of research data and, at a recent meeting, Professor Geoffrey Boulton who led the study noted that ‘open science’ approaches are not new. Henry Oldenburg, the 17th-century German natural philosopher and first Secretary of the Royal Society, ensured all his scientific correspondence was written in vernacular (and not Latin, as was the norm), and that all his observations were supported by supplementary evidence (and not just assertions).
Thus Boulton reflected that while the value of supporting reproducibility and providing an evidence base had been recognised very early on, many journals no longer published the results in tandem with the underlying data. Fortunately the technology is now allowing many publishers and others to provide better access to the data.
In some areas of science there has been a culture of data sharing. If researchers are sequencing DNA from any species they are asked to submit it to GenBank: a database established to ensure that scientists have access to the most up-to-date and comprehensive DNA sequence information. Most publishers require the researchers to provide evidence that they have added their data to GenBank before publication. So, if you work on sequencing DNA, getting access to other people’s data is relatively easy – but that is not necessarily the case for many other areas of science.
The reasons are complex. In many areas of research, there are no established or permanent stores for the many types of data that are produced. For researchers, the data they collect or generate is the primary output of the research and therefore comprises their intellectual capital. Many researchers are concerned about receiving appropriate credit for their efforts and that may not happen if they share their data with all and sundry. But that objection could be tackled if researchers could cite data – and thereby be recognised for their contribution.
The British Library is a founding member of an organisation called DataCite which, as the name suggests, was established to enable data to be cited. We have been working with a range of organisations responsible for managing, storing and preserving data from a variety of areas – everything from archaeology to atmospheric science – to enable them to attach a ‘digital tag’ to data that allows it to be referenced. This tag is ‘persistent’, so that even if the data is no longer available, it will be possible to find out what has happened to that resource. We hope when someone says – ‘show me the data’ – we will have played a role in making that possible.
Lee-Ann Coleman and Allan Sudlow