Science blog

29 posts categorized "Data"

06 February 2015

DataCite Case Study: at the Unviersity of Leeds

In June last year, we held a DataCite workshop hosted by the University of Glasgow. We've now turned our speaker's use of Digital Object Identifiers (DOIs) for rainforest data into a video and printed case study.

You can still find a short summary of that event here. Our thanks go to Gabriela Lopez-Gonzalez for taking the time to come and film with us.


We hope that this case study will help institutions promote the idea of data citation and use of DOIs for data to their researchers, and that this in turn will encourage more submission of data to institutional repositories.


A DataCite DOI is not just for data

During January we had also been trying to spread the word that DOIs from DataCite aren't necessarily just for data. We've been working with the British Library's EThOS service to look at how UK institutions might give DOIs to their electronic theses and dissertations.

There was an initial workshop to divine the issues in November 2014, and on 16th January we held a bigger workshop, bringing more institutions together to look at how we might start to establish a common way of identifying e-theses in the UK.

The technical step of assigning a DOI to a thesis is relatively straightforward. Once an institution is working with DataCite (or CrossRef) they can use their established systems to assign a DOI to a thesis. But the policies surrounding the issue and management of this process are more complex. We're hoping that these workshops have helped everyone to pull in the same direction and collaborate on answers to common questions.

This work has given rise to a proposal to look at how to improve the connection between a thesis and the data it is built on. By triggering the consideration of sharing the data supporting a thesis, maybe we can "get 'em young" and introduce good data sharing practice as early in the research career as possible. Connecting the thesis and its data also increases the visibility of both, helping early career researchers to reap the benefits of their hard work sooner.

Watch this space to see what happens next!


12 December 2014

Wishing you a Merry Crystal-mas from DataCite UK

As 2014 draws to a close, it has been another busy year for us here at the Library running DataCite UK. Over the past 12 months the number of organisations that are now using DataCite DOIs in the UK has gone up to 26.

One highlight from earlier in the year was the minting of 3millionth DOI, which you can find here: This was minted as part of the work by the Cambridge Crystallographic Data Centre to assign DOIs to their crystallographic datasets. This has been a particularly nice milestone to have as 2014 has been the International Year of Crystallography.

In this year of crystallography, CCDC are by no means the only crystallography database getting DOIs for their data. Both eCrystals ( based at Southampton and the SPECTRa project at Imperial ( are doing the same thing.

This work now means that there are DOIs available for the crystal structure of caffeine (, paracetamol ( and theobromine (, all things that you might want to (or might need to) partake of this Christmas.

ChocolateimageTheobromine is a key flavour compound in milk and dark chocolate, and the reason you can't feed it to your pets: theobromine is particularly toxic to animals. Image from Flickr, CC-BY-NC-SA. 



29 August 2014

Seeing Is Believing: Picturing the Nation's Health

Our latest Beautiful Science video looks back a fantastic evening in which we welcomed Professor David Spiegelhalter and Dame Sally Davies to the Library for a discussion with Michael Blastland about the way in which public health messages are communicated.

In our recent Beautiful Science exhibition, we brought together some classics of data visualisation in the field of public health, showing the impact that powerful images can have in transforming the way we think about our own health and that of our society. But is John Snow's map of cholera deaths, or Florence Nightingale's rose diagram of deaths in the Crimean War really better than a table of numbers, like John Graunt’s Table of Casualities, based on his amalgamation of the data contained within the London Bills of Mortality? When it comes to our health, how and why do we make decisions to reform, or not reform our unhealthy behaviours?

Discussing this important question are:

Sir David Spiegelhalter is Winton Professor for the Public Communication of Risk at Cambridge University

Dr. Dame Sally Davies is the Chief Medical Officer for England

Michael Blastland, writer, broadcaster and author of the Tiger that Isn’t



Johanna Kieniewicz

04 August 2014

Beautiful Science 2014: Picturing Data, Inspiring Insight

As regular readers of this blog will know, earlier in 2014 we hosted the British Library’s first science-led exhibition: Beautiful Science. From classic diagrams from the Library’s collections to contemporary digital displays, Beautiful Science explored how the visualisation of scientific data is crucial for making new discoveries and for communicating those discoveries effectively. Nearly 70,000 people visited the Beautiful Science exhibition over its three month run at the Library, with thousands more experiencing the exhibition at Cheltenham Science Festival.

Beautiful Science also comprised a spectacular season of events that ranged from serious debate to comedy, from family fun days to data visualisation workshops, from competitions to hands-on experiments.

If you missed out on the fun, or just want to remind yourself of what happened, then you can watch a highlights video of the season.


Over the coming weeks we will be posting videos from some of the key events in the season so watch this space…

Katie Howe

24 June 2014

UK DataCite on the road

Our data citation workshops have gone on the road. This blog post summarises the recent event at the University of Glasgow.

On Friday 13 June, we held an Introduction to DataCite workshop at the University of Glasgow. As well introducing what DataCite is and what it does, we demonstrated the various ways you can look at what you put a DOI on (see this previous blog post), considering issues about the versioning of data, and the ‘granularity’ – whether you apply a DOI to a collection of data, individual data files or some other slice of the data. Slides from the day are available on our website.

We had two really enlightening talks from users of the service Gabriela Lopez-Gonzalez and Graham Blythe, both from the University of Leeds.

Gabriela is a researcher at Leeds, and runs the site The site is part of international work to share longitudinal inventory data from permanent forest plots. Gabriela spoke passionately about how important data citation is for her and her community, and how having a persistent identifier such as a DOI for that data will help to acknowledge, not just the researchers who collected the data, but the research assistants, data managers and curators. These people play a vital role in the quality of the data, and in making sure the data are available for further research, but they do not traditionally get recognition in subsequent research papers. And most of them spend an equal amount of time camping out in the rainforest, enduring mosquitos, snakes and spiders as those who are recognised on research papers!



Making good quality research data avialable for reuse involves many people. Image credit Gabriela Gonzalez-Lopez


Graham is part of the research data management team at Leeds. Gabriela provided them with a great test case early on in their planning. He talked about the process Leeds has been through in deciding how to use DOIs for their data. He was wonderfully honest in talking about where, like many institutions we’re talking to, not all the possibilities have been decided on – or even uncovered yet.

Some of the issues around using DOIs seem difficult at first, for instance what data should get a DOI and when. It can be hard to make those decisions when you’re aware of how diverse an institution’s research and data is – no one wants to set policies that will exclude important data. But while it’s good to have general rules on assigning your DOIs, it is important to be flexible as best practice evolves.

We hope to run further data citation workshops around the country, not just to provide details on working with DataCite, but also to bring institutions dealing with these issues together – keep an eye on our webpages and Twitter feed for details.

12 June 2014

The World we live in

Natalie Bevan looks back on last week’s World Environment Day #WED and considers the role of environmental data, outlining some examples of global ecological information sources available today.

Last week saw the annual celebration and public awareness campaign for all things green World Environment Day. With this in mind it seems an opportune time to consider some of the outstanding information resources available today for those interested or working in the broad discipline of environmental sciences. 

Tree Frog
Taking biodiversity as an example, data generated from scientific research are being used in a variety of innovative and progressive ways by organisations and individuals. As more data is made available, better analysis can be undertaken regarding the risks and threats faced by the natural environment. Most vitally this data can also be interpreted to discover effective solutions to complex global ecological problems.

The United Nations Environment Programme and the World Conservation Monitoring Centre - which support World Environment Day - have recreated this Conservation Dashboard. It provides easy access to snapshots of key ecological profiles country by country.

Biodiversity data are available to view, download and analyse in a variety of tools via this site. These include, to name but a few:  Ocean Data Viewer – this tool provides access to, and geospatial navigation of data on the conservation of coastal and marine ecology and Protected Planet -   this gives access to the most comprehensive global data on the world’s protected areas that can be explored through the map or searched for specific datasets.

Harlequin Macaw

We are also making strides here at the British Library in providing more visibility to datasets available on the web via our online catalogue, making a limited number of selected scientific research datasets records available in Explore the British Library.  Search for datasets here.

We have also made datasets relevant to flooding discoverable through our Envia tool.

Bee 148325846 (2)Finally, has it ever occurred to you that you might be interested in generating or collecting biodiversity data yourself? There are a number of interesting citizen science initiatives afoot in this area. Its worth checking out:

  • The Great British Bee Count – It’s well known that bees are suffering loss of habitat and food sources, but there is not a detailed picture of overall bee health across the UK. This project invites you to download an app, and record the bees you spot in your daily activities.
  • OPAL (Open Air Laboratories) – OPAL runs nation-wide surveys on the state of the environment. From the health of the trees in your neighbourhood, to the bugs in your hedge, learn a bit more about the environment around you and contribute to important scientific research!


11 April 2014

The Evolution of Evolution: Picturing the Tree of Life

Johanna Kieniewicz introduces the section of Beautiful Science that explores the Tree of Life.

In our Beautiful Science exhibition, we explore the evolution of evolution, with a section of the exhibition dedicated to the ways in which we have pictured the tree of life—simultaneously image and metaphor for our relationship and connection to life on Earth.

We start out at the beginning, with an illustration of the universe by Renaissance alchemist Robert Fludd. The ‘Great Chain of Being’  is an ancient Greek concept that classifies life on earth into a hierarchical order with respect to the rest of the universe. A great ladder links God and other divine beings to astronomical bodies, man, animals, plants and minerals. Each animal is fixed on a rung in order of perfection (upwards towards man). This sort of hierarchical organisation of life laid the groundwork for the development of biological classification systems and ultimately evolutionary trees.

A complex circular diagram with concentric layers. A nude woman is pictured in the middle, with her right hand raised and holding a staff in her left.
Great Chain of Being, Robert Fludd, Utriusque Cosmi majoris scilicet et minoris ... Oppenheim; Frankfurt, 1617

In On the Origin of Species (1859, 1st ed) , Charles Darwin famously used the metaphor of a tree to articulate his ideas around evolution.

‘The affinities of all the beings of the same class have sometimes been represented by a great tree… The green and budding twigs may represent existing species; and those produced during former years may represent the long succession of extinct species. From the first growth of the tree, many a limb and branch has decayed and dropped off; and these fallen branches of various sizes may represent those whole orders, families, and genera which have now no living representatives, and which are known to us only in a fossil state… As buds give rise by growth to fresh buds, and these, if vigorous, branch out and overtop on all sides many a feebler branch, so by generation I believe it has been with the great Tree of Life, which fills with its dead and broken branches the crust of the earth, and covers the surface with its ever-branching and beautiful ramifications.’


A hand-written "tree of life" diagram
Darwin's diagram picturing his ideas of evolution from On The Origin of Species, Charles Darwin, 1850


German scientist (and talented illustrator) Ernst Heckel was greatly inspired by Darwin’s ideas and sought to devise a great number of trees organising all life on Earth. In The Evolution of Man, Haeckel illustrates the evolutionary history of humans with a great tree, whose trunk represents our ancestral history, as our progenitors moved through stages, such as primitive worms, amphibians and apes. This tree reflects Haeckel’s (albeit not terribly Darwinian) belief that evolution was a process of perfecting, and that humans represented the pinnacle of evolution. Although the diagram reflects the attitudes of its time, it may be seen as a link between the early attempts to hierarchically organise life and contemporary approaches based on ancestral relationships and genetics.


Two pages of an open book. The left page shows a diagrammatic family tree, while the right shows a naturalistic, bare tree with labels on the branches.
The Pedigree of Man. Ernst Haeckel, The evolution of man. London, 1879.


Whilst the relationships pictured in early evolutionary trees were generally based on inference and shared traits, today’s phylogenetic trees are based on vast amounts of genomic data. In Beautiful Science, we show a ‘molecular time tree’ depicting the evolutionary relationships of all 9,993 living species of birds, illustrating when individual species diverged. The oldest species diverge closest to the centre of the circle, with more recent diversification closer to the edge.  Although modern birds first evolved some 145 – 66 million years ago, this diagram shows that they began to diversify exceptionally rapidly about 50 million years ago. This is particularly apparent for the songbirds, waterfowl, gulls and woodpeckers.


A multicoloured circular diagram showing families of birds broadening out from the centre.
Avian Tree of Life (c) Gavin Thomas, Walter Jetz, Jeff Joy, Arne Mooers, Klass Hartmann, 2012. First published in Nature.


And, indeed here we are dealing with such vast amounts of information that one might begin to wonder whether there were any way in which we could meaningfully depict all of life on Earth on an A4 sheet of paper. There have been some attempts—but Imperial College London researcher James Rosindell has come up with an ingenious solution. One Zoom Tree, an interactive tree of life, allows viewers to zoom into the tree of life and explore the evolutionary relationships between tens of thousands of mammals, birds, reptiles and amphibians.  It  uses a branch of mathematics known as fractal geometry to create an attractive visualisation that can be explored by zooming in, to get ever more detail.  The data includes sounds from the British Library’s collections and evolutionary data from scientific literature including the ‘Avian Tree of Life’ showing how the same data can be pictured in different ways.


A section of a "tree of life" drawn in stylised manner as a branching fractal, with labels indicating the name of the group and date of divergence.
James Rosindell, Imperial College London, One Zoom Tree,

Highly engaging, One Zoom Tree shows how far we’ve come in our ability to rationalise our relationship to the rest of life on Earth.  As you move along the trunk of the tetrapod tree, you see where different branches diverge, and can see relationships between different species. Did you know that the elephant’s closest relatives are hyraxes and sea cows?


Darwin wrote at the end of On the Origin of Species,

“It is interesting to contemplate a tangled bank, clothed with many plants of many kinds, with birds singing on the bushes, with various insects flitting about, and with worms crawling through the damp earth, and to reflect that these elaborately constructed forms, so different from each other, and dependent upon each other in so complex a manner, have all been produced by laws acting around us. . . There is grandeur in this view of life, with its several powers, having been originally breathed by the Creator into a few forms or into one; and that, whilst this planet has gone cycling on according to the fixed law of gravity, from so simple a beginning endless forms most beautiful and most wonderful have been, and are being, evolved.”

These new ways of picturing phylogenetic data are allowing us to better understand this beautiful, messy, complex tangled bank of life on Earth- and equip us with new ways for protecting its incredible biodiversity.


 Beautiful Science, sponsored by Winton Capital Management, is on display at the British Library until 26 May 2014 in the Folio Society Gallery. Admission is free.

13 March 2014

I Chart the British Library - Who Ate All the Pie Charts?

Festival of the Spoken Nerd and Special Guest Geeks explore the highs and lows of data visualisation as part of the Beautiful Science events season at the British Library. Rebecca Withers and Allan Sudlow report on the laughs and graphs during an evening for the sci-curious.

Monday night was not a typical night at the British Library. Over 250 self-identifying nerds and geeks poured into the Conference Centre for a night of graphs and gaffs for our data-related science comedy event, "I Chart the British Library". The  show was hosted by our friends Festival of the Spoken Nerd- the phenomenal trio of geeky songstress Helen Arney, experiment maestro Steve Mould and stand-up mathematician Matt Parker- and supported by an outstanding set (collective noun) of guest nerds.

In the first half of the show Steve taught us the difference between Venn and Euler diagrams in classic FOSTN cheeky style, whilst Matt plumbed the depths of bad data visualisation, exposing the eye-watering attempts to make marketing guff look more 'mathsy'. Helen - in wonderful periodic table couture - explored with our very own Richard Ranft (Head of BL Sound & Vision) how wildlife calls had been visualised before recorded sound had been invented, and what new science the analysis of animal vocalisation data can reveal.

Erinma Ochu - one of our special guest nerds - talked about her citizen science projects, including the fantastic sunflower project she worked on with another of our guest nerds, Jonathan Swinton. A current crowdsourcing data project - hookedonmusic - inspired Helen to finish the first half with a song to test with the audience whether she was able to write a catchy tune, or not! 


The interval was crammed with data-tastic activities giving the audience a chance to get hands on, literally in the case of Matt, who was analysing audience arm spans. Steve used social media to capture numbers from the audience for some suprising statistics in the second half of the show. As well as the aforementioned hookedon music and vocal visualisations with Helen and Richard, the audience explored multispectral imaging forensics with Christina Duffy, part of the Conservation Science Team at the BL. We also got a sneak peek behind the scenes tour of the Beautiful Science exhibition with our 'stand-up' curators: Johanna Kieniewicz and Nora McGregor.


After the break, we were treated to some analytical mayhem from the Nerds and Jonathan, as we examined some of the graphs and gaffs generated during the break. Graphing dangerous animals and a mathematically accurate love song were a perfect way to end the show.


We'd like to thank Helen, Matt, Steve and all our wonderful guest nerds for an evening of statistically significant silliness.

Please keep an eye out for highlight vidoes of the Beautiful Science events as they appear on our blog over the coming months....

21 February 2014

Beautiful Science-- Now Open!

Johanna Kieniewicz can finally get some sleep, but not before she puts up this blog post!

With great fanfare and much twittering (#BeautifulScience), our Beautiful Science exhibition opened yesterday. Beautiful Science: Picturing Data, Inspiring Insight looks at the past and present of data visualisation in science, telling stories of both discovery as well as the way we think about the information that makes up our world.



 Thus far, the exhibition has received some wonderful coverage. Rather than repeating what others are saying about the things that the exhibition contains, we thought we'd highlight a few informative and interesting posts about the exhibition.


"An august institution, yes of course, our national library, so I suppose I was rather expecting a staid parade of the editiones principes of the great masters, leavened with the odd choice manuscript, and a morning of gentle savouring and genteel pleasure.  Not a bit of it.  The modern BL has fully embraced digital." 

We couldn't agree more. We are both a physical and digital library. Our science collections range from the depths of history to the present day, and we are keen to provide access to digital old things and physical new things (and vis versa of course!)


  • On the Guardian H-Word blog, Rebekah Higgit asks the very valid question of what makes a science exhibition. After all, science has been embedded in various guises in previous library exhibitions. She picks up on the fact that this is not really a history of science exhibition-- but something that comes from a contemporary perspective, looking back. She notes:

"The British Library is the perfect institution for discussions between science, arts and the humanities to take place. While defined as a “science exhibition”, visitors to the display and participants in the accompanying events programme should be encouraged to see the aethestic and the historical in it too – just as the science of the Tudor or Georgian eras should be recognised as part of their history."




  • In The Observer, Nicola Davis highlighted how data visualisation has changed life and saved lives. Touching on exhibits, such as Florence Nightingale's Rose Diagram and John Snow's Cholera Map, she highlights the very tangible importance of data visualisation.

"From scientists to consumers, there's no escaping the onward march of big data. But as Beautiful Science shows, if we embrace the power of graphics, fresh insights to modern challenges may be glimpsed. And that could be massive."


  • Writing for Forbes Magazine, Jonathon Keats takes interest in the works by John Snow and William Farr highlighted in the exhibition.  He argues that

"Technological advances clearly distinguish the new visualizations, many of which are interactive and all of which benefit from stores of data that Victorian scientists could scarcely have imagined. Yet the older charts and maps ­– especially those of William Farr and John Snow – remain pertinent in the age of cloud computing precisely because they are more limited in scope. While Google Flu Trends is vast and ever-changing, we can easily assess what Farr and Snow were doing. Their successes and failings can help inform how we produce and consume contemporary data visualizations."


  • Similarly, on the Nature Of Schemes and Memes blog, Alex Jackson latches on to what I said about the parallels between today's explosion of infographics and something similar that happened in the Victorian era.

Informative pieces featuring some of the fantastic graphics from the exhibition have also been featured in The Independent, The Daily Mail and The Londonist. If you are not UK-based, you can also take a look at this piece that featured on the BBC World Service.

So far, it has been fascinating to see the exhibition through the eyes of others. Those of us involved in the exhibition have been so immersed in it for so long that it sometimes seems as though we can't see the forest for the trees. It is really interesting to see fresh perspectives on the exhibits we've selected that provide new insights on the exhibition as a whole, as well as individual displays. We do hope you come along to Beautiful Science-- and be sure to let us know what you think!


Beautiful Science: Picturing Data, Inspiring Insight is on display in The British Library until 26 May 2014. The exhibition is free, and is sponsored by Winton Capital Management

07 February 2014

Visualising Research – let the competition begin

At a workshop held on 24 January at the British Library people from a variety of backgrounds came to hear more about the Visualising Research competition and to be inspired. My previous post explains the background to the competition.

Our interest in data and visualising it is all coming together in Beautiful Science, an exhibition opening on 20 February in the Folio Gallery of the British Library. But this competition gives everyone a chance to visualise research for the public.

At the workshop we had presentations from the sponsors and organisers, which are available from the competition website. It was useful to hear how the Gateway to Research data (the vital data required for a competition entry to be considered) is both supplied by and used by the Research Councils and informs their decision making by providing a picture of the research funding landscape in the UK.

Richard Jones from Cottage Labs gave us the technical lowdown on the Gateway to Research data and how to get it. Anyone deciding to enter will need a fair degree of technical know-how to get at this data and manipulate it in order to reveal the story they want to tell the public. So we encourage all the designers and artists to find themselves a techie partner for this competition!

Tobias Sturt and Adam Frost from the Guardian Digital Agency gave a great joint presentation on how to develop a visualisation. The key messages from them were:

  • Data - everything rests on the data.
  • Story - think about the audience, what do they care about, what will resonate with them?
  • Charting - what is the best way to represent the data? You may need an analyst to explore different ways of representing the data (and not all bar charts are boring!).
  • Design - although everything is about design, once you have the basis of your visualisation, you need to consider how you will use colour, what layout works best, and how to make it beautiful.

And they reminded us again that collaboration is key to successful data visualisation. 

Some inspirational examples were shown:

The True Size of Africa - it’s bigger than you thought!


Bloomberg billionaires - a bit addictive this one.

Nathan Yau's blog and Flowing Data web site was recommended.

And for some examples of dynamic visualisations, the following are worth checking out:

Kepler’s Tally of Planets

If the earth were 100 pixels wide

Then for a view of how data can be used to give reality to the sometimes extraordinary inconsistencies of our world –  particularly in the way that money is distributed - we were enlightened by Andrew Steele physicist, TEDx presenter and instigator of  Rather than summarising his talk – have a look at him presenting his findings here.

The competition is open now. The closing date is 21 March. You could win £2,000. There are great judges who will see your work. And there will be kudos for the winners. 

Lee-Ann Coleman


Science blog recent posts



Other British Library blogs