Science blog

Exploring science at the British Library

29 posts categorized "Data"

15 January 2021

zbMATH Open - mathematical database now free online

zbMATH Open - the first resource for mathematics. The logo is a white square containing a small grey square in the upper left corner and a larger red square in the lower right corner

We are very happy to hear that zbMATH, one of the most important bibliographic databases in the field of mathematics, is now freely available to all online. The database is run by FIZ Karlsruhe, the European Mathematical Society and the Heidelberg Academy of Arts and Sciences, and the funding to make it free to all was provided by the Joint Science Conference, the German national government organisation for science research funding and policy.

The database covers mathematics books and scholarly articles comprehensively since 1868, with some items from considerably earlier. It includes material from the paper abstracts journals Jahrbuch über die Fortschritte der Mathematik (1868-1945) and Zentralblatt für Mathematik (1931-2013). It can be searched by author and subject as normal, but also includes searching by mathematical formula and the subject-specific Mathematics Subject Classification. It includes not just abstracts, but independent reviews of the significance of important articles, although some of these are in German rather than English. It also has both forward and backward citation data. Where possible links to the online full-text item are provided.

The administrators are currently working on developing an API to allow content from zbMATH to be used in other digital information systems on an open access basis.

Anybody with an interest in mathematics is heartily recommended to try it out.

18 June 2020

Citizen Science and COVID-19

Your experience of the COVID-19 pandemic could be an important contribution to science. Researchers from diverse disciplinary backgrounds are keen to learn about your stories, insights, routines, thoughts and feelings. While some projects would be eager to receive diaries in the narrative style of Samuel Pepys or John Evelyn, others want more specific information in survey format.

Hand-drawn and painted cartoon illustrating various ways people have entertained themselves during lockdown
Illustration: Graham Newby, The British Library: Lockdown Rooms (3rd June 2020)

Citizen science engages self-selected members of the public in academic research that generates new knowledge and provides all participants with benefits. The engagement can vary from data gathering or participatory interpretation to shared research design. Different forms of citizen science can be referred to as public science, public participation in scientific research, community science, crowd-sourced science, distributed engagement with research and knowledge production, or trans-disciplinary research that integrates local, indigenous and academic knowledge.

Contributing to citizen science projects sustains a sense of control, sense of belonging (empowering feelings in and after isolation) and sense of being useful which are particularly important in uncertain times. According to the UK Environment Observation Framework, self-measured evidence is more trusted by people, and organisations that draw on data generated through citizen science are more trusted. Trust is linked to transparency. Better understanding of how scientific knowledge is produced, and having a role and responsibility in shaping the knowledge production process, are likely to enable citizen scientists to re-frame the often-uneasy relationship between society and science.

Scale is a distinctive feature of citizen science. The more people are engaged, the more comprehensive an understanding can be reached about the researched topic. The featured COVID-19 Symptom Study has become the largest public science project in the world in a matter of weeks:  3,881,488 citizen scientists are involved as of 18th June 2020. Big data allowed medics to develop an artificial intelligence diagnostic that can predict the likelihood of having COVID-19 based on the symptoms only: a vital tool indeed when testing is limited.

The citizen science initiatives highlighted here, COVID-19 Symptom Study, COVID-19 and You, and COVID Chronicles, may inspire you to contribute to them or find other projects where you can take an active role in developing better understanding of current and future epidemics.

COVID-19 Symptom Study
Institutions: King's College London, ZOE
Launched: 25th March 2020
Your contribution helps you and researchers understand COVID-19 and the dynamics of the pandemic (UK, USA).
How: Submit your physical health status regularly.

COVID-19 and You
Social sciences
Institutions: The Open University, The Young Foundation
Launched: 7th April 2020
Your contribution helps you and researchers understand how COVID-19 is affecting households and communities across the world.
How: Fill in an online survey with choices and narratives.

In addition to supporting current research, your contribution could add to future inquiries as well. Collecting and archiving short personal stories ensures authentic data will be available when researchers in the future look back to us now with their research questions. Reliable data should be collected now, while we are still living in unprecedented times. It is especially important to record the experiences of people from less privileged backgrounds, in contrast to earlier pandemics where the voices of all but the upper and middle classes, and the political, legal and scholarly elite, have often been lost to history. COVID Chronicles, an archival initiative, is doing just that. COVID Chronicles is a joint project: BBC 4 PM collects and features some of the stories and The British Library archives them all for future academic inquiries.

COVID Chronicles
History, social sciences
Institutions: BBC Radio 4, The British Library
Launched: 30th April 2020
Your contribution helps you and future researchers understand how people experience the COVID-19 pandemic in their daily life, at a personal level.
How: Submit a mini-essay (about 400 words) to BBC Radio 4 PM via e-mail: pm at bbc dot co dot uk. Your essay will be archived by The British Library and made available for future research.

The gradually easing lockdown and the anticipated long journey of national and global recovery generate a growing appetite to record, reflect on and analyse the COVID-19 epidemic's influence on our life. Not all "citizen science" projects observe high standards of privacy and ethical responsibility, however. Before joining in any research with public participation, consider the principles of citizen science suggested by the European Citizen Science Association and the questions below:

Five questions before joining a citizen science initiative

  1. Can you contact the researchers and the institution(s) they belong to with your questions and concerns?
  2. Is the research approach clear to you? In order words, is it clear to you what happens to your contribution, how it shapes the investigation and what new knowledge is expected?
  3. Is your privacy protected? In other words, is the privacy policy clear to you, including how you can opt out any time and be sure that your data are deleted?
  4. Are you contacted regularly about the progress of the research you are contributing to?
  5. Are you gaining new transferable skills, new knowledge, insights and other benefits by participating in the research?

Further reading:

Bicker, A., Sillitoe, P., Pottier, J. (eds) 2004. Investigating Local Knowledge: New Directions, New Approaches. Aldershot : Ashgate.
BL Shelfmark YC.2009.a.7651, Document Supply m04/38392

Citizen Science Resources related to COVID-19 pandemic (annotated list)
[Accessed 18th June 2020]

Curtis, V. 2018. Online citizen science and the widening of academia: distributed engagement with research and knowledge production. Basingstoke, Hampshire: Palgrave Macmillan.
Available as an ebook in British Library reading rooms.

Open University. 2019. Citizen Science and Global Biodiversity  (free online course)
[Accessed 18th June 2020]

Sillitoe, P. (ed). 2007. Local science vs global science: approaches to indigenous knowledge in international development. New York : Berghahn Books.
BL Shelfmark YC.2011.a.631, also available as an ebook in British Library reading rooms.

Written by Andrea Deri, Science Reference Team

Contributions from Polly Russell, Curator, COVID Chronicles, and Phil Hatfield, Head of the Eccles Centre for American Studies, are much appreciated.


14 January 2020


A handwritten letter from Ada Lovelace to Charles BabbageThe British Library is joining in the International Day of Women and Girls in Science, celebrating and raising the voices of women in science with a one day mini festival. Our events and talks will encourage you to laugh, sing and think. Every few days this blog will look in more detail at the participants and their involvement with the event.

From 1pm drop in to our free Entrance Hall sessions, including fun scientific presentations, hands-on activities and a chance to create your own (bio)selfie using the bacteria swabbed from your cheek. There’s something for all ages and levels of science knowledge. See the full list of activities here.
Then join us for an evening of talks to hear from women about their experiences of working in the sciences. This is a ticketed event and tickets can be purchased from our website.

The British Library holds one of the most comprehensive national science collections in the world, ranging from ancient manuscripts grappling to understand different aspects of the world, prior to the development of science as we know it today, to the latest scientific publications deposited at the Library through the electronic legal deposit every day. The British Library preserves the UK scientific record, supports scientific research and enables access to science for all, which includes supporting equality and diversity in science. During 2020 the Library’s exhibition Unfinished Business: The Fight for Women's Rights will be looking into the struggle for women’s rights in all walks of life which includes an ongoing struggle for equality in all areas of science, technology and engineering. The WISE Festival is an opportunity to start our reflection on women’s rights and to celebrate the achievements of women in science in a way that we hope will be fun, inspirational and thought-provoking.

Join us next time to find out more about Sunetra Gupta.

WISE (WOMEN IN SCIENCE EVENTS) Festival, British Library 11 February 2020.

29 August 2017

I4OC: The British Library and open data

In August the British Library joined the Initiative for Open Citations as a stakeholder. The I4OC’s aim of promoting the availability of structured, separable, open citation data fits perfectly with the Library's established strategy for open metadata which has just marked its seventh anniversary. I4oc logo

In August 2010, responding to UK Government calls for increased access to public data to promote transparency, economic growth and research, the British Library launched the strategy by offering over 16m CC0 licensed records from its catalogue and national bibliography datasets. This initiative aimed to remove constraints created by restrictive licensing and library specific standards to enable wider community re-use. In doing so the Library aimed to unlock the value of the data while improving access to information and culture in line with its wider strategic objectives.
The initial release was followed in 2011 by the launch of the Library’s first Linked Open Data (LOD) bibliographic service. The Library believed Linked Open Data to be a logical evolutionary step for the established principle of freedom of access to information, offering trusted knowledge organisations a central role in the new information landscape. The development proved influential among the library community in moving the Linked Data debate from theory to practice.

Over 1,700 organisations in 123 countries now use the Library’s open metadata services with many more taking single files. The value of the Library’s open data work was recognised by the British National Bibliography linked dataset receiving a 5 star rating on the UK Government site and certification from the Open Data Institute (ODI). In 2016 the Library launched the platform in order to offer copies of a range of its datasets available for research and creative purposes. In addition, the BL Labs initiative continues to explore new opportunities for public use of the Library’s digital collections and data in exciting and innovative ways. The British Library therefore remains committed to an open approach to enable the widest possible re-use of its rich metadata and generate the best return on the investment in its creation.

I4oc users
I4OC users by country


As the example of the British Library’s open data work shows, opening up metadata facilitates access to information, creates efficiencies and allows others to enhance existing and develop new services. This is particularly important for researchers and others who do not work for organisations with subscriptions to commercial citation databases. The British Library believes that opening up metadata on research facilitates both improved research information management and original research, and therefore benefits all.

The I4OC’s recent call to arms for its stakeholders is therefore very much in tune with the British Library’s open data work in promoting the many benefits of freely accessible citation data for scholars, publishers and wider communities. Such benefits proved compelling enough to enable the I4OC to secure publisher agreement for nearly half of indexed scholarly data to be made openly accessible. This data is now being used in a range of new projects and services including OpenCitations and Wikidata. It's encouraging to see I4OC spreading the open data ideal so successfully and it is to be hoped that it will also succeed in ensuring open citations become the default in future.

Correction: Image shows users of BL open data services by country, not I4OC

03 February 2017

HPC & Big Data

Matt and Philip attended the HPC & Big Data conference on Wednesday 1st February. This is an annual one-day conference on the uses of high-performance computing and especially on big data. “Big data” is used widely to mean very large collections of data in science, social science, and business.

There were some very interesting presentations over the day. Anthony Lee from our friends the Turing Institute discussed the Institute’s plans for the future and the potential of big data in general. The increasing amounts of data being created in “big science” scientific experiments and the world at large mean that the problems of research have shifted from data collection being the hard part to processing capabilities being overwhelmed by the sheer volume of data.

A presentation from the Earlham Institute and Verne Global revealed that Iceland could become a centre for high-performance computing in the future, thanks to its combination of cheap, green electricity from hydroelectric and geothermal power, high-bandwidth data links to other continents, and a cool climate which reduces the need for active cooling of equipment. HPC worldwide now consumes more energy than the entire airline industry and whole countries of the size and development level of Italy and Spain. Seljalandsfoss-1207956_1280

Dave Underwood of the Met Office described the Met Office’s acquisition of the largest HPC computer in Europe. He also pointed out the extreme male-biased demographic of the event, something that both Matt and Philip had noticed (although we admit, one of our female team members could have gone instead of Philip).

Luciano Floridi of Oxford University discussed the ethical issues of Big Data and pointed out that as intangibles become a greater portion of companies’ value, so scandal becomes more damaging to them. Current controversies involving behaviour on the internet suggest that moral principles of security, privacy, and freedom of speech may be increasingly conflicting with one another, leading to difficult questions of how to balance them.

JISC gave a presentation on their actual and planned shared HPC data centres, and invited representatives from our friends and neighbours at the Crick Institute, and the Wellcome Trust’s Sanger Institute on their IT plans. Alison Davis from Crick pointed out that an under-rated problem for academic IT departments is individual researchers’ desire to carry huge quantities of digital data with them when they move institutions, causing extra demand on storage and raising difficult issues of ownership.

Finally, Richard Self of the University of Derby gave an illuminating presentation on the potential pitfalls of “big data” in social science and business, such as the fact that the size of a sample does not guarantee that it is representative of the whole population, the probability of finding apparent correlations in a large sample that are created by chance and not causation, and the lack of guaranteed veracity. (For example, in one investigation 14% of geographical locations from mobile phone data were 65km or more out of place.)

Philip Eagle, Content Expert - STM

05 September 2016

Social Media Data: What’s the use?

Team ScienceBL is pleased to bring you #TheDataDebates -  an exciting new partnership with the AHRC, the ESRC and the Alan Turing Institute. In our first event on 21st September we’re discussing social media. Join us!

Every day people around the world post a staggering 400 million tweets, upload 350 million photos to Facebook and view 4 billion videos on YouTube. Analysing this mass of data can help us understand how people think and act but there are also many potential problems.  Ahead of the event, we looked into a few interesting applications of social media data.

Politically correct? 

During the 2015 General Election, experts used a technique called sentiment analysis to examine Twitter users’ reactions to the televised leadership debates1. But is this type of analysis actually useful? Some think that tweets are spontaneous and might not represent the more calculated political decision of voters.

On the other side of the pond, Obama’s election strategy in 2012 made use of social media data on an unprecedented scale2. A huge data analytics team looked at social media data for patterns in past voter characteristics and used this information to inform their marketing strategy - e.g. broadcasting TV adverts in specific slots targeted at swing voters and virtually scouring the social media networks of Obama supporters on the hunt for friends who could be persuaded to join the campaign as well. 

Image from Flickr

In this year's US election, both Hillary Clinton and Donald Trump are making the most of social media's huge reach to rally support. The Trump campaign has recently released the America First app which collects personal data and awards points for recruiting friends3. Meanwhile Democrat nominee Clinton is building on the work of Barack Obama's social media team and exploring platforms such as Pinterest and YouTube4. Only time will tell who the eventual winner will be.

Playing the market

You know how Amazon suggests items you might like based on the items you’ve browsed on their site? This is a common marketing technique that allows companies to re-advertise products to users who have shown some interest in the brand but might not have bought anything. Linking browsing history to social media comments has the potential to make this targeted marketing even more sophisticated4.

Credit where credit’s due?

Many ‘new generation’ loan companies don’t use a traditional credit checks but instead gather other information on an individual - including social media data – and then decide whether to grant the loan5. Opinion is divided as to whether this new model is a good thing. On the one hand it allows people who might have been rejected by traditional checks to get credit. But critics say that people are being judged on data that they assume is private. And could this be a slippery slope to allowing other industries (e.g. insurance) to gather information in this way? Could this lead to discrimination?

Image from Flickr

What's the problem?

Despite all these applications there’s lots of discussion about the best way to analyse social media data. How can we control for biases and how do we make sure our samples are representative? There are also concerns about privacy and consent. Some social media data (like Twitter) is public and can be seen and used by anyone (subject to terms and conditions). But most Facebook data is only visible to people specified by the user. The problem is: do users always know what they are signing up for?

Image from Pixabay

Lots of big data companies are using anonymised data (where obvious identifiers like name and date of birth are removed) which can be distributed without the users consent. But there may still be the potential for individuals to be re-identified - especially if multiple datasets are combined - and this is a major problem for many concerned with privacy.

If you are an avid social media user, a big data specialist, a privacy advocate or are simply interested in finding out more join us on 21st September to discuss further. Tickets are available here.

Katie Howe

15 March 2016

Tunny and Colossus: Donald Michie and Bletchley Park

In honour of British Science Week Jonathan Pledge explores the work of Donald Michie, a code-breaker at Bletchley Park from 1942 to 1945. The Donald Michie papers are held at the British Library.

Donald Michie (1923-2007) was a scientist who made key contributions in the fields of cryptography, mammalian genetics and artificial intelligence (AI).

Copy of a photograph of Donald Michie taken while he was at Bletchley Park (Add MS 89072/1/5). Copyright the estate of Donald Michie/Crown Copyright.

In 1942, Michie began working at Bletchley Park in Buckinghamshire as a code-breaker under Max H. A. Newman. His role was to decrypt the German Lorenz teleprinter cypher - codenamed ‘Tunny’.

The Tunny machine was attached to a teleprinter and encoded messages via a system of two sets of five rotating wheels, named ‘psi’ and ‘chi’, by the code-breakers. The starting position of the wheels, known as a wheel pattern, was decided by a predetermined code before the operator entered the message. The encryption worked by generating an additional letter, derived from the addition of each letter generated by the psi and chi wheels to each letter of the unencrypted message entered by the operator. The addition worked by using a simple rule represented here as dots and crosses:

• + • = •

x + x = •

• + x = x

x + • = x

Therefore using these rules, M in the teleprinter alphabet, represented as:  • • x x x, added to N: • • x x •, gives • • • • x, the letter T.

Detail of the Lorenz machine showing the encoding wheels. Creative Commons Licence.

In order for messages to be decrypted it was initially necessary to know the position of the encoding wheels before the message was sent. These were initially established by the use of ‘depths’. A depth occurred when the Tunny operator mistakenly repeated the same message with subtle textual differences without first resetting the encoding wheels.

A depth was first intercepted on 30 August 1941 and the encoding text was deciphered by John Tiltman. From this the working details of Tunny were established by the mathematician William Tutte without his ever having seen the machine itself; an astonishing feat. Using Tutte’s deduction the mathematician Alan Turing came up with a system for devising the wheel patterns; known as ‘Turingery’.

Turing, known today for his role in breaking the German navy’s ‘Enigma ‘code, was at the time best known for his 1936 paper ‘On Computable Numbers’ in which he had theorised about a ‘Universal Turing Machine’ which today we would recognise as a computer. Turing’s ideas on ‘intelligent machines’, along with his friendship, were to have a lasting effect on Michie and his future career in AI and robotics. 

Between July and October 1942, all German Tunny messages were decrypted by hand. However changes to the way the cypher was generated meant that finding the wheel setting by hand was no longer feasible. It was again William Tutte who came up with a statistical method for finding the wheels settings and it was the mathematician Max Newman who suggested using a machine for processing the data.

FO 850_234-2
Colossus computer [c 1944]. By the end of the War there were ten such machines at Bletchley. Crown Copyright.

Initially an electronic counter dubbed ‘Heath Robinson’ was used for data processing. However it was not until the engineer Thomas Flowers, designed and built Colossus, the world’s first large scale electronic computer, that wheel patterns and therefore the messages could be decrypted at speed. Michie too, along with Jack Good, played a part, discovering a way of using Colossus to dramatically reduce the processing time for ciphered texts.

The decrypting of Tunny messages was critical in providing the Allies with information on high level German military planning in particular for the Battle of Kursk in 1943 and surrounding preparations for the D-Day invasion of 1944

One of the great ironies is that much of this pioneering and critical work remained a state secret until 1996. It was only through Donald Michie’s tireless campaigning that the General Report on Tunny, written in 1945 by Michie, Jack Good and Geoffrey Timmins, was finally declassified by the British Government; providing proof of the code-breakers collective achievements during the War. 

Pages from Donald Michie’s copy of the General Report on Tunny. (Add MS 89072/1/6). Crown Copyright.

 Donald Michie at the British Library

The Donald Michie Papers at the British Library comprises of three separate tranches of material gifted to the library in 2004 and 2007. They consist of correspondence, notes, notebooks, offprints and photographs and are available to researchers through the British Library’s Explore Archives and Manuscripts catalogue at Add MS 88958, Add MS 88975 and Add MS 89072.


Jonathan Pledge: Curator of Contemporary Archives and Manuscripts, Public and Political Life

Read more about ciphers in the British Library's collections on Untold Lives

13 October 2015

‘Your Puzzle-Mate’: Ada Lovelace and Charles Babbage

On Ada Lovelace Day, Alexandra Ault explores the British Library's collection of correspondence between Ada Lovelace and Charles Babbage.                    

Did you know that the British Library holds an incredible set of letters from Ada Lovelace to Charles Babbage? Dating between 1836-1851, the letters from the mathematician and only daughter of Lord Byron to the inventor of the first successful automatic calculator, record a working relationship and friendship between two great minds. Despite Lovelace’s young age when she began writing to Babbage who was twenty-four years her senior, her letters reveal not only an incredible mathematical talent but an organised sensibility.

Letter from Ada Lovelace to Charles Babbage, 10 July 1843, Add MS 37192. Noc

Add MS 37192 contains 29 letters from Lovelace to Babbage which sit with letters to Babbage from other great Victorian inventors, writers and politicians including Charles Dickens, Sir Robert Peel,  Michael Faraday and Isambard Kingdom Brunel.

Looking at excerpts from letters written by Lovelace to Babbage in 1843, it is possible to see not just collaboration between the two mathematicians, but a friendship whereby Lovelace chastised and encouraged Babbage:

On 19 (?) July 1843 Lovelace wrote

“My dear Babbage, It is quite evident to me that you have been looking over the superseded sheet 4, instead of the corrected one.”

And on the 13 July 1843:

“Will you come at mine on Saturday morning and stay as long as we find requisite. I name so early an hour because we shall have much to do I think. And it certainly must not be later than ten o’clock”. 

Ada Lovelace by William Henry Mote, after Alfred Edward Chalon, published 1839, National Portrait Gallery NPG D5123  NPG CC By

 In her letters, Lovelace displays both a keen sense of humour and dedication to mathematical investigation. On 10 July 1843 she wrote:

“Mr Dear Babbage, I am working very hard for you; like the Devil in fact (which perhaps I am). I think you will be pleased. I have made what appears to me some very important exclusions and improvements”.

21 July (?) 1843:

“My Dear Babbage, I am in much dismay at having got into so amazing a quagmire and botheration with these numbers”.

In this letter Lovelace signs herself off as “Your puzzle-mate” showing both the professional and friendly nature of their relationship.

The British Library has featured one of the Lovelace Letters on their Treasures Page:

Alexandra Ault, Curator, Modern Archives and Manuscripts 1601-1850.

07 October 2015

The Ugly Truth

On 28th September the British Library hosted the 10th Annual Sense About Science lecture entitled "The Ugly Truth" and delivered by Sense About Science director Tracey Brown. The British Library's mission is to make our intellectual heritage accessible to everyone for research, inspiration and enjoyment. This key purpose aligns with that of Sense About Science who are making research accessible by equipping people to make sense of science and evidence. In this guest post, Voice of Young Science member Sheena Cowell summarizes the lecture highlights.

Towards the end of my PhD I was often asked by interested friends and family “So, what have you found out then?” I knew this question was innocent enough, but in the complexity of my project and the stress of trying to write up, I would often revert to something along the lines of “we had this nice idea, but in the end it didn’t quite work”. This was not the truth. I was distilling my results, removing the nuances of my research and giving an answer that was simpler, easier. Science rarely has definitive answers. Scientists spend their days finding evidence to support or disprove arguments and hypotheses within their fields. Uncertainty is accepted. Probabilities and error bars are scrutinised alongside results. But, when it comes to explaining a body of scientific work to a wider audience, this uncertainty is often left out. Evidence is simplified. Results and outcomes are over or understated in order to get a point across. But what harm does this do?

 On Monday 28th September at the Sense About Science Annual Lecture, Tracey Brown gave a talk exploring just that; the difficulty of telling the whole ‘truth’ or challenging ‘truths’ in the public arena. As scientists or even as advocates of evidence, we can sometimes alter the evidential ‘truth’ in favour of a simplified explanation or an uncomplicated argument. However, in her talk, Tracey argued that evidence should be presented warts and all, including the uncertainty and unknowns that it can expose. “The Ugly Truth” explored the concept that the oversimplification of evidence and the lack of critical scrutiny of established claims, can be detrimental to public accountability and to the scientific community itself.

At the beginning of her lecture, Tracey Brown quoted Henning Mankell’s book ‘The White Lioness

“The truth is complicated, multi-faceted, contradictory. On the other hand, lies are black and white.”

This quote to me, sums up the messy nature of scientific ‘truths’. We do not live in a world of black and white, but one of endless shades of grey, where what we know as ‘true’ is constantly changing as science advances and technology evolves.

Tracey explored the many reasons that evidence can be overstated or uncertainties ignored. Often the truth can be difficult. If we look for instance at clinics offering miracle cures for cancer as Tracey did in her talk, we can see that the evidence for these ‘cures’ may be limited. In reality however, it is hard to question these ‘cures’ and destroy the hope they can provide. Other times it may appear in the public’s interest to simplify the evidence to make a point. This is often the case for many public health campaigns. Who cares about the evidence if the outcome is positive? For example the ‘5 A DAY’ campaign, where numbers touted may vary from country to country, but we can all agree that eating more fruit and vegetables is a good thing. And finally, it may be that a ‘fact’ or claim is so well established we don’t even think to question it, or put it under critical scrutiny.

Tracey Brown. Photo: Richard Lakos

While these reasons can be compelling, they can become problematic. If uncertainty and accountability for evidence is not present at every level of public life, how can we introduce it in more nuanced scientific areas? By denying people the opportunity to understand scientific uncertainty, we can become trapped by our oversimplifications. We are left with the fear that uncertainty will be misused by critics and we begin to dread the question “But, are you sure?”

In the end Tracey’s argument comes down to mutual trust. The public needs to be trusted with uncertainty. As a scientific community we must be trustworthy and present the uncertainty that accompanies our work. We need to give the public the tools to ask for and demand evidence and accountability. There will be missteps and misunderstandings along the way. Opinion and motive will always find a way to clash with evidence. But by promoting the true nature of scientific evidence, people will be free to make fully informed decisions in a world where evidence and accountability cannot be ignored.

To listen to Tracey Brown’s talk in full (without any oversimplifications) visit the Guardian website or download the podcast here. To learn more about Sense About Science, or get involved in their Ask for Evidence campaign visit

Sheena Cowell recently completed her PhD at Imperial College London in Medicinal Chemistry and Cancer Imaging. Sheena is a member of Voice of Young Science, a programme to encourage early career researchers to play an active role in public debates about science.Sense About Science is a charity that works with scientists and members of the public to change public debates and to equip people to make sense of science and evidence.

24 September 2015

A novel use of PhD data: Investigating the state of the Dementia Workforce

Katie Howe explains how data from the British Library’s electronic thesis service EThOS has been used in a report into the state of dementia research in the UK.

EThOS is the British Library’s electronic theses service. By working with universities across the UK EThOS is able to provide records for over 400,000 UK PhD theses going back as far as the 19th century. For 165,000 of these PhD theses it is also possible to access a full text version of the document. A key feature of EThOS is that you don’t have to come to the BL to use it - in fact it is accessible from anywhere in the world.

In previous blog posts we have described how EThOS could be a valuable resource for scientific researchers (see here and here). However, as an extensive source of information on PhDs undertaken in the UK, EThOS data can also be used to look at trends in PhD research over time. A recent report by the Alzheimer’s Society illustrates this approach. Graph

The Alzheimer’s Society appointed RAND Europe to produce a report on the state of dementia research in the UK. RAND wished to investigate the dementia workforce pipeline - how many researchers are working on dementia and how this is changing over time. As EThOS contains records for a high (and growing proportion) of recent PhD theses, RAND contacted the EThOS team to ask for their help with this investigation. EThOS Metadata Manager Heather Rosie and her colleagues undertook bespoke analysis for RAND and produced a list of theses awarded from 1970 onwards. The graph above shows the results. Dementia-related PhD research has been steadily increasing over the last 30 years. However, cancer-related PhDs have skyrocketed over the same time frame. Now five times more PhD researchers chose to work on cancer than dementia.

InfographicRAND were also interested in what proportion of PhD students studying dementia stay in the field. To investigate this they traced about 1500 dementia PhD researchers to find out about their career since finishing their PhD. The results show that of those who do complete a PhD in dementia, retention in the field is poor with 70% leaving the field within four years. Only 21% are still researching dementia. (The results are summarised on the infographic opposite. A full version of which can be seen here)

The researchers gave a number of reasons for leaving the field of dementia but amongst the most common was a concern over the increasing competition for senior faculty positions. This is not a problem unique to dementia research but spans all of academia. This is a familiar issue for us in team ScienceBL and a previous series of blog posts outlines some alternative career options for those undertaking biomedical PhDs (here and here).

As well as being a great source of detailed information for scientific researchers, PhD theses accessed through EThOS can be used to find out about individual researchers or to help students structure their own PhD thesis. This report shows another novel use of PhD data enabled by the size and national scope of the EThOS resource. The full report can be seen here.

Katie Howe

Science blog recent posts



Other British Library blogs