Sound and vision blog

Sound and moving images from the British Library

23 April 2018

The Evolving English collection – what’s in it?

PhD placement student Rowan Campbell writes:

By 3rd April 2018 – which is, incidentally, seven years after the closing day of the exhibition – the Evolving English VoiceBank has reached 7,914 catalogued items. The last 2,100 of these have been accessioned by Andrew Booth and Rowan Campbell as part of their three-month PhD placement. While there are many records still to be catalogued, as today is English Language Day it seemed like an opportune moment to sketch out an overview of what we have in the collection and who is represented in it.

Visitors to the Library’s Evolving English exhibition in 2010/11 could record themselves reading the children’s book Mr Tickle (© Hargreaves, 1971) or donating a dialect word or phrase to the WordBank – and we now have 5,471 recordings of Mr Tickle, and 2,796 WordBank contributions catalogued. 1,462 visitors did both; 842 simply gave us their personal information such as location and year of birth; and some recorded themselves multiple times – perhaps they remembered new words, or decided that they did want to read Mr Tickle after all.

Our oldest speaker was born in 1914 and the youngest in 2006 – meaning that the age of participants ranges from 5 to 97! Interestingly, the gender of our contributors is heavily skewed towards female (65%). This may be in line with the gender split of those who are interested in linguistics or who visit British Library exhibitions (for example, the VoiceBank’s @VoicesofEnglish Twitter followers are 61% female), but it is still an unexpectedly large bias.


As would be expected, most participants were from the British Isles – that is, England, Scotland, Wales, the Isle of Man, the Channel Islands, Northern Ireland and Ireland. However, nearly 25% were from outside the British Isles, with 87 other countries represented! The twenty least represented countries had only one speaker each, and include Guyana, an English-speaking country in South America with a population about the size ofLeeds.

Top 20 countries World cities

The United States had the biggest representation, making up 44% of international contributions, but we are sadly lacking voices from five states – Idaho, Maine, Nevada, North Dakota and Wyoming. If you are from one of these places and want to record a contribution for us, please get in touch!

Unsurprisingly due to the locations of the recording booths, England was the most represented region of the British Isles, making up 91% of the collection. RP speakers (mainly from the British Isles but some from other countries) make up 25% of the collection overall, and are proportionately at their highest in Wales (40%) and lowest in the Republic of Ireland (1%).

Pie chart

In terms of representation within the British Isles, England is very well-covered, with speakers from every county except Rutland (the heat map shows no data around the Stockton-on-Tees area due to different regional classifications – we do have a number of speakers from here). As can be seen, Scotland and Wales have patchier representation but they also have far fewer contributors in general than England – around 250 and 100 respectively, compared to the 5,400 from England.

Heat map

There are also some surprises in the most-represented cities. The table below shows the top 16 British and Irish cities in the collection, with at least  20 contributions each – numbers in brackets refer to the city’s ranking in terms of population size*.

British and Irish cities

Immediately noticeable is the higher occurrence of Northern cities such as Liverpool, Newcastle, Nottingham, Hull and Derby, and the non-appearance of large cities such as Bristol and Cardiff, 6th and 11th most populous cities respectively. The first explanation for this is likely to be simply that there are fewer large cities in the South – in fact, only Bristol and Cardiff are in the top 20 at all. A second explanation could be that there were recording booths in some other cities outside London – Norfolk, Birmingham, Plymouth, Newcastle and Liverpool.

However, this does not explain the large difference in ranking of the Northern cities that did not have a recording booth. Instead, dialect levelling might be a concept to consider. Due to factors such as geographical proximity, greater mobility and fewer major accent differences between South West England, South East Wales and the South East and Greater London area, we might expect these areas to be more susceptible to dialect levelling towards RP. This has the potential to over-represent RP in these areas and thus obscure the location of contributors: while someone with an RP accent may have been ‘born and bred’ in Devon, their accent would be categorised as RP rather than Devon. Conversely, phonetic, geographical and social factors such as covert prestige and strong regional identity mean that fewer Northerners orientate to the South East and thus to RP – which could help to explain why Northern cities have climbed the rankings in our dataset respective to their actual population.

*It has not always been possible to be consistent regarding whether figures used are for greater metropolitan areas, urban areas, etc., as these are not always comparable, but this ranking has been arrived at based on the distinctions made in the collection categorisation system. Thus why we have Greater London and Greater Manchester, but not West Yorkshire (Leeds-Bradford) as this would require merging two cities.

.