Digital scholarship blog

16 July 2018

Crowdsourcing comedy: date and genre results from In the Spotlight

Beatrice Ashton-Lelliott is a PhD researcher at the University of Portsmouth studying the presentation of nineteenth-century magicians in biographies, literature, and the popular press. She is currently a research placement student on the British Library’s In the Spotlight project, cleaning and contextualising the crowdsourced playbills data. She can be found on Twitter at @beeashlell and you can join the In the Spotlight project at

In this blog post I discuss the data created so far by In the Spotlight volunteers via crowdsourcing – which has already thrown out quite a few surprises along the way! All of the data which I discuss was cleaned using Open Refine, with some manual intervention by me to group categories such as genre. My first post below highlights the most notable results to come out of the date and genre tasks so far, and a second post will present similar findings for play titles and playwrights.


I started off by analysing the dates generated by the projects as, to be honest, it seemed easiest! One of the problems we’ve encountered with the date tasks, however, is that a number of the playbills do not show a full date.  This is notable in itself but unsurprising – why would a playbill in the eighteenth or nineteenth century need a full date when they weren’t expected to last two hundred years into the future? With that in mind, this is by no means an exhaustive data set.

After creating a simple graph of the most popular dates, it became clear that we had a huge spike in the number of performances in 1825. Was something relevant to theatre history happening during this year, or were the sources of the playbill collections just unusually pro-active in 1825 after taking some time off? Was the paper stock quality better, so more playbills have lasted? The outside influence of the original collector or owner of these playbills is also something to consider, for instance, maybe he was more interested in one type of performance than others, had more time to collect playbills in certain years or in certain places, and so on. A final potential factor is that this data also only comes from the volumes added to the site projects so far, and so isn’t indicative of the Library’s playbills as a whole.

Aside from source or collector influence, some other possible explanations do present themselves. Britain in general was growing exponentially, with London in particular becoming one of the biggest cities in the world, and this era also saw the birth of railways and the extravagant influence of figures such as George IV. As this is coming off the back of what seems to be a very slow year in 1824, however, perhaps it is best just to chalk this up to the activity of the collectors. We also have another noticeable spike in 1829, but by no means as dramatic as that of 1825. I’ve spent a bit of time comparing the number of performances seen in the volumes with other online performance date tools, such as UMass's Adelphi calendar and Godwin’s Diary to compare numbers, but would love to hear any further insights into this!

alt="Graph of most popular dates"
A graph showing the most popular performance dates


The main issue I faced in working with the genre data was the wide variety of descriptors used on the playbills themselves. For instance, I encountered burlesque, burletta and burlesque burletta – which of the first two categories would the last one go under? When I went back to the playbills themselves, it was also clear that many of the ‘genres’ generated were more like comments from theatre managers or just descriptions e.g. ‘an amusing sketch’. With this in mind, genre was the dataset which I ‘interfered’ with the most from a cleaning point of view.

Some of the calls I made were to group anything cited as ‘dramatic ___’ with drama more widely, unless it had a notable second qualifier, such as pantomime, Romance or sketch. I also grouped anything mentioning ‘historical’ together, as from a research point of view this is probably the most prominent aspect, grouped harlequinades with pantomimes (although I know this might be controversial!) and grouped anything which involved a large organisation, such as military, Masonic or national performances, under ‘organisational’. Some were difficult to separate – I did wonder about grouping variety and vaudeville together, but as there were so few of each it seemed better to leave them be.

With these qualifications in mind, by far the most popular genre in the collections was farce, which I kept distinct from comedy, clocking up 537 performances from the projects. This was closely followed by comedy more generally with 527 performances, with the drama (197), melodrama (150) and tragedy (135) trailing afterwards. Once again, it could purely be that the original collectors of these volumes had more of a taste for comedy than drama, but there is such a wide gap in popularity from the volumes so far that it seems fair to conclude that the regional theatre-going public of the late eighteenth and early nineteenth centuries preferred to be cheered rather than saddened by their entertainment.

alt="Graph of the most popular genres"
A graph showing the most popular genres in records transcribed to date

You can contribute to this research

The more contributions we receive, the more accurate the titles, genre and dates results will be, so whether you’re looking out for your local theatre or interested in the more unusual performances which crop up, get involved with the project today at In the Spotlight is well on the way to hitting 100,000 contributions – make sure that you’re one of them!