The Newsroom blog

31 January 2019

The anatomy of news

“I hear new news every day”, wrote the scholar Robert Burton in 1628, “and those ordinary rumours of war, plagues, fires, inundations, thefts, murders, massacres, meteors, comets, spectrums, prodigies, apparitions, of towns taken, cities besieged in France, Germany Turkey, Poland, daily musters and preparations, and such like.” For Burton, this firehose of news amounted to a “vast confusion”, though his attitude seems to have been one of wonder rather than fear.

Burton was an Oxford man, but made regular trips to London. There he would have paid a visit to the Exchange, gathering up news and gossip from the merchants crowding the surrounding streets, before moving on to St. Paul’s Churchyard, perhaps stopping to buy a pamphlet from a hawker on the way. On front of the Cathedral he might have picked up some more pamphlets from the many booksellers lining the border of its square, or a copy of Nathaniel Butter and Nicholas Bourne’s new news publication, an innovative weekly format copied from the continent, although, somewhat disappointingly, it wouldn’t have contained any domestic news.

This short walk helps us understand how Burton perceived a world of overwhelming information. But what would he have made of the 21st century? Indeed, what would he have made of the 19th? Had he been writing, say, 250 years later, in 1872, Burton would surely have been overwhelmed by the number of titles available to him on a daily basis.


A late-seventeenth-century London coffee house (Usage terms: Creative Commons Attribution Non Commercial Share Alike licence. Held by © Trustees of the British Museum)

The 19th century is a new world for me, coming from a background of 17th century newspapers. And it is a different world. There’s the name, for one thing: the Oxford English Dictionary records the first use of the word ‘newspaper’, to mean a publication of regular, periodical news, in 1688. My own work is on the first half of the 17th century, when the word ‘news-book’ was most common, as was a host of words and phrases like ‘coranto’, ‘weekly news-sheet’, ‘weekly pamphlet’ and ‘Mercuries’, with overlapping, shifting and slightly different meanings.

This naming change can be useful – it helps us to grasp the real intellectual and material differences between the news world of the 17th century and that of the 19th. Although the change was gradual and not always linear – changes and innovations often moved backwards as well as forwards – the march of progress was did eventual pick up pace. 17th century news looked very different, much like a few sheets of A4 paper folded in half, with news in a single column. It was called a news-book because it looked like a small book. The way information was organised was different, too: early 17th century news-books contained a series of paragraphs each from a particular place, recording all the news collected from that place. The invention of the ‘article’, a unit of news based on one particular subject or event, was not to happen for some time.


The evolution from one to eight columns

This categorical divide also continues with the data. I estimate there are 1,000,000 words in Early English Books Online’s entire periodicals collection. The British Library’s collection of 19th century news runs to hundreds of millions of pages (we wrote recently that the collection consists of 60 million issues, 450 million pages... perhaps four trillion words... twenty-six trillion characters…). The other seismic change is that a computer can be taught to read (with varying accuracy) 19th century news. For the 17th, it’s still very difficult.

This Optical Character Recognition is what allows me to load up the British Newspaper Archive and check if my great-great-granddad committed any crimes in 1839 (still can’t find anything), for example, or check Limerick hurling scores from 1887. This difference isn’t just trivial: it represents a complete step-change in the way we approach newspaper history. For one thing, the datasets increase in size, by orders of magnitude. I have created a dataset of about 15,000 rows, manually collected, by reading 17th century news and noting down bits of information in a spreadsheet. 15,000 rows, from about 400 newspaper issues, which took many months to create. Yesterday, a few hours, I created a dataset of N-Grams (basically combinations of words) from a single issue of one 19th century title.  It contained 150,000 rows.

150,000 rows of generated data, from one issue. Multiply that by about 250 for a weekday title, then by hundreds of titles, then by 200 years and the potential for ‘big data’ is rather astonishing. Of course, this data is not as rich with information as my humble spreadsheet, nor does it record any kind of fine-grained detail, but it does change the types of processing, computing power and storage needed, and most importantly, the types of intellectual questions that are and are not answerable. My 17th century dataset is like interviewing everyone in a small town, in some detail; the 19th century datasets we’ll be working with on our Heritage Made Digital newspapers project records the cosmos – albeit from far away. We don’t know much, but we know it about an enormous number of things. But the differences extend past volume: there is also a step-change in readership and scope.

The 19th century newspaper was everywhere. Some of the most popular 17th century newsbooks were probably printed in weekly runs of about 2,000; by 1863, the Daily Telegraph had a circulation of 120,000 per day. In 1628 Burton was overwhelmed by information in London and Oxford but elsewhere the firehose could be a drip, or a drought. By the 19th century news surged through the country’s arteries, veins and capillaries: at first everywhere within the reach of the train; eventually the telegraph, information finally travelling at the speed of light, in dots and dashes. It was the most pervasive cultural object of the century.


Newspaper titles held by the British Library, year by year, 1621-1900

Even accounting for the reuse and sharing of copies this is a fundamentally very different type of cultural artefact. If I analyse every page of news in the early 17th century, I have a vast record of events, and the thoughts and feelings of a select group of people. In the 19th century, the newspaper is a reasonable proxy for the way society thinks. To me it seems as though news in the 19th century captures a good proportion of a collective consciousness. It is a reasonable (though problematic) way to infer societal change. Through the newspaper’s great reach we can understand historical forces. The articles and personalities in the 19th century newspaper can tell us about structures of power. Its advertisements identify trends, economic forces and the changing roles within the family. The words themselves and their frequencies can help us understand the use of language, or uncover drifts in sentiments towards political movements, ideologies and so forth. In the 17th century the readership is so small, such a small part of the diet of information ingested by both important and ordinary people, that the questions we ask of its remains are different. Not less important, certainly not less interesting, but surely of a different kind.

Yes, the 19th century news world feels like a different one to the 17th. A mostly new world, with some evidence of the ruins of its earlier civilisation: the old towers are fallen, though echoes of their presence remain. The vast confusion had been replaced with one infinitely greater. Our job is to find, research and understand the new techniques that are necessary to make sense of this information overload.

Yann Ryan

Curator, Newspaper Data