Digital scholarship blog

Enabling innovative research with British Library digital collections

30 October 2013

Guess the journal!

Over recent months I’ve been working on-and-off with a collection of metadata relating to articles published since 1995 in journals the library have categorised under the ‘History’ subject heading. 382497 rows of data (under CC0) about publication habits in the Historical profession, which lend themselves to some interesting analysis and visualisation.

To recap from previous posts on this blog and on another, I started this work by extracting words which frequently occurred within journal article titles. Having filtered out words whose meaning was fuzzy (‘new’, ‘early’, ‘late’, ‘age’) or whose presence was not helpful (‘David’), I was left with this list of topwords (I’ve avoided ‘keywords’, I just don’t like the word at the moment):

africa america archaeology art britain british china chinese cultural culture development empire england europe france historical history identity life making medieval national policy political politics power revolution social society state study women world

Next I created a .csv where each row represented an occurrence of a one of these 33 topwords in an article title. This totalled 209210 rows; and though this was less than the total number of rows, as many titles contained more than one of these words some articles were represented more than once.

Before we get to the fun bit, there are a number of problems with the data that need pointing out:

  • There are some odd gaps and declines in article volume for some journals around 2005. This isn’t due to actual publication trends, so we are working on why the data isn’t accurate – huge thanks to the Metadata Services team (especially Corine Deliot) for their hard work.
  • The volume of English language titles smother the various English, Italian and – notably thanks to Zeitschrift für Geschichtswissenschaft – German titles, leaving us with very Anglophonic data. I’d like to do some translating, but for now I’ll restrict myself to trends in English language articles.
  • The data isn’t smoothed by articles per journal issue (or articles per journal per year), thus ‘power’ journals are created on sheer volume of output alone (and, as we all should know and should hope to be the mantra of future academic publication, less can be more…).
  • The data includes reviews, though this isn’t necessarily a bad thing as it adds book titles to the list of titles mines (hence why ‘David’ is one of the unfiltered topwords).
  • Some words have multiple meanings (china) or are ill-suited to simple text mining (art), but then corpus linguists have known this for years.
  • Some journals in the data are not really history journals, but rather politics and current affairs publications with a sprinkling of historical content. Archaeology is similarly problematic, but I’ve left these journals in for now out of a sense of GLAM solidarity.

Despite all of this, I’d like you to play a game of guess the journal from a network graph; a network graph representing data for the 30 highest ranking English language History journals in terms of article volume published between 1995 and 2013. On one hand you doing this will help me validate that my data – and this particular way I’ve chose to represent it (a force-directed ‘Force Atlas’ graph generated using Gephi) – has some value; Adam Crymble has a nice example of how this can be useful. On the other it should be a bit of fun.

So, onto that long promised fun bit. Knowing the following:

  • That each number on the network represents a journal name,
  • that each word within square brackets is a topword from an article title,
  • that the thickness of the line between the word and the number represents the occurrence of that topword in the numbered journal,
  • and that the colouring represents the group (or modularity) the numbered journal has been assigned to based on the structure of the network;

can you guess which number the following journal is represented by? (Or is this whole thing meaningless?)

  • Antiquity
  • English Historical Review
  • International Journal of African Historical Studies
  • International Journal of Maritime History
  • Journal of American History
  • Journal of Asian Studies
  • Journal of Social History
Bimodal Force Atlas graph for History Journal Articles published 1995-2013. For more detail (and with apologies for the fuzzy compression above, you'll probably need it!), download the PNG or SVG version.

To start of you off, I’ll gift you that American Historical Review is number 34 – right at the heart of the network, not surprising given the volume of output. I’ll also give you a little derived data to help you make up your mind.

Answers in the comments please!



Your Dropbox csv file link does not workk properly, it seems.

Apologies. Should be fixed now.

John Theibault has guessed three correctly These are:

68 is International Journal of African Historical Studies
45 is Journal of American History
20 is Antiquity

Of his other guesses, the correct journals are:
82 is French Studies
241 is China Quarterly
23 is Albion
76 is International History Review

So, four more to get. Keep guessing!

My guesses:

22 Antiquity
23 English Historical Review
163 International Journal of African Historical
227 Studies International Journal of Maritime History
83 Journal of American History
241 Journal of Asian Studies
91 Journal of Social History

Adam, thanks for guessing!

So, going through your guesses in turn:

22 Antiquity
Incorrect. 22 is Journal of Archaeological Science. Antiquity is 20 (see comment above).

23 English Historical Review
Incorrect. 23 is Albion (see comment above)

163 International Journal of African Historical Studies
Incorrect. 123 is Historiens et Geograpghes. International Journal of African Historical Studies is 68 (see comment above).

227 Studies International Journal of Maritime History
Incorrect. 227 is Historian.

83 Journal of American History
Incorrect. 83 is Hispanic American Historical Review. Journal of American History is 45 (see above).

241 Journal of Asian Studies
Incorrect. 241 is China Quartely (see comment above).

91 Journal of Social History
Incorrect. 91 is one of the four titles yet to be guessed correctly...

So unlucky, but close. I've created a new blob of derived data based on all the guess so far: So only four left to guess!

The comments to this entry are closed.