09 November 2018
C M Taylor on ‘keystroke logging project’ with British Library
a guest blog by Craig Taylor, whose latest novel, Staying On, is published by Duckworth in 2018. In 2014 he began a project with the British Library to document the creative process of writing the book, using key-logging software. You can reach Craig on Twitter at @CMTaylorStory.
Re-entering the academic world after starting work as an Associate Lecturer on the Publishing degree at Oxford Brookes University, I began speculating about writer’s archives. Did previous scholars have access to more hand-written and typed drafts of works in progress - actual objects showing the shaping of works of art - but with the normalisation of computerized authorship, were these discrete drafts abolished in the rolling palimpsest of write and digital re-write?
Plus, I was considering a new novel myself, but as I have written elsewhere, emotionally I was daunted by the long-haul loneliness of novel writing, a process I considered in my most despairing moments as like wallpapering a dungeon.
I spoke to my friend Mark about these two things - the lost drafts and the loneliness - and in a flash he had the answer: ‘Put a piece of malware on it.’
He meant that if I put some malware, or spyware, on my computer to note everything I did, it would record all changes made to an evolving manuscript, plus it might offer a weird kind of company for me in my wallpapered dungeon.
It was worth a shot.
I contacted the digital curation team at the British Library in April 2013 and they could not have been more transparent, accessible and curious. We started talking about how digital production intersected with the scholarly recovery of the creation of works of art, and it turned out that my first view of things was off. Forensic curatorial techniques for salvaging the development of a manuscript on a hard drive did exist. It was just that they could not often be used, due to issues of privacy. How could you go into a writer's hard drive if they were writing and receiving email from multiple others from the same computer they were writing on, and writing on topics that might be of a personal sensitivity to one or more of the correspondents? Without complex legal initiatives and sensitive multiple consent, you just couldn’t.
But a simple solution was available. To save us from running into privacy issues, I would just buy a separate machine on which I wrote only the novel. I’m not the world’s richest guy, so I bought a pretty basic reconditioned laptop. After all, I was only going to write prose.
The reconditioned keylogging laptop on my writing desk at home.
We negotiated a contract where (to put it crudely) the data was the British Library’s but the resultant book was mine, and then we looked round for some software. The curation team found a piece of keylogging software called, Spector Pro about which Jonathan Pledge, a curator of contemporary archives at the British Library has recently written:
"The software used for capturing the writing process on the Craig Taylor project was the keylogging software, Spector Pro produced by SpectorSoft. In 2015 the company was rebranded as Veratio; Spector Pro is no longer part of the product range and is no longer supported. Spector Pro works with Windows variants from Windows XP to Windows 7.
After installation on a host computer, Spector Pro works by running undetected as a background application and cannot be accessed via the normal Windows user interface (it is not visible in the Applications folder). Access to the programme is by a default keyboard combination Control-Alt-Shift which brings up a password dialog box. The password is set by whoever installs the programme.
As keylogging software Spector Pro is not terribly sophisticated and seems to have been specifically designed for low-level company surveillance of employees, potentially without their knowledge. It is possible to run Spector Pro as a visible programme but this would seem to negate its original stated purpose.
Spector Pro can track and record chat conversations (as transcripts), emails (sent and received), websites visited and, most importantly for this project, keystrokes made, not only what has been typed within an application; but mouse and keystroke usage across the whole computer system."
The software was installed on my empty computer and I set to work.
But what had I done? I’d offered myself as a guinea pig, with my every wrong-turn, reappraisal, edit and mistake noted, recoverable and time and date stamped. Not only that but the novel proved punishingly hard to write. It wasn’t just that I was also writing a film script and an app, plus working as an editor of fiction and a university lecturer, and it wasn’t just that one of my young daughters was often to be found perched on my desk asking me questions, it was also the content of the book. I was aiming for a clarity of prose and of story, and for a universal relatability of protagonist, that I had never sought before.
The going was slow, but when I got the chance, and when I had chunk of work, I would arrange to come in to the British Library to download the data. I visited on eight separate occasions. My first visit was in October 2014, and my last was in March 2018. By the time we had finished we had generated 222GB of date, captured across 108, 318 files.
So, what exactly do we have?
We have information on every keystroke typed:
This shot shows the raw data usage as a list. By far the largest number of keystrokes concerns writing/typing as well as work on editing (Find & Replace) with the remainder comprising system activity including backups.
Plus, we have thousands of screenshots, one captured every few seconds each time activity on the host computer is detected.
From the moment the computer is logged into until the moment it is shutdown. Screenshots allows an output as either still images (.jpg or .BMP) or as black and white video (.avi).
And we have text outputs:
Text output from ‘Keystrokes Typed’ for a single tracked session. As seen from the detail below the header provides information on the Application used, the start of activity and the title of the file being worked on. The greyed text represents the tracked movements with typed words rendered in bold. Time stamps are given, with the green text signalling the start of activity and red the end.
During the writing I had no access to the software on my computer and I had no clear sense of the data being produced. But while I never knew what it was doing, it actually did help me begin again with novel writing, to get over that initial hump in the road. Somehow the writing felt collaborative, not only because the software was recording me, but also because of the digital curation team who were taking the data.
I have been asked if knowing that the work was being recorded made me self-conscious, and, sure at first, I was minding my Ps and Qs a bit, trying to seem like a more competent writer. But that didn’t last. Soon I realised that I quite wanted mistakes to show. It seemed an act of solidarity with the writers I was teaching, to really show them what I had often told them, that writing is born from repetition, that every writer has blind spots – weak theme, two dimensional characters, flimsy plotting – and that only re-writing cures these ills. It seemed like honesty to uncover the tottering beginnings of what most people would only consume as the solid, finished article.
Not only that. I forget about the keylogging software recording my every character because of the story itself. I wrote earlier that it was a difficult novel to write, because I aimed to write as simply and truthfully and compassionately as I was able. Aims I found to be not as readily available to me as I would have flattered myself to hope. I forgot about the keylogging going on as I wrote because the difficult writing became immersive – as I hope the reading of it will be - because my story and my characters - Tony and Laney, Jo and Nick - absorbed me, and in the end it was their story that cured me of my wallpapered dungeon, the keylogging project being the booster to get the journey started.
And so now, what are we going to do with the data? Well, I’m not going to do anything with it, I don’t have the skills. The data is now placed in the public domain, under a Creative Commons BY license, running free at : https://data.bl.uk/cmtaylorkeylogging/. So, if you are a scholar of digital humanities, or a digital artist or a creative visualizer, be our guest. The data is there to be played with. It would be lovely to know what you did with it.