UK Web Archive blog

Information from the team at the UK Web Archive, the Library's premier resource of archived UK websites

The UK Web Archive, the Library's premier resource of archived UK websites

2 posts from March 2019

29 March 2019

Collecting Interactive Fiction

Intro
Works of interactive fiction are  stories where the reader/player can guide or affect the narrative in some way. This can be through turning to a specific page as in 'Choose Your Own Adventure', or clicking a link or typing text in digital works. 

Archiving Interactive Fiction
Attempts to archive UK-made interactive fiction began with an exploration of the affordances of a couple of different tools. The British Library’s own ACT (Annotation Curation Tool), and Rhizome’s WebRecorder. ACT is a system which interfaces with the Internet Archive’s Heritrix crawl engine to provide large scale captures of the UK Web. Webrecorder instead focusses on much smaller scale, higher fidelity captures which include video, audio and other multimedia content. All types of interactive fiction (parser, hypertext, choice-based and multimodal) were tested with both ACT and Webrecorder in order to determine tools which were best suited to which types of content. It should be noted that this project is experimental and ongoing, and as a result, all assertions and suggestions made here are provisional and will not necessarily affect or influence Library collection policy or the final collection. As yet, Webrecorder files do not form part of standard Library collections.

Cat_Simulator

For most parser-based works (those made with Inform 7), Webrecorder appears to work best. It is generally more time-consuming to obtain captures in Webrecorder than in ACT as each page element has to be clicked manually (or at least, the top level page in each branch must be visited) in order to create a fully replayable record. However, this is not the case with most Inform 7 works. For the vast majority, visiting the title page and pressing space bar was sufficient to capture the entire work. The works are then fully replayable in the capture, with users able to type any valid commands in any order. ACT failed to capture most parser works, but there were some successes. For example, Elizabeth Smyth’s Inform 7 game 1k Cupid was fully replayable in ACT, while Robin Johnson’s custom-made Aunts and Butlers also retained full functionality. Unfortunately, games made with Quest failed to capture with either tool.

Another form which appears to be currently unarchivable are those works which make use of live data such as location information, maps or other online resources. Matt Bryden’s Poetry Map failed to capture in ACT, and in Webrecorder although the poems themselves were retained, the background maps were lost. Similarly, Kate Pullinger’s Breathe was recorded successfully with WebRecorder, but naturally only the default text, rather than the adaptive, location-based information is present. Archiving alternative resources such as blogs describing the works may be necessary for these pieces until another solution is found. However, even where these works don’t capture as intended, running them through ACT may still have benefits. A functional version of J.R. Carpenter’s This Is A Picture of Wind, which makes use of live wind data, could not be captured, but crawling it obtained a sample thumbnail which indicates how the poems display in the live version – something which would not have been possible using Webrecorder alone.

Choice-based works made with Ink generally captured well with ACT, although Isak Grozny’s dripping with the waters of SHEOL required Webrecorder. This could be due to the dynamic menus, the use of javascript, or because Autorun has been enabled on itch.io, all of which can prevent ACT from crawling effectively. ChoiceScript games were difficult to capture with either tool for various reasons. Firstly, those which are paywalled could not be captured. Secondly, the manner in which the files are hosted appears to affect capture. When hosted as a folder of individual files rather than as a single compiled html file, the works could only be captured with Webrecorder’s Firefox emulator, and even then, the page crashes frequently. Those which had been compiled appeared to capture equally well with either tool.

Twine works generally capture reasonably well with ACT. ACT is probably the best choice for larger Twines in particular, as capturing a large number of branches quickly becomes extremely time-consuming in Webrecorder. Works which rely on images and video to tell their story, such as Chris Godber’s Glitch, however, retain a greater deal of their functionality if recorded in Webrecorder. As the game is somewhat sprawling, a route was planned through which would give a good idea of the game’s flavour while avoiding excessively long capture times. Webrecorder also contains an emulator of an older version of Firefox which is compatible with older javascript functions and Flash. This allowed for archiving of works which would have otherwise failed to capture, such as Emma Winston’s Cat Simulator 3000 and Daniel Goodbrey’s Icarus Needs.

As alluded to above, using the two tools in tandem is probably the best way to ensure these digital works of fiction are not lost. However, creators are advised to archive their own work too, either by nominating web pages to the UKWA, capturing content with Webrecorder, or saving pages with the Internet Archive’s Wayback Machine.

By Lynda Clark, Innovation Placement, The British Library - @notagoth

21 March 2019

Save UK Published Google + Accounts Now!

The fragility of social media data was highlighted recently when Myspace deleted (by accident) user’s audio and video files without warning. This almost certainly resulted in the loss of many unique and original pieces of work. This is another example of how online social media platforms should not be seen as archives and that if things are important to you they should also be stored elsewhere. The UK Web Archive can play a role in this and we do what we can to preserve websites and selected social media. We do, however, need your help!

Google+
If you have a  Google + account you will have seen the warning that the service is shutting down on 2 April 2019 and have warned users to download any data they want to save by 31 March 2019.

However, it’s not easy to know how to preserve data from social media accounts and sometimes this information without the context of the platform it was hosted on doesn’t give the full picture. In a previous blog post we outlined the challenges involved in archiving social media. Currently the most popular social media platform in the UK Web Archive is Twitter, followed by Facebook, which we haven’t been able to successfully capture since 2015, and a limited amount of Instagram, Wiebo, WeChat and Google +.

Under the 2013 Non-Print Legal Deposit Regulations we can legally only collect digital content published in the UK. As these platforms are hosted outside the UK there is no automated way to identify UK accounts so it requires a person to look through and identify the profiles that are added. In general, these are profiles of politicians, public figures, people renowned in their field of study, campaign groups and institutions.

So far, we only have handful of Google + profiles in the UK Web Archive but we are keen to have more.

How to save your Google+ data
If you have a Google + profile or know of other profiles published in the UK that you think should be preserved, fill in our nomination form before March 29th 2019: https://www.webarchive.org.uk/en/ukwa/info/nominate

If the profiles you want to archive outside the UK you can use the save a website now function on the Internet Archive website: https://archive.org/web/

By Helena Byrne, Web Curator of Web Archiving, The British Library