18 March 2013
TreeCurator, and 3D Visualisation of Computer Directories
(or, the case of the monotomous nodes)
As mentioned in a blog on 1 March 2013 about the use of phylogenetic software for visualising the arrangement of directories and folders in computer media, the Newick file format is used by tree viewers to construct and present the tree. Usually phylogeneticists obtain their Newick file directly from the software that undertakes the phylogenetic analysis. In order to use phylogenetic tree viewers in another context it is necessary to create the Newick file independently.
The eMSS Lab at the British Library has been writing programs in Python for creating the necessary files in Newick format, and they may be seen as an initial component of a tool to be known as TreeCurator. In the first instance code has been directed at depicting the arrangement of computer files and folders but the same approach can also be used to show the arrangement trees of analogue objects notably the papers in a personal archive of letters, diaries and notebooks. This delivery and presentation may facilitate the integration of analogue and digital entities in a hybrid personal archive, for example.
Although Newick may be seen as a kind of standard, there is in reality quite a bit of diversity in interpretation by software and there are a number of variants such as NHX (New Hampshire Extended) and NEXUS, with their XML derivatives phyloXML and NeXML. (It is worth bearing in mind too that digital curators and preservation practitioners working with scientific archives can expect to encounter these variants in personal archives.)
There are, moreover, some important differences between file trees and phylogenetic trees. For example, computer file trees commonly have folders which contain just one folder, whereas phylogenetic trees typically have bifurcating or multifurcating nodes (a single parent with 2 or more descendants)
Some software such as FigTree seems to be able to handle monotomes (monotomous nodes with not only a single parent but also a single descendant) but other software such as Phylo3D is not able to do so, and it is necessary to adapt the Newick tree file data accordingly.
One of the approaches towards visualising trees of objects not mentioned in the blog entry for 1 March 2013 is the use of 3D visualisation.
It is still early days in the case of phylogenetic trees and so far the emerging possibilities have had an ambivalent reception but there have been some important efforts. Among the most notable are Paloverde and Phylo3D (which makes it possible to use Walrus).
Three screenshots of the visualisations of a hard drive created using Paloverde: circle, cone, spiral
Walrus requires a special (some might say, esoteric) version of graph file format known as LibSea. (It is possible to create directory trees directly from a hard drive using the utility called dirgraph which produces LibSea files but the aim of Tree Curator and this brief exploration of 3D is to be able to maximise usability by working directly with Newick and its variants.) The tool Phylo3D was developed by Dr Timothy Hughes for converting Newick (and its relatives) to the format necessary for Walrus, and I thank him for confirming that monotomy was the issue that I needed to address in order to use his program.
Although limited in their functionality pioneering 3D tree visualisation software do illustrate the potential benefit of interactive 3D trees. In occupying the third dimension the leaf tips of the tree may be presented more compactly and in a way that suits the viewer. Indeed this is manifested in the way in which living trees occupy space in order to maximise access to sunlight and meet the gaze of the sun as it moves across the sky.
The following pictures show the file tree of a hard drive of John Maynard Smith at a number of angles and proximities using Walrus. These are static images. Active use of Walrus, allows the viewer to move the 3D image around for viewing from various directions as well as zooming in and out.
Three screenshots of the file tree of a hard drive using Walrus: the lower two images are close ups
Annotation is possible but currently limited. No doubt if phylogenetic trees had always been prepared in 3D, an enterprising researcher would have invented 2D trees. In truth both have advantages and disadvantages. (For an example of discussion see the article "Crunching the Data for the Tree of Life" in a New York Times article.) Future Digital Scholarship blogs will continue the examination of potentially useful phylogenetic software in the context of computer media and digital curation.
Jeremy Leighton John, @emsscurator