textDNA

Platform/Software

Year initiated:

2009

Record Status:

Description:

TextDNA allows users to explore and analyze word usage across text collections of varying scale. With TextDNA, users can compare word usage between document collections (e.g., across different decades), between individual documents, or between elements within a document (e.g., chapters or acts). Word usage can be explored across raw texts, i.e., text documents not subject to processing. Additionally, word usage can be explored across different metrics, such as how frequently words are used within a document.

TextDNA is based on the Sequence Surveyor genomics analysis system, which provides overview visualizations to elucidate large-scale patterns across multiple genome sequence alignments. Like genomes, texts can be thought of as distinct sequences of data. As bacteria strains can be distinguished by their DNA sequences, texts can be distinguished by their sequences of words. TextDNA visualizes sequences of text in parallel, allowing users to detect word usage patterns.

TextDNA displays information about word usage in text sequences through aggregating position and color. Each unique set (or sequence) of words is mapped to a colored row. Each word of the set, in turn, is mapped to colored blocks within its row. The aggregation of color and position allow words within a row to be ordered and recolored according to different data properties. These variable encodings empower the user to examine their data from multiple angles to scrutinize global trends and outliers.

(source: http://graphics.cs.wisc.edu/Vis/SequenceSurveyor/TextDNA.html)

Works Developed in this Platform:

Work title	Author	Language	Year
JanusNode		English	2012

The permanent URL of this page:

https://elmcip.net/node/11216

Record posted by:

Hannah Ackermans