A version of this paper was presented at the Digital Humanities Symposium: Visualizing the Archive held at Ryerson University April 23, 2010 to mark the official inauguration of The 1890s Online. The name of the project was changed from The 1890s Online to The Yellow Nineties Online in December 2010.
How can we represent a digital object? Is it simply an online version of its print self? Or can we imagine its online incarnation as challenging the way we analyze, interpret, and model the information from a traditional humanities perspective? Julia Flanders has argued that humanities computing should produce unease. For scholars in the humanities, she suggests, this unease registers as “a sense of friction between familiar mental habits and the affordances of the tool, but it is ideally a provocative friction, an irritation that prompts further thought and engagement” (12). The authors of the “Digital Humanities Manifesto 2.0” similarly propose that the practice of the digital humanities provokes “digital estrangement” (10). Making objects, texts, and knowledge “strange” allows for a re-interpretation of their value and meaning. As a humanities computing project, The 1890s Online should aim to do more than reproduce static facsimiles of print objects. It should also work to produce anxiety through digital estrangement, and this necessitates an intervention into the thought processes of traditional humanities scholarship. In relation to the processes of data visualization and text transformation using xslt, for example, The 1890s Online team can aim to deconstruct and then reconstruct the data in a way that informs generative thought.
According to Flanders, there are three major ways that scholarship has changed and continues to change as a result of humanities computing projects. It has shifted and evolved relative to the importance of medium, the cultural habits of the institution, and the ability of scholars to adapt to new scales and models of representation. With regard to the issue of medium, there is a friction between the digital aspect of digital humanities (which is recognized as being invested in progress) and the humanities aspect (which is seen as resisting cumulative concepts of progress). This friction, however, pushes scholarship towards some constructive questions concerning the relationship between knowledge production and dissemination, and the process of scholarship (Flanders 8). Victorian researchers have, in several respects, kept up with the shift from Web 1.0 to Web 2.0, characterized by a move “from publishing to participation, from web content as the outcome of large up-front investment to an ongoing and interactive process, and from content management systems to links based on tagging (folksonomy)” (Flew). The Networked Infrastructure for Nineteenth-Century Electronic Scholarship (NINES) represents the best example of Victorian scholarship firmly invested in the benefits of Web 2.0. For example, users are able to add tags to existing objects (creating tag clouds that give us a real-time understanding of the effect of the network). Users can also create exhibits of their own by collecting objects and re-configuring information to create new meaning. As such, the content of NINES is constantly developing and, therefore, so are the definitions of user and creator. Most other Victorian digital humanities projects do not individually allow this kind of participation but by joining NINES, The 1890s Online will benefit from such powerful, user-directed tools. This shift to interactive, invested creation also challenges scholars to re-evaluate textual meaning and how it is produced, and questions the mediations of representation and changes the habits of academic institutions, which have traditionally been primarily invested in one-way information dissemination.
Data visualization and text transformations can estrange the user of The 1890s Online in a way that engenders new perceptions of the textual objects offered. Such a project of estrangement hearkens back to Russian Formalist Viktor Sklovsky, who claimed that “in order to restore to us the perception of life, to make the stone stony, there exists that which we call art[. ...] The technique of art is to make objects ‘unfamiliar’” (12). In terms of data visualization, how can we defamiliarize textual objects in order to perceive our data and our users in new ways? In this paper, I discuss four aspects of this inquiry: the purpose of xslt; considerations for transforming xml using xslt for The 1890s Online; possibilities for data visualization in The 1890s Online; and possible visualization applications for the future.
The Purpose of xslt
After having so diligently coded our texts using xml (extensible mark-up language), how do we display our work? Xslt is the missing link. Among many other things, it takes text that, using xml, has been encoded to TEI (text encoding initiative) standards, and transforms it into html for display online, while retaining all of the important data encoded behind the words and images. “Xslt” stands for extensible stylesheet language transformation. Essentially, without xslt to manipulate the data, one cannot display or visualize the text. Once the xml file has been converted into html, cascading styles sheets can be used to change the aesthetics of the document. Xslt is powerful because it allows the transformation of xml documents into any other mark-up language or into plain text, and enables the extraction of resource description framework files (which is necessary for federation within NINES).
In many ways the acronym “xslt” is misleading because xslt is much more than a style sheet; it is a declarative programming language — meaning it specifies the task to be completed without having to explicitly list all the necessary steps, ensuring that “batch” transformations are possible. Declarative programming languages are often contrasted with imperative programming languages, which require more explicitly directed steps. Xslt can also be used to perform batch processing. For example, if a portion of the TEI header (the administrative information that appears at the top of every xml document) needs to be changed, an xslt program can be used to insert the updated information simultaneously in every document requested. In addition, a document can be re-structured using xslt. For example, if a portion of text that appears at the end of the xml document needs to appear at the beginning, xslt can transform the text displaying the information differently. Figures one and two below show portions of the xslt written for The 1890s Online.
Fig. 1 — A screenshot of the xslt used to transform xml into html.
Fig. 2 — A screenshot of the xslt used to transform xml into html.
Considerations for transforming xml using xslt for The 1890s Online
A number of issues arose in the process of writing the xslt program for The 1890s Online. In terms of visualization, transforming xml into html becomes an issue of display, structure, and aesthetics. Similarly, the editors of the fin-de-siècle magazine The Yellow Book — which is currently the central project of The 1890s Online — aimed to depart as far as may be from the bad old traditions of periodical literature, and to provide an Illustrated Magazine which shall be beautiful as a piece of bookmaking, modern and distinguished in its letter-press and its pictures, and withal popular in the better sense of the word. (Prospectus)
The pronouncement emphasizes the importance that the editors, Henry Harland and Aubrey Beardsley, and the publisher, John Lane, placed on the periodical’s visual components. Thus, when creating a digital edition of the magazine, The 1890s Online editorial team recognized the visual and spatial aspects of The Yellow Book as integral to its cultural value and bibliographic expression of meaning. To ensure the preservation of visual elements, The 1890s Online provides pdf (portable document format) versions of all texts, allowing users to view close approximations to the original physical editions.
In writing the xslt and css for the html transformations, the digital team aimed to create an on-screen, readable, and user-friendly version of the materials. When considering visual transformations of TEI-encoded texts, however, the materiality of certain elements contributed to their inclusion or exclusion in the transformation. Much of the encoded meta-data sits behind the displayed html. For example, a rich description of each image has been included in an “alt” tag, so that these terms are searchable yet they do not appear on the screen (see Fig. 3 below). Xslt was used to transform a TEI <figDesc> tag into an html <alt> tag.
Fig. 3 — <figDesc> descriptive metadata is converted into an “invisible” <alt> tag in html.
In addition, certain elements that were encoded in the TEI version of the text were excluded from the html transformation. Throughout The Yellow Book there are, for instance, onionskin pages — translucent sheets of paper used to protect the process- and line-engravings. These pages are important visually and historically. Without the physical text itself, however, all of these unmarked pages make little sense in html. We therefore encoded their existence into the xml using a page rendering as “onion,” but did not program the xslt to transform an empty page (see Figs. 4 and 5 below).
Fig. 4 — Screenshots of the pdf title page for Volume 1. The onionskin page is shown on the left and the title page is shown on the right.
Fig. 5 — The xml version of the title page for Volume 1, showing the onionskin page rendering.
We were required to build our xslt program from scratch rather than making use of the extensive, well-built xslt files available through the Text Encoding Initiative (http://www.tei-c.org/Tools/Stylesheets/). The fact that our website houses all transformed materials within a framed header and footer made these configurable xslt files unusable because they include built-in headers and footers.
Our team discovered that visualizing the archive after it was transformed allowed us to interpret our own coding practices in a different light. In other words, seeing the material transformed exposed elements of the project that required reconsideration. Thus, when encoding the page numbers, titles, author’s names, and catchwords, we coded each separately as <fw> types. This is the appropriate element according to tei standards; fw stands for “forme work” and is used for such items as headers, footers, and catchwords (see Fig. 6). These <fw> tags, however, interrupted the “flow” of an otherwise uninterrupted paragraph. After transforming several files, we noticed that certain paragraphs that carried over from one page to the next were not displayed correctly. Therefore, the xslt process taught us that we needed to use the closing </p> tag before every instance of <fw> and then begin the next page with a </p> tag in order to ensure that text displayed properly (see Fig. 6).
Fig. 6 — An example of <fw> for catchwords, page numbers, and running heads. Also, note the use of </p> to end the paragraph “unnaturally.”
Fig. 7 — The transformed html version of the text in Fig. 6.
Possibilities for data visualization in The 1890s Online
Beyond the practical, text transformations and visualization tools are productive in that they offer alternate avenues for viewing and understanding data. Visualizing how the data will look transformed with xslt can change interpretive methodologies. How can other visualizations offer new interpretative possibilities? For example, where do we find meaning in The Yellow Book? Does it exist in similar form or in radically different form depending on how it is displayed visually? In examining Figures 8 to 13, what differences exist and what can those differences lead to in terms of interpretative possibilities?
Fig. 8 — Reproduced cover of The Yellow Book Volume 1.
Fig. 9 — Xml (TEI) version of the cover image shown in Fig. 8.
Fig. 10 — Html version of Fig. 8 and Fig. 9.
How does the inclusion of embedded meta-data inform or enhance description? Figures 11 to 13 below offer further estrangements of The Yellow Book Volume 1.
Fig. 11 — TokenX word cloud of The Yellow Book Volume 1. (http://jetson.unl.edu:8080/cocoon/tokenx/index.html?file=../xml/base.xml).
Fig. 12 — Word cloud of The Yellow Book Volume 1 with extraneous words, such as prepositions, removed (http://www.wordle.net/create).
Fig. 13 — Word and element statistics for The Yellow Book Volume 1 (generated using Hyperpo – http://hyperpo.org).
Through these various visualization tools, I have easily presented the information included in The Yellow Book Volume 1 in a number of new ways. Through such estrangements, one is able to discern, for example, that “he” is used slightly more often than “his” and “her,” yet all three words occur much more often than “him.” And “love” occurs less frequently than “life” and “literature.” “Mr” is, by far, the most widely used abbreviated title, and “Lucy” is used either many times in one text or used in more than one text, or both. A simple word search would clarify which case is true. By harnessing the power of digital tools, The 1890s Online could allow users to perceive the text in ways that generate interpretations with fresh insight. For example, the less frequent use of “love” in relation to “life” and “literature” suggests that The Yellow Book should perhaps be distinguished from journals publishing stories in the romance and sensation categories. This observation could lead to an argument concerning The Yellow Book ’s target audience and marketed reputation as a being “popular in the better sense of the word,” as Beardsley and Harland declared in the Prospectus to Volume 1.
Data visualization tools can also be used to compare texts. The word clouds shown in Figures 14 and 15 below, for example, have been generated from Henry James’s “Death of a Lion” and George Egerton ’s “A Lost Masterpiece” respectively. Both stories were published in The Yellow Book Volume 1.
Fig. 14 — Wordle visualization of Henry James’s “Death of a Lion.”
Fig. 15 — Wordle visualization of George Egerton’s “A Lost Masterpiece.”
Comparing these two visualizations, one might note that “lady” and “Mrs” are used by James with similar frequency to Egerton’s use of “woman.” Meanwhile, Egerton seems to avoid both “lady” and “Mrs.” This observation could lead to a preliminary thesis concerning Egerton’s daring, “new woman” rejection of traditional monikers denoting dependence and marital status. In contrast, James’s use of “lady” and “Mrs” suggests a more traditional approach to Victorian heterosexual relationships in this story. Further research through visualization tools and more traditional research methods would then confirm or revise the initial speculation.
Data visualizations and estrangements could not only allow our users to generate productive interpretations; by similarly visualizing our users and their search terms and tags, the project’s research team could interpret its own practices and purpose in new ways. If we include certain applications that allow us to visualize the search terms that our site’s users enter, we can gain a sharper understanding of the reasons people are turning to The 1890s Online and how to address their interests. Thus, data visualization tools not only help users see the texts in new ways; the creators of the archive can also use data visualization to visualize the project itself differently. Moreover, this type of visualization could also guide us to adjust our encoding. Learning from NINES’s inclusion of several Web 2.0 technologies, we can observe what sorts of tags users are applying to federated sites. This folksonomy is important because our users would become collaborative members of The 1890s Online research team who, via their searching and tagging, could help us to deliver data in more meaningful ways. If, for example, a user were able to tag Egerton’s story with “new woman,” “rejection of traditional titles,” and “challenging gender,” this would mean the user has contributed to the meta-data associated with that text. Similarly, if The 1890s Online were able to track users’ search terms, the project team could more meaningfully encode key words into relevant files. Through these means, The 1890s Online would ideally merge scholarship, pedagogy, publication, and practice.
Appropriately, this full adaptation of Web 2.0 technologies echoes the original goals of the editors of The Yellow Book, as outlined in their Prospectus to the magazine’s first volume. Like them, we aim to move beyond the “traditions of periodical literature” such as static websites and web-based editions and provide an online environment that is “modern and distinguished in its letter-press and its pictures, and withal popular in the better sense of the word.” Making objects “strange” can lead us out of the practices of the static web and into a research environment distinguished in its visual interpretive realities and collaborations.
© 2011, Ruth Knechtel, University of Manitoba
Ruth Knechtel completed her doctorate at York University in Toronto. She has published in English Literature in Transition and Victorians Institute Journal. In addition, Ruth is in the process of building The New Woman Online, a searchable environment including rare documents related to the concept of nineteenth- and twentieth-century womanhood. She currently teaches at the University of Manitoba.
|MLA citation:||Knechtel, Ruth. "Digital Estrangement, or Anxieties of the Virtually Visual: Xslt Transformations and The 1890s Online ." The Yellow Nineties Online . Ed. Dennis Denisoff and Lorraine Janzen Kooistra. Ryerson University, 2011. Web. [Date of access]. http://1890s.ca/HTML.aspx?s=Digital_Estrangement.html|