Markup and Manuscripts

In an early draft of his classic work Walden, Henry David Thoreau wrote, “A word fitly written is the most choice and select of things.”

As we’ve seen, words are one type of thing we find within the thing that is a text. They’re easily recognizable in a printed text by the space that typically separates them — a kind of informal markup, we’ve said.

We tend to value writers who care deeply about their words as things — who want their words to be “fitly written” and who consequently choose their words deliberately. Manuscripts that are drafts of published works often offer the opportunity to see those choices being made.

The manuscript of Walden is a perfect example.

A manuscript is itself a thing that records an author’s choices, a physical object of paper (or some other support, as manuscript scholars call it, such as animal skin or papyrus) and ink, pencil, crayon or some other medium. The words as inscribed on the page are things of a different order from the words as lexical objects (that is, dictionary items with meanings). They’re written in one medium or another, in some individual’s particular hand (handwriting). Each is the product of a unique act of inditing: the word word written by the same person multiple times within the same manuscript will look slightly different in each instance. Same word, same meaning, same thing lexically, different things physically.

The words in a manuscript draft may be written in various locations on the page (between lines, say, or in the margins) in a variety of orientations (vertical, horizontal, angled), and surrounded and accompanied by marks of one kind or another, such as carets, bubbles, lines, and arrows. These marks are also things.

One other thing worth mentioning: the act of selection itself. An author’s choices are things, in other words, and if we’re trying to understand how and why they happened during the process of composition, we need some way to describe these non-physical things — decisions to remove or add or relocate words in the flow of the text — and relate them to the physical things (like carets and line-throughs) that may represent them.

Thoreau ultimately decided that the words fitly written in his sentence were … not. He crossed them out. The whole sentence — “A word fitly written is the most choice and select of things” — remained a sentence in Walden, remained in some sense the same sentence — but also became a different one. Even before he finished this draft, the sentence became, “A written word is the choicest of relics.”

Here’s how it looks on the manuscipt page:

Image excerpt from the manuscript of Walden
Click the image to see a larger view.

You may want to look at how this sentence looks on the full page, where you’ll see quite a few other marks and acts of revision. (If you’re reading Walden as it was finally published, you’ll find the sentence, appropriately enough, in the chapter titled “Reading.”)

One of the most interesting things about this revision is that Thoreau changed the word choice to choicest simply by adding the st onto letters he’d already put on the page. How should we describe what he did? Did he delete choice and replace it with choicest? Even though he didn’t physically cross-out or erase choice, that might be one way to describe his action. But perhaps it would be better to describe the action as just “inserting” st? Also: should his deletion of most be described as part of the same act of revision (most becomes redundant once choice becomes choicest) or a separate one?

Hopefully it’s clear by now why we said earlier that a manuscript is a complex textual object containing data (aka things) that there may be no obvious or simple way to describe. Modeling this data is a much more complicated and difficult task than modeling the basic structure of Blake’s poem.

But the process is no different: all we need is a markup language adequate to the task and a clear sense of what we want to do with it. There isn’t a single right way to model a manuscript or any other textual object, and there’s probably no way to model everything about any object worth modeling. So before we dive into modeling with any kind of markup, we need to take a good look at the object and decide what there is about it — and in it — that we want to describe.

Before we look at some ways we might model the Walden manuscript, then, let’s learn a little bit about its history.