|
MONK : Proposal about main text and paratext (10-17-2007)
This page last changed on Feb 23, 2008 by martinmueller@northwestern.edu.
For a variety of purposes, many users will find it helpful to filter out certain kinds of 'paratext' from their searches and analyses. They prefer to see the movie without the trailers and credits. Main text consists of what are clearly part of the author's words from the naive reader's perspective. Paratext consists of what is ambiguously or clearly not part of the author's words. It also includes information in lists and tables that are not easily parsed as sentences. The distinction is easier to maintain in some genres than in others. In plays before the late nineteenth, for instance, the main text consists of all the word intended to be spoken by actors on the stage. Everything else is paratext. In the plays of Ibsen or Shaw, it is harder to decide whether or how stage directions are a form of paratext. The distinction between 'main text' and paratext is established at the point of a text's ingestion into MONK and becomes part of its SIP or submission information package. In principle it is possible to change the distinction on a text by text basis. In practice, one will do it on a batch basis, using as the criterion a genre (plays) or a particular collection of texts. Since MONK texts will overwhelmingly come in some version of TEI, the distinction between main and side text can be expressed in terms of elements that will count as one or the other. Paratext will always include the content of <front> and <back> elements. It will also include some elements that occur inside the <body> element, in particular the following elements from these TEI modules:
Where count objects are precomputed, separate counts are kept for main text and paratext. The lattert is by its nature a hodge podge and unlikely to be an object of attention in itself. But users must have the option of performing their operations on main text, paratext, or "all text." Users will typically have the relevant knowledge to determine whether they want to filter out or include The initial selection of side text elements will be a curatorial decision and will always be based on local knowledge of a particular collection or set of texts. Sir Walter Scott, for instance, wrote voluminous notes for his historical novels. If you know this you are likely to include the notes in the main text as "clearly part of the author's words from the naive reader's perspectives." Similarly, you might decide that the stage direction of some authors are really part of the main text. Or you might still classify them as paratext text because users can ignore the distinction. |
| Document generated by Confluence on Apr 19, 2009 15:04 |