This page last changed on Jun 12, 2007 by tclement@wam.umd.edu.

The Stein Experiment

Function Wishlist

Mining for patterns of repetition in The Making of Americans (1925) by Gertrude Stein.

Problem/Question:
With The Making of Americans, Gertrude Stein's goal was to record "a history of every kind of repeating there is in living" (para 848), a goal she furthers at the beginning of the novel through descriptive analysis. As the text progresses, however, her methodology changes. While at first the novel is meant to include every type of man and woman through description, to be an exact correspondence to human identity through a systematic description of behaviors, of actions and relationships, in the course of the novel, the narrator begins to doubt her ability to achieve this goal and her "narrative" becomes engulfed in seemingly indiscernible rounds of long repeating phrases and sentences. In the course of her project, the narrator fears that she has discovered the limits of representation by simultaneously tapping into the powers of language to express the notion that identity is expression itself and never static or complete.

Lauded by some critics who thought Stein accomplished what T.S. Eliot demanded of all writers - to make art, literature, and language "new' - she was also criticized by others like Malcolm Cowley who said Stein's "experiments in grammar" made her novel "one of the hardest books to read from beginning to end that has ever been published" (148). More recent scholars have attempted to aid interpretation by charting the correspondence between structures of repetition and the novel's discussion of identity and representation. Yet, the use of repetition in The Making of Americans is far more complicated than manual practices or traditional word-analysis programs (such as those that make concordances or measure word-frequency occurrence) could indicate. Further, the girth of this complicated text, comprising almost 900 pages and 3182 paragraphs, makes the novel as unwieldy as a large collection of texts.

Both its complicated patterns of repetition and its "manageable" size make it a useful case study for MONK. While this text comprises 3183 paragraphs, collections of hundreds (or thousands) of texts can be analyzed using tactics tested on this text. In particular, what is needed for this text is a process that a large collection will need: one of reading by "not reading." The particular difficulties engendered by the complicated patterns of repetition in The Making of Americans make it almost impossible to "read" the text in a traditional, linear manner; a problem that will be mirrored by users unfamiliar with all texts in a collection. However, by visualizing certain patterns in this text (and, it follows with larger collections in general), by looking at the text "from a distance" through text mining and visualization, one can "read" the novel in meaningful ways that would have been hitherto impossible.

There are three main areas of research to be conducted on this novel of 3183 paragraphs (517207 words with 5329 unique words): (1) an investigation into the rigorous patterns of repetition with variation employed in the text would forward the idea that Stein-accused both in her time and in ours of being nonsensical, "obtuse," and in some cases, insane-was employing systematic and meaningful patterns of repetition that forwarded her philosophical goals; (2) an examination of the prosody or rhythm of the text in comparison to rhythmic patterns in non-fiction narratives and oral histories (within other MONK collections) could engage discussions in critical theory about the influence African-American speech patterns had on Stein's style of writing; and (3) a look at how character development is achieved through relationships to other characters could engage discussions grounded in feminist and queer theory that argue that Stein's experimental project is not only on the grammatical level (in that her pre-Oedipal structures throw into question "patriarchal" norms of linguistic structure), but also on the level of narrative in that character development is based on "non-normative" (or what Stein would call "queer") relationships.

Current practice: Discussion for each part:

  1. We've already made some headway into patterns of repetition with variation by using D2K to detect the presence of n-grams of various sizes (here you can see a pretty miraculous output for 9- to 36-grams. I used the 36gram fragments to find these two paragraphs of text (Sample) that are remarkably similar but could not have been found by simple search mechanisms, primarily because I would have had no way of knowing what strings to look for in a text in which so many strings are repeated so frequently. In addition, I've given two papers on attempts to visualize what I like to think of as "this mess of data." Please see DHCS2006 and more recently MLA2006 -please use Explorer as they don't look as good in Firefox and feel free to skip to the pictures.
  2. As for prosody and rhythm, Carla Peterson, a scholar at UMD (who has not been approached yet), has written about the influence of the prevalence of "coon" songs and the blues in early twentieth century Baltimore on Stein's work. There are many Baltimore based oral narratives in the Documenting the American South collection. She might possibly be interested in further investigating this topic with us. And, others have done some limited work on "hand coding" rhythmic and intonation patterns in The Making of Americans. While some of it is discussion, one scholar I've found, Steven Meyer, actually diagrammed the first paragraph of the novel, like this: diagram sample.
  3. The relationships between the characters has not been mapped yet by anyone. Sometimes I think no one actually reads the whole book it's that hard to read), but I have list of the characters and Xin has mentioned that he could put it into his social network map.

Texts needed in the collection:

  • The Making of Americans : I have the text in standard TEI XML.
  • In your dreams: Anything in Documenting the American South that is "oral" and is from the Baltimore area.
  • Is there multiple versions of your documents you would need to see in parallel or combined? Is there foreign language or unusual characters? No.

Generality: As explained above, questions about patterns of repetition can be generalized to all documents. Similarly, looking at prosody (rhythm) and character development based on relationships to other characters could be useful in both fiction and non-fiction collections.

Granularity: Documents, paragraph, and sentence.

Characteristics: POS, Ngrams, and Soundex would all be useful in determining repetition as it occurs syntactically and semantically.

*Patterns * Again, see Sample for an example of two paragraphs that are closely related by repetition with variation. Also look at Steven Meyer's example of diagramming intonation and rhythm diagram sample.

Morphology: Morphology is very useful in this study. One aspect of Stein's work that is very important both to her and to scholar's studying her work is her notion of a "continuous present," which is constructed by many things (beginning again) and changes in verb tenses. Morphology can help track some of these changes.

Tags:

  • In comparing MoA to oral histories/narratives in Documenting the American South Date and Place will help locate documents that are in the appropriate time (circa 1900) and place (Baltimore).
  • Character names (which I already have) but pronoun referents would be awesome.

Classification: Primarily, I think this project is more about clustering, but classification could be very useful in both the prosody and the character studies. That is, if it were possible to "capture" a certain pattern in MoA that had been identified as "speech-like" and/or a pattern or particular network of relationships, it would be interesting to see if we could employ a "find more like these" strategy.

Comparisons: As described above with oral histories/narratives in the Documenting the American South collection.

Topic extraction: Hmmm . . . need to think more about this.

Lexicon, counts of words, most common occurrences, concordance As described above, I don't think this is necessarily important, although it is useful to have concordance-like abilities to see phrases (n-grams) in context.

Annotation: I need to think more about this.

Collaboration Again, I need to think more about this.

Document generated by Confluence on Apr 19, 2009 15:05