|
MONK : Adorned and Unadorned Texts
This page last changed on Nov 20, 2008 by amitku.
The following corpora are all located at Northwestern Update from PibThis share is available \\ariadne.northwestern.edu\monk and from off campus via webdav as https://ariadne.northwestern.edu:8443/ariadne/ . You can enter username "xxxx" and password "xxxxx" when prompted, When we decide a batch of texts are "good enough" for ingestion into The monk share is read-only. You cannot add, delete, or modify files. The webdav access is a lightweight shell over the Windows monk share I have tested access to the monk share using Windows web folders and I have not been able to get Konqueror on Linux to connect, but this The "monk" share on ariadne now also contains adorned versions of the There are five eebo texts for which there is no enhanced a04632_02.xml There are no bibadorned versions of these files yet, nor do these For the moment I am leaving both the original adorned texts and the The adorned and bibadorned texts do not pass XML validation with the – Phil "Pib" Burns NCF textsThe 250 Chadwyck-Healey nineteenth century British novels are stored in /contents/monk/collections/ncf . Both the adorned and the unadorned texts are available. Both sets of texts should validate against the teisimple DTD. Last updated 2007/08/28 . Northwestern University is the keeper of these texts. MOA Stein texts/contents/monk/collections/moa Tanya Clement is the person responsible for getting the texts. MorphAdorned Stein textsMorphAdorned versions of the two Gertrude Stein texts Making of America and Three Lives are stored in /contents/monk/stein . Also available are unadorned and adorned versions of the texts with named entities added by MorphAdorner's version of the Gate named entity extractor. The "entitified" texts demonstrate the limitations of the Gate named entity extractor for literary purposes. These texts do not yet validate against the teisimple DTD. Last updated 2007/08/03 . Northwestern University is the keeper of the adorned versions of the texts. Early American FictionAdorned and unadorned versions of The Scarlet Letter are stored in /contents/monk/collections/eaf . The adorned text should validate against the teisimple DTD. Last updated 2007/08/03 . Northwestern University is the keeper of this text. Wright Fiction ArchiveAdorned and unadorned versions of Moby Dick and Uncle Tom's Cabin are stored in /contents/monk/collections/wright . The adorned version of Uncle Tom's Cabin demonstrates how the lack of training data for the dialectical language adversely impacts morphological adornment. Last updated 2007/09/17 .
University Of Nebraska is the keeper of these texts. -Last provided by Steve Ramsay DTDsThe XML DTDs used by various Monk corpora are stored in /contents/monk/collections/dtds . Last updated 2007/08/01 . Northwestern University is the keeper of the DTDs. Training dataThe MorphAdorner training data for nineteenth century British fiction is stored in /contents/monk/collections/ncf/monk/ncf/trainingdata . Last updated 2007/08/01 . Northwestern University is the keeper of the training data. |
| Document generated by Confluence on Apr 19, 2009 15:04 |