|
MONK : April 14 conference call
This page last changed on Apr 14, 2008 by unsworth.
Present: Stan, Martin, Catherine, John TCP texts (abou 700 of them) are TEI-A'd and ready to be morphadorned. Early American Fiction, DocSouth, and Wright are also ready. Next steps are for someone to morphadorn (Brian? Amit? Phil?); next step after that is to get things into the datastore (preferably by way of Prior, rather than in a handmade way). Workflow as a whole will tie together TEI-A stylesheet transformation, an opportunity for curatorial intervention, morphadorning (including the opportunity to choose training sets), plus Prior. Workflow should also store each state of the document in Fedora, as it goes through this process. We imagine workflow being made available to an end user who is uploading his or her own texts, eventually. As part of workflow, we will need some interactive process that allows curators to identify gender and nationality of authors, for example. This could be eased by using name authority control, for example the Library of Congress's (http://www.loc.gov/catdir/pcc/naco/naco.html), if that's available as a web service. Otherwise, we accumulate information and offer the curator a chance to accept a previous identification. Amit's been working with a java applet to work with the output or input for supervised classification; applet produces a decision tree that shows you features that are identified by the system, and how many works they appear in. Not offering browse access to the document, at present, but if this gets embedded in the MONK interface, we could presumably link to the browse interface already there. Matt's working on an information glyph to show you things like dunning's log likelihood for a particular feature, comapred to the rest of the collection. Profuse has "docuburst" that does more or less the same thing. Sara and Kirsten have use cases to use the decision-tree stuff, and Kirsten has a paper to deliver in JUNE on this, so it would be good if we could move it along. Martin will talk to John N. about at least running the witchcraft texts through Prior, or in some other way getting them in to the MONK datastore. Amit, Steve, Brian, Phil, John might need a workflow hackfest: Chicago? May? |
| Document generated by Confluence on Apr 19, 2009 15:05 |