|
MONK : Meeting Minutes for 4-9-2007
This page last changed on Apr 09, 2007 by amitku.
Present: John, Amit, Catherine, Stan, Steve, Bill, Martin, Matt Some doubt expressed about the usefulness of the roles in all cases; perhaps a chair should make sure that all of these things are covered, or at least as they make sense. Uses and Users report: use cases up on the wiki, template with questions, based on one-on-one conversations with scholars. Tanya, Sara, Steve, Kirsten, Martin have prepared or will prepare cases. Martha will prepare one also - maybe two (the old one, and a new one). Use case for curator is next: a page has been created with some notes by Catherine, but it would be good to have feedback from an example user. Description of the MONK user categories is up, too; perhaps a non-specialist user. Some questions on Martin's part and Steve's about whether all of these cases require the software we are building - Kirsten's, for example, might be accomplished using EEBO. A question for the template might be "what MONK facilities does this need, beyond searching/browsing?" The witchcraft example (Kirsten's) might or might not--extra tagging might be problematic; patterns of vocabulary or style might be monk-worthy, and then the question is the tagging of examples that could be shared with other people. A question for the collaboration group? Brian P-Z will look at the curator use-case; we could ask Mark Olsen as well. Data cell: conversations so far have raised more questions than answered. Six milestones, not in any particular order: we need a data model, we need an API for interfaces to address, we need to look at use cases in order to inform data model, we have a collection of texts from 1690 (including public domain) and texts are in SVN (password-protected; license terms may include ability for licensee institutions to give access to particular individuals), Bill and Martin have been working on a domain model (what are the characteristics of the texts, what are users adding to or doing with those texts) and a dtd (modified tei-tite) for MONK. A list of the text collections is already posted to the wiki, please check on whether your institution licenses these texts, so we know whether we can all work with all of them. Individual groups need to make sure that their members can log into the monk server, as well. NW can transform TCP texts into TEI-Lite, working on CH fiction to get to the same tagging. Martin thinks it would be desirable to add British and American fiction after 1865, to experiment with up-tagging these. JMU is profoundly nervous about up-tagging, and wants to know what functionality requires it. MM feels that without some assumptions about tagging, analytical capabilities are severely limited. BP thinks mapping (as in the nora chunk file) or transformation (as in up-tagging) are the same thing, really--what matters is how extensive and/or customized they are. Use cases might provide some insight into functionality that should drive these mappings; explication of nora chunk files could also help to explain. Chunking,f or example, is important, and granularity of chunking is important to discuss; dates are important, and place-names, but do we identify these based on tagging, or based on named-entity extraction? And what could we get at by treating tags as special class of tokens? Analytics: good meeting--role is to broker the relationship between uses/users cell and data cell. Use cases should drive analytics, even if you're thinking of something general or generalizable. Went over all the use cases posted so far, and thinking about how we can do these and who wants to be in charge of each one. Do we choose a subset of the use cases and focus on them, or do we try to pursue many use cases in parallel? Our group is not uniform in skills/background, so parceling these cases out per person doesn't always make sense. Floating analytics people, users who aren't developers. Can we group these use cases in some way, from an analytics point of view? Some kind of fungible granularity is implied in all of the use cases; getting a handle on this kind of thing is the challenge. Latch onto what's generalizable from the data perspective. Pitch into the ones that are being driven by users. What's inside and what's outside of MONK, and why, is also important for this group to articulate. Interface, Collaboration: start the next call; Milestones... |
| Document generated by Confluence on Apr 19, 2009 15:05 |