|
MONK : Aug 18 conference call
This page last changed on Aug 18, 2008 by stan_ruecker.
Present: Stan Ruecker, John Unsworth, Martin Mueller, Amit Kumar, Steve Ramsay, Matt Kirschenbaum Interface: Report on the faceted browser tool that Alejandro has worked up. Allows you to select and group. There are still some visibility and usability issues, and feedback on this tool would be good. Available facets should be anything supported (as a classifier) in the datastore: date, gender of author, genre, title, author, etc. Challenge for this tool will be to make it approachable in the first couple of stages of use; it becomes more useful the further in you get, but people may be baffled by the first couple of screens. Headwords rather than first words would be useful, too, for the tiles, or you could come into this as the result of a search, so that your first screen is a simple search screen (or a search by classifier). Faceting could begin after that. Or come in with some preset selections that people could manipulate and change. This strategy is also being considered for the Mandala browser. Progress bars at various points would also be useful: Amit suggests that this should be a generalized API for the workbench, an Ajax call to the server, to see how things are going. Back button is also missing in all of our tools, but the browser back button doesn't work (takes you out of the workbench altogether). This is also something that we should do at the tool level, but support as an API at the workbench level. Mike Plouffe has done some work, earlier, on a javascript history widget--storing history in the datastore: that was connected to login and sessions, so you could re-login and get the earlier state. But we could do something client-side and session-specific, using browser cache. Stan will ask Mike to look into that. Mike also needs Amit's help on servlets for Google charts; Stan will ask Mike to contact Amit. Data: Wright texts have been Morphadorned, ready to go into Prior. John N. has worked with a sample of 30 TCP texts, ready for the 600 EEBO texts from TCP: these are in Abbot at the moment (in Nebraska). Problems with superscripts, gap tags, but these are nearly, nearly complete. NCF needs to be re-run, as well. Need to drop out Chaucer: monkdb on monk.lis.uiuc.edu is pre-Prior. We need to start fresh, create a new datastore (2.0) with all the materials going through current versions of Abbot, Morphadorner, Prior; drop Chaucer (and possibly Stein) at this point. Raw texts also need to go into Fedora, and all of this workflow needs to go into SEASR, presumably. But: do we add more texts, or spend time getting SEASR more into the flow of MONK? We don't have substantial additional collections in view, though: demonstrating robust ingest would be a good deliverable, though. Further out, though, OCA and Gutenberg would be next collections, but that's probably out of scope for the end of this project. Let's make the datastore stable, document it, and allow people to finish building interface for use-cases. Document datastore and workflow in the remaining time, get it into SEASR. Each new collection requires some new hand-made scripts to get them from being junk (non-XML, non-validating XML, etc. etc.) before you can get them into Abbott. Documenting that would be useful, though, so that one could develop a checklist for the types of things that typically arise, and that need to be handled with such scripts. Some smaller and more generalizable tweaking of Abbott itself is necessary for each new collection as well. There must be a way to write an XML Tidy, though... also out of scope for this project. But documenting the typical problems would be in scope, and would be useful as a set of requirements for such a (future) development effort. Collaboration: Future calls could include Lev Manovich (Big Humanities and cultural analytics); Dan Cohen and Zotero; another round with Fluid might also be useful (Jess Mitchell is the new project manager) and they would like people to look through their proposed component list and make suggestions; another round with Fedora, focused on workflow. Seasr does talk to Fedora now, via a SOAP interface. Embedding workflow inside Fedora might be out of scope for now, but checkpointing transformations in Fedora is still in scope. Of course, since SEASR doesn't have an object model of its own, we end up using the Fedora one; having more moving parts means more failure points. Documentation: is documenting interface elements an interface problem? If Matt has people or time for this, it would be very useful... and Martin has already written a bunch of excellent documentation, and we should make a deliberate attempt to coordinate documentation at different levels with respect to vocabulary, etc. MONK manual: October should be documentation month. |
| Document generated by Confluence on Apr 19, 2009 15:05 |