|
MONK : November 26 conference call
This page last changed on Nov 26, 2007 by unsworth.
Catherine, Martin, Bill, Stan, John present I. Planning for all hands meeting Not planning to be present: Stefan (James, Andrew planning to attend? JU to ask) Catherine will do a wiki page for arrival times, lodging place, etc. Day and a half of meeting time (Friday/Saturday): coffee in the morning, sandwiches brought in. Specify deliverable and time line (half-way through): working analytics and interface with WordHoard data store? Inventory of what's been accomplished to date Highlights and deliverable for the annual report (February) At the all hands meeting, reviewing and versioning proxy calls should be a priority (Amit and James might put a draft out for review in advance – JU will ask). User interface for data mining should be ready to be beat on, plan for revision. Integration of things like feature-lens, wordhoard Integration with SEASR Use cases and support for same: Kirsten, Sara, Tanya are all waiting for data. Bill P: Data cell has a call scheduled for tomorrow. John N. has released a first draft of the data access layer with documentation, checked into SVN. Amit's used it to pull data for Sara's use case. Next step will be build a MONK database with ingest routines. Fedora has been set up on the monk server, NCF has been ingested (adorned and unadorned), with OAI harvesting enabled. Amit's had some success doing analytics, for Sara. Morphadorner's been updated, log likelihood code has been pulled out of WordHoard, given to NCSA for inclusion in D2K. Martin: will call Steve and Bryan this morning. They've done a run on Wright to convert from P4 to TEI Analytics, ready to be morphadorned. Bryan should now be working on conversion of NCF. TEI Tite and TEI Analytics should be compatible, in case we want to catch stuff coming back from keyboarding, for MONK conversion, before it gets further tagging. Short term (February goals) could be to have Wright and TCP ready for use cases. Next step might be between putting things into the WordHoard database, with a MONK proxy layer, and try to meet up with the interface/workbench stuff. Status report from each of the cells, minuted in conference calls between now and all-hands (JU will ask for same) When do we grapple with scale? Where do we think the bottlenecks will be? What are the tradeoffs with granularity? What can we do to test this? How do we decide on the granularity tradeoff from the point of view of the use cases? How can we adapt user interface to accommodate large slow analytics (Mail me when it's done?)? |
| Document generated by Confluence on Apr 19, 2009 15:05 |