This page last changed on Jul 07, 2008 by unsworth.

Present: Martin, John, Stefan, Catherine, Amit, Matt

Agenda:

  • anyone have minutes from conference calls between early April and now?
  • Second Life project

Matt's been investigating 3D visualization of MONK results in SL. There's an SL community interested in scientific visualization in SL, interested in collective/collaborative investigation of data visualizations. As far as MONK goes, the first step would be to do a simple export (a la ManyEyes) to SL. Once there, various visualization widgets that could be applied to it. Do we need to use only public domain data? Not necessarily: you can set permissions on objects. Social network visualization widgets could be useful to apply. Publicly available collections: Wright, for example (more limited in scope and span), and NCF (broader and more interesting in some ways). Exporting mechanisms need to be established; data can be uploaded from a local hard drive (via the client). CSV export facility needs to be added to the interface--export saved result set to local hard drive.

  • Workbench update

Problems with flows that run for more than a few minutes. The collection tree browser is unusably slow when you are working with a large data set. This is a high priority problem--but browsing and selecting thousands of items is a tricky problem in any case. We may need a different solution for very large sets... upload/import excel files; search and add results to a set; use a plain html page with checkboxes; use a lighter-weight library for the current collection-tree browser; expand only selected items;

For ananlytics, we've developed a mailback function--it will be integrated into the workbench soon. (beginning of next week).

Add import/export CSV functionality (next week)

For the use cases, it's really been necessary to pass the data around by hand--the users are not actually able to do their work in the workbench.

A stable version of the workbench would be nice--but it could be published by hand, with some release notes on known issues. This is probably not worth it, all in all, unless we have planned demos or users outside the group.

  • Prior update

Still looking for Prior to be complete in July, at which point all the data that we plan to work with will be ready to be imported by it. At some point later in the fall we may re-run the whole lot of collections through again, based on some clean-up exploration by Martin, but for now it is workable. Morphadorner, by the way, seems to be doing very well with some fairly difficult early texts. 200 million words of English from early 16th century, up until late 19th century, the texts are interoperable.

  • Interface update

Wed. meetings have been happening regularly--the recent issues have been tree-browser, use cases (Tanya), working with Amit, Roman. These are separate experiments from the workbench, but these are at least workable results for the users. Complex filtering on texts around named entities (Roman's work is on this). Can Roman's stand-alone applet be adapted? Maybe. Can the data come from the middleware? Perhaps. At the moment, it is hard-wired to a separate MySQL database (using SEASR calls, albeit).

  • Data layer update:

Workflow for ingestion is one of the things Amit has been working with. The workflow sets up a Fedora collection, uses Meandre as the workflow framework, as soon as Prior is finished. First you set up some metadata and you provide a zip file or a directory starting point. Workflow validates xml as well-formed; creates a collection-level object in Fedora, creates a work object for each work, within that collection. As it does this, it creates a dublin-core record for Fedora, extracts the TEI header, and stores the document as the basis for future versions. Next to do: store workpart information (divs or other structural elements), stores relations of workparts to work, you can get various kinds of full-text search access. Also to be added to workflow: TEI-A processing pipeline (would happen after the initial deposit in Fedora); Morphadorning and Prior also need to be added in.

Functionality added to workbench: save and retrieve result sets, other small improvements.

Next time, the call will be devoted to:

  • Now to end of grant: remaining deadlines/milestones
  • Plan for the future
Document generated by Confluence on Apr 19, 2009 15:05