This page last changed on Feb 23, 2008 by martinmueller@northwestern.edu.

Present: Bill Parod (chair and secretary), Phil Burns, Joe Paris, Brian Pytlik Zillig, Amit Kumar, and Loretta Auvil

Agenda:

1) Abbot update

Brian will send some sample files and schema to John Norstad.
Shell scripts are coming together. Earlier scripts performed the necessary transformations. Current work has been in generalizing the invoking scripts.
If we do decide Fedora to become a significant part of monk, Steve and Brian will need to explore that.

Amit: What do your scripts use?
Brian: Using Saxon for its XSLT 2.0 support
Amit: Shell scripts are fine for prototyping, but in the end an ingestion mechanism should be built on a workflow system. Shell doesn't offer stop/pause/redo and fork/join features.
Brian: What is an example environment?
Amit: I have used jBPM from JBoss. Code is written in Java and workflows are composed in graphical user interface.
Loretta: Can these workflows be done in SEASR?
Amit: Yes, but there are things required (batch processing and staging) that aren't yet in place in SEASR.

Bill: Which collections have been converted to TEI-A
Brian: Currently Wright and plan to turn attention to EEBO soon. Wright is done except for soft-hyphenation. We're exploring use of <orig reg=> use.Wright splits words sometime at the end of a page. MorphAdorner attempts to regularize use of dashes and word splitting. Will check with Martin about possible overlap in Abbot and MorphAdorner on this issue.

2) Meandre update - What does Meandre imply for Monk architecture? - What are the respective roles for Data stor API, Proxy, Fedora, and Meandre?

Loretta: As much as possible should be expressed as Meandre components, in order to enable reuse. John Unsworth didn't want to bring proxy functionality into Meandre at this time, but it could. To write a Meandre component is not difficult. You must define inputs/outputs and parameters.

For example, flows can pull text out of Fedora, create NB model and send back top 10 words of each class. The user interface can go through the proxy.

Meandre developers have wrapped Fedora API-A and API-M.
Amit created Monk servlets that query Monk Data Stor API.
Meandre can perform filtering, sorting of search results.

Component architecture experiments underway in MONK/SEASR may have implications for Monk architecture. Amit can discuss more fully in a week or two.

We discussed organizing face to face meeting in a few weeks.

3) Fedora content models - what do we want to store in Fedora and how do we want to access it?

NCSA has created plugins that are Fedora API-A and API-M wrappers.

Generation and management of identifiers for repository objects and their available components is an important aspect of repository content modeling. We will develop requirements for identifiers in discussion on the monk list.

Document generated by Confluence on Apr 19, 2009 15:04