Personal tools
  • We're Hiring!

You are here: Home Community Minutes Meetings March 2005 Developers meeting

March 2005 Developers meeting

Project wide meeting held in Baltimore March 18-20.

Overview

The core project collaborators (i.e. groups that contribute code) met from March 18-20 in Baltimore. Attendees came from [Baltimore] - (http://www.openmicroscopy.org/site/about/development-teams/ilya), [Boston] - (http://www.openmicroscopy.org/site/about/development-teams/peter), [Dundee] - (http://www.openmicroscopy.org/site/about/development-teams/jason), and [Madison] - (http://www.openmicroscopy.org/site/about/development-teams/kevin).

Schedule

  • March 18-20, NIA, NIH, Baltimore, USA *

Some Rules:

  • Sorry, but we will have to be a bit strict on mealtimes etc-- 2 hr lunches etc are out. We have alot to do, and alot of people with limited time. Ilya says there are places close, some that do carry out, etc.

  • We have a large amount to get through, so we will need to keep discussion focussed. Note we have time to discuss most issues throughout the 3 days. But long diatribes, and discussing plans for future when we are doing current status (for instance) must be supressed.

  • Sorry, no laptops running during talks (except as below). I know this is painful, but we have many presentations to get through. Checking email, pages, etc, can wait. Two very egregious things that happened last time: audience members conversing about ongoing presentation via IM, and doing personal emails and coding work during presentations and discussions. I know this is pretty drastic, and am happy to hear other feelings. Indeed, it is useful to be able to look up things, check things, etc. But this has to be minimized, in favor of what the group is doing, and discussing points openly. Sorry to be (:twisted:).

  • Notes: PDFs etc of presentations, notes from talks will be taken by one person (changes each session ???), placed on Tiki.

  • Presentations MUST be kept to time, so will require planning, practice etc. Where PPT's are used, plan less than half the alotted time for presentation, rest for discussion.

  • Where possible, having general visions of priorities for each group (what each group will do, and what that will depend on) would be great to have ahead of meeting-- on Tiki.

March 18

Talks scheduled for 20 min, allowing ~20min for discussion.

Theme for Morning: Status.

Goal here is to review what we have, what each element does, what we have tested, and what we need to consider in the future, but only as bullet points. Focus on major tools and functions that can be of interest to others in group. Full discussion of future to occur later.

           9-9:30 Arrive, meet, etc.
           9:30  Overview, any necessary points (JRS, IGG)
           9:45 - 10:30 Server (IGG)
           10:30 - 11:30 Java/Shoola  (AF/JMB/HSH)
           11:30 - 12:15 Java/VisBio (KE/CR)
           12:15 - 1:15 Lunch
   

Theme for Afternoon: Status, then Outline of Future Plans

           1:15 - 2:00 Remote Client Infrastructure Profiling (CA/HSH)
           2:00 - 2:45 Classification and Automated Analysis (IGG)
           2:45 - 3:45 Future Server Designs/Plans  I (IGG?JJ?)
           3:45 - 4:45  Future Server Designs/Plans II (Josh)
           4:45 - 5:15 Coffee
           5:30 - 6:15 Remote Client Infrastructure Design & Plans (CA)
           6:15 - 7:00 Nailing the down the Issues for the Future (an explicit list of issues we'll resolve in next 2 days)
           7 - ? Dinner
   

March 19

Theme: API's: What we have, Where we go

           9-9:30 Arrive, meet, etc.
           9:30  Overview, any necessary points (JRS, IGG)
           9:45 - 1:30 APIs.... (IGG/AF/HSH to lead?)
           1:30 - 2:30 Lunch
           2:30 - 5:30 APIs Designs and Implementations
           5:30 - 6:30 Breakouts  I. OME XML-- Status of OME   II. Clients
   

March 20

           TBD
   

Notes and Workgroups (below)

  • Briefings - Friday was spent giving presentations. Most presentations were status reports on existing components. Other presentations summarized research and experiments with new technologies (e.g. ICE, OWL).

  • Future Goals - Each group outlined their OME related goals for the coming year. These goals include: extend the OMEIS rendering model , finish a hierarchy viewer, make the classifier usable, extend the Web UI, offer more support for Annotations, and expand the user base.

  • OME-TIFF - Kevin & Curtis from the LOCI group at UW Madison proposed an OME-TIFF file specification. OME-TIFF is an OME-XML file variant that stuffs OME-XML into the header of a multipage TIFF. The binary pixel data is stored in TIFF pages and is indexed by OME-XML.

  • Updates to OME-XML - Minor fixes to the OME-XML spec.

  • Housekeeping - Mailing list procedures, switching from cvs to subversion, procedures for commiting code.

  • Miscellaneous Tasks - These were written in the notes, but don't fall under any of the defined categories, or are indecypherable.

  • New Tech Research - OWL, Hibernate, and ICE are being considered for future use.

  • Client API - We need a high level client API to simplify the process of writing clients and to allow optimization of common data queries. This is roughly divided into two sections which are listed below.

  • Named Queries - A set of high level remote procedure calls to represent common queries. These will ease the burden on client developers and allow us to optimize access to our current data store.

  • Client Analysis and Annotation - Strategies for storing and retrieving results of interactive client side analysis. The culmination of this will be an API and sandbox example.


Briefings

We spent the first day of the meeting giving presentations. These ranged from status reports on existing components to summaries of forrays into new technologies.

Server briefing: Ilya

Ilya described the [new expanded search criteria] [old link: http://www.openmicroscopy.org/APIdocs/OME/Factory.html#search_criteria] and urged people to use it when developing new code.

There may be some concerns relating to access controls. The underlying system is ok, but there may be some permissions problems. Instances of semantic types don't have ownership - ownership is inherited from containers. But, module executions have separate ownership, leading to potential problems. The solution is that attributes will belong to whoever ran the module. Ideally, the client will hide these details from users.

WebUI API: Josiah

The WebUI is an underused rapid development environment. It is the only client that can generate UIs for display, search, select, and creation of types that were unknown at time of code generation. Additionally, it has an easily extensible API that supports display, search, selection, creation of arbitrary objects, and building custom applications.

Shoola ROI tools: Andrea& Jean-Marie

Users can manually outline a Region of Interest (ROI), then run statistical analysis on it. This is used extensively in the Swedlow lab for FRAP experiments.

Shoola: Harry

The chainbuilder and zoomable browser are stable in the Shoola release. Feedback from others is needed to inform design revisions and changes.

Work has been done to augment the back-end to support requests for data history information. This will provide the basis for an implementation of a history browser. Other future plans include development of a general data browser.

Several wish list items for future work were suggested, including multi-threaded data retrieval, batching of multiple OMEDS requests, greater expressivity, extensibilitty of the data management serveri, centralized data manager, persistent local cache for static data, and a plug-in architecture.

VisBio: Curtis & Kevin

Remote Client Infrastructure Profiling: Harry

Harry presented the results of some tests designed to identify gross bottlenecks in the retrieval of data from the back-end to OME-JAVA. Based on instrumentation of OME-JAVA and the associated Perl facades on the back-end,, it appears as if the construction of the DTOs is the largest bottleneck. This is because of the approach taken to DTO building: for each object in a result set, a separate query is implicitly run for each foreign key.

Alternative approaches to generation of complex queries might provide improved performance. For example, replacing this sort of thing with a wide table based on multiple joins might be more efficient. However, this would not be a trivial project.

Remote Client Infrastructure Profiling: Chris

Classification & Automated Analysis: Ilya

A [general and robust method of automated image classification using supervised learning][old-link http://trac.openmicroscopy.org.uk/ome/wiki/Generalized%2BImage%2Bclassification] has been developed over the past year and a half by folks in Ilya's lab. It has been tested on many different types of training sets It exists in matlab, and will soon be swallowed by OME. A paper describing it should be coming out soon.

Future server plans: Josiah

Short term goals: Establish conventions & APIs to represent edits using data history (i.e. MEX's). Allow modules to have multiple outputs of same ST; Allow variable or untyped inputs to modules Implement full support for global outputs. Long term goals: Unify package defined DBObjects and SemanticTypes. Replace loosy-goosy features with more specific STs. Tie constant values to free inputs of chains Clarify input arity in the data history, and allow a fine grained representation of data history. Execute chains on things other than datasets (e.g. Micro-Array data) By hook or crook, achieve OWL interoperability OmniGraffle presentation

Research & experiments with an OWL server: Josh

In the five weeks preceding the meeting, Josh implemented a prototype OWL data store that handled the OME ontology. It was written in Java using Hibernate, PostgreSQL, TomCat, JBoss, and some other stuff. Its current form is expected to have low performance, but is an important proof of concept.

Remote Client Infrastructure Plans: Chris

Ice is a high performance middleware with bindings to almost every language except perl. It would buy us a lot, but it requires us to first shift to a different data server.

Future Goals

These goals were outlined during the third and final day of the meeting. Goals that did not recieve staffing are listed as "Ambitions" at the bottom of the page.


Client API

The meeting was themed around developing a high level client API. This API should simplify the process of writing clients and allow optimization of common data queries. This is roughly divided into two sections which are listed below.

Named Queries

A set of high level remote procedure calls to represent common queries. These will ease the burden on client developers and allow us to optimize access to our current data store.

Persons responsible: Harry, Andrea, Jean-Marie, and Tony
TIME: 3 months

Client Analysis and Annotation

Strategies for storing and retrieving results of interactive client side analysis or annotation. The culmination of this will be an API and a sandbox example.

Persons responsible: Josiah
TIME: 3 months

OME-TIFF

OME-TIFF is an OME-XML file variant that stuffs OME-XML into the header of a multipage TIFF. The binary pixel data is stored in TIFF pages and is indexed by OME-XML.

Persons responsible: Kevin & Curtis
TIME:: 3 months?

New Tech Research|Investigate OWL

Josh spent 5 weeks leading up to the meeting March 2005 Meeting: Briefings|prototyping an OWL data server (OWL-DS) and loading it with OME's ontology. The goal was to model what our backend would look like if we decide to migrate to OWL. In the months ahead, he will be profiling his DS and evaluating the feasibility of migration.

Persons responsible: Josh
TIME: 6 months?

New Tech Research|Speed up data access (long term)

Shoola is slow to load a large list of images. More generally, client access is slow for large structured piles of data. We are trying to cast a wide net for solutions. One team is going to investigate long term solutions of using ICE (middleware) on top of Hibernate (relational mapping layer), and migrating our data server to java.

Persons responsible: Harry, Josh, and Chris
TIME: Indeterminite

Image Rendering

Port the rendering model developed in Shoola to OMEIS.

        Persons responsible: Andrea, Jean-Marie, Chris
        TIME: 3 months
   

Hierarchy viewer

Displays a tree view of data that can be arranged in a hierarchy. Immediate uses are Project-Dataset-Images, Screen-Plate-Well-Images, and CategoryGroup-Category-Images.

        Persons responsible: Andrea, Jean-Marie, Tony, and Harry
        TIME: 1 month
   

[Generalized Image classification] [old-link http://trac.openmicroscopy.org.uk/ome/wiki/Generalized%2BImage%2Bclassification]

Finish integrating the classifier into OME, and expose it to end users through the Web UI.

       Persons responsible: Josiah, Tomasz
       TIME: 3 months
   

WebClient

Make it easy for sys-admins or power users to extend the annotation forms available in Web UI for needs of individual labs. Write a simple tutorial that includes sandbox examples of how to write custom applications with the Web UI.

        Persons responsible: Josiah
        TIME: 2 months
   

Expand the User base

Ilya's group has plans to expand the local user base into labs within the institute (NIA/IRP). Happy users in a few labs means we could start a road show, and aim for a broader base in Bethesda. The hope is that users would attract developers and sys admins. We need more developers to maintain the existing code base.

Other plans to expand the user base include putting together an [OME-knoppix demo] [old-link http://nacho.mit.edu/ome_knoppix/ ], organizing an ome user day, pointing people to ome/getting-started.ppt, setting up a demo server, and using vnc2flash to record a demo movie.

        Persons responsible: Josiah & Ilya (expanding NIH user base), Tony (knoppix), Jason (ome/getting-started.ppt)
        TIME: ???
   

Ambitions.

These goals got no staffing and no timeline. They are ambitions that may not be achieved.

Stripped down installation

Package a base level of OME intended solely for data management.

Importer

Tony needs a better understanding of the Import process. The solution, IMHO, is to compile a how-to guide for writing importers. The closest thing we have currently is a paragraph in the [documentation of importing files into OME] [old-link http://www.openmicroscopy.org/system-admin/import.html]

OME-TIFF

During the meeting, Kevin & Curtis from the LOCI group at UW Madison proposed an OME-TIFF file specification. OME-TIFF is an OME-XML file variant that stuffs OME-XML into the header of a multipage TIFF. The binary pixel data is stored in TIFF pages and is indexed by OME-XML.

They have a lot of users who demand stand alone files that are compatible with existing programs. Storing metadata is crucial, but most of their existing programs won't read OME-XML. The solution is to stuff OME-XML into the header of a multipage TIFF.

Nuts & Bolts

  • LSIDs: IDs in an OME-XML document come in two flavors: unique within the document or properly formed globally unique LSIDs. The latter flavor is very useful if multiple documents need to refer to the same physical microscope. The former flavor is sufficient for referencing a Pixel set that will never be referred to outside of the document.

  • Linking to Binary PixelData: We need to make a new tag "TIFF-5d" or "TIFF-IFD" that can be used to point to pixel planes. This would be used instead of "BinData". They will discuss these things with Ilya.

  • Help with backend Importer: Ilya will help them with this.

  • A disk space optimization was suggested that involves an extension to OMEIS to read pixel data directly from OME-TIFF files.

Updates to OME-XML

#

Minor fixes to the OME-XML spec

  • add a Pixels annotation
  • add PlaneInfo
  • Dimensions (Image ST): add a Pixels reference. nuke PixelSizeC and PixelSizeT

Who is responsible???

Housekeeping

Mailing list procedures, switching from cvs to subversion, procedures for commiting code.

Subversion

An alternative to cvs that we are considering switching to. * Repository wide revision control makes rollbacks easier Creating, managing, and merging branches is easier. The scripts that generate an RSS feed from code submits will need to be rewritten, and developers will have to learn a new set of commands. Will there still be public access via the web? e.g. http://cvs.openmicroscopy.org.uk/horde/chora/ Subversion is supported in BBEdit from version 8.1 http://www.barebones.com/support/bbedit/updates.shtml

Mailing lists

  • new mailing list:
    ome-users

  • The intended usage: "...a new list ("ome-users") has been created for user support requests and generic questions. Messages that were sent to "ome-devel@lists.openmicroscopy.org.uk" should now be sent to "ome-users@lists.openmicroscopy.org.uk" unless you have re-subscribed to "ome-devel" and that message is related to OME development."

  • Also: "Many of you have noticed the dearth of traffic on this list. In fact, the OME core developers have used another list, OME Nitpick. This is a closed list, which doesn't help the larger group see the work going on on the project. We will therefore be moving all of our technical discussions to this list, the OME Developers list. Note that you will see an increase in traffic volume. If you want to adjust the volume of mail you see, get digests etc, just follow the link at the bottom of this email."
    ome-devel

Procedures for committing code

  • Don't commit on Friday night.
  • Write unit tests. For real.
  • Branch for experiments. Warn the group before merging into HEAD.
  • We should set up more smoke tests on more boxes.

Communication policies:

  • Develop proposals in working groups, then report back to the larger group.

Miscellaneous Tasks

These tasks either didn't fall into any of the big categories or were indecyperable.

  • Publicly document the RSS feed of code commits. Ilya
  • OME-XML validator: Tabled for a grant to be written by Jason
  • Possible use caes for sharing data between ome installations. Jason
  • Rationalizing/normalyzing types around annotations. Jean-Marie, Andrea, Tony
  • Annotation: Is ROI annotation in 80/20? Tony, Eric

Madison list

  • start VisBio from the WebUI: Curtis & Josiah
  • ImageJ plugin: Harry will cooperate on bugs.
  • Discussion of File server etc. for import. Chris

New Tech Research

This working group came out of the March 2005 Developer Meeting. It is tasked with evaluating OWL, [http://www.hibernate.org/|Hibernate], and Ice for future use. Everyone would like OME to interoperate with OWL. If an OWL data server is up to the job, we would also like to migrate our current OME-DS to an OWL-DS (in the long run).

Hibernate (a relational mapping layer) and ICE (high performance binary middleware) are being considered for use on a java DS. They could be used with or without OWL.

OWL

Rather than fully migrating the current data store to an OWL data server, develop an OWL-DS in parallel and provide some means of interoperability. This would provide a shot term payoff (e.g. storing experimental metadata) for developing an OWL-DS. Once performance and stability goals are met by the OWL-DS, OME-DS would be migrated.

  • Plan A: Two way communication between the servers.
  • Plan B: One way communication from OME-DS to OWL-DS. The justification is that the core code of OME has little use for experimental ontologies.
Open Questions:

Would a unified API be presented to clients? (i.e. one that hides the OME-DS & OWL-DS split) Is a unified API even necessary if we develop communication within a suite of clients?

Supporting Technologies:
  • D2RQ: Could give OME-DS an RDF interface. There would be a one-to-one mapping from an ST to an D2RQ XML mapping file.
  • Annotea: provides a limited view of RDF from XML or HTML.

Josh will be profiling his OWL prototype, and doing general OWL investigation.

Hibernate and ICE

ICE would require us to port the backend to another language. The most likely canidate is Java. Below is possible layering of the technologies.

        RDB < --- > (Hibernate w/ OME Logic) < --- > ICE < --- > Client
   

Harry, Joshua, and Chris will investigate Hibernate and ICE.

Named Queries

The Named Queries working group is tasked with developing a set of high level remote procedure calls to represent common queries. These will ease the burden on client developers and allow us to optimize access to our current data store. This is a section of the larger Client API.

Example: getUserPDIgraph() would return the ubiquitous Project-Dataset-Image graph consisting of those objects owned by the logged in user.

Benefit: Because the query mechanisms will be hidden behind the API, client developers will be able to hack the system without having to grok the data model. This will also allow query optimization, and should ease any hypothetical backend migrations.

This effort needs explicit documentation of common queries, their parameters, and the graphs returned. This list will be fleshed out by client developers in the coming months. If you have something to add, but don't have tiki access, post your request ome-devel.

Named Queries: Inventory of Common Queries

The implemntation strategy outlined at the meeting is to develop a custom SQL query for each API call, then translate the results of the query directly into xml-rpc, bypassing object instantiation and SOAP::Lite. The inital goal is to establish a proof of concept such as retrieval of a Project-Dataset-Image grouping. Since the meeting, folks have been discussion a more general solution for SQL generation and parsing. As that conversation develops, updates should be posted here.

Named Queries: Implementation Strategies

The truly important task is to compare the performance of an optimized interface with our Minimal Target Performance of Middleware|minimal target performance. We really need to establish whether it's possible to live with perl.

Named Queries: Optimization results

High throughput quantitative analysis needs lots of numbers without objects. We need specific requirements and use cases before we can get very far.

Named Queries: High performance Use Cases

This working group is currently made up of Harry, Andrea, Jean-Marie, and Tony.

Client Analysis

  • Strategies for storing and retrieving results of interactive client side analysis. The culmination of this will be an API and a sandbox example. *
Representing Edits

Linked together through MEXs. Josiah needs to flesh out this description.

Three tiered API for client analysis or annotation

Integration of Client analysis or annotation comes in three tiers. The first tier requires the client developer to be familiar with the data types they will use for storage. The second tier additional requires some basic knowledge of modules. The third tier would require knowledge of chains, but will not be developed at this point.

Tier 1: Basic annotation

The results of the client will be stored as the output of the "Annotation" Module. The client developer must write XML definitions for any new or custom Semantic Types.

To save data, the developer would do something like:

annotationMEX = IndependentAnalysisFacade.startAnnotationMEX( );
attributes = annotationMEX.writeOutputs( ST_data );
annotationMEX.finalize();

annotationMEX also has a cancel() method.

To retieve data, the developer would do something like:

attributes = IndependentAnalysisFacade.getImageAnnotations( image, STs_wanted );

This would retrieve the most recently stored data of those types.

Tier 2: Storing results of a particular module

In addition to defining the SemanticTypes, the client developer would write a mdoule specification to represent their client. The specification includes what types will be saved. The API calls are much the same, but require an moduleLSID.

To save data, the developer would do something like:

myMEX = IndependentAnalysisFacade.startMEX( moduleLSID );
attributes = myMEX.writeOutputs( ST_data );
myMEX.finalize();

myMEX also has a cancel() method.

To retieve data, the developer would do something like:

attributes = IndependentAnalysisFacade.getMEXresults( moduleLSID, image );

This would retrieve the results of the most recent save. Other methods of interest:

       myFinishedMEX = IndependentAnalysisFacade.getLastMEX( moduleLSID );
       myListOfFinishedMEXs = IndependentAnalysisFacade.getListfOfMEX( moduleLSID );
       attributes = myFinishedMEX.getOutputs()
   

I (Josiah) am still working out some of the details, but this should give you the gist.

Josiah Johnston and Ilya Goldberg are tasked with developing this server side API. Andrea Falconi and Jean-Marie Burel are tasked with specifying their needs as client developers.

Document Actions