2013-11-06 OMERO.features Google Hangout (15:00 GMT / 10:00 EST)

Chris Coletta, Lee Kamentsky, Simon Li, Ivan Cao-Berg (15:30)

S: Last time we more or less agreed on the idea of storing features in a 2D array with metadata in the form of key-value annotations.

L: HDF5 should work as a storage mechanism, supports key-value attributes on arrays.

C: Could we store non-metadata as attachments, e.g. classification results, probabilities, etc?

L: Add a column to the table.

C: Scalability considerations- minimise I/O. Do we have a separate table file per image, per ROI, or group the features for all ROIs in one table? SciDB?

S: Two issues to consider, implementation in OMERO, and how we transfer the features between (non-OMERO) systems.

C: Can tables be related to each other, e.g. multiple versions of features for the same image/ROI, if so is it up to the user to work out which one to use?

L: Have a relational style DB, glob of related tables fitting a specified criteria.

L: Should be the user responsibility to deal with the results. If they make a query that has 15 matching tables all should be returned.

S: OMERO can handle the querying logic, in what format should the results be returned? Needs to include sufficient metadata to distinguish between the 15 tables.

L: Hibernate style object graph? Blob wrapping HDF5?

C: A Numpy matrix. Or Blaze, next generation Numpy, allows multiple non-fixed dimensions.

L: HDF5 is a backing store for Numpy, arranges chunks of data optimally, supports sparseness.

C: So OMERO.features needs to return multiple objects- key-value map(s), Numpy matrix. Related: will be attending PyData conference in New York, others might wish to go.

L: Likes the idea of using a ROI ID as an identifier and keeping the details of the ROI/image/etc separate from the features. Current state of the ROI-specification work? Would it cope with HCS data?

S: On-hold pending grant. In the meantime there’s scope for us to provide requirements or suggests changes.

L: Consider CellH5. Will talk to Christoph Sommer.

S: Are we going to define some key-value pairs? Are they attached to only tables, or also rows and columns?

L: HDF5 only supports per-table. Rows annotations could be done with another column, column annotations would require a separate table where each row is the metadata for a column.

S: OMERO should be able to handle returning multiple keys/values/tables.

I: Any restrictions on keys? Everyone has different requirements.

L: Metadata in OMERO is good, at the feature level most people don’t care. Attempting to standardise feature names is hard e.g. mean intensity could be normalised before/after calculations, areas could be interpolated at boundaries. Maybe have some restrictions, e.g. Image channel, ROI ID.

C: Have a standard set of base-level keys. If metadata is stored alongside the features in the same table (e.g. ROI ID) need to know which columns are features and which are metadata.

L: Describe using a key-value pair with a standard name, or define a datatype for the column.

S: Something like location could be both metadata and a feature. Spend the next week or two thinking about the standard key-value pairs and column-types we need.

I: Can you store an array of features in a single cell?

L: 2D array would be nice.

S: Discussion to be continued on email list.

Document Actions

Print this

Previous: 2013-10-30 OMERO.features Google Hangout (14:00 GMT / 10:00 EDT)

Sections

Personal tools

2013-11-06 OMERO.features Google Hangout (15:00 GMT / 10:00 EST)

Document Actions