design

Feature Storage in OMERO

This is a proposal for storing image features in OMERO, following on from the work done in OMERO.features.

Initial use-case: Storage of WND-CHARM features in the IDR.

Summary

Initially:

In future:

Main issues:

OMERO.tables format

Metadata columns

A column containing information about the features, for example object IDs or parameters.

These must be of scalar type, currently these are:

OMERO ID column types:

Numeric types:

String type:

There must be at least one ID column type to allow a feature row to be linked to an OMERO object. Multiple ID columns are allowed, for example to redundantly link a feature to an ROI and Image. Empty ID columns should be set to -1.

Feature columns

A column containing a feature value or vector.

Feature columns can be of any type, including array types:

Array columns are useful when there are a large number of features since the performance of PyTables decreases when there are a large number of columns. A disadvantage of array columns is they cannot be queried.

Column names

Column names beginning with _ are reserved for internal use. It is strongly recommended that names match the regular-expression [[:alpha:]][[:word:]]*, i.e. any letter followed by zero or more alphanumeric chatters or underscore. This is equivalent to [^\W\d_]\w* in the Python 2 re module, or [A-Za-z][A-Za-z0-9_] if plain ASCII is used.

For ease of parsing OMERO ID columns should be named in the form <Type><ID>, for example an ImageColumn should be named ImageID. If there are multiple ID columns the first one should be the most relevant if possible to make it easier for users unfamiliar with the data to analyse the features.

If the metadata includes the channel, Z-index or T-index the following names are recommended:

Zero-based indexing should be used for C/Z/T.

Column descriptions

Columns have an optional description field. This should be either empty or contain a JSON object including opening/closing braces ({ }). Top-level keys beginning with _ are reserved for internal use, all other keys will be ignored and can be used for client purposes.

Currently the following internal keys are defined:

Example

This example has:

Column type Roi Image Long Double DoubleArray[3]
Name RoiID ImageID C Feature-A Feature-B
Description {_metadata:true} {_metadata:true} {_metadata:true}    
Example data 101 23 1 10.54 [0.23, 3.1, 2.6]
Example data 5637 -1 0 -764567.889 [-9.0, 12.1, 0.2]

Versions

This is version 0.2 since an earlier proposal was called 0.1. OMERO annotations should use the namespace openmicroscopy.org/omero/features/0.2 for feature files.

Resources