Personal tools
  • We're Hiring!

You are here: Home Documentation Previous versions OME Server Developer XML Schemas Analysis Modules


Introduction to Analysis Modules

In OME the analysis module is the sole mechanism by which new attributes are created. The act of creating new data is encoded as a transformation which generates new information from existing data. Whilst analysis modules usually represent computational algorithms, modules can be used to describe transformations that have already happened e.g. importing images, annotation, user administration, install-time configuration.

A module receives input attributes for each of its formal inputs, performs its computation, and generates new output attributes for each of its formal outputs. The OME Analysis Engine executes the Module's underlying algorithm implementation according to the module's execution instructions using implementation specific Handlers. Currently there are handlers for command-line binaries, MATLAB scripts, and Perl scripts.

More information describing the theory of OME analysis modules can be found here.

Figure 1: Analysis Module Schematic

Defining Analysis Modules in XML


Analysis Modules are defined in XML according to the Analysis Module Library schemas. Figure 2 is the illustrative XML for the Analysis Module diagramed in Figure 1. The complete XML for this module is here src/xml/OME/Analysis/HaralickTextures.ome.

Module XML definitions have three parts the (1) Description, (2) Declaration, and (3) Execution Instructions.

          ModuleName="Haralick Features"
          The gray-level co-occurrence matrix is the two dimensional matrix of joint probabilities P_{d,r}(i,j) ...
          <FormalInput Name="Pixels Plane Slice" SemanticTypeName="PixelsPlaneSlice"         Count="!"/>
          <FormalInput Name="Texture Direction"  SemanticTypeName="HaralickTextureDirection" Count="!"/>
          <FormalInput Name="Texture Distance"   SemanticTypeName="HaralickTextureDistance"  Count="!"/>
          <FormalOutput Name="Angular Second Moment"           SemanticTypeName="CoOcMat_AngularSecondMoment"           Count="!"/>
          <FormalOutput Name="Contrast"                        SemanticTypeName="CoOcMat_Contrast"                      Count="!"/>
          <FormalOutput Name="Correlation"                     SemanticTypeName="CoOcMat_Correlation"                   Count="!"/>
          <FormalOutput Name="Variance"                        SemanticTypeName="CoOcMat_Variance"                      Count="!"/>
          <FormalOutput Name="Inverse Difference Moment"       SemanticTypeName="CoOcMat_InverseDifferenceMoment"       Count="!"/>
          <FormalOutput Name="Sum Average"                     SemanticTypeName="CoOcMat_SumAverage"                    Count="!"/>
          <FormalOutput Name="Sum Variance"                    SemanticTypeName="CoOcMat_SumVariance"                   Count="!"/>
          <FormalOutput Name="Sum Entropy"                     SemanticTypeName="CoOcMat_SumEntropy"                    Count="!"/>
          <FormalOutput Name="Entropy"                         SemanticTypeName="CoOcMat_Entropy"                       Count="!"/>
          <FormalOutput Name="Difference Entropy"              SemanticTypeName="CoOcMat_DifferenceEntropy"             Count="!"/>
          <FormalOutput Name="Difference Variance"             SemanticTypeName="CoOcMat_DifferenceVariance"            Count="!"/>
          <FormalOutput Name="First Measure of Correlation"    SemanticTypeName="CoOcMat_FirstMeasureOfCorrelation"     Count="!"/>
          <FormalOutput Name="Second Measure of Correlation"   SemanticTypeName="CoOcMat_SecondMeasureOfCorrelation"    Count="!"/>
          <FormalOutput Name="Maximal Correlation Coefficient" SemanticTypeName="CoOcMat_MaximalCorrelationCoefficient" Count="!"/>
          <ExecutionInstructions ExecutionGranularity="I" xmlns="" ... >
          <!-- SNIP !-->
Figure 2: Illustrative XML


ModuleName="Haralick Features"
        The gray-level co-occurrence matrix is the two dimensional matrix of joint probabilities P_{d,r}(i,j) ...

The Module's Description is a series of one-liners that specify miscellaneous information about the module. Their individual meanings are summarised in Figure 3, although the Module's ID deserves special mention.

The Module's ID is a Life Science Identifier (LSID). The LSID is a specification for uniquely naming data resources and is the fruition of an effort lead by the Interoperable Informatics Infrastructure Consortium (I3C) to enable interoperability between informatics applications. Consider an example LSID such as It contains four parameters separated by colons. Taking each element in turn, urn:lsid is a mandatory preface for LSID data; is the Internet domain of the organization that assigned an LSID to the data; Module is the name of the data resource; and the last parameter 7702 is the name of the data element. There is another possible LSID parameter, the version number of the specific data element, which we aren't using.

You will want to modify the LSID for modules you write by using your institution's FQDN as the first parameter. You may choose whatever number you like for the name of your data element with the proviso it must be unique. If two different modules have the same LSID they cannot be imported into the same OME distribution.

ModuleName Gives your module a name. Although the ModuleName serves mostly for user benefit, it must be unique because it will be used to refer to modules when defining nodes in an Analysis Chain.
Category Category is a string. Modules belong to a heirarchical structure of categories which makes it easier to find, e.g. in the Chain Builder, specific modules since similar modules are grouped together. A period (.) represents the levels of the hierarchy.
ModuleType Refers to the Perl package that will be used to execute this module. This can either be: OME::Analysis::Handlers::CLIHandler to use the Command-line handler, OME::Analysis::Handlers::MatlabHandler for the Matlab Handler, or another Perl package if this module is directly implemented on the back-end.
ProgramID The meaning of ProgramID is specific to the ModuleType.
ID This string specifies the Module's Life Science Identifier (LSID) which is globally unique.
Description This string is a free-text description of the module's algorithm and implementation intended to inform potential users.
Table 1: Module Description XML


<FormalInput Name="Pixels Plane Slice" SemanticTypeName="PixelsPlaneSlice" Count="!"/>
<FormalOutput Name="Angular Second Moment" SemanticTypeName="CoOcMat_AngularSecondMoment" Count="!"/>

The Module Declaration defines the names, semantic types, and arrity of a module's Formal Inputs and Outputs. Name is free text that describes the input or output, SemanticTypeName must be the name of an existing ST that was defined else-where, and Count is a character (see Table 2) that specifies how many attributes should be expected.

! Exactly one
+ One or more
? Zero or one
* Zero or more
Table 2: Permissible values for Count

Execution Instructions

<Execution Instructions> define how to convert from OME semantic type attributes to a format understandable to the Module's algorithm implementation. Execution instructions are specific to an implementation. Go here to read about execution instructions for algorithms implemented as MATLAB scripts.

Using Analysis Modules

Analysis modules and their semantic types must be imported into OME before they can be used. Modules can be viewed with the WebUI and linked to form analysis chains using the Shoola Chain Builder.

Document Actions