Introduction
Introduction to Analysis Modules
In OME the analysis module is the sole mechanism by which new attributes are created. The act of creating new data is encoded as a transformation which generates new information from existing data. Whilst analysis modules usually represent computational algorithms, modules can be used to describe transformations that have already happened e.g. importing images, annotation, user administration, install-time configuration.
A module receives input attributes for each of its formal inputs, performs its computation, and generates new output attributes for each of its formal outputs. The OME Analysis Engine executes the Module's underlying algorithm implementation according to the module's execution instructions using implementation specific Handlers. Currently there are handlers for command-line binaries, MATLAB scripts, and Perl scripts.
More information describing the theory of OME analysis modules can be found here.
Defining Analysis Modules in XML
Overview
Analysis Modules are defined in XML according to the Analysis Module Library schemas. Figure 2 is the illustrative XML for the Analysis Module diagramed in Figure 1. The complete XML for this module is here src/xml/OME/Analysis/HaralickTextures.ome.
Module XML definitions have three parts the (1) Description, (2) Declaration, and (3) Execution Instructions.
<AnalysisModule ModuleName="Haralick Features" Category="Texture" ModuleType="OME::Analysis::Handlers::MatlabHandler" ProgramID="mb_texture" ID="urn:lsid:openmicroscopy.org:Module:7702"> <Description> The gray-level co-occurrence matrix is the two dimensional matrix of joint probabilities P_{d,r}(i,j) ... </Description> <Declaration> <FormalInput Name="Pixels Plane Slice" SemanticTypeName="PixelsPlaneSlice" Count="!"/> <FormalInput Name="Texture Direction" SemanticTypeName="HaralickTextureDirection" Count="!"/> <FormalInput Name="Texture Distance" SemanticTypeName="HaralickTextureDistance" Count="!"/> <FormalOutput Name="Angular Second Moment" SemanticTypeName="CoOcMat_AngularSecondMoment" Count="!"/> <FormalOutput Name="Contrast" SemanticTypeName="CoOcMat_Contrast" Count="!"/> <FormalOutput Name="Correlation" SemanticTypeName="CoOcMat_Correlation" Count="!"/> <FormalOutput Name="Variance" SemanticTypeName="CoOcMat_Variance" Count="!"/> <FormalOutput Name="Inverse Difference Moment" SemanticTypeName="CoOcMat_InverseDifferenceMoment" Count="!"/> <FormalOutput Name="Sum Average" SemanticTypeName="CoOcMat_SumAverage" Count="!"/> <FormalOutput Name="Sum Variance" SemanticTypeName="CoOcMat_SumVariance" Count="!"/> <FormalOutput Name="Sum Entropy" SemanticTypeName="CoOcMat_SumEntropy" Count="!"/> <FormalOutput Name="Entropy" SemanticTypeName="CoOcMat_Entropy" Count="!"/> <FormalOutput Name="Difference Entropy" SemanticTypeName="CoOcMat_DifferenceEntropy" Count="!"/> <FormalOutput Name="Difference Variance" SemanticTypeName="CoOcMat_DifferenceVariance" Count="!"/> <FormalOutput Name="First Measure of Correlation" SemanticTypeName="CoOcMat_FirstMeasureOfCorrelation" Count="!"/> <FormalOutput Name="Second Measure of Correlation" SemanticTypeName="CoOcMat_SecondMeasureOfCorrelation" Count="!"/> <FormalOutput Name="Maximal Correlation Coefficient" SemanticTypeName="CoOcMat_MaximalCorrelationCoefficient" Count="!"/> </Declaration> <ExecutionInstructions ExecutionGranularity="I" xmlns="http://www.openmicroscopy.org/XMLschemas/MLI/IR2/MLI.xsd" ... > <!-- SNIP !--> </ExecutionInstructions> </AnalysisModule>
Description
ModuleName="Haralick Features" Category="Texture" ModuleType="OME::Analysis::Handlers::MatlabHandler" ProgramID="mb_texture" ID="urn:lsid:openmicroscopy.org:Module:7702"> <Description> The gray-level co-occurrence matrix is the two dimensional matrix of joint probabilities P_{d,r}(i,j) ... </Description>
The Module's Description is a series of one-liners that specify miscellaneous information about the module. Their individual meanings are summarised in Figure 3, although the Module's ID deserves special mention.
The Module's ID is a Life Science Identifier
(LSID). The LSID is a specification for uniquely naming data resources
and is the fruition of an effort lead by the Interoperable Informatics
Infrastructure Consortium (I3C) to enable interoperability between
informatics applications. Consider an example LSID such as
urn:lsid:openmicroscopy.org:Module:7702
. It contains
four parameters separated by colons. Taking each element in turn,
urn:lsid
is a mandatory preface for LSID data;
openmicroscopy.org
is the Internet domain of the organization
that assigned an LSID to the data; Module
is the name of
the data resource; and the last parameter 7702
is the name
of the data element. There is another possible LSID parameter, the version
number of the specific data element, which we aren't using.
You will want to modify the LSID for modules you write by using your institution's FQDN as the first parameter. You may choose whatever number you like for the name of your data element with the proviso it must be unique. If two different modules have the same LSID they cannot be imported into the same OME distribution.
ModuleName | Gives your module a name. Although the ModuleName serves mostly for user benefit, it must be unique because it will be used to refer to modules when defining nodes in an Analysis Chain. |
Category | Category is a string. Modules belong to a heirarchical structure of categories which makes it easier to find, e.g. in the Chain Builder, specific modules since similar modules are grouped together. A period (.) represents the levels of the hierarchy. |
ModuleType |
Refers to the Perl package that will be used to execute
this module. This can either be:
OME::Analysis::Handlers::CLIHandler to use the Command-line
handler, OME::Analysis::Handlers::MatlabHandler for the
Matlab Handler, or another Perl package if this module is directly
implemented on the back-end.
|
ProgramID | The meaning of ProgramID is specific to the ModuleType. |
ID | This string specifies the Module's Life Science Identifier (LSID) which is globally unique. |
Description | This string is a free-text description of the module's algorithm and implementation intended to inform potential users. |
Declaration
<FormalInput Name="Pixels Plane Slice" SemanticTypeName="PixelsPlaneSlice" Count="!"/>
<FormalOutput Name="Angular Second Moment" SemanticTypeName="CoOcMat_AngularSecondMoment" Count="!"/>
The Module Declaration defines the names, semantic types, and arrity of a
module's Formal Inputs and Outputs. Name
is free text that describes the
input or output, SemanticTypeName
must be the name of an
existing ST that was defined else-where, and
Count
is a character (see Table 2) that specifies how many
attributes should be expected.
! | Exactly one |
+ | One or more |
? | Zero or one |
* | Zero or more |
Count
Execution Instructions
<Execution Instructions>
define how to convert
from OME semantic type attributes to a format understandable to the
Module's algorithm implementation. Execution instructions are specific
to an implementation. Go here
to read about
execution instructions for algorithms implemented as MATLAB scripts.
Using Analysis Modules
Analysis modules and their semantic types must be imported into OME before they can be used. Modules can be viewed with the WebUI and linked to form analysis chains using the Shoola Chain Builder.