Semantic Types and Friends

Semantic types, semantic elements, data tables, and data columns

The concept of "semantic type" plays a key role in OME. A semantic type is essentially a struct or class data element as implemented in the database. The SEMANTIC_TYPES table contains the structures that are needed to define much of the data in the database. Semantic types are linked to semantic elements, which are the components that the semantic types are composed of. The two tables are linked by the SEMANTIC_TYPE_ID key.

Semantic types have four levels of granularity:

Global (G): Applies to the entire database
Dataset (D): Specific to a given dataset
Image (I): Specific to a given image
Feature (F): Specific to a feature

An example might help make this abstract discussion somewhat more clear.

If we look at semantic_type #8, we get the following output:

ome=# select * from semantic_types where semantic_type_id=8;
 semantic_type_id |    name    | granularity | description 
------------------+------------+-------------+-------------
                8 | Experiment | G           | 
 (1 row)

This tells us that Experiment is a global semantic type, with id #8. Looking at the elements associated with Experiment, we see the following:

ome=# select * from semantic_elements where semantic_type_id=8;
 semantic_element_id | semantic_type_id |     name     | data_column_id | description 
---------------------+------------------+--------------+----------------+-------------
                  23 |                8 | Experimenter |             23 | 
                  22 |                8 | Description  |             22 | 
                  21 |                8 | Type         |             21 | 
(3 rows)

Thus, the Experiment type has three elements — experimenter, description, and type.

The DATA_COLUMN_ID column in the SEMANTIC_ELEMENTS table plays a key role in the translation between the abstract definitions of these semantic types and their implementation in the database. Specifically, DATA_COLUMN_ID is a reference to an entry in the DATA_COLUMNS_TABLE. This entry describes the instantiation of the element in the database.

Following up on our example, the entry for data_column_id = 23 is as follows:

ome=# select * from data_columns where data_column_id=23;
 data_column_id | data_table_id | column_name  | description | sql_type  | reference_type 
----------------+---------------+--------------+-------------+-----------+----------------
             23 |             8 | EXPERIMENTER |             | reference | Experimenter
(1 row)

Thus, in the database, the semantic element field with ID 23 is implemented as a reference to the table Experimenter - a foreign key.

Note the DATA_TABLE_ID field in the DATA_COLUMNS table. This field points to the entries in the DATA_TABLES table that implements the semantic type. Thus, if we find all of the columns in DATA_COLUMNS with data_table_id = 8, we see the three semantic elements for this semantic type:

ome=# select * from data_columns where data_table_id=8;
 data_column_id | data_table_id | column_name  | description | sql_type  | reference_type 
----------------+---------------+--------------+-------------+-----------+----------------
             23 |             8 | EXPERIMENTER |             | reference | Experimenter
             22 |             8 | DESCRIPTION  |             | string    | 
             21 |             8 | TYPE         |             | string    | 
(3 rows)

Finally, if we look at the entry in DATA_TABLES where data_table_id = 8, we see:

 data_table_id | granularity | table_name  | description 
---------------+-------------+-------------+-------------
             8 | G           | EXPERIMENTS | 
(1 row)

The entries in DATA_COLUMNS and DATA_TABLES are sufficient to generate the tables that store the actual data:

ome=# select * from experiments;
 attribute_id | module_execution_id | type | description | experimenter 
--------------+---------------------+------+-------------+--------------
(0 rows)

Each table has an ATTRIBUTE_ID field (essentially, a unique id # for each entry in that table), a MODULE_EXECUTION_ID (more about that later), and all of the fields defined in the associated data columns.

This is a fairly powerful and flexible structure: these tables provide all of the information needed to construct the database schema that is used to store the "real" OME data. This is what we mean when we say that the database is very "meta".

A closer look at this structure reveals some potentially troubling parallels: SEMANTIC_TYPES and DATA_TABLES, look very similar, as do SEMANTIC_ELEMENTS and DATA_COLUMNS. Is this duplication necessary?

This question can be answered by noting that the parallels are not exact. Semantic types and semantic elements define the abstract model of the data, while the data columns and tables define one particular realization of those abstract types. The DATA_COLUMN_ID in the SEMANTIC_ELEMENTS table defines the link between this abstract model and its concrete realization.

This separation allows us to move away from a strictly one-to-one mapping between semantic types and data tables. For example, the semantic types PlaneMean and PlaneGeometricMean all refer back to entries in the table PLANE_STATISTICS.

Document Actions

Print this

Sections

Personal tools

Semantic Types and Friends

Semantic types, semantic elements, data tables, and data columns

Document Actions