Skip to content
Nate Moore edited this page Apr 22, 2019 · 3 revisions

Welcome to the almanac-browser wiki!

The following tables describe the structure of the database underlying the Molecular Almanac Browser. See the Vertabelo website for an editable graphical representation of the database. Note that Vertabelo is used to generate the db_scripts/db_create.sql and db_scripts/db_drop.sql files which are in turn used by create_db.sh to create the Almanac SQLite database and fill it with spreadsheet data.

Table of Contents

  1. Assertion
  2. Assertion_To_Source
  3. Source
  4. Feature_Set
  5. Feature
  6. Feature_Attribute
  7. Feature_Definition
  8. Feature_Attribute_Definition
  9. Version

Assertion

Holds all information related to a specific assertion made by a Source document.

Column Type Key/Nullable Description
assertion_id integer PK
created_on text Date this assertion was created, as a UNIX timestamp.
last_updated text Date this assertion was last modified, as a UNIX timestamp.
disease text N Name of the disease this assertion relates to, as specified by the user adding the assertion.
oncotree_term text N Standardized OncoTree descriptor of the referenced disease state.
oncotree_code text N Short format of standardized OncoTree descriptor of the referenced disease state.
stage integer N Specific stage number of the disease this assertion references, if applicable.
therapy_name text N Name of therapy referenced by this assertion.
therapy_sensitivity boolean N If True, this assertion describes a scenario in which the patient's disease state may be responsive to the referenced therapy.
therapy_resistance boolean N If True, this assertion describes a scenario in which the patient's disease state may be resistant to the referenced therapy.
predictive_implication text N "Confidence level" of this assertion; must be one of Inferential, Preclinical, Clinical, Guideline, FDA-Approved.
favorable_prognosis boolean N If True, this assertion describes a scenario in which the patient is predicted to experience a less severe disease (somatic variants) or be less likely to acquire the disease (germline variants). If False, this assertion describes a scenario in which the patient is predicted to experience a more severe disease (somatic variants) or to be more likely to acquire the disease (germline variants).
description text N User-provided text describing the assertion in a human-readable format; often explains how the linked source(s) relate to the assertion.
validated boolean Describes whether a user-submitted assertion has been validated by an admin. If True, the assertion is validated and should appear in the Almanac. If False, the assertion has not yet been validated and should be hidden.
submitted_by text Email address of user that submitted the alteration.

Assertion_To_Source

Association table linking Assertion and Source in a many-to-many relationship.

Column Type Key/Nullable Description
ats_id integer PK ID for this Assertion_To_Source relationship.
assertion_id integer FK Foreign key referencing Assertion.assertion_id.
source_id integer FK Foreign key referencing Source.source_id.

Source

Describes a source document describing the information in one or multiple Assertions.

Column Type Key/Nullable Description
source_id integer PK ID for this source.
source_type text String describing the type of source (e.g., Journal, Guideline, etc.).
cite_text text Formal citation string for this source (ideally in AMA format).
doi text N DOI for this source (do not use DOI URLs - only the DOI itself).

Feature_Set

Association table linking Assertion and Feature in a one-to-many relationship (one assertion may have many features grouped together as a feature set).

Column Type Key/Nullable Description
feature_set_id integer PK ID for this feature set.
assertion_id integer FK Foreign key referencing Assertion.assertion_id.

Feature

Describes a single molecular feature associated with a given assertion (e.g., Somatic Variant or Mutational Burden). Instantiates one of the feature definitions given in the Feature_Definition table.

Column Type Key/Nullable Description
feature_id integer PK ID for this feature.
feature_set_id integer FK Foreign key referencing Feature_Set.feature_set_id for the feature set this feature belongs to.
feature_def_id integer FK Foreign key referencing Feature_Definition.feature_def_id for the feature definition this feature instantiates.

Feature_Attribute

Describes a single attribute associated with a molecular feature. E.g., the Somatic Variant feature has the variant_type, gene, and protein_change attributes, each of which is described by a separate Feature_Attribute row for the given Assertion. Instantiates one of the attribute definitions given in the Feature_Attribute_Definition table.

Column Type Key/Nullable Description
attribute_id integer PK ID for this attribute.
feature_id integer FK Foreign key referencing Feature.feature_id for the feature of which this attribute is a member.
attribute_def_id integer FK Foreign key referencing Feature_Attribute_Definition for the attribute definition this attribute instantiates.

Feature_Definition

Provides a user-defined definition for a given feature (e.g., Somatic Variant or Mutational Burden). All features linked to a given assertion must reference a single feature definition described in this table.

Column Type Key/Nullable Description
feature_def_id integer PK ID for this feature definition.
name text "Programmatic" name of this feature (e.g., "somatic_variant").
readable_name text Human-readable name for display to users (e.g., "Somatic Variant").
is_germline boolean If True, this feature applies to germline data; otherwise applies to somatic data.

Feature_Attribute_Definition

Provides a user-defined definition of a single attribute (e.g., the gene or protein_change attributes within the Somatic Variant feature. All attributes linked to a given feature must reference a single attribute definition given in this table.

Column Type Key/Nullable Description
attribute_def_id integer PK ID for this attribute definition.
feature_def_id integer FK Foreign key referencing Feature_Definition.feature_def_id for the feature definition this attribute definition is defined under.
name text "Programmatic" name of this feature (e.g., "protein_change").
readable_name text Human-readable name for display to users (e.g., "Protein Change").
type text Type of data that is stored in this attribute; may be used by the Browser to classify certain attributes or decide how to display them. May hold any value, but the Browser explicitly recognized text, integer, and gene at this time.

Version

Contains the version number of the underlying Almanac database from which the Browser is pulling information. Versioning follows the Semantic Versioning guidelines.

Column Type Key/Nullable Description
major integer PK Incremented when major changes are made to the database such that the Almanac Browser itself must be modified to continue reading data from the database (e.g., changes in the structure of the database).
minor integer Incremented when data is added to the database without requiring modification of the Almanac Browser to use the database (e.g., adding new assertions).
patch integer Incremented after bug fixing (e.g., fixing a typo in an entered assertion).
Clone this wiki locally