Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explanation on ERD #80

Closed
anuradhawick opened this issue May 8, 2023 · 2 comments
Closed

Explanation on ERD #80

anuradhawick opened this issue May 8, 2023 · 2 comments

Comments

@anuradhawick
Copy link

Greetings,

In my understanding, we should be able to have cohorts defined under different criteria such as study, described in beacon or user-defined (I guess this is done using ontology terms).

In these scenarios, having a particular individual referenced across several cohorts is inevitable. However, from the ETD diagram, it seems the cohort-individual relationship has cardinality 1 -<> n or one to many. Could you kindly elaborate on this design aspect? I have attached the ERD for reference.

Thanks
ERD

@mbaudis
Copy link
Member

mbaudis commented May 8, 2023

@anuradhawick I'd say this is just ill defined. datasets and cohorts are 2 types of "collections"; datasets are "physical" groups (e.g. with common access, DAC, storage ... based on the variants) while cohorts can be flexible collections related to groupings of e.g. phenotype, diseases, studies...

We do not model all entity relationships for such parameters. E.g. a cohort may be assembled from members of multiple datasets - though such definitions are pushed out to cohort definitions beyond the Beacon model.

I'll adjust the cohorts<-->individuals to many-to-many:

classDiagram

    analyses <-- genomicVariations : 1..n
    runs <-- analyses : 1..n
    biosamples <-- runs : 1..n
    individuals <-- biosamples : 1..n

    runs <.. genomicVariations : 1..n
    biosamples <.. genomicVariations : 1..n
    individuals <.. genomicVariations : 1..n
    biosamples <.. analyses : 1..n
    individuals <.. analyses : 1..n
    individuals <.. runs : 1..n

    cohorts o-- individuals : m..n
    datasets o-- genomicVariations : 1..n

    class genomicVariations{
        analysisId
        runId
        biosampleId
        individualId
        variation
        clinicalInterpretations
        caseLevelData
        ...
    }
    class analyses{
        id
        runId
        biosampleId
        individualId
        analysisDate
        pipelineName
        aligner
        ...
    }
    class biosamples{
        id
        individualId
        biosampleStatus
        sampleOriginType
        histologicalDiagnosis
        collectionDate
        ...
    }
    class individuals{
        id
        sex
        diseases
        phenotypicFeatures
        ethnicity
        pedigrees
        ...
    }
    class runs{
        id
        biosampleId
        individualId
        runDate
        librarySource
        libraryStrategy
        platform
        ...
        }
    class datasets{
        id
        name
        description
        dataUseCondition
        info
        updateDateTime
        ...
    }
    class cohorts{
        id
        name
        cohortType
        cohortSize
        cohortDataTypes
        cohortDesign
        ...
    }
Loading

mbaudis added a commit that referenced this issue May 8, 2023
Documentation: Additional contribution/development pages together with a reorganization of the navigation structure.

Also, clarification of the cohorts relation (#80).
@anuradhawick
Copy link
Author

Thanks for the prompt response and explanation. I will close this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants