Support co annotation analysis #9

ubyndr · 2023-06-16T14:56:34Z

fixes #7

I'm not clear about how to choose the predicates mentioned in the ticket;

set(X) cluster_overlaps set(Y)
set(X) cluster_matches set(Y)
set(x) subcluster_of set(y)

I've added the table without the predicates.

dosumis · 2023-06-21T11:14:17Z

pandasaurus_cxg/anndata_analyzer.py

+                    lambda row: Predicate.CLUSTER_MATCHES.value
+                    if row[text] in predicate_dict.get(row["cell_type"], [])
+                    and len(predicate_dict.get(row["cell_type"], [])) == 1
+                    else (


Bit opaque. I think this works, but looks like this only tests co-annotation with cell_type field. Should be looking at all fields.

All fields as in tissue, diseases and organism etc. or just all other free text cell type field?

Just cell type fields

All combinations of pairs free text and ontology cell type fields.

pandasaurus_cxg/anndata_analyzer.py

dosumis · 2023-06-21T11:23:38Z

pandasaurus_cxg/anndata_analyzer.py

+                    and len(predicate_dict.get(row["cell_type"], [])) == 1
+                    else (
+                        Predicate.SUBCLUSTER_OF.value
+                        if row[text] in predicate_dict.get(row["cell_type"], [])


Based on the if/else logic - this looks like adds SUBCLUSTEROF if there are multiple rows with "cell_type" and text. But that could be a subcluster_of relationship in either direction or overlap.

Let me try to explain my reasoning behind this;
Lets say we have the following structure in predicate_dict:

{ 'endothelial cell': ['Descending Vasa Recta Endothelial Cell', 'Ascending Vasa Recta Endothelial Cell', 'Afferent / Efferent Arteriole Endothelial Cell', 'Peritubular Capilary Endothelial Cell ', 'Glomerular Capillary Endothelial Cell', 'Degenerative Peritubular Capilary Endothelial Cell', 'Cycling Endothelial Cell', 'Lymphatic Endothelial Cell', 'Degenerative Endothelial Cell'], 'podocyte': ['Podocyte', 'Degenerative Podocyte'], 'leukocyte': ['Natural Killer Cell / Natural Killer T Cell', 'M2 Macrophage', 'Neutrophil', 'Monocyte-derived Cell', 'T Cell', 'Plasma Cell', 'Cycling Mononuclear Phagocyte', 'Non-classical Monocyte', 'Classical Dendritic Cell', 'Mast Cell', 'B Cell', 'Plasmacytoid Dendritic Cell', 'Cycling Natural Killer Cell / Natural Killer T Cell'] }

I iterate through the df I have and lets say the first row[text] is 'Descending Vasa Recta Endothelial Cell' and it corresponds to 'endothelial cell' in the cell type field. I check if 'Descending Vasa Recta Endothelial Cell' is in ['Descending Vasa Recta Endothelial Cell', 'Ascending Vasa Recta Endothelial Cell', 'Afferent / Efferent Arteriole Endothelial Cell', 'Peritubular Capilary Endothelial Cell ', 'Glomerular Capillary Endothelial Cell', 'Degenerative Peritubular Capilary Endothelial Cell', 'Cycling Endothelial Cell', 'Lymphatic Endothelial Cell', 'Degenerative Endothelial Cell']. Since the length of the list is not 1 I infer 'Descending Vasa Recta Endothelial Cell' as subcluster_of 'endothelial cell'

Is there any way to determine the direction of this relationship with the tabular data?

I assumed that everything other than cluster_matches and subcluster_of should be cluster_overlaps, not sure for 100%.

dosumis · 2023-06-21T11:24:25Z

pandasaurus_cxg/anndata_analyzer.py

+                        if row[text] in predicate_dict.get(row["cell_type"], [])
+                        else Predicate.CLUSTER_OVERLAPS.value
+                    ),  # All the other cases should be marked with 'cluster_overlaps', right?
+                    axis=1,


I think this is wrong. I will add a cluster_overlaps example to the ticket.

ubyndr · 2023-06-22T09:01:00Z

Hi @dosumis, examples for each predicate group can be seen below;

cluster_matches

Value of field_name_1: dPT
Value of field_name_1_dict: ['Degenerative Proximal Tubule Epithelial Cell']
Value of field_name_2: Degenerative Proximal Tubule Epithelial Cell
Value of field_name_2_dict: ['dPT']

subcluster_of

Value of field_name_1: aTAL1
Value of field_name_1_dict: ['Adaptive / Maladaptive / Repairing Thick Ascending Limb Cell']
Value of field_name_2: Adaptive / Maladaptive / Repairing Thick Ascending Limb Cell
Value of field_name_2_dict: ['aTAL1', 'aTAL2']

Value of field_name_1: Cortical Collecting Duct Intercalated Cell Type A
Value of field_name_1_dict: ['C-IC-A']
Value of field_name_2: C-IC-A
Value of field_name_2_dict: ['Cortical Collecting Duct Intercalated Cell Type A', 'Connecting Tubule Intercalated Cell Type A']

Value of field_name_1: Connecting Tubule Principal Cell
Value of field_name_1_dict: ['CNT']
Value of field_name_2: CNT
Value of field_name_2_dict: ['Connecting Tubule Principal Cell', 'Connecting Tubule Cell']

supercluster_of

Value of field_name_1: stroma cells
Value of field_name_1_dict: ['kidney interstitial fibroblast', 'renal interstitial pericyte']
Value of field_name_2: renal interstitial pericyte
Value of field_name_2_dict: ['stroma cells']

Value of field_name_1: Adaptive / Maladaptive / Repairing Thick Ascending Limb Cell
Value of field_name_1_dict: ['aTAL1', 'aTAL2']
Value of field_name_2: aTAL2
Value of field_name_2_dict: ['Adaptive / Maladaptive / Repairing Thick Ascending Limb Cell']

Value of field_name_1: PT
Value of field_name_1_dict: ['dPT', 'aPT', 'cycPT', 'PT-S1/2', 'PT-S3']
Value of field_name_2: dPT
Value of field_name_2_dict: ['PT']

cluster_overlaps

Value of field_name_1: degenerative
Value of field_name_1_dict: ['PT', 'TAL', 'PC', 'FIB', 'EC', 'VSM/P', 'ATL', 'IC', 'CNT', 'POD', 'DTL', 'DCT']
Value of field_name_2: PT
Value of field_name_2_dict: ['degenerative', 'adaptive - epi', 'cycling', 'reference']

Value of field_name_1: reference
Value of field_name_1_dict: ['FIB', 'TAL', 'IMM', 'EC', 'IC', 'DTL', 'POD', 'ATL', 'PC', 'PT', 'CNT', 'DCT', 'VSM/P', 'NEU', 'PEC', 'PapE']
Value of field_name_2: FIB
Value of field_name_2_dict: ['reference', 'degenerative', 'adaptive - str', 'cycling']

Value of field_name_1: cycling
Value of field_name_1_dict: ['PT', 'IMM', 'EC', 'CNT', 'DCT', 'FIB']
Value of field_name_2: PT
Value of field_name_2_dict: ['degenerative', 'adaptive - epi', 'cycling', 'reference']

ubyndr · 2023-06-22T20:12:02Z

@dosumis @hkir-dev
Can I merge this now with the recent changes?

ubyndr added 4 commits June 16, 2023 15:52

Added TODO

06b783d

Added temporary schema and loader

a741502

Added temporary schema and loader

51dc3b7

Added AnndataAnalyzer and co_annotation_report

9817b00

ubyndr requested review from dosumis and hkir-dev June 16, 2023 14:56

Added predicate logic to co_annotation_report

44115a1

ubyndr marked this pull request as ready for review June 20, 2023 14:18

dosumis requested changes Jun 21, 2023

View reviewed changes

ubyndr added 4 commits June 21, 2023 15:58

Increased max-line-length

622b69f

Updated schema.json

6dd3670

Updated pandasaurus version

0fa06e5

Fixed the predicate logic and added supercluster_of predicate

63cfddd

ubyndr requested a review from dosumis June 21, 2023 15:01

Refactored the lambda operation

5fe7820

ubyndr added 2 commits June 22, 2023 14:46

Removed duplicates from co_annotation_report

1c3c114

Updated pandasaurus version to 0.2.2

86b945c

dosumis approved these changes Jun 23, 2023

View reviewed changes

ubyndr merged commit 7911370 into main Jun 23, 2023

ubyndr deleted the 7-support-co-annotation-analysis branch June 23, 2023 08:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support co annotation analysis #9

Support co annotation analysis #9

ubyndr commented Jun 16, 2023

dosumis Jun 21, 2023

ubyndr Jun 21, 2023

dosumis Jun 21, 2023

dosumis Jun 21, 2023

dosumis Jun 21, 2023

ubyndr Jun 21, 2023

dosumis Jun 21, 2023

ubyndr commented Jun 22, 2023

ubyndr commented Jun 22, 2023

Support co annotation analysis #9

Support co annotation analysis #9

Conversation

ubyndr commented Jun 16, 2023

dosumis Jun 21, 2023

Choose a reason for hiding this comment

ubyndr Jun 21, 2023

Choose a reason for hiding this comment

dosumis Jun 21, 2023

Choose a reason for hiding this comment

dosumis Jun 21, 2023

Choose a reason for hiding this comment

dosumis Jun 21, 2023

Choose a reason for hiding this comment

ubyndr Jun 21, 2023

Choose a reason for hiding this comment

dosumis Jun 21, 2023

Choose a reason for hiding this comment

ubyndr commented Jun 22, 2023

ubyndr commented Jun 22, 2023