Skip to content

Commit

Permalink
update reference (#76)
Browse files Browse the repository at this point in the history
  • Loading branch information
rcannood authored Oct 15, 2024
1 parent dc7d140 commit 70f7d28
Show file tree
Hide file tree
Showing 3 changed files with 23 additions and 5 deletions.
9 changes: 9 additions & 0 deletions _viash.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,15 @@ links:
issue_tracker: https://github.com/openproblems-bio/task_perturbation_prediction/issues
repository: https://github.com/openproblems-bio/task_perturbation_prediction
docker_registry: ghcr.io
references:
bibtex: |
@article{slazata2024benchmark,
title = {A benchmark for prediction of transcriptomic responses to chemical perturbations across cell types},
author = {Artur Szałata and Andrew Benz and Robrecht Cannoodt and Mauricio Cortes and Jason Fong and Sunil Kuppasani and Richard Lieberman and Tianyu Liu and Javier A. Mas-Rosario and Rico Meinl and Jalil Nourisa and Jared Tumiel and Tin M. Tunjic and Mengbo Wang and Noah Weber and Hongyu Zhao and Benedict Anchang and Fabian J Theis and Malte D Luecken and Daniel B Burkhardt},
booktitle = {The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year = {2024},
url = {https://openreview.net/forum?id=WTI4RJYSVm}
}
authors:
- name: Artur Szałata
Expand Down
2 changes: 1 addition & 1 deletion common
17 changes: 13 additions & 4 deletions scripts/datasets/neurips-2023-data.sh
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,22 @@ OUT=resources/datasets/neurips-2023-data

[[ ! -d $IN ]] && mkdir -p $IN

if [[ ! -f "$IN/sc_counts.h5ad" ]]; then
echo ">> Downloading 'sc_counts.h5ad'"
if [[ ! -f "$IN/sc_counts_reannotated_with_counts.h5ad" ]]; then
echo ">> Downloading 'sc_counts_reannotated_with_counts.h5ad'"
aws s3 cp --no-sign-request \
s3://openproblems-bio/public/neurips-2023-competition/sc_counts_reannotated_with_counts.h5ad \
"$IN/sc_counts_reannotated_with_counts.h5ad"
fi

# multiline string
ref="@article{slazata2024benchmark,
title = {A benchmark for prediction of transcriptomic responses to chemical perturbations across cell types},
author = {Artur Szałata and Andrew Benz and Robrecht Cannoodt and Mauricio Cortes and Jason Fong and Sunil Kuppasani and Richard Lieberman and Tianyu Liu and Javier A. Mas-Rosario and Rico Meinl and Jalil Nourisa and Jared Tumiel and Tin M. Tunjic and Mengbo Wang and Noah Weber and Hongyu Zhao and Benedict Anchang and Fabian J Theis and Malte D Luecken and Daniel B Burkhardt},
booktitle = {The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year = {2024},
url = {https://openreview.net/forum?id=WTI4RJYSVm}
}"

echo ">> Running 'process_dataset' workflow"
nextflow run \
target/nextflow/workflows/process_dataset/main.nf \
Expand All @@ -23,8 +32,8 @@ nextflow run \
--sc_counts "$IN/sc_counts_reannotated_with_counts.h5ad" \
--dataset_id "neurips-2023-data" \
--dataset_name "NeurIPS2023 scPerturb DGE" \
--dataset_url "TBD" \
--dataset_reference "TBD" \
--dataset_url "https://trace.ncbi.nlm.nih.gov/Traces/?view=study&acc=SRP527159" \
--dataset_reference "$ref" \
--dataset_summary "Differential gene expression sign(logFC) * -log10(p-value) values after 24 hours of treatment with 144 compounds in human PBMCs" \
--dataset_description "For this competition, we designed and generated a novel single-cell perturbational dataset in human peripheral blood mononuclear cells (PBMCs). We selected 144 compounds from the Library of Integrated Network-Based Cellular Signatures (LINCS) Connectivity Map dataset (PMID: 29195078) and measured single-cell gene expression profiles after 24 hours of treatment. The experiment was repeated in three healthy human donors, and the compounds were selected based on diverse transcriptional signatures observed in CD34+ hematopoietic stem cells (data not released). We performed this experiment in human PBMCs because the cells are commercially available with pre-obtained consent for public release and PBMCs are a primary, disease-relevant tissue that contains multiple mature cell types (including T-cells, B-cells, myeloid cells, and NK cells) with established markers for annotation of cell types. To supplement this dataset, we also measured cells from each donor at baseline with joint scRNA and single-cell chromatin accessibility measurements using the 10x Multiome assay. We hope that the addition of rich multi-omic data for each donor and cell type at baseline will help establish biological priors that explain the susceptibility of particular genes to exhibit perturbation responses in difference biological contexts." \
--dataset_organism "homo_sapiens" \
Expand Down

0 comments on commit 70f7d28

Please sign in to comment.