Name		Name	Last commit message	Last commit date
parent directory ..
convenience_scripts		convenience_scripts
datasets/moldoveanu		datasets/moldoveanu
README.md		README.md

README.md

The scripts here are used to import dataset into a PostgreSQL database for the SPT application or analysis, as well as organize or curate these datasets into a form acceptable for import.

Curation or pre-processing
Doing import / upload into the database

Curation or pre-processing

Datasets are stored in subdirectories of datasets/. To prepare a new dataset, follow the full example here. The example includes files pre-generated in a format ready for import into the database, but you can also re-generate them yourself.

Extraction scripts tend to be dataset-specific, but there are some common tasks like quantification over segments in images, and formulation of standardized representations of channel or clinical metadata.

Doing import / upload into the database

The recommended import method is to use spt db interactive-uploader.

It will take care of creating a run directory for Nextflow, configuring it with a workflow.config file like:

[general]
db_config_file = /Users/username/.spt_db.config.local

[database visitor]
study_name = Melanoma CyTOF ICI

[tabular import]
input_path = datasets/moldoveanu/generated_artifacts

or, in the case of S3 source files, with:

...
[tabular import]
input_path = s3://bucketname/moldoveanu

In the S3 case, you would have to make sure that credentials are available. Currently Nextflow requires, in the case of session-specific credentials, a "profile" in ~/.aws/credentials, usually the profile named default.

Note

To put the generated_artifacts/ files into an S3 bucket, you can use:

aws s3 cp generated_artifacts s3://bucket-name/dataset-name --recursive`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data_curation

data_curation

README.md

Curation or pre-processing

Doing import / upload into the database

Files

data_curation

Directory actions

More options

Directory actions

More options

Latest commit

History

data_curation

Folders and files

parent directory

README.md

Curation or pre-processing

Doing import / upload into the database