Skip to content

Detailed dataset (yaml) file example

Christopher Small edited this page Dec 26, 2017 · 1 revision

See the README for a basic introduction to the setup of the dataset (yaml/json) spec.

A more fully fleshed out example can be seen below:

id: laura-mb-v14
samples:
  # some particular sample id in our dataset
  Hs-LN-D-5RACE-IgG:
    meta: {isotype: g, species: human, subject: BF520, timepoint: M9}
    locus: igh
    parameter-dir: /path/to/Hs-LN-D-5RACE-IgG/parameter-dir
    partition-file: /path/to/Hs-LN-D-5RACE-IgG/partition.csv
    cluster-annotation-file: /path/to/Hs-LN-D-5RACE-IgG/partition-cluster-annotations.csv
    per-sequence-meta-file: /path/to/Hs-LN-D-5RACE-IgG/seqmeta.csv
    seeds:
      # unique seed id
      BF520.1-igh:
        partition-file: /path/to/Hs-LN-D-5RACE-IgG/seeds/BF520.1-igh/partition.csv
        cluster-annotation-file: /path/to/Hs-LN-D-5RACE-IgG/seeds/BF520.1-igh/partition-cluster-annotations.csv
      BF520.2-igh:
        partition-file: /path/to/Hs-LN-D-5RACE-IgG/seeds/BF520.2-igh/partition.csv
        cluster-annotation-file: /path/to/Hs-LN-D-5RACE-IgG/seeds/BF520.2-igh/partition-cluster-annotations.csv
      # other seed partitions...

    # other partitions can be any other runs of partis on the same sample or subsample of the data, as you like.
    # You can also just create separate sample entries for these if you'd rather.
    other-partitions:
      subset-1:
        partition-file: /path/to/Hs-LN-D-5RACE-IgG/other-subsets/subset-1/partition.csv
        cluster-annotation-file: /path/to/Hs-LN-D-5RACE-IgG/other-subsets/subset-1/partition-cluster-annotations.csv
      subset-2:
        partition-file: /path/to/Hs-LN-D-5RACE-IgG/other-subsets/subset-2/partition.csv
        cluster-annotation-file: /path/to/Hs-LN-D-5RACE-IgG/other-subsets/subset-2/partition-cluster-annotations.csv
      # other subsets...

  # some other sample in our dataset...
  Hs-LN-D-5RACE-IgK:
    meta: {isotype: k, species: human, subject: BF520, timepoint: M9}
    locus: igh
    parameter-dir: /path/to/Hs-LN-D-5RACE-IgG/parameter-dir
    partition-file: /path/to/Hs-LN-D-5RACE-IgG/partition.csv
    cluster-annotation-file: /path/to/Hs-LN-D-5RACE-IgG/partition-cluster-annotations.csv
    per-sequence-meta-file: /path/to/Hs-LN-D-5RACE-IgG/seqmeta.csv
    seeds:
      #etc...
    other-partitions:
      #etc...

  # etc. for other samples...

Eventually we may also support adding per-subject metadata. Also if you aren't a fan of yaml but would prefer to use json for these specs, we can add an option for that fairly easily. If you're interested in either, please submit an issue.

Clone this wiki locally