Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatting tool for user friendly converting to spaceTX format #1318

Closed
3 tasks
shanaxel42 opened this issue May 7, 2019 · 8 comments · Fixed by #1421
Closed
3 tasks

Formatting tool for user friendly converting to spaceTX format #1318

shanaxel42 opened this issue May 7, 2019 · 8 comments · Fixed by #1421
Assignees
Labels
feature New work
Milestone

Comments

@shanaxel42
Copy link
Collaborator

  • Define file name convention

  • Have tile fetcher example that works with above convention

  • Document in readthedocs

@ttung
Copy link
Collaborator

ttung commented May 17, 2019

Proposal

files are to be named:

<image_type>-f<fov_number>-r<round_number>-c<ch_number>-z<zplane_number>.<image_extension>

image_type is like primary, dots, etc.
*_number: self explanatory
image_extension: has to be one of the supported image extensions, case insensitive

coordinates are to be read from a specified csv file, where the columns are filename, xmin, xmax, ymin, ymax, zmin, zmax.

images where num_zplanes==1 should declare num_zplane to be any value they desire, and assign NaN to the z coordinates.

ttung pushed a commit that referenced this issue May 24, 2019
Need this for the generalized experiment formatter (#1318), and @njmei needs it for #1322.

Test plan: `pytest -v -n4 starfish/core/experiment/builder/test/`
ttung pushed a commit that referenced this issue May 24, 2019

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Need this for the generalized experiment formatter (#1318), and @njmei needs it for #1322.

Test plan: `pytest -v -n4 starfish/core/experiment/builder/test/`
@ttung
Copy link
Collaborator

ttung commented Jun 6, 2019

codebook is still the hardest part of this, IMO...

@ambrosejcarr
Copy link
Member

ambrosejcarr commented Jun 6, 2019

There are a few formalisms for codebooks that we might want to support conversion from:

csv where each row is <gene>,ACCGTC where the position of each nucleotide is taken to be a sequential round, and a mapping is provided from nucleotide to channel.

csv where each row is <gene>,0010134 where the position of each number represents a sequential round, and the numbers are taken to be the channels.

@ambrosejcarr
Copy link
Member

I don't have a good solution for codebooks that aren't 1-hot. Fortunately 1-hot appears to be very common.

@ttung
Copy link
Collaborator

ttung commented Jun 6, 2019

I would prefer just doing:

<target>,r0_c0, r0_c1, ...,  rn_cn
SCUBE2,0,1, ..., 0

@neuromusic
Copy link
Collaborator

neuromusic commented Jun 6, 2019

@ttung so each column defines a coordinate?

I would find the tidy version more intuitive, where each row corresponds to a single value in the codebook...

target,round,channel,value
SCUBE2,0,0,1
SCUBE2,0,1,1
BRCA,0,0,1
BRCA,1,1,1
ACTB,0,1,1
ACTB,1,0,1

the xarray to_dataframe() method would return the data formatted something like this (except with zero values, as well). see http://xarray.pydata.org/en/stable/pandas.html#dataset-and-dataframe

@neuromusic
Copy link
Collaborator

example roundtrip from codebook to csv and back (technically to xarray on the return) https://gist.github.com/neuromusic/87267d7e20279585517c8cd46a0c5601

@ttung
Copy link
Collaborator

ttung commented Jun 6, 2019

so each column defines a coordinate?

correct. for one-hot encodings, it's very easy to sanity check (sums across rows or columns should always be 1)

I would find the tidy version more intuitive, where each row corresponds to a single value in the codebook...

that's a lot more "rows". riskier to get it wrong, potentially? not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New work
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants