-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Composite codebook support #1626
Conversation
shanaxel42
commented
Oct 24, 2019
- Added a slicing method in codebook that allows a user to easily get a portion of the data.
- Added a new SimpleLookupDecoder for non multiplexed data that just makes a table of r/ch keys to target value and decodes.
- Added a few tests to demonstrate workflows with composite experiment data
Codecov Report
@@ Coverage Diff @@
## master #1626 +/- ##
==========================================
+ Coverage 90.06% 90.33% +0.27%
==========================================
Files 223 235 +12
Lines 8563 9323 +760
==========================================
+ Hits 7712 8422 +710
- Misses 851 901 +50
Continue to review full report at Codecov.
|
2c1be1a
to
00e8ed7
Compare
starfish/core/codebook/codebook.py
Outdated
Parameters | ||
---------- | ||
indexers : Mapping[Axes, Union[int, tuple]] | ||
A dictionary of dim:index where index is the value or range to index the dimension |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the tuple case is not super clear to me. is it like a 2-tuple with a start and end? what if you want a random eclectic set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to support lists of arbitrary values as well
lookup_table: Dict[Tuple, str] = {} | ||
for target in self.codebook[Features.TARGET]: | ||
for ch_label in self.codebook[Axes.CH.value]: | ||
for round_label in self.codebook[Axes.ROUND.value]: | ||
if self.codebook.loc[target, round_label, ch_label]: | ||
lookup_table[(int(round_label), int(ch_label))] = str(target.values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems slow, but the number of targets is probably not horrendous.
it wouldn't surprise me, however, if there were a faster way to do this.
Also, if you are going to do it the slow way, you might as well detect scenarios where multiple bit patterns decode to the same target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same code we use in Codebook.to_json for saving out the codebook. I'm not sure I understand this "you might as well detect scenarios where multiple bit patterns decode to the same target." how would that make sense for a non multiplexed assay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If one passes in a multiplexed codebook into this decoder, it will happily decode it in probably strange and unexpected ways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yea but the description says it should only be used with non multiplexed data similarly to how the decode_per_round description says it should only be used for one hot encoded data
Produce an Imagestack representing a composite experiment where the first 2 rounds are | ||
multiplexed data and the last round is sequential data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have a test where we exercise something like what @berl has (multiplexed data and sequential data mixed into one round)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I closed the comment thread about perf, but the rest I think still apply.
77e9cd3
to
450bb81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comments pls.
starfish/core/codebook/codebook.py
Outdated
|
||
Parameters | ||
---------- | ||
indexers : Mapping[Axes, Union[int, tuple]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indexers : Mapping[Axes, Union[int, tuple]] | |
indexers : Mapping[Axes, Union[int, Sequence]] |
(sequence encompasses both tuple and list)
starfish/core/codebook/codebook.py
Outdated
@@ -389,6 +389,20 @@ def open_json( | |||
|
|||
return cls.from_code_array(codebook_doc[DocumentKeys.MAPPINGS_KEY], n_round, n_channel) | |||
|
|||
def get_partial(self, indexers: Mapping[Axes, Union[int, tuple, list]]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_partial(self, indexers: Mapping[Axes, Union[int, tuple, list]]): | |
def get_partial(self, indexers: Mapping[Axes, Union[int, Sequence]]): |
indexers : Mapping[Axes, Union[int, tuple]] | ||
A dictionary of dim:index where index is the value or range to index the dimension | ||
indexers : Mapping[Axes, Union[int, tuple, list]] | ||
A dictionary of dim:index where index is the value, values or range to index the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A dictionary of dim:index where index is the value, values or range to index the | |
A dictionary of dim:index where index is the value, values, or range to index the |
"""Given a dictionary mapping the index name to either a value or a range represented as a | ||
tuple, return an Imagestack with each dimension indexed by position accordingly | ||
|
||
Parameters | ||
---------- | ||
indexers : Dict[Axes, (int/tuple)] | ||
A dictionary of dim:index where index is the value or range to index the dimension | ||
A dictionary of dim:index where index is the value, values or range to index the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A dictionary of dim:index where index is the value, values or range to index the | |
A dictionary of dim:index where index is the value, values, or range to index the |
"""Given a dictionary mapping the index name to either a value or a range represented as a | ||
tuple, return an Imagestack with each dimension indexed accordingly | ||
|
||
Parameters | ||
---------- | ||
indexers : Mapping[Axes, Union[int, tuple]] | ||
A dictionary of dim:index where index is the value or range to index the dimension | ||
indexers : Mapping[Axes, Union[int, tuple, list]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indexers : Mapping[Axes, Union[int, tuple, list]] | |
indexers : Mapping[Axes, Union[int, Sequence]] |
@@ -8,7 +8,8 @@ | |||
|
|||
|
|||
def convert_to_selector( | |||
indexers: Mapping[Axes, Union[int, slice, tuple]]) -> Mapping[Hashable, Union[int, slice]]: | |||
indexers: Mapping[Axes, Union[int, slice, tuple, list]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indexers: Mapping[Axes, Union[int, slice, tuple, list]] | |
indexers: Mapping[Axes, Union[int, slice, Sequence]] |
@@ -8,7 +8,8 @@ | |||
|
|||
|
|||
def convert_to_selector( | |||
indexers: Mapping[Axes, Union[int, slice, tuple]]) -> Mapping[Hashable, Union[int, slice]]: | |||
indexers: Mapping[Axes, Union[int, slice, tuple, list]] | |||
) -> Mapping[Hashable, Union[int, slice, list]]: | |||
"""Converts a mapping of Axis to int, slice, or tuple to a mapping of str to int or slice. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Converts a mapping of Axis to int, slice, or tuple to a mapping of str to int or slice. The | |
"""Converts a mapping of Axis to int, slice, or Sequence to a mapping of str to int, slice, or list. The |
return_dict: MutableMapping[Hashable, Union[int, slice]] = { | ||
ind.value: slice(None, None) for ind in Axes} | ||
return_dict: MutableMapping[Hashable, Union[int, slice, list]] = { | ||
ind.value: ind.value if type(ind.value) is list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ind.value: ind.value if type(ind.value) is list | |
ind.value: ind.value if isinstance(ind.value, list) |
assert np.array_equal(decoded_sequential[Features.TARGET].values, ['GENE_C', 'GENE_D']) | ||
|
||
|
||
def test_composite_codebook_decoding_seperate_sized_stacks(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good if this had a short description like test_composite_codebook_decoding
.
['GENE_C', 'GENE_D', 'GENE_E']) | ||
|
||
|
||
def test_composite_codebook_mixed_round(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
450bb81
to
0643359
Compare