Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better averaging strategy for heatmap tiles #135

Open
1 of 2 tasks
JobLeonard opened this issue Nov 8, 2017 · 1 comment
Open
1 of 2 tasks

Better averaging strategy for heatmap tiles #135

JobLeonard opened this issue Nov 8, 2017 · 1 comment
Assignees

Comments

@JobLeonard
Copy link
Contributor

JobLeonard commented Nov 8, 2017

Because of issues #114 and #134 I'm looking at this code again I've done a quick investigation into strategies for generating zoomed-out tiles from our data.

Currently, we pick the top-left corner out of four data-points. This allows for enormous systematic biases: half of the genes are removed for each zoom level! I think we can do better.

For comparison, here is a pixel-perfect zoomed-in view of the cortex.loom dataset:

screenshot-2017-11-8 published cortex loom 12

Ideally, we want to maintain similar brightness, some sense of noise profile, and visible structures. In practice we will need to compromise on something that does well but no perfect on all three.

top-left pick (current strategy)

screenshot-2017-11-8 published cortex loom 8
screenshot-2017-11-8 published cortex loom 7
screenshot-2017-11-8 published cortex loom 6

This happens works decently enough on this dataset, presumably because the distribution in the data is random enough to counter the systematic bias. On other datasets the zoomed out view is almost completely blue, despite having non-blue rows, hiding interesting spots.

Also, structures present in zoomed in views (rows and columns that have expression levels from top to bottom) is almost completely gone when zooming out.

Average

screenshot-2017-11-8 published cortex loom 11
screenshot-2017-11-8 published cortex loom 10
screenshot-2017-11-8 published cortex loom 9

Too smooth, and because the value distribution is not uniform it introduces a bias of its own by draging the high values down. It does preserve structure better.

Max value

screenshot-2017-11-8 published cortex loom

Yeah... moving on...

Max value per column, average per row

screenshot-2017-11-8 published cortex loom 5
screenshot-2017-11-8 published cortex loom 4
screenshot-2017-11-8 published cortex loom 3
screenshot-2017-11-8 published cortex loom 2
screenshot-2017-11-8 published cortex loom 1

Now we're getting somewhere! While still biased to the maximum values too much, resulting in higher values every time we zoom out, this maintains the structure visible when looking at the zoomed in tiles.

Max-biased weighed average per column, average per row

screenshot-2017-11-8 published cortex loom
screenshot-2017-11-8 published cortex loom 20
screenshot-2017-11-8 published cortex loom 19
screenshot-2017-11-8 published cortex loom 18

We take the weighed average per column, biasing max:min value 3:1. Then we take the plain average between rows.

While brightness still slowly increases as we zoom out (this might be tweakable with a different weight, but it also depends on the underlying values so I don't think there is a "generic" way of doing this), it is not that pronounced, and it maintains the aforementioned benefits.

I think the last strategy is a good replacement for our current one. Also, we're using numpy methods, so this does not create a significant slowdown.

  • settle on new strategy for merging the data
  • re-tile all files on the server
@JobLeonard JobLeonard self-assigned this Nov 8, 2017
JobLeonard added a commit that referenced this issue Nov 8, 2017
- use numpy methods to calculate min/max faster
- improve CLI feedback for user when calculating min/max
- change from "pick top-left datapoint" to
 "max-biased weighed average". Preserves structure
 a LOT better when zooming out.

See issue #135 on github for more details
@JobLeonard
Copy link
Contributor Author

Old vs New:

https://www.youtube.com/watch?v=IjYZybeB4N4

https://www.youtube.com/watch?v=AB86fNJuzOU

(also, the private server is a few versions behind in terms of the loom-viewer. @pl-ki, can you show me tomorrow how it was set up and how I can update it?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant