Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Progressive loading for 2D and 3D data #5561

Open
Tracked by #5942
kephale opened this issue Feb 13, 2023 · 27 comments
Open
Tracked by #5942

[WIP] Progressive loading for 2D and 3D data #5561

kephale opened this issue Feb 13, 2023 · 27 comments
Assignees
Labels
task Tasks for contributors and maintainers

Comments

@kephale
Copy link
Contributor

kephale commented Feb 13, 2023

🧰 Task

A little background on the text: the text that I am initially adding here was developed for an internal process from Dec 2022 - Feb 2023, but is being shared out to make the effort 100% public. This is my preference as well. This document is closely related to code in this repository: https://github.com/kephale/napari-multiscale-rendering-prototype. After this document was written fun things happened and we now have a great community effort with @jni @alisterburt @GenevieveBuckley @ppwadhwa. Please take this document lightly because we've changed quite a few things since the original writing (e.g. starting on 3D first instead of 2D) :)

Progressive Loading for napari

napari users need to be able to visualize large datasets efficiently, especially for out-of-core data (data that is larger than system or GPU memory). While there are numerous approaches to visualizing out-of-core datasets, there are some underlying motifs in the design of these approaches. This document presents these motifs as individual requirements, and the corresponding engineering efforts address the exploration, design, and implementation of algorithms to address each requirement.

Large array data (e.g. dask, zarr) uses different representations than small array data (e.g. numpy), where the array is subdivided into chunks. This facilitates data access and, in the case of zarr, can improve writing performance. The use of chunks also makes it easier to perform tiled rendering, where a subset of tiles/chunks can be rendered at any given time. Tiled rendering is the primary strategy for incrementally updating a display with subsets of data.

Tiled rendering is not sufficient on its own. If we only perform tiled rendering, then the user will often see partially empty images due to tiles that have yet to be loaded/rendered. To smooth this experience for the user we use multiscale rendering, where downsampled versions of the image are quickly loaded first and repainted with higher resolution data. Multiscale rendering minimizes the amount of time the user sees an empty canvas.

Caching is an important aspect of loading large data, because data is often loaded from a remote data source. Any implementation of Progressive Loading should include a caching strategy, and the caching strategy should accommodate multiple large data sources and data types. The implementation of the caching strategy may require 2 components: a disk cache and a GPU memory cache, where the disk cache is important for CPU-based Progressive Loading and the GPU cache is important for 3D volume rendering.

Tunable quality and memory constraints are desirable. In particular, being able to guarantee an upper-bound on the memory usage is a feature that makes it easier to ensure that the rendering method can run on performance-constrained hardware. Similar concerns are relevant to the management of cache size when multiple Progressive Loading layers are being used.

Ultimately, the Progressive Loading effort should introduce support for 2D and 3D visualization of Image, Label, and Point datasets for datasets that exceed system memory. Within the scope of the effort addressed by this RFC, prototypes will be developed that provide this support. These prototypes will go through user testing and will be used to determine the best way to achieve a polished product.

Approach

The goal of our approach for providing Progressive Loading involves developing Python-based large data rendering prototype plugins, as well as exploring the prospect of developing a napari plugin around an existing tool that supports large data rendering, BigDataviewer.

Key points:

  • These prototype plugins will be evaluated by documenting the experiences of select members of the community with exemplary use cases (cc @LCObus @dgmccart @chili-chiu ).
  • Python-based prototypes will leverage dask and zarr to display out-of-core images in napari.
  • The Python-based large data renderer will require improved support for dask delayed, and a custom caching strategy within the napari ecosystem.
  • The Python prototype of a large image volume renderer may require exploration of additional rendering libraries, such as the CZI EOSS funded project vispy2.
  • Alternative renderers, like BigDataViewer, require a different approach that is addressed being addressed in parallel (DM @kephale; will update here as things are posted).

Communication

This RFC describes a learning effort that is focused on developing multiple prototypes for new technologies. Scheduling deadlines for these prototypes can lead to unnecessary re-planning because there is significant community interest and activity related to large image data and progressive loading; some of the work will have to be prioritized opportunistically. However, to support transparency and observability activity on the project will be actively shared on an ongoing basis (see existing Zulip stream for example)., including a monthly shareout at community meetings.

Some places for tracking updates about this project include:

Core Design

For simplicity, the basic architecture will be developed with a focus on 2D; however, 3D prototypes will simultaneously be explored to ensure generalizability beyond 2D. The initial version has been prototyped to include, simplified tiled rendering, basic caching, and a multiscale rendering strategy. However, there are some key limitations in the existing prototype:

  • the vispy/GPU component of the renderer is currently a hack that will break with extremely large 2D data (e.g. when the 2D canvas array exceeds available GPU memory)
  • only the x and y axis work (Z is hard coded); the user can only see/access a 2D slice of the image
  • UI for selecting groups/hierarchical data
  • chunk data fetching is not threading safe
  • chunk data should be fetched with a Dask executor instead of threading module
  • a more advanced cache should be used to support multiple large image layers

These efforts can be partitioned into 3 focus areas: rendering, UI, and data fetching/caching. Finally, some tooling work to support testing data size and type limits and visually debugging the resulting prototypes is also necessary.

Development is based upon the latest version of the napari repository (and other plugins), and may potentially leverage unreleased features and private variables. When the prototypes are adapted for production, version pinning will take place.

Prioritization of effort is key for the development of progressive loading because there are a number of potential bottlenecks (data fetching, caching, data/chunk prioritization, texture updates, etc.). Ultimately we would like napari to have the highest quality rendering possible, including the progressive loading aspect. However, given that the current vispy will be replaced by vispy2, there is strong motivation to focus as much effort as possible into implementing progressive loading with minimal usage of vispy and/or GPU-specific code. Achieving ideal rendering quality will eventually require this type of more focused computer graphics effort, but within the scope of this effort, the goal is to develop a progressive loading implementation for 2D and 3D that will be robust to future code change.

Rendering

The rendering efforts here refer to the implementation of tiled rendering and GPU memory management.

The current 2D prototype assumes that it is possible to pre-allocate a large 2D plane that represents a cross-section of the dataset. This works in many cases, but will fail in cases with a large 2D extent. Furthermore, it is not efficient and leads to overallocation of GPU memory. The 2D memory allocation should be revised to be determined based upon napari's current display size.

In principle, the 3D implementation for tiled, multiscale rendering may follow the same design as 2D. The key differences are: rendering 3D data to a canvas requires more data than 2D, and rendering 3D data benefits from GPU power, but this requires GPU memory management. These differences are significant because the amount of additional data that volume rendering requires is large enough that it is often not possible to render the image data at the highest resolution. This necessitates rendering and caching strategies that efficiently move data between remote, local, and GPU storage in a way that leverages multiscale representations.

A special consideration in the case of rendering effort is "vispy2." vispy2 is funded by CZI's EOSS program, and the vispy2 library is designed to replace vispy, which is napari's current rendering backend. Not only does vispy2 propose numerous modern features, but vispy itself is built on top of OpenGL which is depracated on MacOS. After discussions with the vispy2 team, it is clear that the progressive loading effort should not expect to build on top of vispy2 with the current timeframe, but we are enthusiastic about the opportunity to stay engaged with vispy2 during their development process.

User Interface

UI efforts will focus on multiple needs:

  • A feature to select different renderers (existing Image renderer and large data renderer) to support user testing (this feature may not be needed in production),
  • UI controls need to work with the large data layer (e.g. channel, Z, time sliders), and
  • a new ability to interactively open/select datasets from HDF5/zarr hierarchical files.

UI efforts will leverage napari-hierarchical, which is a plugin that allows users to explore HDF5 and zarr datasets (note that there is other related work planned on savable viewer states in napari that may leverage napari-hierarchical). For example, menu items for selecting renderers will be exposed as context menus in napari-hierarchical. Note that these UI choices are about the convenience of prototyping, and UX feedback will be recorded during user testing reports.

Work packages are described in this github issue.

The user experience testing tool may have a basic data collection component that would allow users to export properties of their data that will be informative for development, such as: data dimensionality, image size, and file size.

Data Fetching and Caching

Data fetching and caching is critical for large data handling. Datasets are generally fetched from remote sources and often exceed local resources (disk, system memory, and GPU memory). As a consequence it is important to carefully manage data retrieval and caching.

Data retrieval can be accelerated by using multiple threads, which is the case in the current 2D multiscale tiled rendering prototype. However, the current implementation uses Python’s threading module but with a naive batch parallelization strategy, where N threads start fetching chunks at the same time but all threads must wait until all threads are complete. Dask executors are an abstraction for parallel processing that work in local and remote environments. Large data processing in napari is already heavily reliant on Dask array data structures. Dask provides support for monitoring threads and controlling Python’s threading module, making it a better option than threading when something more than simple threads are needed. On the other hand, other tools that support parallel processing, such as Ray, Joblib, etc., either would require new dependencies to be added to napari or do not have native support for large arrays.

Although naive caching strategies can be easily implemented, it is important to consider multiscale-aware caching strategies, where particular scales can be cached to ensure a natural user interaction. These caching strategies are important in the context of support for multiple layers. Cache resources must be shared evenly across layers, otherwise the users may observe uneven performance when working with multiple layers simultaneously.

Some attention will be invested into considering the possibility of a persistent cache. There are many cases where a user will work with a particular large image dataset across multiple napari sessions. A persistent cache would speed up the user's experience on subsequent napari sessions, after an initial session that loads the data.

While the prototype leveraged the native Python functools-based cache, a Dask cache (based on cachey) will need to be developed. This will require exposing the cache contents to enable more intelligent methods for cache clearing. The approach will require adapting element prioritization within the cache.

Finally, the design of caches will need to be aligned with the Dask cache, the GPU cache (in the case of volume rendering), and the potential persistent cache.

Tooling Work

Overall the goal of the proposed tooling work is to support the Progressive Loading development process, but these tools are not direct goals of the effort. The Large Data Generator tools have the goal of accelerating the development cycle by reducing reliance on remote downloads of 100s of GB of data during feature testing, and increasing the availability of different data sizes and types. The Visual Debugger is important for confirming that the Progressive Loading effort has been successful by inspecting the resolutions of image data as they are loaded. The amount of effort required is expected to be minimal (approximately ~1 FTE week).

Large Data Generators

There are a number of parameters of each dataset we consider, such as number of scales, pixel type, layer type, dimensionality, and size. All combinations of these parameters are not present in our example test data. To support testing across wide combinations of parameters, we will leverage simple synthetic data generators.

The use of synthetic data generators to support napari development is discussed and explored in a separate CZI-internal document (DM @kephale, happy to share out and plan to do so regardless of whether any one asks). In this work we consider 2 simple data generators: checkerboards and fractals (see Figure FractalExample). Both of these patterns can be easily generated on-the-fly, and do not require on-disk storage that can otherwise be prohibitive for storing large test datasets. Our current large datasets for testing are specific precomputed real-world datasets, which means they cannot be easily/efficiently adjusted to have different features like size and shape. In particular, the generative dataset generators will produce multiscale datasets with tunable numbers of scales, dimensions, and image size.

image

Figure FractalExample. An example of a generative fractal based on the Julia Set displayed in napari.

Visual Debugger

The development process for multiscale and tiled renderers is vulnerable to some typical bugs. Half-pixel offset bugs are common in multiscale rendering, where the downsampling method that was used to create a given multiscale dataset affects how the multiscale rendering should be performed. Loading and positioning errors with tiles are also common, where tiles may be loaded from incorrect scales, in the wrong order, or put into the wrong location. While a test suite will ultimately prevent these bugs from being introduced into the final product, during intermediate states of development visual debugging is often critical.

To address these issues, a visual debugging tool will be developed that provides the developer with visual overlays to indicate scale and other source information about chunks/tiles, see (Figure VisualDebugger) for a 2D mockup and this video for a demo that shows a visual debugger overlay of resolutions for 3D.

image

Figure VisualDebugger: An overlay view will help developers debug multiscale rendering issues by using color overlays to indicate the active scale level.

Notes

Q1: How to handle napari’s current strategy for arbitrary slicing of volumes? (See Lorenzo’s message) Can the slicing plane be used for chunk prioritization similar to the current implementation with camera?

Decision1: (Kyle has a proposed solution in Lorenzo’s thread)

@kephale kephale added the task Tasks for contributors and maintainers label Feb 13, 2023
@kephale kephale self-assigned this Feb 13, 2023
@alisterburt
Copy link
Contributor

Really enjoyed reading this @kephale, thanks for sharing out! Clarified my thinking on the relationship between multiscale and progressive rendering. Super excited to see how this work pans out!

@royerloic
Copy link
Member

Very cool Kyle! I like the way you are thinking about this.
Looking forward to see what will happen here!

@kephale
Copy link
Contributor Author

kephale commented Mar 21, 2023

Hi Folks!

I recorded a walkthrough of the current version of the code from https://github.com/kephale/napari/tree/poor-mans-octree which demonstrates the end-to-end test for multiscale volume rendering. This is not the full volume renderer!

You can see a quick demo of what the code does here:
https://drive.google.com/file/d/10W5z-z9QAzKVL4WGNXjjO5NyAbUOyU6K/view?usp=sharing

The walkthrough is here: https://drive.google.com/file/d/117Tnh2acUUq1yFhlxrPXS5DNNV766RT2/view?usp=sharing. More ramble-y than I'd hoped but it is better to work on the follow up work than re-record.

Cheers,
Kyle

@kephale
Copy link
Contributor Author

kephale commented Mar 21, 2023

My current plan is to use lessons learned from the code in the current volume rendering demo to update my prototype for 2D multiscale tiled rendering. The reasoning is that the 2D version is not strongly impacted by the 2 major next steps for volume rendering:

  • smarter data fetching (we only need enough data to draw a 2D plane), and
  • improved rendering (2D rendering is easier and doesn't require new shaders).

The 2D tiled renderer is actually not sooo far off 🤞 (old video of my 2D prototype: https://drive.google.com/file/d/1MFotcXLs5WCUjNU_Y8LeT-HNP6YOrIbz/view?usp=sharing). The hope is to clean up any code that will be shared between the 2D and 3D tiled rendering, particularly for data fetching/caching, and finish prototyping the 2D tiled renderer (incl compatibility with @andy-sweet's async improvements).

@jni
Copy link
Member

jni commented Mar 21, 2023

Thanks @kephale! My hope is that we can share as much code between the two paths as possible, using functional abstractions to separate them, e.g. should_fetch_chunk() -> bool would have 2D and 3D paths, chunk_priority() -> float same, but the render loop (has the camera moved, if so grab chunks asynchronously until the scene is how we want it) should be the same for both. Does that make sense? Are there reasons why that can't work?

@imagesc-bot
Copy link

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/napari-viewing-3d-image-of-large-tif-stack-cropping-image-w-general-shape/55500/5

@kephale
Copy link
Contributor Author

kephale commented May 19, 2023

Hello Friends!

Cool demo today:

I've derived an example from an awesome Mandelbrot demo made by the Vizarr team (see original here: https://colab.research.google.com/github/hms-dbmi/vizarr/blob/main/example/mandelbrot.ipynb).

Here you can see 2D tiled loading in napari of a 134,217,728 x 134,217,728, 8-bit image with 18 scales (representing ~18 petabytes). There are still plenty of things to improve, and yes the Mandelbrot set is rotated from its normal orientation.

This isn't the visual experience that will be delivered at the end of the day. The main point and excitement of this demo is the size of the image being visualized in napari with tiled loading that does not block user input.

napari_2D_tiledrendering_mandelbrot_18scales.mp4

@kevinyamauchi
Copy link
Contributor

Wow, this is super exciting, @kephale ! @jluethi might be interested in this.

@alisterburt
Copy link
Contributor

so so so sick!!!

@andy-sweet
Copy link
Member

Very nice! Is the mandelbrot example reusable as an independent napari script? Or is it tightly coupled to the implementation right now? I'd be curious to try it out with #5816 - I'm pretty confident it will be quite a bit worse (maybe even a non-starter), but it's hard to avoid the comparison.

@kephale
Copy link
Contributor Author

kephale commented May 19, 2023

Very nice! Is the mandelbrot example reusable as an independent napari script? Or is it tightly coupled to the implementation right now? I'd be curious to try it out with #5816 - I'm pretty confident it will be quite a bit worse (maybe even a non-starter), but it's hard to avoid the comparison.

Reusable! I'll clean the code up next week, some of the changes that made this demo possible leaked into the Mandelbrot part of the code.

@kephale
Copy link
Contributor Author

kephale commented May 19, 2023

@kcpevey has been helping out recently as well and was a great help when puzzling through some of the recent bugs!

@rabernat
Copy link

Very cool demo! 🤯

I've derived an example from an awesome Mandelbrot demo made by the Vizarr team (see original here: https://colab.research.google.com/github/hms-dbmi/vizarr/blob/main/example/mandelbrot.ipynb).

I tried running this and found I needed to make the following change for it to work

- store = MandlebrotStore(levels=50, tilesize=512, compressor=Blosc())
+ store = zarr.storage.KVStore(MandlebrotStore(levels=50, tilesize=512, compressor=Blosc()))

Without this, you get the error

ValueError: Starting with Zarr 2.11.0, stores must be subclasses of BaseStore, if your store exposes the MutableMapping interface wrap it in Zarr.storage.KVStore. Got <__main__.MandlebrotStore object at 0x7f0bf0bc7010>

kephale added a commit to kephale/napari that referenced this issue May 25, 2023
@imagesc-bot
Copy link

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/creating-an-ome-zarr-dynamically-from-tiles-stored-as-a-series-of-images-list-of-centre-positions-using-python/81657/6

@kephale
Copy link
Contributor Author

kephale commented Jun 20, 2023

Howdy Friends,

I am thrilled to share this demo: this is a significantly more extensive example of multiscale volume rendering in napari.

Let me introduce you to the Mandelbulb, if you are not already acquainted. It is a 3D fractal based upon the Mandelbrot set. This is one particular variant of the Mandelbulb, there are others to investigate if you are interested.

In this demo I show a 3D multiscale image with a size of (1_048_576, 1_048_576, 1_048_576) which is the equivalent of 1152921504606846976 bytes, also known as 1 exabyte. We do not access/generate all of this data, but as you can see we can successfully browse the data as a volume render.

The video is quite long because it takes some time to navigate these fractals and I thought I would just share out the whole session. I suggest speeding up the video, or skipping to the point in the video when I start zooming out.

Here is a video that can fit GH's file size requirements (downscaled and some frame dropping):

napari_multiscale_volumerender_003_mandelbulb_trimmed.mp4

Full resolution and no frame dropping: https://drive.google.com/file/d/11hkd64r7ZlZSBCcJXczQjWsl156_wEcp/view?usp=sharing

@alisterburt
Copy link
Contributor

@kephale my god it's beautiful

@melonora
Copy link
Contributor

@kephale This is so awesome. Love it!

@kephale
Copy link
Contributor Author

kephale commented Jun 20, 2023

Many thanks to @psobolewskiPhD for helping with the scale bar behavior! Otherwise it can get confusing when you're deep in the Mandelbulb.

@kephale
Copy link
Contributor Author

kephale commented Jun 20, 2023

A day late for you @royerloic, but the plan is for Zebrahub to be the next dataset we focus on for this project.

@dpshepherd
Copy link

This is really exciting!!!! We will put it use too!

@kevinyamauchi
Copy link
Contributor

kevinyamauchi commented Jun 21, 2023

Ahhh so exciting! (why doesn't GitHub have a party parrot emoji?)

@kephale
Copy link
Contributor Author

kephale commented Jul 5, 2023

Tracking some next steps:

@kephale
Copy link
Contributor Author

kephale commented Aug 11, 2023

Tracking this comment from one of @andy-sweet's PR reviews: #6043 (comment) about tile2data

@kephale
Copy link
Contributor Author

kephale commented Sep 28, 2023

Here is a super early demo of visualizing images from Zebrahub (@royerloic @JoOkuma and friends). The new thing here is visualizing data with a dimensionality other than the viewing mode (2D or 3D), in this case the data is 4D. In the video I move through a few timesteps and a few scales.

This is still a bit hacky and there are things to improve, so I look forward to doing some cleaning at the hackathon next week.

zebrahub_progloading_early_001.mp4

@imagesc-bot
Copy link

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/using-naparis-new-not-yet-released-async-functionality-to-browse-large-ome-zarr-hcs-plates/86984/1

@kephale
Copy link
Contributor Author

kephale commented Oct 4, 2023

Paired with @kevinyamauchi to make a very satisfying debugging visualization.

Screen.Recording.2023-10-03.at.8.23.40.PM.mp4

@imagesc-bot
Copy link

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/using-naparis-new-not-yet-released-async-functionality-to-browse-large-ome-zarr-hcs-plates/86984/2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task Tasks for contributors and maintainers
Projects
None yet
Development

No branches or pull requests

10 participants