-
-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Progressive loading for 2D and 3D data #5561
Comments
Really enjoyed reading this @kephale, thanks for sharing out! Clarified my thinking on the relationship between multiscale and progressive rendering. Super excited to see how this work pans out! |
Very cool Kyle! I like the way you are thinking about this. |
Hi Folks! I recorded a walkthrough of the current version of the code from You can see a quick demo of what the code does here: The walkthrough is here: https://drive.google.com/file/d/117Tnh2acUUq1yFhlxrPXS5DNNV766RT2/view?usp=sharing. More ramble-y than I'd hoped but it is better to work on the follow up work than re-record. Cheers, |
My current plan is to use lessons learned from the code in the current volume rendering demo to update my prototype for 2D multiscale tiled rendering. The reasoning is that the 2D version is not strongly impacted by the 2 major next steps for volume rendering:
The 2D tiled renderer is actually not sooo far off 🤞 (old video of my 2D prototype: https://drive.google.com/file/d/1MFotcXLs5WCUjNU_Y8LeT-HNP6YOrIbz/view?usp=sharing). The hope is to clean up any code that will be shared between the 2D and 3D tiled rendering, particularly for data fetching/caching, and finish prototyping the 2D tiled renderer (incl compatibility with @andy-sweet's async improvements). |
Thanks @kephale! My hope is that we can share as much code between the two paths as possible, using functional abstractions to separate them, e.g. |
This issue has been mentioned on Image.sc Forum. There might be relevant details there: |
Hello Friends! Cool demo today: I've derived an example from an awesome Mandelbrot demo made by the Vizarr team (see original here: https://colab.research.google.com/github/hms-dbmi/vizarr/blob/main/example/mandelbrot.ipynb). Here you can see 2D tiled loading in napari of a 134,217,728 x 134,217,728, 8-bit image with 18 scales (representing ~18 petabytes). There are still plenty of things to improve, and yes the Mandelbrot set is rotated from its normal orientation. This isn't the visual experience that will be delivered at the end of the day. The main point and excitement of this demo is the size of the image being visualized in napari with tiled loading that does not block user input. napari_2D_tiledrendering_mandelbrot_18scales.mp4 |
so so so sick!!! |
Very nice! Is the mandelbrot example reusable as an independent napari script? Or is it tightly coupled to the implementation right now? I'd be curious to try it out with #5816 - I'm pretty confident it will be quite a bit worse (maybe even a non-starter), but it's hard to avoid the comparison. |
Reusable! I'll clean the code up next week, some of the changes that made this demo possible leaked into the Mandelbrot part of the code. |
@kcpevey has been helping out recently as well and was a great help when puzzling through some of the recent bugs! |
Very cool demo! 🤯
I tried running this and found I needed to make the following change for it to work - store = MandlebrotStore(levels=50, tilesize=512, compressor=Blosc())
+ store = zarr.storage.KVStore(MandlebrotStore(levels=50, tilesize=512, compressor=Blosc())) Without this, you get the error
|
Fix mentioned here napari#5561 (comment)
This issue has been mentioned on Image.sc Forum. There might be relevant details there: |
Howdy Friends, I am thrilled to share this demo: this is a significantly more extensive example of multiscale volume rendering in napari. Let me introduce you to the Mandelbulb, if you are not already acquainted. It is a 3D fractal based upon the Mandelbrot set. This is one particular variant of the Mandelbulb, there are others to investigate if you are interested. In this demo I show a 3D multiscale image with a size of (1_048_576, 1_048_576, 1_048_576) which is the equivalent of 1152921504606846976 bytes, also known as 1 exabyte. We do not access/generate all of this data, but as you can see we can successfully browse the data as a volume render. The video is quite long because it takes some time to navigate these fractals and I thought I would just share out the whole session. I suggest speeding up the video, or skipping to the point in the video when I start zooming out. Here is a video that can fit GH's file size requirements (downscaled and some frame dropping): napari_multiscale_volumerender_003_mandelbulb_trimmed.mp4Full resolution and no frame dropping: https://drive.google.com/file/d/11hkd64r7ZlZSBCcJXczQjWsl156_wEcp/view?usp=sharing |
@kephale my god it's beautiful |
@kephale This is so awesome. Love it! |
Many thanks to @psobolewskiPhD for helping with the scale bar behavior! Otherwise it can get confusing when you're deep in the Mandelbulb. |
A day late for you @royerloic, but the plan is for Zebrahub to be the next dataset we focus on for this project. |
This is really exciting!!!! We will put it use too! |
Ahhh so exciting! (why doesn't GitHub have a party parrot emoji?) |
Tracking some next steps:
|
Tracking this comment from one of @andy-sweet's PR reviews: #6043 (comment) about tile2data |
Here is a super early demo of visualizing images from Zebrahub (@royerloic @JoOkuma and friends). The new thing here is visualizing data with a dimensionality other than the viewing mode (2D or 3D), in this case the data is 4D. In the video I move through a few timesteps and a few scales. This is still a bit hacky and there are things to improve, so I look forward to doing some cleaning at the hackathon next week. zebrahub_progloading_early_001.mp4 |
This issue has been mentioned on Image.sc Forum. There might be relevant details there: |
Paired with @kevinyamauchi to make a very satisfying debugging visualization. Screen.Recording.2023-10-03.at.8.23.40.PM.mp4 |
This issue has been mentioned on Image.sc Forum. There might be relevant details there: |
🧰 Task
A little background on the text: the text that I am initially adding here was developed for an internal process from Dec 2022 - Feb 2023, but is being shared out to make the effort 100% public. This is my preference as well. This document is closely related to code in this repository: https://github.com/kephale/napari-multiscale-rendering-prototype. After this document was written fun things happened and we now have a great community effort with @jni @alisterburt @GenevieveBuckley @ppwadhwa. Please take this document lightly because we've changed quite a few things since the original writing (e.g. starting on 3D first instead of 2D) :)
Progressive Loading for napari
napari users need to be able to visualize large datasets efficiently, especially for out-of-core data (data that is larger than system or GPU memory). While there are numerous approaches to visualizing out-of-core datasets, there are some underlying motifs in the design of these approaches. This document presents these motifs as individual requirements, and the corresponding engineering efforts address the exploration, design, and implementation of algorithms to address each requirement.
Large array data (e.g. dask, zarr) uses different representations than small array data (e.g. numpy), where the array is subdivided into chunks. This facilitates data access and, in the case of zarr, can improve writing performance. The use of chunks also makes it easier to perform tiled rendering, where a subset of tiles/chunks can be rendered at any given time. Tiled rendering is the primary strategy for incrementally updating a display with subsets of data.
Tiled rendering is not sufficient on its own. If we only perform tiled rendering, then the user will often see partially empty images due to tiles that have yet to be loaded/rendered. To smooth this experience for the user we use multiscale rendering, where downsampled versions of the image are quickly loaded first and repainted with higher resolution data. Multiscale rendering minimizes the amount of time the user sees an empty canvas.
Caching is an important aspect of loading large data, because data is often loaded from a remote data source. Any implementation of Progressive Loading should include a caching strategy, and the caching strategy should accommodate multiple large data sources and data types. The implementation of the caching strategy may require 2 components: a disk cache and a GPU memory cache, where the disk cache is important for CPU-based Progressive Loading and the GPU cache is important for 3D volume rendering.
Tunable quality and memory constraints are desirable. In particular, being able to guarantee an upper-bound on the memory usage is a feature that makes it easier to ensure that the rendering method can run on performance-constrained hardware. Similar concerns are relevant to the management of cache size when multiple Progressive Loading layers are being used.
Ultimately, the Progressive Loading effort should introduce support for 2D and 3D visualization of Image, Label, and Point datasets for datasets that exceed system memory. Within the scope of the effort addressed by this RFC, prototypes will be developed that provide this support. These prototypes will go through user testing and will be used to determine the best way to achieve a polished product.
Approach
The goal of our approach for providing Progressive Loading involves developing Python-based large data rendering prototype plugins, as well as exploring the prospect of developing a napari plugin around an existing tool that supports large data rendering, BigDataviewer.
Key points:
Communication
This RFC describes a learning effort that is focused on developing multiple prototypes for new technologies. Scheduling deadlines for these prototypes can lead to unnecessary re-planning because there is significant community interest and activity related to large image data and progressive loading; some of the work will have to be prioritized opportunistically. However, to support transparency and observability activity on the project will be actively shared on an ongoing basis (see existing Zulip stream for example)., including a monthly shareout at community meetings.
Some places for tracking updates about this project include:
Core Design
For simplicity, the basic architecture will be developed with a focus on 2D; however, 3D prototypes will simultaneously be explored to ensure generalizability beyond 2D. The initial version has been prototyped to include, simplified tiled rendering, basic caching, and a multiscale rendering strategy. However, there are some key limitations in the existing prototype:
These efforts can be partitioned into 3 focus areas: rendering, UI, and data fetching/caching. Finally, some tooling work to support testing data size and type limits and visually debugging the resulting prototypes is also necessary.
Development is based upon the latest version of the napari repository (and other plugins), and may potentially leverage unreleased features and private variables. When the prototypes are adapted for production, version pinning will take place.
Prioritization of effort is key for the development of progressive loading because there are a number of potential bottlenecks (data fetching, caching, data/chunk prioritization, texture updates, etc.). Ultimately we would like napari to have the highest quality rendering possible, including the progressive loading aspect. However, given that the current vispy will be replaced by vispy2, there is strong motivation to focus as much effort as possible into implementing progressive loading with minimal usage of vispy and/or GPU-specific code. Achieving ideal rendering quality will eventually require this type of more focused computer graphics effort, but within the scope of this effort, the goal is to develop a progressive loading implementation for 2D and 3D that will be robust to future code change.
Rendering
The rendering efforts here refer to the implementation of tiled rendering and GPU memory management.
The current 2D prototype assumes that it is possible to pre-allocate a large 2D plane that represents a cross-section of the dataset. This works in many cases, but will fail in cases with a large 2D extent. Furthermore, it is not efficient and leads to overallocation of GPU memory. The 2D memory allocation should be revised to be determined based upon napari's current display size.
In principle, the 3D implementation for tiled, multiscale rendering may follow the same design as 2D. The key differences are: rendering 3D data to a canvas requires more data than 2D, and rendering 3D data benefits from GPU power, but this requires GPU memory management. These differences are significant because the amount of additional data that volume rendering requires is large enough that it is often not possible to render the image data at the highest resolution. This necessitates rendering and caching strategies that efficiently move data between remote, local, and GPU storage in a way that leverages multiscale representations.
A special consideration in the case of rendering effort is "vispy2." vispy2 is funded by CZI's EOSS program, and the vispy2 library is designed to replace vispy, which is napari's current rendering backend. Not only does vispy2 propose numerous modern features, but vispy itself is built on top of OpenGL which is depracated on MacOS. After discussions with the vispy2 team, it is clear that the progressive loading effort should not expect to build on top of vispy2 with the current timeframe, but we are enthusiastic about the opportunity to stay engaged with vispy2 during their development process.
User Interface
UI efforts will focus on multiple needs:
UI efforts will leverage napari-hierarchical, which is a plugin that allows users to explore HDF5 and zarr datasets (note that there is other related work planned on savable viewer states in napari that may leverage
napari-hierarchical
). For example, menu items for selecting renderers will be exposed as context menus in napari-hierarchical. Note that these UI choices are about the convenience of prototyping, and UX feedback will be recorded during user testing reports.Work packages are described in this github issue.
The user experience testing tool may have a basic data collection component that would allow users to export properties of their data that will be informative for development, such as: data dimensionality, image size, and file size.
Data Fetching and Caching
Data fetching and caching is critical for large data handling. Datasets are generally fetched from remote sources and often exceed local resources (disk, system memory, and GPU memory). As a consequence it is important to carefully manage data retrieval and caching.
Data retrieval can be accelerated by using multiple threads, which is the case in the current 2D multiscale tiled rendering prototype. However, the current implementation uses Python’s threading module but with a naive batch parallelization strategy, where N threads start fetching chunks at the same time but all threads must wait until all threads are complete. Dask executors are an abstraction for parallel processing that work in local and remote environments. Large data processing in napari is already heavily reliant on Dask array data structures. Dask provides support for monitoring threads and controlling Python’s threading module, making it a better option than threading when something more than simple threads are needed. On the other hand, other tools that support parallel processing, such as Ray, Joblib, etc., either would require new dependencies to be added to napari or do not have native support for large arrays.
Although naive caching strategies can be easily implemented, it is important to consider multiscale-aware caching strategies, where particular scales can be cached to ensure a natural user interaction. These caching strategies are important in the context of support for multiple layers. Cache resources must be shared evenly across layers, otherwise the users may observe uneven performance when working with multiple layers simultaneously.
Some attention will be invested into considering the possibility of a persistent cache. There are many cases where a user will work with a particular large image dataset across multiple napari sessions. A persistent cache would speed up the user's experience on subsequent napari sessions, after an initial session that loads the data.
While the prototype leveraged the native Python functools-based cache, a Dask cache (based on cachey) will need to be developed. This will require exposing the cache contents to enable more intelligent methods for cache clearing. The approach will require adapting element prioritization within the cache.
Finally, the design of caches will need to be aligned with the Dask cache, the GPU cache (in the case of volume rendering), and the potential persistent cache.
Tooling Work
Overall the goal of the proposed tooling work is to support the Progressive Loading development process, but these tools are not direct goals of the effort. The Large Data Generator tools have the goal of accelerating the development cycle by reducing reliance on remote downloads of 100s of GB of data during feature testing, and increasing the availability of different data sizes and types. The Visual Debugger is important for confirming that the Progressive Loading effort has been successful by inspecting the resolutions of image data as they are loaded. The amount of effort required is expected to be minimal (approximately ~1 FTE week).
Large Data Generators
There are a number of parameters of each dataset we consider, such as number of scales, pixel type, layer type, dimensionality, and size. All combinations of these parameters are not present in our example test data. To support testing across wide combinations of parameters, we will leverage simple synthetic data generators.
The use of synthetic data generators to support napari development is discussed and explored in a separate CZI-internal document (DM @kephale, happy to share out and plan to do so regardless of whether any one asks). In this work we consider 2 simple data generators: checkerboards and fractals (see Figure FractalExample). Both of these patterns can be easily generated on-the-fly, and do not require on-disk storage that can otherwise be prohibitive for storing large test datasets. Our current large datasets for testing are specific precomputed real-world datasets, which means they cannot be easily/efficiently adjusted to have different features like size and shape. In particular, the generative dataset generators will produce multiscale datasets with tunable numbers of scales, dimensions, and image size.
Figure FractalExample. An example of a generative fractal based on the Julia Set displayed in napari.
Visual Debugger
The development process for multiscale and tiled renderers is vulnerable to some typical bugs. Half-pixel offset bugs are common in multiscale rendering, where the downsampling method that was used to create a given multiscale dataset affects how the multiscale rendering should be performed. Loading and positioning errors with tiles are also common, where tiles may be loaded from incorrect scales, in the wrong order, or put into the wrong location. While a test suite will ultimately prevent these bugs from being introduced into the final product, during intermediate states of development visual debugging is often critical.
To address these issues, a visual debugging tool will be developed that provides the developer with visual overlays to indicate scale and other source information about chunks/tiles, see (Figure VisualDebugger) for a 2D mockup and this video for a demo that shows a visual debugger overlay of resolutions for 3D.
Figure VisualDebugger: An overlay view will help developers debug multiscale rendering issues by using color overlays to indicate the active scale level.
Notes
Q1: How to handle napari’s current strategy for arbitrary slicing of volumes? (See Lorenzo’s message) Can the slicing plane be used for chunk prioritization similar to the current implementation with camera?
Decision1: (Kyle has a proposed solution in Lorenzo’s thread)
The text was updated successfully, but these errors were encountered: