This is a Python implementation of the method and experiments from the paper "Tensor Decompositions for Integral Histogram Compression and Look-Up" (Rafael Ballester-Ripoll and Renato Pajarola), IEEE Transactions on Visualization and Computer Graphics, 2018.
Computing histograms of large multidimensional data is time-consuming as it may require millions of random accesses. The integral histogram trades space for speed: it needs only a few hundred accesses per query, but takes up orders of magnitude more storage.
TT-histograms compress integral histograms to make them manageable and offer fast and approximate look-up over medium to large regions. As a bonus, we can also query non-rectangular regions (separable regions are especially friendly) and compute histogram fields as well. For small regions, brute-force histogram computation is still the better option, since it is error-free and sufficiently fast.
The main class is TTHistogram
as contained in tthistogram.py
. It features:
- A constructor that builds a TT-compressed integral histogram representation out of an ND array with an arbitrary number of bins;
- The function
box()
, which looks up a histogram over any given box region; - The function
separable()
, which looks up a histogram over a separable region (e.g. Gaussian); - The function
nonseparable()
, which looks up a histogram over a non-separable region; - The function
box_field()
, which reconstructs a histogram field made up of box regions; - The function
separable_field()
, which reconstructs a histogram field made up of separable regions.
Note: for benchmarking, all look-ups are also implemented using brute-force traversal in both NumPy (file bruteforce.py
) and CuPy (file bruteforce_cupy.py
). The remaining files simply reproduce the experiments in the paper.
-
The ttpy toolbox for several elementary operations with compressed TT tensors.
-
The ttrecipes toolbox, which contains some extra homemade utility functions for working with the TT format, as well as the
sum_and_compress()
function for incremental TT decomposition that we use to compress huge integral histograms. -
From the Python ecosystem: matplotlib, SciPy
-
Optionally (for benchmarking only): pandas, CuPy (a NumPy-compatible matrix library accelerated by CUDA)
This work was partially supported by the UZH Forschungskredit "Candoc", grant number FK-16-012.