Fix exclusive upper bound for line aggregation #343

jbcrail · 2017-05-04T14:27:51Z

The previous adjustment for the exlusive upper bound failed to account properly for bounds with float values. This caused more pixels than necessary to be clipped.

Instead of performing the upper bound check when we clipped a line segment to the bounding box, we now check at a lower level when we translate a line segment to the actual pixels. This gives us greater control without dealing with float coordinates; at that level we are working with integer coordinates.

One test was updated to match the expected result.

Fix #342

The previous adjustment for the exlusive upper bound failed to account properly for bounds with float values. This caused more pixels than necessary to be clipped. Instead of performing the upper bound check when we clipped a line segment to the bounding box, we now check at a lower level when we translate a line segment to the actual pixels. This gives us greater control without dealing with float coordinates; at that level we are working with integer coordinates. Fix holoviz#342

philippjfr · 2017-05-04T20:46:25Z

datashader/glyphs.py

+        xmin = int(x_mapper(xmin) * sx + tx)
+        xmax = int(x_mapper(xmax) * sx + tx)
+        ymin = int(y_mapper(ymin) * sy + ty)
+        ymax = int(y_mapper(ymax) * sy + ty)


Maybe I'm not reading the code correctly but could this be done once outside the aggregation function somehow? Seems like doing it repeatedly for each line vertex will add some overhead.

After a brief look could Line._build_extend do this and then pass the mapped_bounds in as an argument?

That was my question as well; seems like this change has the potential to slow things down, which we should be careful to avoid.

I agree that this could be a performance issue, so as @philippjfr suggested, I pre-calculated the mapped bounds and passed it down to draw_line.

I refactored the pixel mapping into a separate method similar to draw_line and extend_line. This made the code more readable and prevented me from hard-coding a mapped bounds value into the tests.

jbednar

Turned out to be more complicated than I expected, but it does look like a better approach. Have you tested with logarithmic axes?

jbcrail · 2017-05-05T14:19:55Z

I haven't. Is there a relevant notebook example?

jbednar · 2017-05-05T14:55:04Z

No that I know of, no, but that's a good point -- we should have one. And I guess we need one both for lines and for points, since they are using different code here. There are a couple of open issues about logarithmic axes...

jbcrail · 2017-05-05T16:16:06Z

I checked the test cases and found tests for logarithmic axes but only for points. I'll add tests for lines.

philippjfr · 2017-05-05T17:02:06Z

Thanks for the fix @jbcrail. My understanding of this code is still lacking so my feedback may not be as helpful as it could be. The main worry I have are these new conditionals you've had to introduce in the core aggregation function will add additional overhead which may not be necessary. I have the feeling that your original approach of adjusting the bounds would have worked okay but should simply been adjusting them by 1 or 1/2 a pixel width rather than always subtracting 1. I'm not very confident about that though so if you can explain to me and Jim why these checks have to happen in the core aggregation and not before we'll trust you. I'll try to work through it on paper now to convince myself.

jbcrail · 2017-05-05T19:17:13Z

My understanding of the code is still improving also, so any feedback or double-checking is appreciated. I'll describe what I've learned and why I choose the current approach. Maybe this will help us find a better way.

Initial approach: exclusive bounding box in extend_line

In extend_line, we process a set of segments, where each segment has a start and end point and the points are typically defined using float coordinates. For each segment, we clip the line segment to the bounding box if one point falls outside (when both points are outside, we skip the line segment). Then we pass the segments to draw_line to calculate the pixels.

If we made the bounding box exclusive, any point landing on either maximum bound would have to be adjusted, causing the point to be inside the box. Then we could pass this clipped line segment to draw_line, which wouldn't have to be updated.

However, the clipping algorithm assumes an inclusive bounding box and updating the algorithm accordingly turned out to be error prone. I considered using NumPy's float epsilon to decrement the respective point's coordinate so that the point would be just inside but I determined it was a bad idea to change data.

Current approach: exclusive bounding box in draw_line

We leave extend_line unchanged (aside from passing pre-calculated parameters).

At this level, the coordinates of the line segment and bounding box are scaled and transformed to integer coordinates. There are four cases to consider: 1) one segment point is on a maximum bound, 2) each segment point is on a separate maximum bound, 3) both segment points are on the same maximum bound (i.e. vertical or horizontal line), and 4) both segment points are inside or on minimum bounds.

Since draw_line doesn't distinguish between the four cases, for generality I had to add the conditional before appending any pixel.

Alternative

In extend_line, the clipping algorithm is currently run against the float-coordinate bounding box. This is where the complexity and edge cases have been. But if we scale/transform the coordinates for the line segment and the bounding box prior to clipping, then the problem is simpler.

Based on @philippjfr's feedback, we could adjust the integer-coordinate bounding box down by one on each side. Then we would run the clipping algorithm unaltered and pass the new line segment coordinates to draw_line. We would improve the performance and further simplify draw_line. It would add some messiness to the tests but it's a good tradeoff.

This removes the bound checks in draw_line. It also makes the bounding box clipping integer-based, along with draw_line.

philippjfr · 2017-05-07T16:18:36Z

Thanks so much for the detailed summary. The new approach seems like the correct and most optimal approach so I'd be very happy to see this merged.

jbcrail · 2017-05-07T16:23:50Z

Thanks, @philippjfr. I also benchmarked master against this PR using a 1e6 x 1e6 diagonal. No major difference in performance.

Branch: master
--------------------------------------------------------------------------------- benchmark: 2 tests --------------------------------------------------------------------------------
Name (time in ms)            Min                   Max                  Mean              StdDev                Median                 IQR            Outliers(*)  Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_draw_line           78.4571 (1.0)        258.6826 (1.0)         99.5152 (1.0)       55.9732 (1.0)         82.1140 (1.0)        4.5016 (1.0)              1;1      10           1
test_extend_line      3,513.6678 (44.78)    3,945.5268 (15.25)    3,616.2169 (36.34)    129.4731 (2.31)     3,578.8234 (43.58)    102.4231 (22.75)            1;1      10           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Branch: fix-exclusive-upper-bound
-------------------------------------------------------------------------------- benchmark: 2 tests --------------------------------------------------------------------------------
Name (time in ms)            Min                   Max                  Mean              StdDev                Median                IQR            Outliers(*)  Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_draw_line           77.9279 (1.0)        179.8982 (1.0)         93.9135 (1.0)       33.3825 (1.0)         79.6688 (1.0)       2.6067 (1.0)              1;2      10           1
test_extend_line      3,405.3419 (43.70)    3,964.4963 (22.04)    3,584.1758 (38.16)    155.1264 (4.65)     3,563.3096 (44.73)    19.2078 (7.37)             3;4      10           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

This reverts to exclusive ranges both manual and auto for all glyphs (points/line) without introducing regressions of holoviz#318, holoviz#330, and holoviz#343. I refactored several tests to make xarray coordinate indices easier to read and more explicit.

jbcrail added the in progress label May 4, 2017

jbcrail mentioned this pull request May 4, 2017

Bug in line aggregation for float ranges #342

Closed

jbcrail requested a review from jbednar May 4, 2017 15:09

philippjfr reviewed May 4, 2017

View reviewed changes

jbcrail added 4 commits May 4, 2017 20:37

Move mapping onto pixel grid to its own method

58b2175

Pre-calculate mapped bounds for efficiency

622904d

Update docstring

c0a85b4

Remove useless assignment

c34c74f

jbednar reviewed May 5, 2017

View reviewed changes

Add line aggregation tests for logarithmic axes

bb19174

Convert line segment vertices to integer coords

542ebf7

This removes the bound checks in draw_line. It also makes the bounding box clipping integer-based, along with draw_line.

jbcrail removed the in progress label May 7, 2017

jbednar merged commit 56f37f8 into holoviz:master May 7, 2017

jbcrail deleted the fix-exclusive-upper-bound branch May 7, 2017 20:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix exclusive upper bound for line aggregation #343

Fix exclusive upper bound for line aggregation #343

jbcrail commented May 4, 2017

philippjfr May 4, 2017

philippjfr May 4, 2017

jbednar May 4, 2017

jbcrail May 5, 2017

jbednar left a comment

jbcrail commented May 5, 2017

jbednar commented May 5, 2017

jbcrail commented May 5, 2017

philippjfr commented May 5, 2017

jbcrail commented May 5, 2017

philippjfr commented May 7, 2017

jbcrail commented May 7, 2017

Fix exclusive upper bound for line aggregation #343

Fix exclusive upper bound for line aggregation #343

Conversation

jbcrail commented May 4, 2017

philippjfr May 4, 2017

Choose a reason for hiding this comment

philippjfr May 4, 2017

Choose a reason for hiding this comment

jbednar May 4, 2017

Choose a reason for hiding this comment

jbcrail May 5, 2017

Choose a reason for hiding this comment

jbednar left a comment

Choose a reason for hiding this comment

jbcrail commented May 5, 2017

jbednar commented May 5, 2017

jbcrail commented May 5, 2017

philippjfr commented May 5, 2017

jbcrail commented May 5, 2017

philippjfr commented May 7, 2017

jbcrail commented May 7, 2017