Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix exclusive upper bound for line aggregation #343

Merged
merged 7 commits into from
May 7, 2017

Conversation

jbcrail
Copy link
Contributor

@jbcrail jbcrail commented May 4, 2017

The previous adjustment for the exlusive upper bound failed to account properly for bounds with float values. This caused more pixels than necessary to be clipped.

Instead of performing the upper bound check when we clipped a line segment to the bounding box, we now check at a lower level when we translate a line segment to the actual pixels. This gives us greater control without dealing with float coordinates; at that level we are working with integer coordinates.

One test was updated to match the expected result.

Fix #342

The previous adjustment for the exlusive upper bound failed to account
properly for bounds with float values. This caused more pixels than
necessary to be clipped.

Instead of performing the upper bound check when we clipped a line
segment to the bounding box, we now check at a lower level when we
translate a line segment to the actual pixels. This gives us greater
control without dealing with float coordinates; at that level we are
working with integer coordinates.

Fix holoviz#342
xmin = int(x_mapper(xmin) * sx + tx)
xmax = int(x_mapper(xmax) * sx + tx)
ymin = int(y_mapper(ymin) * sy + ty)
ymax = int(y_mapper(ymax) * sy + ty)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm not reading the code correctly but could this be done once outside the aggregation function somehow? Seems like doing it repeatedly for each line vertex will add some overhead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a brief look could Line._build_extend do this and then pass the mapped_bounds in as an argument?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my question as well; seems like this change has the potential to slow things down, which we should be careful to avoid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this could be a performance issue, so as @philippjfr suggested, I pre-calculated the mapped bounds and passed it down to draw_line.

I refactored the pixel mapping into a separate method similar to draw_line and extend_line. This made the code more readable and prevented me from hard-coding a mapped bounds value into the tests.

Copy link
Member

@jbednar jbednar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turned out to be more complicated than I expected, but it does look like a better approach. Have you tested with logarithmic axes?

@jbcrail
Copy link
Contributor Author

jbcrail commented May 5, 2017

I haven't. Is there a relevant notebook example?

@jbednar
Copy link
Member

jbednar commented May 5, 2017

No that I know of, no, but that's a good point -- we should have one. And I guess we need one both for lines and for points, since they are using different code here. There are a couple of open issues about logarithmic axes...

@jbcrail
Copy link
Contributor Author

jbcrail commented May 5, 2017

I checked the test cases and found tests for logarithmic axes but only for points. I'll add tests for lines.

@philippjfr
Copy link
Member

Thanks for the fix @jbcrail. My understanding of this code is still lacking so my feedback may not be as helpful as it could be. The main worry I have are these new conditionals you've had to introduce in the core aggregation function will add additional overhead which may not be necessary. I have the feeling that your original approach of adjusting the bounds would have worked okay but should simply been adjusting them by 1 or 1/2 a pixel width rather than always subtracting 1. I'm not very confident about that though so if you can explain to me and Jim why these checks have to happen in the core aggregation and not before we'll trust you. I'll try to work through it on paper now to convince myself.

@jbcrail
Copy link
Contributor Author

jbcrail commented May 5, 2017

My understanding of the code is still improving also, so any feedback or double-checking is appreciated. I'll describe what I've learned and why I choose the current approach. Maybe this will help us find a better way.

Initial approach: exclusive bounding box in extend_line

In extend_line, we process a set of segments, where each segment has a start and end point and the points are typically defined using float coordinates. For each segment, we clip the line segment to the bounding box if one point falls outside (when both points are outside, we skip the line segment). Then we pass the segments to draw_line to calculate the pixels.

If we made the bounding box exclusive, any point landing on either maximum bound would have to be adjusted, causing the point to be inside the box. Then we could pass this clipped line segment to draw_line, which wouldn't have to be updated.

However, the clipping algorithm assumes an inclusive bounding box and updating the algorithm accordingly turned out to be error prone. I considered using NumPy's float epsilon to decrement the respective point's coordinate so that the point would be just inside but I determined it was a bad idea to change data.

Current approach: exclusive bounding box in draw_line

We leave extend_line unchanged (aside from passing pre-calculated parameters).

At this level, the coordinates of the line segment and bounding box are scaled and transformed to integer coordinates. There are four cases to consider: 1) one segment point is on a maximum bound, 2) each segment point is on a separate maximum bound, 3) both segment points are on the same maximum bound (i.e. vertical or horizontal line), and 4) both segment points are inside or on minimum bounds.

Since draw_line doesn't distinguish between the four cases, for generality I had to add the conditional before appending any pixel.

Alternative

In extend_line, the clipping algorithm is currently run against the float-coordinate bounding box. This is where the complexity and edge cases have been. But if we scale/transform the coordinates for the line segment and the bounding box prior to clipping, then the problem is simpler.

Based on @philippjfr's feedback, we could adjust the integer-coordinate bounding box down by one on each side. Then we would run the clipping algorithm unaltered and pass the new line segment coordinates to draw_line. We would improve the performance and further simplify draw_line. It would add some messiness to the tests but it's a good tradeoff.

This removes the bound checks in draw_line. It also makes the bounding
box clipping integer-based, along with draw_line.
@philippjfr
Copy link
Member

Thanks so much for the detailed summary. The new approach seems like the correct and most optimal approach so I'd be very happy to see this merged.

@jbcrail
Copy link
Contributor Author

jbcrail commented May 7, 2017

Thanks, @philippjfr. I also benchmarked master against this PR using a 1e6 x 1e6 diagonal. No major difference in performance.

Branch: master
--------------------------------------------------------------------------------- benchmark: 2 tests --------------------------------------------------------------------------------
Name (time in ms)            Min                   Max                  Mean              StdDev                Median                 IQR            Outliers(*)  Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_draw_line           78.4571 (1.0)        258.6826 (1.0)         99.5152 (1.0)       55.9732 (1.0)         82.1140 (1.0)        4.5016 (1.0)              1;1      10           1
test_extend_line      3,513.6678 (44.78)    3,945.5268 (15.25)    3,616.2169 (36.34)    129.4731 (2.31)     3,578.8234 (43.58)    102.4231 (22.75)            1;1      10           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Branch: fix-exclusive-upper-bound
-------------------------------------------------------------------------------- benchmark: 2 tests --------------------------------------------------------------------------------
Name (time in ms)            Min                   Max                  Mean              StdDev                Median                IQR            Outliers(*)  Rounds  Iterations
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_draw_line           77.9279 (1.0)        179.8982 (1.0)         93.9135 (1.0)       33.3825 (1.0)         79.6688 (1.0)       2.6067 (1.0)              1;2      10           1
test_extend_line      3,405.3419 (43.70)    3,964.4963 (22.04)    3,584.1758 (38.16)    155.1264 (4.65)     3,563.3096 (44.73)    19.2078 (7.37)             3;4      10           1
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

@jbednar jbednar merged commit 56f37f8 into holoviz:master May 7, 2017
@jbcrail jbcrail deleted the fix-exclusive-upper-bound branch May 7, 2017 20:12
jbcrail added a commit to jbcrail/datashader that referenced this pull request Sep 27, 2017
This reverts to exclusive ranges both manual and auto for all glyphs
(points/line) without introducing regressions of holoviz#318, holoviz#330, and holoviz#343.

I refactored several tests to make xarray coordinate indices easier to
read and more explicit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants