-
-
Notifications
You must be signed in to change notification settings - Fork 366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix upper-bound bin error for auto-ranged data #459
Merged
Merged
Changes from 18 commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
bab1044
Fix upper-bound bin error for auto-ranged data
jbcrail 2131de4
Clean up docs
jbcrail 98f8358
Merge branch 'master' into issue-457
jbcrail 80829e9
Add more comprehensive tests
jbcrail a0bab32
Refactor bounds expansion into separate method
jbcrail 1d5b94d
Switch to exclusive range for all use cases
jbcrail af57cbc
Revert whitespace updates
jbcrail b323b12
Add auto range tests for line glyph
jbcrail 3ba7ce9
Add tests for uniform distribution of points
jbcrail 245ae5b
Revert exclusive range adjustments
jbcrail eb30517
Update point glyph to use fully inclusive range
jbcrail 5e36915
Update line glyph to use fully inclusive range
jbcrail 6051cd2
Update docs
jbcrail 6ac43a6
Update tests
jbcrail 54ac722
Update comment
jbcrail 343953c
Fix glyph tests
jbcrail 025ab07
Remove unused module
jbcrail 8df6599
Fixed typo
jbednar 90aa4cd
Add more tests for uniform points
jbcrail 8c8b9f6
Simplify mapping to pixels
jbcrail 72f6851
Added comment
jbednar 43bf9bc
Fixed typo in comment
jbednar File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -92,14 +92,21 @@ def _build_extend(self, x_mapper, y_mapper, info, append): | |
def _extend(vt, bounds, xs, ys, *aggs_and_cols): | ||
sx, tx, sy, ty = vt | ||
xmin, xmax, ymin, ymax = bounds | ||
|
||
def map_onto_pixel(x, y): | ||
xx, yy = x_mapper(x) * sx + tx, y_mapper(y) * sy + ty | ||
if x == xmax: | ||
xx -= np.spacing(xx) | ||
if y == ymax: | ||
yy -= np.spacing(yy) | ||
return int(xx), int(yy) | ||
|
||
for i in range(xs.shape[0]): | ||
x = xs[i] | ||
y = ys[i] | ||
if (xmin <= x < xmax) and (ymin <= y < ymax): | ||
append(i, | ||
int(x_mapper(x) * sx + tx), | ||
int(y_mapper(y) * sy + ty), | ||
*aggs_and_cols) | ||
if (xmin <= x <= xmax) and (ymin <= y <= ymax): | ||
xi, yi = map_onto_pixel(x, y) | ||
append(i, xi, yi, *aggs_and_cols) | ||
|
||
def extend(aggs, df, vt, bounds): | ||
xs = df[x_name].values | ||
|
@@ -127,16 +134,10 @@ def _build_extend(self, x_mapper, y_mapper, info, append): | |
y_name = self.y | ||
|
||
def extend(aggs, df, vt, bounds, plot_start=True): | ||
# Scale/transform float bounds to integer space and adjust for | ||
# exclusive upper bounds | ||
xmin, xmax, ymin, ymax = map_onto_pixel(vt, *bounds) | ||
mapped_bounds = (xmin, xmax - 1, ymin, ymax - 1) | ||
|
||
xs = df[x_name].values | ||
ys = df[y_name].values | ||
cols = aggs + info(df) | ||
|
||
extend_line(vt, bounds, mapped_bounds, xs, ys, plot_start, *cols) | ||
extend_line(vt, bounds, xs, ys, plot_start, *cols) | ||
|
||
return extend | ||
|
||
|
@@ -169,13 +170,16 @@ def _compute_outcode(x, y, xmin, xmax, ymin, ymax): | |
|
||
def _build_map_onto_pixel(x_mapper, y_mapper): | ||
@ngjit | ||
def map_onto_pixel(vt, x0, x1, y0, y1): | ||
def map_onto_pixel(vt, bounds, x, y): | ||
"""Map points onto pixel grid""" | ||
sx, tx, sy, ty = vt | ||
return (int(x_mapper(x0) * sx + tx), | ||
int(x_mapper(x1) * sx + tx), | ||
int(y_mapper(y0) * sy + ty), | ||
int(y_mapper(y1) * sy + ty)) | ||
_, xmax, _, ymax = bounds | ||
xx, yy = x_mapper(x) * sx + tx, y_mapper(y) * sy + ty | ||
if x == xmax: | ||
xx -= np.spacing(xx) | ||
if y == ymax: | ||
yy -= np.spacing(yy) | ||
return int(xx), int(yy) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment as above regarding np.spacing vs. -1. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same as above. |
||
|
||
return map_onto_pixel | ||
|
||
|
@@ -234,9 +238,9 @@ def draw_line(x0i, y0i, x1i, y1i, i, plot_start, clipped, *aggs_and_cols): | |
|
||
def _build_extend_line(draw_line, map_onto_pixel): | ||
@ngjit | ||
def extend_line(vt, bounds, mapped_bounds, xs, ys, plot_start, *aggs_and_cols): | ||
def extend_line(vt, bounds, xs, ys, plot_start, *aggs_and_cols): | ||
"""Aggregate along a line formed by ``xs`` and ``ys``""" | ||
xmin, xmax, ymin, ymax = mapped_bounds | ||
xmin, xmax, ymin, ymax = bounds | ||
nrows = xs.shape[0] | ||
i = 0 | ||
while i < nrows - 1: | ||
|
@@ -252,11 +256,9 @@ def extend_line(vt, bounds, mapped_bounds, xs, ys, plot_start, *aggs_and_cols): | |
i += 1 | ||
continue | ||
|
||
x0i, x1i, y0i, y1i = map_onto_pixel(vt, x0, x1, y0, y1) | ||
|
||
# Use Cohen-Sutherland to clip the segment to a bounding box | ||
outcode0 = _compute_outcode(x0i, y0i, xmin, xmax, ymin, ymax) | ||
outcode1 = _compute_outcode(x1i, y1i, xmin, xmax, ymin, ymax) | ||
outcode0 = _compute_outcode(x0, y0, xmin, xmax, ymin, ymax) | ||
outcode1 = _compute_outcode(x1, y1, xmin, xmax, ymin, ymax) | ||
|
||
accept = False | ||
clipped = False | ||
|
@@ -268,34 +270,34 @@ def extend_line(vt, bounds, mapped_bounds, xs, ys, plot_start, *aggs_and_cols): | |
elif outcode0 & outcode1: | ||
plot_start = True | ||
break | ||
|
||
clipped = True | ||
outcode_out = outcode0 if outcode0 else outcode1 | ||
if outcode_out & TOP: | ||
x = x0 + (x1 - x0) * (ymax - y0) / (y1 - y0) | ||
y = ymax | ||
elif outcode_out & BOTTOM: | ||
x = x0 + (x1 - x0) * (ymin - y0) / (y1 - y0) | ||
y = ymin | ||
elif outcode_out & RIGHT: | ||
y = y0 + (y1 - y0) * (xmax - x0) / (x1 - x0) | ||
x = xmax | ||
elif outcode_out & LEFT: | ||
y = y0 + (y1 - y0) * (xmin - x0) / (x1 - x0) | ||
x = xmin | ||
|
||
if outcode_out == outcode0: | ||
x0, y0 = x, y | ||
outcode0 = _compute_outcode(x0, y0, xmin, xmax, ymin, ymax) | ||
# If x0 is clipped, we need to plot the new start | ||
plot_start = True | ||
else: | ||
clipped = True | ||
outcode_out = outcode0 if outcode0 else outcode1 | ||
if outcode_out & TOP: | ||
x = x0i + int((x1i - x0i) * (ymax - y0i) / (y1i - y0i)) | ||
y = ymax | ||
elif outcode_out & BOTTOM: | ||
x = x0i + int((x1i - x0i) * (ymin - y0i) / (y1i - y0i)) | ||
y = ymin | ||
elif outcode_out & RIGHT: | ||
y = y0i + int((y1i - y0i) * (xmax - x0i) / (x1i - x0i)) | ||
x = xmax | ||
elif outcode_out & LEFT: | ||
y = y0i + int((y1i - y0i) * (xmin - x0i) / (x1i - x0i)) | ||
x = xmin | ||
|
||
if outcode_out == outcode0: | ||
x0i, y0i = x, y | ||
outcode0 = _compute_outcode(x0i, y0i, xmin, xmax, | ||
ymin, ymax) | ||
# If x0i is clipped, we need to plot the new start | ||
plot_start = True | ||
else: | ||
x1i, y1i = x, y | ||
outcode1 = _compute_outcode(x1i, y1i, xmin, xmax, | ||
ymin, ymax) | ||
x1, y1 = x, y | ||
outcode1 = _compute_outcode(x1, y1, xmin, xmax, ymin, ymax) | ||
|
||
if accept: | ||
x0i, y0i = map_onto_pixel(vt, bounds, x0, y0) | ||
x1i, y1i = map_onto_pixel(vt, bounds, x1, y1) | ||
draw_line(x0i, y0i, x1i, y1i, i, plot_start, clipped, *aggs_and_cols) | ||
plot_start = False | ||
i += 1 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not tested this, but would it be better to fix it up in the integer domain, instead of relying on np.spacing to be sufficient to bump it back by an integer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjusting in the integer domain instead fails compared to the original. For instance,
given the point (1.0, 1.0), the original algorithm returns (0, 0) while the above returns (-1, -1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used this script to compare implementations:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For that script, all the cases where the two methods differ are when f==bound, which makes sense given that both versions will otherwise always return
int(x * s + t)
. But what I can't tell is which answer (if either) is correct when the two do differ. When they differ, is that meant to be a valid actual input, given that the points are being cropped against the specified bounds already?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value of the first implementation,
a
, is assumed to be the correct value since its the current implementation used by all of the passing tests. The script was meant to verify the second implementation would be a drop-in replacement.I just verified that the newer implementation passes the tests, which makes me believe more investigation may be needed to improve the test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that's true; the test cases could well be fine. It's true that the above script finds differences, but I can't tell whether it's finding them in cases that would ever be able to be encountered in practice, given that s and t are not arbitrary, but are presumably derived from the bounds (and the aggregate array size), and given that the points are already being cropped against the bounds. What would be helpful would be to take a case where the values are observed to differ, and then figure out whether that case could ever happen, given how s and t are actually calculated. My guess is that it couldn't ever happen, but I have not worked it through myself. And given that both approaches pass the tests, I am not at all confident that the existing approach in the PR is safer or more general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your assessment. I took my modified script and created additional uniformity test cases. Both implementations produced the same results with the test suite, so I replaced the mapping with the simpler implementation.