-
Notifications
You must be signed in to change notification settings - Fork 353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GridGeoSampler may never reach end of image #431
Comments
As a result of #376, GridGeoSampler will also skip areas smaller than the patch size entirely. I think the issue you reported is easy to fix, we just have to be careful not to go out of bounds of the image. So the minx/miny will need to be adjusted with a smaller stride for the last sample per row/col. |
Hadn't though of the "smaller stride at the end of image" solution. I guess this ensures the chip_size is respected and could definitely work. The only drawback is that this changes the expected behaviour of the sampler since the stride may not always be the size that was initially set (in my dummy example, the last stride would be 88 rather than the expected 256). If a user (like me :) ) is stitching inference chips into a large array assuming the stride is constant (let's say he's using pixel units from #279 ), the result of the "smaller stride at the end of image" solution may be surprising. If this solution is chosen, I guess the user will have to make sure he/she is not naively assuming the stride will always be constant and make sure to rely on the bbox information returned by the sampler either directly from the __iter__ function of the sampler of from a __getitem__ as in RasterDataset. An alternative that I'd consider is to keep a constant stride, but return a sample that is truncated (in my example: a 88 x 512 sample for the last column and vice versa for last row), leaving it to the user to use a pad_to function as is done in test_transforms on the Chesapeake dataset. To synthesize: In both cases, I'd issue a warning since either (1) the stride's value or (2) the chip size's value as inputted by the user will be forced to change at the limits of an image. EDIT: To help illustrate, I'll borrow from @adamjstewart's artwork :). Given two chips with a one pixel overlap, ie stride = chip_size - 1, where L: last, SL: second to last and 0: padding. Suppose these are being sampled from CRS units where maxx is on the left (pixel wise) of minx.
|
Also see #30. Part of the reason that GeoDatasets return the CRS and bbox is so you can stitch things back together easily even if the stride changes for the last image. Of course, we still need to actually write stitching utilities to use this information. |
Thanks a ton for the bug report @remtav! Here's a notebook that gives a self-contained example of:
https://gist.github.com/calebrob6/7b226eb73877187f85fb5e1621bb7971 There is definitely something wrong here, but I haven't parsed it yet. |
add first draft of datasets/ccmeo.py and datamodules/ccmeo.py chesapeake_cvpr.yaml: add trainer parameters for testing segmentation.py: (WIP) add a BinarySemanticSegmentationTask tested with Spacenet single.py: add GridGeoSamplerPlus with temporary solution to microsoft#431
In ran into this issue after running multiple tests and double-checking why my inference script using GridGeoSampler never reached the end (right and bottom) of an image.
I'm using GridGeoSampler with an image of size 1591 x 2187 pixels, a chip size of 512 and stride of 256. For the sake of clarity, let's just say my image is 600 x 600 (pixel units or crs units doesn't matter here -> just assume minimum bounds in x and y are 0 if you're thinking crs units):
I want GridGeoSampler to sample my 600 x 600 image in chips of 512x512, with a stride of 256.
The total number of row and columns as calculated in the __init__ and __iter_ of GridGeoSampler will be:
rows = columns = (600 - 512) // 256 +1 = 1
Therefore, the __iter__ function will use two for loops with range(rows) and range (columns). Since those two variables are 1, only 1 512x512 sample will be returned by the sampler, thus ignoring 88 (600-512) pixels of width and height where valid values existed in my image.
To adapt the current implementation, I believe rows and columns should be calculated this way:
( image width - chip size + stride size ) // stride size + 1
or in the code, for rows:
rows = int((bounds.maxy - bounds.miny - self.size[0] + self.stride[0]) // self.stride[0]) + 1
The text was updated successfully, but these errors were encountered: