Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v.rast.stats: note about vector overlap limitation #1730

Merged
merged 3 commits into from
Jan 13, 2022

Conversation

jfbourdon
Copy link
Contributor

Short note warning the user that if the vector map contains overlapping vectors, only one category will be picked during the rasterization process.

I'm suggesting that note because it took me a while recently before understanding why I was not always getting the stats I was expecting...

@neteler
Copy link
Member

neteler commented Oct 10, 2021

@metzm: may we merge this documentation change?

@neteler neteler added backport_needed bug Something isn't working HTML Related code is in HTML manual Documentation related issues vector Related to vector data processing labels Dec 9, 2021
@neteler neteler added this to the 8.0.0 milestone Dec 9, 2021
@neteler
Copy link
Member

neteler commented Jan 12, 2022

Close/reopen to awake flake8 test

@neteler neteler closed this Jan 12, 2022
@neteler neteler reopened this Jan 12, 2022
rasterization process. Statistics for these vectors will thus be partial.
<p>If an area has several categories in the selected layer (equivalent
to overlapping polygons in Simple Features), only one category will be
kept during the rasterization process. Statistics for the skipped
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't skipped there be replaced by kept? The skipped will be skipped, no? So the resulting stats for kept categories will be partial.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of two polygons partially overlapping, you have three topological areas where one area has two categories, shared with the other two areas:

  • area 1: category 1
  • area 2: category 2
  • area 3: categories 1, 2

Let's assume, for area 3 category 1 has been used: statistics for category 1 are complete and for category 2 incomplete because category 2 has been skipped for area 3. Thus resulting stats for skipped categories are partial.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarification, Markus. I re-created the plot in a paper here. So, you mean it's partial because category 2 will be considered only in area 2 in this case then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a more detailed explanation. OK, @veroandreo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks Markus :)

@metzm
Copy link
Contributor

metzm commented Jan 12, 2022

An example to create a vector in nc_spm_08 where areas have multiple categories and to get the number of areas that have multiple categories (requires PR #2085):

# create a vector where some areas have multiple categories 
v.buffer input=hospitals output=hospitals_circled type=point distance=1000 -t
# add unique category values to a new layer
v.category in=hospitals_circled out=hospitals_circled_l2 type=centroid op=add layer=2
# add a table to layer 2 with a column to hold counts of categoty values in layer 1
v.db.addtable map=hospitals_circled_l2 layer=2 columns="ncats_l1 integer"
# check range of category values in both layers
# range is larger in layer 2 because of unique category values 
v.category in=hospitals_circled_l2 op=report
# load number of different category values in layer 1 to layer 2 
v.to.db map=hospitals_circled_l2 layer=2 op=query query_layer=1 query_column="count(cat)" column=ncats_l1
# print summary of multiple category values in layer 1
v.db.select map=hospitals_circled_l2 layer=2 col="ncats_l1,count(ncats_l1)" group=ncats_l1

@neteler neteler added the database Related to database management label Jan 12, 2022
@metzm metzm merged commit 06e2c6e into OSGeo:main Jan 13, 2022
neteler pushed a commit that referenced this pull request Jan 13, 2022
* explain behaviour and incomplete stats if areas have multiple categories in the selected layer

Co-authored-by: Markus Metz <33666869+metzm@users.noreply.github.com>
ninsbl pushed a commit to ninsbl/grass that referenced this pull request Oct 26, 2022
* explain behaviour and incomplete stats if areas have multiple categories in the selected layer

Co-authored-by: Markus Metz <33666869+metzm@users.noreply.github.com>
ninsbl pushed a commit to ninsbl/grass that referenced this pull request Feb 17, 2023
* explain behaviour and incomplete stats if areas have multiple categories in the selected layer

Co-authored-by: Markus Metz <33666869+metzm@users.noreply.github.com>
neteler pushed a commit to nilason/grass that referenced this pull request Nov 7, 2023
* explain behaviour and incomplete stats if areas have multiple categories in the selected layer

Co-authored-by: Markus Metz <33666869+metzm@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working database Related to database management HTML Related code is in HTML manual Documentation related issues vector Related to vector data processing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants