Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r.info: Add JSON output #3744

Merged
merged 9 commits into from
Jun 16, 2024
Merged

Conversation

kritibirda26
Copy link
Contributor

Use parson to add json output format support to the r.info module. The module has various flags to control the fields being output in case of plain format. The current prototype adheres to the flags for JSON output as well.

Use parson to add json output format support to the r.info module. The
module has various flags to control the fields being output in case of plain
format. The current prototype adheres to the flags for JSON output as well.
@github-actions github-actions bot added raster Related to raster data processing Python Related code is in Python C Related code is in C module tests Related to Test Suite labels May 27, 2024
raster/r.info/main.c Outdated Show resolved Hide resolved
raster/r.info/main.c Outdated Show resolved Hide resolved
raster/r.info/main.c Outdated Show resolved Hide resolved
raster/r.info/main.c Outdated Show resolved Hide resolved
raster/r.info/main.c Outdated Show resolved Hide resolved
raster/r.info/main.c Outdated Show resolved Hide resolved
raster/r.info/main.c Outdated Show resolved Hide resolved
raster/r.info/main.c Outdated Show resolved Hide resolved
raster/r.info/main.c Outdated Show resolved Hide resolved
raster/r.info/main.c Outdated Show resolved Hide resolved
raster/r.info/testsuite/test_r_info.py Show resolved Hide resolved
raster/r.info/testsuite/test_r_info.py Outdated Show resolved Hide resolved
raster/r.info/testsuite/test_r_info.py Outdated Show resolved Hide resolved
raster/r.info/testsuite/test_r_info.py Outdated Show resolved Hide resolved
raster/r.info/testsuite/test_r_info.py Outdated Show resolved Hide resolved
raster/r.info/testsuite/test_r_info.py Outdated Show resolved Hide resolved
raster/r.info/testsuite/test_r_info.py Outdated Show resolved Hide resolved
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@wenzeslaus
Copy link
Member

I'm curious what others think about making parts of the JSON dependent on the flags versus always outputting everything.

Compared to our previous discussion, I see that respecting flags is actually easier than not respecting them given the current code. On the other hand, adding || format == JSON is not that bad.

@cwhite911
Copy link
Contributor

I'm curious what others think about making parts of the JSON dependent on the flags versus always outputting everything.

Compared to our previous discussion, I see that respecting flags is actually easier than not respecting them given the current code. On the other hand, adding || format == JSON is not that bad.

I think we should start with outputting everything.

@kritibirda26 will you add the JSON schema to your original PR comment.

@kritibirda26
Copy link
Contributor Author

@cwhite911 Makes sense, I will update the code to output everything. By JSON schema, do you mean a valid https://json-schema.org/ or just a rough outline of the intended JSON output?

kritibirda26 and others added 2 commits May 28, 2024 20:54
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@ninsbl
Copy link
Member

ninsbl commented May 30, 2024

Does JSON output actually solve: #491 ?

@wenzeslaus
Copy link
Member

Does JSON output actually solve: #491 ?

It should in a way. I think issues like these is one of the important reasons to introduce JSON. ...what to do for plain text output with -e that's a discussion for #491.

raster/r.info/main.c Outdated Show resolved Hide resolved
raster/r.info/main.c Show resolved Hide resolved
raster/r.info/main.c Show resolved Hide resolved
raster/r.info/main.c Show resolved Hide resolved
raster/r.info/main.c Show resolved Hide resolved
raster/r.info/testsuite/test_r_info.py Outdated Show resolved Hide resolved
@cwhite911
Copy link
Contributor

Does JSON output actually solve: #491 ?

It should in a way. I think issues like these is one of the important reasons to introduce JSON. ...what to do for plain text output with -e that's a discussion for #491.

I agree the text returned as json should not contain line breaks. @kritibirda26

@kritibirda26
Copy link
Contributor Author

@cwhite911 AFAIK, it is not possible to set integers selectively using parson. My understanding is that all numbers should be integers or floats. See kgabis/parson#71

@echoix
Copy link
Member

echoix commented May 31, 2024

Does JSON output actually solve: #491 ?

It should in a way. I think issues like these is one of the important reasons to introduce JSON. ...what to do for plain text output with -e that's a discussion for #491.

I agree the text returned as json should not contain line breaks. @kritibirda26

Strings should probably be escaped

@cwhite911
Copy link
Contributor

cwhite911 commented May 31, 2024

@cwhite911 AFAIK, it is not possible to set integers selectively using parson. My understanding is that all numbers should be integers or floats. See kgabis/parson#71

@kritibirda26 Will you look into some possible solutions.

@kritibirda26
Copy link
Contributor Author

@cwhite911 I had looked into finding a solution before but couldn't fine one. The best I could come up with is setting json_set_number_serialization_function to something like this:

static int format_json_number(double num, char* buf) {
    long truncated = (long) num;
    if (truncated == num) {
        return sprintf(buf, "%ld", truncated);
    } else {
        return sprintf(buf, "%lf", num);
    }
}

We still cannot control individual number field's datatypes but it will format numbers as integers if possible. (The downside is some floating point fields might be formatted as numbers if their values are integers without any fractional part). Can you please tell if this solution is fine or if you have a better solution in mind?

@wenzeslaus
Copy link
Member

...We still cannot control individual number field's datatypes but it will format numbers as integers if possible. (The downside is some floating point fields might be formatted as numbers if their values are integers without any fractional part)...

I don't have any suggestion at this point, but I remember I struggled with this when reading JSON into Python, sometimes the values for the same key came as ints and sometimes as floats depending on how they are stored in JSON. Considering the following example in Python:

>>> type(json.loads('{"a": 7}')["a"])
<class 'int'>
>>> type(json.loads('{"a": 7.1}')["a"])
<class 'float'>

This is expected, but it depends on the tool which produces the JSON whether or not it makes the distinction and would export 7.0 as 7.0 or 7. In Python, mixing floats and ints gives the expected float results and even division of ints now gives float, so it is less of an issue. If one or the other type is needed consistently, the only two ways I know about (without invoking things JSON schema reading) are knowing what type that should be (you know that for resolution you should use float) or exporting the types in JSON (makes more sense in cases when these are dynamic as in v.db.select).

Just for comparison, Python JSON dump is preserving float vs int:

>>> import json
>>> json.dumps({"a": 7})
'{"a": 7}'
>>> json.dumps({"a": 7.0})
'{"a": 7.0}'
>>> json.dumps({"a": int(7)})
'{"a": 7}'
>>> json.dumps({"a": float(7)})
'{"a": 7.0}'
>>> json.dumps({"a": 7.000})
'{"a": 7.0}'

raster/r.info/main.c Outdated Show resolved Hide resolved
@wenzeslaus
Copy link
Member

All versus selected:

$ # -g
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -g >/dev/null; done

real	0m21.433s
user	0m15.746s
sys	0m6.246s

$ # -gr
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -gr >/dev/null; done

real	0m22.289s
user	0m16.450s
sys	0m6.419s

$ # -grs
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -grs >/dev/null; done

real	0m34.711s
user	0m28.881s
sys	0m6.326s

$ # -gs
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -gs >/dev/null; done

real	0m34.914s
user	0m29.283s
sys	0m6.151s

$ # -s
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -s >/dev/null; done

real	0m34.896s
user	0m29.112s
sys	0m6.316s

$ # -gre
$ time for i in {1..100}; do grass-dev --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -gre >/dev/null; done

real	0m22.039s
user	0m16.297s
sys	0m6.315s

$ # -grse
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -grse >/dev/null; done

real	0m35.337s
user	0m29.894s
sys	0m5.983s

@echoix
Copy link
Member

echoix commented Jun 5, 2024

All versus selected:

$ # -g
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -g >/dev/null; done

real	0m21.433s
user	0m15.746s
sys	0m6.246s


$ # -gr
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -gr >/dev/null; done


real	0m22.289s
user	0m16.450s
sys	0m6.419s



$ # -grs
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -grs >/dev/null; done


real	0m34.711s
user	0m28.881s
sys	0m6.326s


$ # -gs
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -gs >/dev/null; done


real	0m34.914s
user	0m29.283s
sys	0m6.151s


$ # -s
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -s >/dev/null; done


real	0m34.896s
user	0m29.112s
sys	0m6.316s



$ # -gre
$ time for i in {1..100}; do grass-dev --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -gre >/dev/null; done


real	0m22.039s
user	0m16.297s
sys	0m6.315s


$ # -grse
$ time for i in {1..100}; do grass --tmp-mapset ~/grassdata/nc_spm_08_grass7/ --exec r.info elevation -grse >/dev/null; done


real	0m35.337s
user	0m29.894s
sys	0m5.983s

So by reduction, the -s flag adds about 12 seconds for 100 invocations, while other flag combinations lie in the 1-2 second range.

@github-actions github-actions bot added HTML Related code is in HTML docs labels Jun 7, 2024
@wenzeslaus
Copy link
Member

...the -s flag adds about 12 seconds for 100 invocations, while other flag combinations lie in the 1-2 second range.

@kritibirda26 @cwhite911 Do you want to reevaluate the enabled flags? Perhaps it should give the same output as r.info map=name without any flags. This includes min,max but does not include stats.

@kritibirda26
Copy link
Contributor Author

@wenzeslaus Can you please tell what we is to be reevaluated again and what the aim is?

petrasovaa added a commit that referenced this pull request Jun 7, 2024
Given the [discussion](#3744 (comment)) about what should be part of JSON output, I changed the behavior to include the distance whether or not the -l flag is used. There is no computational reason not to.
@wenzeslaus
Copy link
Member

old new
r.info elevation r.info elevation or r.info elevation format=plain
r.info elevation -gre r.info elevation format=shell
- r.info elevation format=json

There is -s which is more costly, so -s should work like this:

command result
r.info elevation -s shell-style output like now
r.info elevation -gres shell-style output like now
r.info elevation -s format=shell same as -gres above
r.info elevation -s format=json JSON with statistics

The -h flag is not entirely clear to me without reading the code. However, it creates a human-readable output and is exclusive with -gres. It seems to me that the output of -h is already included in JSON and shell-style.

Not now, but for version 9, we may want to do:

command result
r.info elevation -gre none of -gre is supported (replaced by format=shell)
r.info elevation format=shell shell-style (same as v8.5)
r.info elevation -s human-readable for statistics (only statistics or current plus statistics)
r.info elevation -s format=shell shell-style with statistics (same as v8.5)
r.info elevation -s format=json JSON with statistics (same as v8.5)
r.info elevation -h human-readable for history (same as v8.3/v8.4)

@cwhite911
Copy link
Contributor

@wenzeslaus I agree, let's have the JSON format respect the '-s' flag.

@kritibirda26 kritibirda26 changed the title Add JSON support to r.info module r.info: add JSON support Jun 14, 2024
@kritibirda26
Copy link
Contributor Author

Hi! I have updated the PR to respect the s-flag. IIUC, no other changes are pending and the PR should be ready for merging.

Copy link
Member

@wenzeslaus wenzeslaus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. Let's merge it. The code looks good and the tests, too.

We can always get back to this. (We have the branch for the 8.4 release series, so changes on main can be adjusted even in terms of API until branching for 8.5.)

@wenzeslaus wenzeslaus changed the title r.info: add JSON support r.info: Add JSON support Jun 16, 2024
@wenzeslaus wenzeslaus changed the title r.info: Add JSON support r.info: Add JSON output Jun 16, 2024
@wenzeslaus wenzeslaus merged commit 5fbf526 into OSGeo:main Jun 16, 2024
26 checks passed
@neteler neteler added this to the 8.5.0 milestone Jun 16, 2024
a0x8o pushed a commit to a0x8o/grass that referenced this pull request Jun 17, 2024
Given the [discussion](OSGeo#3744 (comment)) about what should be part of JSON output, I changed the behavior to include the distance whether or not the -l flag is used. There is no computational reason not to.
a0x8o pushed a commit to a0x8o/grass that referenced this pull request Jun 17, 2024
Uses parson to add JSON output format support to the r.info module. The tool has various flags to control the fields being output in case of the shell script style format (aka plain in the code). The JSON output always contains all fields even without the flags being specified, except for statistics. The statistics must be set with a flag because the statistics are only optionally stored in the raster data and computing them on the fly takes a lot of time.

The option value for format is plain (shell/key-value) and json and internally, the plain, human output is treated separately from the machine readable formats which are PLAIN and JSON.
@kritibirda26 kritibirda26 deleted the add-json-to-r-info branch June 19, 2024 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C Related code is in C docs HTML Related code is in HTML module Python Related code is in Python raster Related to raster data processing tests Related to Test Suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants