change edit_api to reflect server #941

sahithyaravi · 2020-08-19T08:08:03Z

Reference Issue

What does this PR implement/fix? Explain your changes.

The server has made changes to the edit_api. Refer openml/OpenML#1058.
Summary of the changes:
Edit API - Edit non-critical data fields for everyone. Edit critical fields only for the owner, provided there are no tasks. If the owner already created a task, we will just throw an error.

Fork API - Clone the row as such with change in dataset ID and uploader ID.

How should this PR be tested?

Any other comments?

Fork API changes will be made in a new PR. This is more urgent as it is breaking the openml-python build.
Note that "data" changes are not supported. If the data itself requires changes, the user has to create a new dataset. We will not be doing this through edit API or fork API, as it changes the arff file.
In future we want to support versioning for descriptions (For now we allow direct changes without any version history.)

PGijsbers

Do we want to allow everyone to edit the data? Is this temporary? I can't imagine this is healthy long term, someone will come along and be destructive.

Of course we do still need a way for other users to at least suggest changes to be approved by the original uploader or someone from openml (hopefully mostly the former). Are there plans for this?

tests/test_datasets/test_dataset_functions.py

codecov-commenter · 2020-08-27T09:12:10Z

Codecov Report

Merging #941 into develop will decrease coverage by 0.28%.
The diff coverage is 81.77%.

@@             Coverage Diff             @@
##           develop     #941      +/-   ##
===========================================
- Coverage    88.06%   87.77%   -0.29%     
===========================================
  Files           37       37              
  Lines         4364     4427      +63     
===========================================
+ Hits          3843     3886      +43     
- Misses         521      541      +20

Impacted Files	Coverage Δ
openml/datasets/__init__.py	`100.00% <ø> (ø)`
openml/exceptions.py	`93.54% <0.00%> (ø)`
openml/extensions/__init__.py	`100.00% <ø> (ø)`
openml/flows/__init__.py	`100.00% <ø> (ø)`
openml/runs/__init__.py	`100.00% <ø> (ø)`
openml/study/__init__.py	`100.00% <ø> (ø)`
openml/tasks/__init__.py	`100.00% <ø> (ø)`
openml/setups/setup.py	`44.00% <9.09%> (ø)`
openml/evaluations/evaluation.py	`66.66% <20.00%> (ø)`
openml/datasets/data_feature.py	`68.00% <33.33%> (ø)`
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5d2e0ce...cfa2513. Read the comment docs.

mfeurer · 2020-08-27T11:34:05Z

Do we want to allow everyone to edit the data? Is this temporary?

Reading this, I'm also a bit worried about this.

sahithyaravi · 2020-08-27T11:53:05Z

Do we want to allow everyone to edit the data? Is this temporary?

Reading this, I'm also a bit worried about this.

It is temporary that everyone can edit non-critical fields.
This is to avoid unnecessary forks(even with the same file ID) for changes to a non-critical field such as citation, paper_url etc.
For now, description also falls under non-critical category.
However, In the near future, we plan to:

We will provide a choice to the owner of the dataset to allow/disallow non-critical data edits from the community. If the owner chooses not to allow this, then only the owner can perform edits.
We will support versioning for dataset descriptions. This way we dont need different dataset versions for a change in description.

mfeurer · 2020-08-27T11:54:30Z

It is temporary that everyone can edit non-critical fields.

Wouldn't it make sense to at least restrict this to registered users?

sahithyaravi · 2020-08-27T11:56:16Z

It is temporary that everyone can edit non-critical fields.

Wouldn't it make sense to at least restrict this to registered users?

yes only registered users can do this , otherwise the server will throw an error "Authentication failed". I checked this through a post request for edit and fork APIS but, I think this is true for all requests that are not "get".

mfeurer

This looks good to me, although I would prefer to have a bit more text in the tutorial what critical and non-critical fields are.

examples/30_extended/datasets_tutorial.py

PGijsbers

Looks good to me. We might want to refer to the documentation on API key configuration rather than leaving the commented out code in (https://openml.github.io/openml-python/master/examples/20_basic/introduction_tutorial.html#sphx-glr-examples-20-basic-introduction-tutorial-py). Either way works for me. I'll leave the merge for @mfeurer since he had left feedback on the PR.

@mfeurer

* Create first section: Creating Custom Flow * Add Section: Using the Flow It is incomplete as while trying to explain how to format the predictions, I realized a utility function is required. * Allow run description text to be custom Previously the description text that accompanies the prediction file was auto-generated with the assumption that the corresponding flow had an extension. To support custom flows (with no extension), this behavior had to be changed. The description can now be passed on initialization. The description describing it was auto generated from run_task is now correctly only added if the run was generated through run_flow_on_task. * Draft for Custom Flow tutorial * Add minimal docstring to OpenMLRun I am not for each field what the specifications are. * Process code review feedback In particular: - text changes - fetch true labels from the dataset instead * Use the format utility function in automatic runs To format the predictions. * Process @mfeurer feedback * Rename arguments of list_evaluations (#933) * list evals name change * list evals - update * adding config file to user guide (#931) * adding config file to user guide * finished requested changes * Edit api (#935) * version1 * minor fixes * tests * reformat code * check new version * remove get data * code format * review comments * fix duplicate * type annotate * example * tests for exceptions * fix pep8 * black format * Adding support for scikit-learn > 0.22 (#936) * Preliminary changes * Updating unit tests for sklearn 0.22 and above * Triggering sklearn tests + fixes * Refactoring to inspect.signature in extensions * Add flake8-print in pre-commit (#939) * Add flake8-print in pre-commit config * Replace print statements with logging * Fix edit api (#940) * fix edit api * Update subflow paragraph * Check the ClassificationTask has class label set * Test task is of supported type * Add tests for format_prediction * Adding Python 3.8 support (#916) * Adding Python 3.8 support * Fixing indentation * Execute test cases for 3.8 * Testing * Making install script fail * Process feedback Neeratyoy * Test Exception with Regex Also throw NotImplementedError instead of TypeError for unsupported task types. Added links in the example. * change edit_api to reflect server (#941) * change edit_api to reflect server * change test and example to reflect rest API changes * tutorial comments * Update datasets_tutorial.py * Create first section: Creating Custom Flow * Add Section: Using the Flow It is incomplete as while trying to explain how to format the predictions, I realized a utility function is required. * Allow run description text to be custom Previously the description text that accompanies the prediction file was auto-generated with the assumption that the corresponding flow had an extension. To support custom flows (with no extension), this behavior had to be changed. The description can now be passed on initialization. The description describing it was auto generated from run_task is now correctly only added if the run was generated through run_flow_on_task. * Draft for Custom Flow tutorial * Add minimal docstring to OpenMLRun I am not for each field what the specifications are. * Process code review feedback In particular: - text changes - fetch true labels from the dataset instead * Use the format utility function in automatic runs To format the predictions. * Process @mfeurer feedback * Update subflow paragraph * Check the ClassificationTask has class label set * Test task is of supported type * Add tests for format_prediction * Process feedback Neeratyoy * Test Exception with Regex Also throw NotImplementedError instead of TypeError for unsupported task types. Added links in the example. Co-authored-by: Bilgecelik <38037323+Bilgecelik@users.noreply.github.com> Co-authored-by: marcoslbueno <38478211+marcoslbueno@users.noreply.github.com> Co-authored-by: Sahithya Ravi <44670788+sahithyaravi1493@users.noreply.github.com> Co-authored-by: Neeratyoy Mallik <neeratyoy@gmail.com> Co-authored-by: zikun <33176974+zikun@users.noreply.github.com>

change edit_api to reflect server

12b67f1

sahithyaravi requested a review from PGijsbers August 19, 2020 08:09

PGijsbers reviewed Aug 25, 2020

View reviewed changes

tests/test_datasets/test_dataset_functions.py Outdated Show resolved Hide resolved

change test and example to reflect rest API changes

d1147b6

sahithyaravi requested a review from PGijsbers August 27, 2020 09:37

mfeurer reviewed Aug 28, 2020

View reviewed changes

examples/30_extended/datasets_tutorial.py Outdated Show resolved Hide resolved

sahithyaravi and others added 2 commits August 28, 2020 12:03

tutorial comments

17fb46b

Update datasets_tutorial.py

cfa2513

PGijsbers approved these changes Aug 28, 2020

View reviewed changes

sahithyaravi requested a review from mfeurer August 31, 2020 10:04

mfeurer approved these changes Aug 31, 2020

View reviewed changes

mfeurer merged commit f70c720 into develop Aug 31, 2020

mfeurer deleted the edit_api_changes branch August 31, 2020 18:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change edit_api to reflect server #941

change edit_api to reflect server #941

sahithyaravi commented Aug 19, 2020 •

edited

Loading

PGijsbers left a comment

codecov-commenter commented Aug 27, 2020 •

edited

Loading

mfeurer commented Aug 27, 2020

sahithyaravi commented Aug 27, 2020

mfeurer commented Aug 27, 2020

sahithyaravi commented Aug 27, 2020 •

edited

Loading

mfeurer left a comment

PGijsbers left a comment

change edit_api to reflect server #941

change edit_api to reflect server #941

Conversation

sahithyaravi commented Aug 19, 2020 • edited Loading

Reference Issue

What does this PR implement/fix? Explain your changes.

How should this PR be tested?

Any other comments?

PGijsbers left a comment

Choose a reason for hiding this comment

codecov-commenter commented Aug 27, 2020 • edited Loading

Codecov Report

mfeurer commented Aug 27, 2020

sahithyaravi commented Aug 27, 2020

mfeurer commented Aug 27, 2020

sahithyaravi commented Aug 27, 2020 • edited Loading

mfeurer left a comment

Choose a reason for hiding this comment

PGijsbers left a comment

Choose a reason for hiding this comment

sahithyaravi commented Aug 19, 2020 •

edited

Loading

codecov-commenter commented Aug 27, 2020 •

edited

Loading

sahithyaravi commented Aug 27, 2020 •

edited

Loading