Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit api #935

Merged
merged 14 commits into from
Jul 23, 2020
Merged

Edit api #935

merged 14 commits into from
Jul 23, 2020

Conversation

sahithyaravi
Copy link
Member

@sahithyaravi sahithyaravi commented Jul 22, 2020

Reference Issue

Edit api #929

What does this PR implement/fix? Explain your changes.

As discussed, this api can edit some meta-features of the dataset:
We have two cases based on the fields to edit:

Case 1:
List of meta-features can be edited/ updated via data_edit REST API call: Modifies existing version - only uploader or admin can do this

  1. description
  2. creator
  3. contributor
  4. collection_date
  5. language
  6. citation
  7. original_data_url
  8. paper_url

Case 2:
These meta-features/ data will just need to call create_dataset api and create a new version by cloning the old version and changing only the specified field - Creates new version

  1. attributes
  2. data - the data itself
  3. default_target_attribute
  4. ignore_attribute
  5. row_id_attribute

If fields with both Case1 and Case2 are specifed => Case 2 or creation of new version is chosen

How should this PR be tested?

Any other comments?

  • Has not been added to the docs yet, will do so after feedback or review
  • We are planning for another "fork" API which forks a dataset for allowing a user to fork and edit his own copy of a dataset

@codecov-commenter
Copy link

codecov-commenter commented Jul 22, 2020

Codecov Report

Merging #935 into develop will decrease coverage by 0.00%.
The diff coverage is 96.87%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #935      +/-   ##
===========================================
- Coverage    87.76%   87.76%   -0.01%     
===========================================
  Files           37       37              
  Lines         4397     4429      +32     
===========================================
+ Hits          3859     3887      +28     
- Misses         538      542       +4     
Impacted Files Coverage Δ
openml/datasets/functions.py 94.11% <96.87%> (+0.27%) ⬆️
openml/_api_calls.py 87.93% <0.00%> (-2.59%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1670050...758faa6. Read the comment docs.

Copy link
Collaborator

@PGijsbers PGijsbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Have a few minor questions/comments.

tests/test_datasets/test_dataset_functions.py Show resolved Hide resolved
Copy link
Collaborator

@PGijsbers PGijsbers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to keep this up, but a few minor changes for the documentation. After that it's good to merge.

@PGijsbers PGijsbers merged commit 9c93f5b into develop Jul 23, 2020
@PGijsbers PGijsbers deleted the edit_api branch July 23, 2020 11:08
mfeurer pushed a commit that referenced this pull request Sep 2, 2020
* Create first section: Creating Custom Flow

* Add Section: Using the Flow

It is incomplete as while trying to explain how to format the
predictions, I realized a utility function is required.

* Allow run description text to be custom

Previously the description text that accompanies the prediction file was
auto-generated with the assumption that the corresponding flow had an
extension. To support custom flows (with no extension), this behavior
had to be changed. The description can now be passed on initialization.
The description describing it was auto generated from run_task is now
correctly only added if the run was generated through run_flow_on_task.

* Draft for Custom Flow tutorial

* Add minimal docstring to OpenMLRun

I am not for each field what the specifications are.

* Process code review feedback

In particular:
 - text changes
 - fetch true labels from the dataset instead

* Use the format utility function in automatic runs

To format the predictions.

* Process @mfeurer feedback

* Rename arguments of list_evaluations (#933)

* list evals name change

* list evals - update

* adding config file to user guide (#931)

* adding config file to user guide

* finished requested changes

* Edit api (#935)

* version1

* minor fixes

* tests

* reformat code

* check new version

* remove get data

* code format

* review comments

* fix duplicate

* type annotate

* example

* tests for exceptions

* fix pep8

* black format

* Adding support for scikit-learn > 0.22 (#936)

* Preliminary changes

* Updating unit tests for sklearn 0.22 and above

* Triggering sklearn tests + fixes

* Refactoring to inspect.signature in extensions

* Add flake8-print in pre-commit (#939)

* Add flake8-print in pre-commit config

* Replace print statements with logging

* Fix edit api (#940)

* fix edit api

* Update subflow paragraph

* Check the ClassificationTask has class label set

* Test task is of supported type

* Add tests for format_prediction

* Adding Python 3.8 support (#916)

* Adding Python 3.8 support

* Fixing indentation

* Execute test cases for 3.8

* Testing

* Making install script fail

* Process feedback Neeratyoy

* Test Exception with Regex

Also throw NotImplementedError instead of TypeError for unsupported task
types. Added links in the example.

* change edit_api to reflect server (#941)

* change edit_api to reflect server

* change test and example to reflect rest API changes

* tutorial comments

* Update datasets_tutorial.py

* Create first section: Creating Custom Flow

* Add Section: Using the Flow

It is incomplete as while trying to explain how to format the
predictions, I realized a utility function is required.

* Allow run description text to be custom

Previously the description text that accompanies the prediction file was
auto-generated with the assumption that the corresponding flow had an
extension. To support custom flows (with no extension), this behavior
had to be changed. The description can now be passed on initialization.
The description describing it was auto generated from run_task is now
correctly only added if the run was generated through run_flow_on_task.

* Draft for Custom Flow tutorial

* Add minimal docstring to OpenMLRun

I am not for each field what the specifications are.

* Process code review feedback

In particular:
 - text changes
 - fetch true labels from the dataset instead

* Use the format utility function in automatic runs

To format the predictions.

* Process @mfeurer feedback

* Update subflow paragraph

* Check the ClassificationTask has class label set

* Test task is of supported type

* Add tests for format_prediction

* Process feedback Neeratyoy

* Test Exception with Regex

Also throw NotImplementedError instead of TypeError for unsupported task
types. Added links in the example.

Co-authored-by: Bilgecelik <38037323+Bilgecelik@users.noreply.github.com>
Co-authored-by: marcoslbueno <38478211+marcoslbueno@users.noreply.github.com>
Co-authored-by: Sahithya Ravi <44670788+sahithyaravi1493@users.noreply.github.com>
Co-authored-by: Neeratyoy Mallik <neeratyoy@gmail.com>
Co-authored-by: zikun <33176974+zikun@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants