Prepare 11.0 release #969

PGijsbers · 2020-10-24T20:40:08Z

No description provided.

* Preliminary addition of license to source files * Adding license to almost every source file

* add task_type to list_runs * length of run change * changelog * changes in progress rst

* do not populate server base URL on startup * update changelog * fix pep8

* Ask users to cite us * improve reference * Remove linebreak from bibtex block.

* Adding option to print logs during an api call * Adding timing to log and changing string interpolation * Improving logging and timing of api calls * PEP8 * PEP8

* improve sdsit handling * fix changelog * fix pytest installation * install test dependencies extra * fix sdist

* add better error message for too-long URI * improve error handling * improve data download function, fix bugs * stricter API, more private methods * incorporate Pieter's feedback

* Initial changes to handle reproducible example from the issue * Making tentative changes; Need to test deserialization * Fixing deserialization when empty steps in sklearn model * Fixing flake issues, failing test cases * Fixing test cases * Dropping support for 'None' as sklearn estimator * Adding test case for None estimator

* Add support for using run_model_on_task simply * Add unit test * fix mypy error

* Changes proposed in #885. Don't register handlers by default. * Delay file creation until log emit. Correctly read from config. * Remove loading/storing log level references. * _create_log_handlers now returns early if called a second time * Fix type errors. * Update changelog. * Test remove register file log handler to see if CI works. * Undo last change. test server ssl works agian. * Bump scikit-learn version to 0.22 * Scikit-learn 0.22 does not install properly. * Install scikit-learn through pip instead.

* init feather implementation * sparse matrix * test notebook * feather pickle compare * test arrow vs feather * add columns condition * Testing * get_dataset add cache format * add pyarrow * sparse matrix check * pep8 and remove files * return type * fix type annotation * value check * change feather condition * fixes and test * fix errors * testing file * feather new file for attributes * change feather attribute file path * delete testing file * testing changes * delete pkls * fixes * fixes * add comments * change default caching * pip version * review comment fixes * newline * fix if condition * Update install.sh * pandas verison due to sparse data * review #2 * Update appveyor.yml * Update appveyor.yml * rename cache dir

* remove __version__ from __all__ in init * Add comment for flake8 test

* Removing support for pandas SparseDataFrame * Fixing rebase loss * Reiterating with Matthias' changes * Rolling back setup * Fixing PEP8 * Changing check to detect sparse dataframes * Fixing edge case to handle server side arff issue * Removing stray comment * Failing test case fix * Removing stray comment

* Fixing typos * Rewording

* Sphinx issue fix * Removing comment

I ran into issues when the openml server config is not exactly 'https://www.openml.org/api/v1/xml', e.g. I had 'https://www.openml.org/api/v1'. I only noticed when getting a bad dataset url. This edit makes the API more robust against how exactly the server URL is set in the config.

* Add Flake8 configuration Uses the configuration from ci_scripts * Add mypy configuration file Based on the ci_scripts parameters. * Pre-commit mypy flake8, add flake8 excludes Any venv folder does not need flake8. The example directory got flake8 warnings so I assumed it should be excluded. * Add Black to pre-commit Add ignore E203 as Black will observe PEPs specification for white space around a colon it is next to an expression. * Set max line length to 100 * Blacken code There are a few places where big indentation is introduced that may warrant refactoring so it looks better. I did not refactor anything yet, but did exlude three (?) lists (of ids) to not be formatted. * Add unit tests to flake8 and mypy pre-commit * Use pre-commit for flake8, mypy and black checks This ensures it runs with the same versions and settings as developers. * Update docs, add 'test' dependencies Add two other developer dependencies not strictly required for unit tests, but required for development. I think the overlap between people who want to execute unit tests and perform commits is (close to) 100% anyway. * Uninstall pytest-cov on appveyor ci It seems to cause an error on import due to a missing sqlite3 dll. As we don't check coverage anyway, hopefully just uninstalling is sufficient. * Add -y to uninstall * Sphinx issue fix (#923) * Sphinx issue fix * Removing comment * More robust handling of openml_url (#921) I ran into issues when the openml server config is not exactly 'https://www.openml.org/api/v1/xml', e.g. I had 'https://www.openml.org/api/v1'. I only noticed when getting a bad dataset url. This edit makes the API more robust against how exactly the server URL is set in the config. * format for black artifacts * Add Flake8 configuration Uses the configuration from ci_scripts * Add mypy configuration file Based on the ci_scripts parameters. * Pre-commit mypy flake8, add flake8 excludes Any venv folder does not need flake8. The example directory got flake8 warnings so I assumed it should be excluded. * Add Black to pre-commit Add ignore E203 as Black will observe PEPs specification for white space around a colon it is next to an expression. * Set max line length to 100 * Blacken code There are a few places where big indentation is introduced that may warrant refactoring so it looks better. I did not refactor anything yet, but did exlude three (?) lists (of ids) to not be formatted. * Add unit tests to flake8 and mypy pre-commit * Use pre-commit for flake8, mypy and black checks This ensures it runs with the same versions and settings as developers. * Update docs, add 'test' dependencies Add two other developer dependencies not strictly required for unit tests, but required for development. I think the overlap between people who want to execute unit tests and perform commits is (close to) 100% anyway. * Uninstall pytest-cov on appveyor ci It seems to cause an error on import due to a missing sqlite3 dll. As we don't check coverage anyway, hopefully just uninstalling is sufficient. * Add -y to uninstall * format for black artifacts Co-authored-by: Neeratyoy Mallik <neeratyoy@gmail.com> Co-authored-by: Joaquin Vanschoren <joaquin.vanschoren@gmail.com>

* MAINT 918: improve error handling and error message * incorporate feedback from Pieter

* Increase unit test stability by waiting longer for the server to process run traces, and by querying the server less frequently for new run traces. * Make test stricter actually, we only wait for evaluations to ensure that the trace is processed by the server. Therefore, we can also simply wait for the trace being available instead of relying on the proxy indicator of evaluations being available. * fix stricter test

* Mention the initialization of pre-commit * Restructure the two contribution guidelines The rst file will now have general contribution information, for contributions that are related to openml-python, but not actually to the openml-python repository. Information for making a contribution to the openml-python repository is in the contributing markdown file.

* improve error message for dataset upload * fix unit test

* list evals name change * list evals - update

* adding config file to user guide * finished requested changes

* version1 * minor fixes * tests * reformat code * check new version * remove get data * code format * review comments * fix duplicate * type annotate * example * tests for exceptions * fix pep8 * black format

* Preliminary changes * Updating unit tests for sklearn 0.22 and above * Triggering sklearn tests + fixes * Refactoring to inspect.signature in extensions

* Add flake8-print in pre-commit config * Replace print statements with logging

* fix edit api

* Adding Python 3.8 support * Fixing indentation * Execute test cases for 3.8 * Testing * Making install script fail

* change edit_api to reflect server * change test and example to reflect rest API changes * tutorial comments * Update datasets_tutorial.py

@mfeurer

* Create first section: Creating Custom Flow * Add Section: Using the Flow It is incomplete as while trying to explain how to format the predictions, I realized a utility function is required. * Allow run description text to be custom Previously the description text that accompanies the prediction file was auto-generated with the assumption that the corresponding flow had an extension. To support custom flows (with no extension), this behavior had to be changed. The description can now be passed on initialization. The description describing it was auto generated from run_task is now correctly only added if the run was generated through run_flow_on_task. * Draft for Custom Flow tutorial * Add minimal docstring to OpenMLRun I am not for each field what the specifications are. * Process code review feedback In particular: - text changes - fetch true labels from the dataset instead * Use the format utility function in automatic runs To format the predictions. * Process @mfeurer feedback * Rename arguments of list_evaluations (#933) * list evals name change * list evals - update * adding config file to user guide (#931) * adding config file to user guide * finished requested changes * Edit api (#935) * version1 * minor fixes * tests * reformat code * check new version * remove get data * code format * review comments * fix duplicate * type annotate * example * tests for exceptions * fix pep8 * black format * Adding support for scikit-learn > 0.22 (#936) * Preliminary changes * Updating unit tests for sklearn 0.22 and above * Triggering sklearn tests + fixes * Refactoring to inspect.signature in extensions * Add flake8-print in pre-commit (#939) * Add flake8-print in pre-commit config * Replace print statements with logging * Fix edit api (#940) * fix edit api * Update subflow paragraph * Check the ClassificationTask has class label set * Test task is of supported type * Add tests for format_prediction * Adding Python 3.8 support (#916) * Adding Python 3.8 support * Fixing indentation * Execute test cases for 3.8 * Testing * Making install script fail * Process feedback Neeratyoy * Test Exception with Regex Also throw NotImplementedError instead of TypeError for unsupported task types. Added links in the example. * change edit_api to reflect server (#941) * change edit_api to reflect server * change test and example to reflect rest API changes * tutorial comments * Update datasets_tutorial.py * Create first section: Creating Custom Flow * Add Section: Using the Flow It is incomplete as while trying to explain how to format the predictions, I realized a utility function is required. * Allow run description text to be custom Previously the description text that accompanies the prediction file was auto-generated with the assumption that the corresponding flow had an extension. To support custom flows (with no extension), this behavior had to be changed. The description can now be passed on initialization. The description describing it was auto generated from run_task is now correctly only added if the run was generated through run_flow_on_task. * Draft for Custom Flow tutorial * Add minimal docstring to OpenMLRun I am not for each field what the specifications are. * Process code review feedback In particular: - text changes - fetch true labels from the dataset instead * Use the format utility function in automatic runs To format the predictions. * Process @mfeurer feedback * Update subflow paragraph * Check the ClassificationTask has class label set * Test task is of supported type * Add tests for format_prediction * Process feedback Neeratyoy * Test Exception with Regex Also throw NotImplementedError instead of TypeError for unsupported task types. Added links in the example. Co-authored-by: Bilgecelik <38037323+Bilgecelik@users.noreply.github.com> Co-authored-by: marcoslbueno <38478211+marcoslbueno@users.noreply.github.com> Co-authored-by: Sahithya Ravi <44670788+sahithyaravi1493@users.noreply.github.com> Co-authored-by: Neeratyoy Mallik <neeratyoy@gmail.com> Co-authored-by: zikun <33176974+zikun@users.noreply.github.com>

* support passthrough and drop in sklearn extension when serialized to xml dict * make test work with sklearn==0.21 * improve PR * Add additional unit tests * fix test * incorporate feedback and generalize unit tests

* Added PEP 561 compliance (#945) * FIX: mypy test dependancy * FIX: mypy test dependancy (#945) * FIX: Added mypy to CI list of test packages

* convert TaskTypeEnum class to TaskType enum * update docstrings for TaskType * fix bug in examples, import TaskType directly * use task_type instead of task_type_id

* Updating contribution to aid debugging * More explicit instructions

Remove a faulty entry in the argument list of datasets.

* Improved documentation of example * Update examples/30_extended/create_upload_tutorial.py Co-authored-by: PGijsbers <p.gijsbers@tue.nl> Co-authored-by: Matthias Feurer <feurerm@informatik.uni-freiburg.de> Co-authored-by: PGijsbers <p.gijsbers@tue.nl>

@PGijsbers

* run on tasks allows dataframes * don't force third subcomponent part to be list * Making DataFrame default behaviour for runs; Fixing test cases for the same * Fixing PEP8 + Adding docstring to CustomImputer() * run on tasks allows dataframes * Attempting rebase * Fixing test cases * Trying test case fixes * run on tasks allows dataframes * don't force third subcomponent part to be list * Making DataFrame default behaviour for runs; Fixing test cases for the same * Fixing PEP8 + Adding docstring to CustomImputer() * Attempting rebase * Fixing test cases * Trying test case fixes * Allowing functions in subcomponents * Fixing test cases * Adding dataset output param to run * Fixing test cases * Changes suggested by mfeurer * Editing predict_proba function * Test case fix * Test case fix * Edit unit test to bypass server issue * Fixing unit test * Reiterating with @PGijsbers comments * Minor fixes to test cases * Adding unit test and suggestions from @mfeurer * Fixing test case for all sklearn versions * Testing changes * Fixing import in example * Triggering unit tests * Degugging failed example script * Adding unit tests * Push for debugging * Push for @mfeurer to debug * Resetting to debug * Updating branch * pre-commit fixes * Handling failing examples * Reiteration with clean ups and minor fixes * Closing comments * Black fixes * feedback from @mfeurer * Minor fix * suggestions from @PGijsbers Co-authored-by: neeratyoy <neeratyoy@gmail.com> Co-authored-by: neeratyoy <de4nas@gmail.com>

* fork api * improve docs (+1 squashed commits) Squashed commits: [ec5c0d10] import changes * minor change (+1 squashed commits) Squashed commits: [1822c99] improve docs (+1 squashed commits) Squashed commits: [ec5c0d10] import changes * docs update * clarify example * Update doc/progress.rst * Fix whitespaces for docstring * fix error * Use id 999999 for unknown dataset Co-authored-by: PGijsbers <p.gijsbers@tue.nl>

* Change default size for list_evaluations to 10000 * Suggestions from code review

Co-authored-by: PGijsbers <p.gijsbers@tue.nl>

PGijsbers · 2020-10-24T20:43:08Z

I might be able to look at the merge conflicts in the morning. I am a bit confused about why there are merge conflicts in the first place though, the last master commit was the 10.2 release.

Dev master merged

Neeratyoy and others added 30 commits November 6, 2019 09:40

Fixing broken links (#864)

12f1455

Adding license to each source file (#862)

e489f41

* Preliminary addition of license to source files * Adding license to almost every source file

add task_type to list_runs (#857)

33bf643

* add task_type to list_runs * length of run change * changelog * changes in progress rst

Prepare new release (#868)

e5e3858

start version 0.11.0dev (#872)

46df529

Do not populate server base URL on startup (#873)

fb1c1d9

* do not populate server base URL on startup * update changelog * fix pep8

Add cite me (#874)

c02096b

* Ask users to cite us * improve reference * Remove linebreak from bibtex block.

Adding option to print logs during an api call (#833)

a1cfd6e

* Adding option to print logs during an api call * Adding timing to log and changing string interpolation * Improving logging and timing of api calls * PEP8 * PEP8

improve sdist handling (#877)

a1e2c34

* improve sdsit handling * fix changelog * fix pytest installation * install test dependencies extra * fix sdist

add support for MLP HP layer_sizes (#879)

69d443f

add better error message for too-long URI (#881)

d79a98c

* add better error message for too-long URI * improve error handling * improve data download function, fix bugs * stricter API, more private methods * incorporate Pieter's feedback

Add support for using run_model_on_task simply (#888)

d5e46fe

* Add support for using run_model_on_task simply * Add unit test * fix mypy error

Fix typo, use log10 as specified in axis labels. (#890)

371911f

Remove __version__ from __all__ in openml\__init__.py (#903)

4b9b873

* remove __version__ from __all__ in init * Add comment for flake8 test

Fixing documentation typo (#914)

df864c2

* Fixing typos * Rewording

Sphinx issue fix (#923)

5a31f8e

* Sphinx issue fix * Removing comment

Improve error handling and error message when loading datasets (#925)

368700e

* MAINT 918: improve error handling and error message * incorporate feedback from Pieter

improve error message for dataset upload (#927)

525e8a6

* improve error message for dataset upload * fix unit test

FIX #912: add create_task to API doc (#924)

4256834

Rename arguments of list_evaluations (#933)

e5dcaf0

* list evals name change * list evals - update

adding config file to user guide (#931)

1670050

* adding config file to user guide * finished requested changes

Edit api (#935)

9c93f5b

* version1 * minor fixes * tests * reformat code * check new version * remove get data * code format * review comments * fix duplicate * type annotate * example * tests for exceptions * fix pep8 * black format

Neeratyoy and others added 17 commits August 3, 2020 11:01

Adding support for scikit-learn > 0.22 (#936)

666ca68

* Preliminary changes * Updating unit tests for sklearn 0.22 and above * Triggering sklearn tests + fixes * Refactoring to inspect.signature in extensions

Add flake8-print in pre-commit (#939)

5d9c69c

* Add flake8-print in pre-commit config * Replace print statements with logging

Fix edit api (#940)

7d51a76

* fix edit api

Adding Python 3.8 support (#916)

5d2e0ce

* Adding Python 3.8 support * Fixing indentation * Execute test cases for 3.8 * Testing * Making install script fail

change edit_api to reflect server (#941)

f70c720

* change edit_api to reflect server * change test and example to reflect rest API changes * tutorial comments * Update datasets_tutorial.py

Better support for passthrough and drop in sklearn extension (#943)

3d85fa7

* support passthrough and drop in sklearn extension when serialized to xml dict * make test work with sklearn==0.21 * improve PR * Add additional unit tests * fix test * incorporate feedback and generalize unit tests

Added PEP 561 compliance (#945) (#946)

d303ced

* Added PEP 561 compliance (#945) * FIX: mypy test dependancy * FIX: mypy test dependancy (#945) * FIX: Added mypy to CI list of test packages

Remove todo list and fix broken link (#954)

5641828

Class to enum (#958)

0def226

* convert TaskTypeEnum class to TaskType enum * update docstrings for TaskType * fix bug in examples, import TaskType directly * use task_type instead of task_type_id

Updating contribution to aid debugging (#961)

dde5662

* Updating contribution to aid debugging * More explicit instructions

MAINT #660 (#962)

d48f108

Remove a faulty entry in the argument list of datasets.

Change default size for list_evaluations (#965)

f464a2b

* Change default size for list_evaluations to 10000 * Suggestions from code review

prepare release of 0.11.0 (#966)

7a3e69f

Co-authored-by: PGijsbers <p.gijsbers@tue.nl>

PGijsbers and others added 3 commits October 25, 2020 09:45

Merge branch 'master' into develop

052550b

Update conftest.py

ec34b5c

Merge pull request #970 from openml/dev-master-merged

79a6705

Dev master merged

mfeurer approved these changes Oct 25, 2020

View reviewed changes

mfeurer merged commit bc87333 into master Oct 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prepare 11.0 release #969

Prepare 11.0 release #969

PGijsbers commented Oct 24, 2020

PGijsbers commented Oct 24, 2020

Prepare 11.0 release #969

Prepare 11.0 release #969

Conversation

PGijsbers commented Oct 24, 2020

PGijsbers commented Oct 24, 2020