Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modernize package infrastructure #172

Merged
merged 4 commits into from
Mar 30, 2023
Merged

Modernize package infrastructure #172

merged 4 commits into from
Mar 30, 2023

Conversation

PicoCentauri
Copy link
Collaborator

@PicoCentauri PicoCentauri commented Mar 27, 2023

This PR updates a number of the infrastructure of the package:

  1. Going from a tox multi-line string to an independent tox.ini file. The latter one is easier to maintain as a huge multi-line string and also consistent with the other projects.
  2. Adding tox sections for linting, building the documentation and formatting the code
  3. Move skmatter to src/skmatter
  4. Rename docs/source -> docs/src (same is in other lab-cosmo projects)
  5. Translate from setup.py, setup.cfg into a single pyproject.toml file
  6. Add a documentation link action
  7. Add Python 3.11 to the CI test matrix
  8. Consistent use of Python 3.10 for docs, lint and ReadTheDocs in the CI
  9. Add tests for windows and macos
  10. Update all gh-action dependencies to the lates version
  11. Update contributors.txt
  12. Update Rose's mail address
  13. Use isort and flake8 settings of the other lab-cosmo projects
  14. Following 7 use 88 characters for all files including docs and comments
  15. Remove numpy as an explict dependency. The whole package depends on scikit-learn which depends on numpy. I think we do not need to list numpy again in the dependencies if we stick with the same version as they do (which we should).

@PicoCentauri PicoCentauri force-pushed the tox branch 6 times, most recently from 6b00bbc to 98773cf Compare March 27, 2023 14:59
@PicoCentauri PicoCentauri force-pushed the tox branch 4 times, most recently from fd944ec to e12ee3a Compare March 28, 2023 08:31
Copy link
Collaborator

@Luthaf Luthaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The infrastructure changes looks good to me, I only skimmed over the documentation changes.

tox -e docs and tox -e format should be documented somewhere (in Contributing maybe?)

.github/workflows/lint.yml Outdated Show resolved Hide resolved
.github/workflows/tests.yml Outdated Show resolved Hide resolved
.github/workflows/tests.yml Show resolved Hide resolved
docs/src/conf.py Outdated Show resolved Hide resolved
docs/src/contributing.rst Outdated Show resolved Hide resolved
@PicoCentauri
Copy link
Collaborator Author

Thanks for your review @Luthaf! I implemented the useful suggestions. If you are happy I would also like to have @rosecers opinion before we apply these changes.

Copy link
Collaborator

@rosecers rosecers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through, all looks generally fine, but I'm worried that the new linting doesn't look as nice. Can we set it up with the same rules as before?

# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/source/conf.py
configuration: docs/src/conf.py

# Optionally build your docs in additional formats such as PDF
formats:
- pdf

# Optionally set the version of Python and requirements required to build your docs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is no longer relevant

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

recursive-include src/skmatter/datasets/data/ *
recursive-include src/skmatter/datasets/descr/ *

prune tests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not include the test files in shipped package via PYPI or conda. We can remove this line and include the tests again... But, if people run the tests they will use the github repo and not the one from a package manager.

docs/Makefile Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file no longer needed? Why?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tox -e docs will build the docs in an isolated environment and takes care of the OS specific stuff. So no, we do not need this anymore.

docs/make.bat Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this file no longer needed? Why?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

docs/src/installation.rst Show resolved Hide resolved
Comment on lines +53 to +54
n_to_select is chosen. Otherwise will stop when the score falls below the
threshold. Stored in :py:attr:`self.score_threshold`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
n_to_select is chosen. Otherwise will stop when the score falls below the
threshold. Stored in :py:attr:`self.score_threshold`.
n_to_select is chosen. Otherwise will stop when the score falls below the threshold.
Stored in :py:attr:`self.score_threshold`.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here 88 characters per line.

@@ -84,7 +78,8 @@ class GreedySelector(SelectorMixin, MetaEstimatorMixin, BaseEstimator):
X_selected_ : ndarray,
Matrix containing the selected samples or features, for use in fitting
y_selected_ : ndarray,
In sample selection, the matrix containing the selected targets, for use in fitting
In sample selection, the matrix containing the selected targets, for
use in fitting
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are small, unnecessary formatting changes throughout here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why unnecessary? There is a line width of 88 characters. This should be obeyed to please the linter.

Comment on lines +197 to +198
"Cannot fit with warm_start=True without having been previously"
" initialized."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

88 characters per line.

Comment on lines +489 to +490
The negative loss is returned for easier use in sklearn pipelines, e.g., a
grid search, where methods named 'score' are meant to be maximized.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this render correctly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup.py Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this file no longer needed? I'm not seeing how removing it is beneficial.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our code is fully compatible with PEP517 and PEP660 🥳. We do not have to do any legacy builds which require a setup.py file: https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html

Also removing this file prevents user from doing python setup.py install which you should not do.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the setup.py was also super weird for me the first time I saw it. But, it actually works and is the way one should do it if one can.

@PicoCentauri
Copy link
Collaborator Author

Going through, all looks generally fine, but I'm worried that the new linting doesn't look as nice. Can we set it up with the same rules as before?

Thanks. Black is exactly the same. The only thing which is slightly changed is how isort works. Packages are sorted according to: FUTURE, STDLIB, THIRDPARTY, FIRSTPARTY, LOCALFOLDER, which is the default. I think this order is much easier to read; especially the splitting between THIRDPARTY (numpy, sklearn) and FIRSTPARTY (skmatter).

@rosecers
Copy link
Collaborator

So I get the 88 char/line thing from a code-formatting standpoint, but as a code reader, I find it very frustrating. Industry-standard is 80 -- any arguments for/against doing that instead?

@PicoCentauri
Copy link
Collaborator Author

I would stick with 88 characters. 88 chars/line is the current way of skmatter since we use black as the formatter. It was only not enabled for the docstring and not applied (and still is not) for the documentation pages. We all have widescreen monitors. Additional eight characters per line lead to fewer line breaks and more readable code.

Also, I am not sure if 80 is still the industry standard. sklearn uses 88 and even the greatLinus Torvald is saying that the time of 80 chars/line are over.

@ceriottm
Copy link
Collaborator

ceriottm commented Mar 29, 2023 via email

@rosecers
Copy link
Collaborator

rosecers commented Mar 29, 2023 via email

@PicoCentauri
Copy link
Collaborator Author

Okay, since this might be a longer and intensive discussion I suggest we keep the 88 chars/line in within this PR and open an issue about how we handle this in the future. One reason is that I would like to touch more files of the repo but this PR is already huge and complex.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants