Modernize package infrastructure #172

PicoCentauri · 2023-03-27T09:15:28Z

This PR updates a number of the infrastructure of the package:

Going from a tox multi-line string to an independent tox.ini file. The latter one is easier to maintain as a huge multi-line string and also consistent with the other projects.
Adding tox sections for linting, building the documentation and formatting the code
Move skmatter to src/skmatter
Rename docs/source -> docs/src (same is in other lab-cosmo projects)
Translate from setup.py, setup.cfg into a single pyproject.toml file
Add a documentation link action
Add Python 3.11 to the CI test matrix
Consistent use of Python 3.10 for docs, lint and ReadTheDocs in the CI
Add tests for windows and macos
Update all gh-action dependencies to the lates version
Update contributors.txt
Update Rose's mail address
Use isort and flake8 settings of the other lab-cosmo projects
Following 7 use 88 characters for all files including docs and comments
Remove numpy as an explict dependency. The whole package depends on scikit-learn which depends on numpy. I think we do not need to list numpy again in the dependencies if we stick with the same version as they do (which we should).

Luthaf

The infrastructure changes looks good to me, I only skimmed over the documentation changes.

tox -e docs and tox -e format should be documented somewhere (in Contributing maybe?)

.github/workflows/lint.yml

.github/workflows/tests.yml

docs/src/conf.py

docs/src/contributing.rst

PicoCentauri · 2023-03-28T11:22:39Z

Thanks for your review @Luthaf! I implemented the useful suggestions. If you are happy I would also like to have @rosecers opinion before we apply these changes.

rosecers

Going through, all looks generally fine, but I'm worried that the new linting doesn't look as nice. Can we set it up with the same rules as before?

rosecers · 2023-03-28T19:47:56Z

.readthedocs.yaml

 # Build documentation in the docs/ directory with Sphinx
 sphinx:
-  configuration: docs/source/conf.py
+  configuration: docs/src/conf.py

 # Optionally build your docs in additional formats such as PDF
 formats:
  - pdf

 # Optionally set the version of Python and requirements required to build your docs


This comment is no longer relevant

rosecers · 2023-03-28T19:54:13Z

MANIFEST.in

+recursive-include src/skmatter/datasets/data/ *
+recursive-include src/skmatter/datasets/descr/ *
+
+prune tests


What does this do?

It does not include the test files in shipped package via PYPI or conda. We can remove this line and include the tests again... But, if people run the tests they will use the github repo and not the one from a package manager.

rosecers · 2023-03-28T20:02:10Z

docs/Makefile

Is this file no longer needed? Why?

tox -e docs will build the docs in an isolated environment and takes care of the OS specific stuff. So no, we do not need this anymore.

rosecers · 2023-03-28T20:02:15Z

docs/make.bat

Is this file no longer needed? Why?

Same as above.

docs/src/installation.rst

rosecers · 2023-03-28T20:04:39Z

src/skmatter/_selection.py

+        n_to_select is chosen. Otherwise will stop when the score falls below the
+        threshold. Stored in :py:attr:`self.score_threshold`.


Suggested change

n_to_select is chosen. Otherwise will stop when the score falls below the

threshold. Stored in :py:attr:`self.score_threshold`.

n_to_select is chosen. Otherwise will stop when the score falls below the threshold.

Stored in :py:attr:`self.score_threshold`.

Same here 88 characters per line.

rosecers · 2023-03-28T20:05:23Z

src/skmatter/_selection.py

@@ -84,7 +78,8 @@ class GreedySelector(SelectorMixin, MetaEstimatorMixin, BaseEstimator):
    X_selected_ : ndarray,
                  Matrix containing the selected samples or features, for use in fitting
    y_selected_ : ndarray,
-                  In sample selection, the matrix containing the selected targets, for use in fitting
+                  In sample selection, the matrix containing the selected targets, for
+                  use in fitting


There are small, unnecessary formatting changes throughout here.

Why unnecessary? There is a line width of 88 characters. This should be obeyed to please the linter.

rosecers · 2023-03-28T20:05:42Z

src/skmatter/_selection.py

+                    "Cannot fit with warm_start=True without having been previously"
+                    " initialized."


88 characters per line.

rosecers · 2023-03-28T20:07:39Z

src/skmatter/decomposition/_kernel_pcovr.py

+        The negative loss is returned for easier use in sklearn pipelines, e.g., a
+        grid search, where methods named 'score' are meant to be maximized.


Will this render correctly?

Looks good to me.

rosecers · 2023-03-28T20:12:13Z

setup.py

Why is this file no longer needed? I'm not seeing how removing it is beneficial.

Our code is fully compatible with PEP517 and PEP660 🥳. We do not have to do any legacy builds which require a setup.py file: https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html

Also removing this file prevents user from doing python setup.py install which you should not do.

Removing the setup.py was also super weird for me the first time I saw it. But, it actually works and is the way one should do it if one can.

PicoCentauri · 2023-03-28T20:47:15Z

Going through, all looks generally fine, but I'm worried that the new linting doesn't look as nice. Can we set it up with the same rules as before?

Thanks. Black is exactly the same. The only thing which is slightly changed is how isort works. Packages are sorted according to: FUTURE, STDLIB, THIRDPARTY, FIRSTPARTY, LOCALFOLDER, which is the default. I think this order is much easier to read; especially the splitting between THIRDPARTY (numpy, sklearn) and FIRSTPARTY (skmatter).

rosecers · 2023-03-29T16:56:05Z

So I get the 88 char/line thing from a code-formatting standpoint, but as a code reader, I find it very frustrating. Industry-standard is 80 -- any arguments for/against doing that instead?

PicoCentauri · 2023-03-29T17:57:19Z

I would stick with 88 characters. 88 chars/line is the current way of skmatter since we use black as the formatter. It was only not enabled for the docstring and not applied (and still is not) for the documentation pages. We all have widescreen monitors. Additional eight characters per line lead to fewer line breaks and more readable code.

Also, I am not sure if 80 is still the industry standard. sklearn uses 88 and even the greatLinus Torvald is saying that the time of 80 chars/line are over.

ceriottm · 2023-03-29T19:28:55Z

Ah, the good ole' Linus' rant. FWIW I'm all for longer lines.

…

On Wed, 29 Mar 2023 at 19:57, Philip Loche ***@***.***> wrote: I would stick with 88 characters. 88 chars/line is the current way of skmatter since we use black as the formatter. It was only not enabled for the docstring and not applied (and still is not) for the documentation pages. We all have widescreen monitors. Additional eight characters per line lead to fewer line breaks and more readable code. Also, I am not sure if 80 is still the industry standard. sklearn uses 88 <https://github.com/scikit-learn/scikit-learn/blob/70c489f1273a5ff877e61750b3a69590bc002b6f/pyproject.toml#L23> and even the greatLinus Torvald is saying that the time of 80 chars/line are over <https://lkml.org/lkml/2020/5/29/1038>. — Reply to this email directly, view it on GitHub <#172 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIREZ6FOQPWAEV675PXLJ3W6RZYVANCNFSM6AAAAAAWI5AISM> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

rosecers · 2023-03-29T19:51:26Z

I’d advocate for even longer, ~100char!

…

On 29 Mar 2023, at 14:29, Michele Ceriotti ***@***.***> wrote: Ah, the good ole' Linus' rant. FWIW I'm all for longer lines. On Wed, 29 Mar 2023 at 19:57, Philip Loche ***@***.***> wrote: > I would stick with 88 characters. 88 chars/line is the current way of > skmatter since we use black as the formatter. It was only not enabled for > the docstring and not applied (and still is not) for the documentation > pages. We all have widescreen monitors. Additional eight characters per > line lead to fewer line breaks and more readable code. > > Also, I am not sure if 80 is still the industry standard. sklearn uses 88 > <https://github.com/scikit-learn/scikit-learn/blob/70c489f1273a5ff877e61750b3a69590bc002b6f/pyproject.toml#L23> > and even the greatLinus Torvald is saying that the time of 80 chars/line > are over <https://lkml.org/lkml/2020/5/29/1038>. > > — > Reply to this email directly, view it on GitHub > <#172 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAIREZ6FOQPWAEV675PXLJ3W6RZYVANCNFSM6AAAAAAWI5AISM> > . > You are receiving this because you are subscribed to this thread.Message > ID: ***@***.***> > — Reply to this email directly, view it on GitHub <#172 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALKVP3WN5MNYNMAY2K2XOWDW6SEQFANCNFSM6AAAAAAWI5AISM>. You are receiving this because you were mentioned.

PicoCentauri · 2023-03-29T20:15:12Z

Okay, since this might be a longer and intensive discussion I suggest we keep the 88 chars/line in within this PR and open an issue about how we handle this in the future. One reason is that I would like to touch more files of the repo but this PR is already huge and complex.

PicoCentauri force-pushed the tox branch 6 times, most recently from 6b00bbc to 98773cf Compare March 27, 2023 14:59

PicoCentauri mentioned this pull request Mar 28, 2023

Port examples to sphinx_gallery #170

Merged

PicoCentauri force-pushed the tox branch 4 times, most recently from fd944ec to e12ee3a Compare March 28, 2023 08:31

Modernize package structure

dc9c32b

PicoCentauri force-pushed the tox branch from e12ee3a to dc9c32b Compare March 28, 2023 08:38

PicoCentauri requested a review from Luthaf March 28, 2023 08:59

Luthaf reviewed Mar 28, 2023

View reviewed changes

Update with reviewers comments

77b209d

rosecers reviewed Mar 28, 2023

View reviewed changes

PicoCentauri added 2 commits March 28, 2023 23:06

Fix docs build and remove a superflous comment

cf7b87f

Update source installation instructions

7a2fc61

rosecers approved these changes Mar 30, 2023

View reviewed changes

PicoCentauri merged commit 16de441 into main Mar 30, 2023

PicoCentauri deleted the tox branch March 30, 2023 19:22

PicoCentauri mentioned this pull request Mar 30, 2023

What should be number of characters/line #177

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modernize package infrastructure #172

Modernize package infrastructure #172

PicoCentauri commented Mar 27, 2023 •

edited by Luthaf

Loading

Luthaf left a comment

PicoCentauri commented Mar 28, 2023

rosecers left a comment

rosecers Mar 28, 2023

PicoCentauri Mar 28, 2023

rosecers Mar 28, 2023

PicoCentauri Mar 28, 2023

rosecers Mar 28, 2023

PicoCentauri Mar 28, 2023

rosecers Mar 28, 2023

PicoCentauri Mar 28, 2023

rosecers Mar 28, 2023

PicoCentauri Mar 28, 2023

rosecers Mar 28, 2023

PicoCentauri Mar 28, 2023

rosecers Mar 28, 2023

PicoCentauri Mar 28, 2023

rosecers Mar 28, 2023

PicoCentauri Mar 28, 2023

rosecers Mar 28, 2023

PicoCentauri Mar 28, 2023

PicoCentauri Mar 29, 2023

PicoCentauri commented Mar 28, 2023

rosecers commented Mar 29, 2023

PicoCentauri commented Mar 29, 2023

ceriottm commented Mar 29, 2023 via email

rosecers commented Mar 29, 2023 via email

PicoCentauri commented Mar 29, 2023

		n_to_select is chosen. Otherwise will stop when the score falls below the
		threshold. Stored in :py:attr:`self.score_threshold`.

		"Cannot fit with warm_start=True without having been previously"
		" initialized."

		The negative loss is returned for easier use in sklearn pipelines, e.g., a
		grid search, where methods named 'score' are meant to be maximized.

Modernize package infrastructure #172

Modernize package infrastructure #172

Conversation

PicoCentauri commented Mar 27, 2023 • edited by Luthaf Loading

Luthaf left a comment

Choose a reason for hiding this comment

PicoCentauri commented Mar 28, 2023

rosecers left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PicoCentauri commented Mar 28, 2023

rosecers commented Mar 29, 2023

PicoCentauri commented Mar 29, 2023

ceriottm commented Mar 29, 2023 via email

rosecers commented Mar 29, 2023 via email

PicoCentauri commented Mar 29, 2023

PicoCentauri commented Mar 27, 2023 •

edited by Luthaf

Loading