Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge development for v1.1.0 #28

Merged
merged 98 commits into from
Feb 18, 2024
Merged

Merge development for v1.1.0 #28

merged 98 commits into from
Feb 18, 2024

Conversation

derb12
Copy link
Owner

@derb12 derb12 commented Feb 18, 2024

Description

Merge the development branch for version 1.1.0, containing the new Baseline2d class and new algorithms, deprecations, and doc improvements.

Type of Pull Request

  • Bug Fix
  • New Feature
  • Miscellaneous Changes (refactor, code improvements, etc.)
  • Documentation or Example Programs

Pull Request Checklist

  • New code and/or documentation is valid for use with the BSD 3-clause license.
  • New code is fully documented with docstrings that follow Numpy style,
    if applicable.
  • New code follows PEP 8 standards as closely as possible, if applicable.
  • Added/updated tests and ensured they pass locally, if applicable.
  • Verified that documentation builds locally, if applicable.

derb12 added 30 commits April 1, 2023 20:41
Will require some fine tuning, but this is a good skeleton.
The penalized spline version of the MPLS algorithm.
Foolish error from switching branches without thinking.
This allows recreating the spline through scipy's BSpline. Mostly a convenience, but this allows a more public interface for it.
Added a version of the rubberband baseline that allows fitting in segments to better handle concave data.
Changed `data` and `x_data` to `y_fit` and `x_fit` to make their usage clearer.
Was causing an unknown issue within GSAS, so removed the file. The pentapy solver can now be set using the Baseline attribute `pentapy_solver`, which is more useful anyway.
By calling setup_whittaker, the whittaker_system does not need to be set up each time if doing repeated function calls.
Scipy is going to deprecate its wavelet support in version 1.12, so added their cwt and ricker functions, with the appropriate attributions.
Also updated copyright year, changed status to stable, and fixed a typo.
Avoids having to use numpy's Polynomial to convert coefficients. The hope is to extend/modify this to work for 2d polynomials. Also about 2 times as fast as the previous implementation, although the absolute times are small regardless.
One warning was from the sphinx extension autosection label documenting all section headers which caused CHANGELOG and changes to have the same headers. The second warning was from documenting pentapy_solver as both an attribute and method for Baseline. Also updated scipy's doc reference url.
All algorithms stopped using it in version 1.0.0, but the internal code was not removed before.
The function will be deprecated from numpy in version 2.0.0, so use scipy's implementation instead.
Also added a test that compares PSplines and Whittaker systems in the special case where they should be equal.
Makes adding weights for Whittaker smoothing much simpler, similar to what was done in 2D.
The nopython keyword is always specified within the actual code, so the warning was only being raised from the test cases and can be safely ignored.
Also remove the ability to input a whittaker or pspline object into their respective smoothers in utils since it would just confuse users.
Now most optimizers do not do two unnecessary sorts.
Currently implemented versions are mor, imor, rolling_ball, and tophat. Note that this is all experimental at this point. Design decision: no functional interface will be provided for the 2D versions.
Implemented the poly, modpoly, imodpoly, penalized_poly, and goldindec algorithms (didn't have to actually do anything outside of the polynomial setup). Had to skip several validations, so need to add that back in later.
Someone should give Paul Eilers the Nobel prize, dude is the GOAT. On a more serious note, the internals of the PSpline2D class will most likely change, but the external calls within the baseline algorithms should remain the same. Implemented the 2D versions of irsqr, pspline_asls, pspline_airpls, pspline_arpls, pspline_iarpls, and pspline_psalsa.
Using eigendecomposition to solve 2D whittaker baselines reduces the computation time significantly, and the computation time scales relatively linear with data size since the number of eigenvalues depends only on baseline curvature and does not increase with size. Need to add some tests and explanations in docstrings and the main docs about the eigendecomposition. Also renamed solve_pspline to just solve.
Makes the sparse solver just a tad faster to better represent it. Also mention that the sparse solution could be sped up with CHOLMOD in case others are interested. Also re-enable autosectionlabel since the branch rebase must have undone that.
Also added a sanity check test for the 1D case to ensure the banded multiplication is the same as the matrix multiplication.
Will use scipy's sparse arrays if the installed scipy version is 1.12 or newer. Thank goodness for unit tests catching the change in matrix multiplication.
Put all metadata into pyproject.toml, so setup.py and setup.cfg can be removed. Switched from flake8 to ruff and bump2version to bump-my-version. Will need to update pinned requirements  once ready to release. Made a separate CI job for linting so that linting can fail but will at least show up now instead of ignoring.
Ensures that any new changes in numpy or scipy will be caught early. Fixed one last place where numpy.trapz was used. Side note: pybaselines works with numpy 2.0, which is a huge relief.
Simplifies the checking of 2d variables. Also added checks to ensure polynomial orders are never non-negative.
No longer mention editable installs in the contribution guide since it requires passing additional options to setuptools in order to work, which could be confusing for new contributors. Update min setuptools for building to allow editable installs, and update ruff settings.
Also updated pinned dependencies.
Also updated the logo_plot helper script to use the Baseline class.
Increased min numpy version from 1.18 to 1.20 to allow using dtype within numpy.concatenate.
ENH: Add 2D versions of several baseline algorithms
Also updated the min python to 3.8 in a few missed places
Improves convergence, removes the dependence of the results on the number of bins, and speeds up the calculation by ~2-10x. Deprecated the num_bins kwarg.
This way the full build is reproducible and will not cause issues when rebuilding docs in a year.
Also allow returning the effective degrees of freedom in the params dictionary.
No longer show the functional interface in order to not encourage its usage.
Also finalized remaining bits in the docs.
Also updated ruffs rules.
@derb12 derb12 merged commit a70bedd into main Feb 18, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant