Skip to content

Commit

Permalink
Merge pull request #3 from derb12/development
Browse files Browse the repository at this point in the history
Merge Development for v0.3.0
  • Loading branch information
derb12 committed Apr 29, 2021
2 parents 7b89e5c + 306b9ff commit 9377e3b
Show file tree
Hide file tree
Showing 40 changed files with 2,037 additions and 525 deletions.
2 changes: 1 addition & 1 deletion readthedocs.yaml → .readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,6 @@ formats:

# Additional requirements for the documentation
python:
version: 3.7
version: 3.8
install:
- requirements: requirements/requirements-documentation.txt
71 changes: 71 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,80 @@
Changelog
=========

Version 0.3.0 (2021-04-29)
--------------------------

This is a minor version with new features, bug fixes, deprecations,
and documentation improvements.

New Features
~~~~~~~~~~~~

* Added the small-window moving average (swima) baseline to pybaselines.window,
which iteratively smooths the data with a moving average to eliminate peaks
and obtain the baseline.
* Added the rolling_ball function to pybaselines.morphological, which applies
a minimum and then maximum moving window, and subsequently smooths the result,
giving a baseline that resembles rolling a ball across the data. Also allows
giving an array of half-window values to allow the ball to change size as it
moves across the data.
* Added the adaptive_minmax algorithm to pybaselines.optimizers, which uses the
modpoly or imodpoly functions and performs polynomial fits with two different
orders and two different weighting schemes and then uses the maximum values of
all the baselines.
* Added the Peaked Signal's Asymmetric Least Squares Algorithm (psalsa)
function to pybaselines.whittaker, which uses exponentially decaying weighting
to better fit noisy data.
* The imodpoly and loess functions in pybaselines.polynomial now use `num_std`
to specify the number of standard deviations to use when thresholding.
* The pybaselines.polynomial.penalized_poly function now allows weights to be used.
Also made the default threshold value scale with the data better.
* Added higher order filters for pybaselines.window.snip to allow for more
complicated baselines. Also allow inputting a sequence of ints for
`max_half_window` to better fit asymmetric peaks.

Bug Fixes
~~~~~~~~~

* Fixed a bug that would not allow even morphological half windows,
since it is not needed for the half windows, only the full windows.
* Fixed the thresholding for pybaselines.polynomial.imodpoly, which was incorrectly
not adding the standard deviation to the baseline when thresholding.
* Fixed weighting for pybaselines.whittaker.airpls so that weights no longer
get values greater than 1.
* Removed the append and prepend keywords for np.diff in the
pybaselines.morphological.mpls function, since the keywords
were not added until numpy version 1.16, which is higher than
the minimum stated version for pybaselines.

Other Changes
~~~~~~~~~~~~~

* Allow utils.pad_edges to work with a pad_length of 0 (no padding).
* Added a 'min_half_window' parameter for pybaselines.morphological.optimize_window
so that small window sizes can be skipped to speed up the calculation.
* Changed the default method from 'aspls' to 'asls' for optimizers.optimize_extended_range.

Deprecations/Breaking Changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* Removed the 'smooth' keyword argument for pybaselines.window.snip. Smoothing is
now performed if the given smooth half window is greater than 0.
* pybaselines.polynomial.loess no longer has an `include_stdev` keyword argument.
Equivalent behavior can be obtained by setting `num_std` to 0.

Documentation/Examples
~~~~~~~~~~~~~~~~~~~~~~

* Updated the documentation to include simple explanations for some techniques.


Version 0.2.0 (2021-04-02)
--------------------------

This is a minor version with new features, bug fixes, deprecations,
and documentation improvements.

New Features
~~~~~~~~~~~~

Expand Down
69 changes: 69 additions & 0 deletions LICENSES_bundled.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
The pybaselines repository and source distributions contain code adapted from compatibly licensed sources,
which are listed below.


Source: ported MATLAB code from https://www.mathworks.com/matlabcentral/fileexchange/27429-background-correction
(accessed March 18, 2021)
Function: pybaselines.polynomial.penalized_poly
License: 2-clause BSD

Copyright (c) 2012, Vincent Mazet
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:

* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in
the documentation and/or other materials provided with the distribution

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.


Source: https://gist.github.com/agramfort/850437 (accessed March 25, 2021)
Function: pybaselines.polynomial.loess
License: 3-clause BSD

# Authors: Alexandre Gramfort <alexandre.gramfort@telecom-paristech.fr>
#
# License: BSD (3-clause)
Copyright (c) 2015, Alexandre Gramfort
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice,
this list of conditions and the following disclaimer in the documentation
and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
include LICENSE.txt
include README.rst
include LICENSES_bundled.txt

exclude readthedocs.yaml
exclude .gitattributes
Expand Down
65 changes: 42 additions & 23 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,14 @@ pybaselines is a collection of baseline algorithms for fitting experimental data
Introduction
------------

pybaselines provides different techniques for fitting baselines to experimental data.
pybaselines provides many different algorithms for fitting baselines to data from
experimental techniques such as Raman, FTIR, NMR, XRD, PIXE, etc. The aim of
the project is to provide a semi-unified API to allow quickly testing and comparing
multiple baseline algorithms to find the best one for a set of data.

Baseline fitting techniques are grouped accordingly (note: when a method
is labelled as 'improved', that is the method's name, not editorialization):
pybaselines has 25+ baseline algorithms. Baseline fitting techniques are grouped
accordingly (note: when a method is labelled as 'improved', that is the method's
name, not editorialization):

a) Polynomial (pybaselines.polynomial)

Expand All @@ -51,11 +55,12 @@ b) Whittaker-smoothing-based techniques (pybaselines.whittaker)

1) asls (Asymmetric Least Squares)
2) iasls (Improved Asymmetric Least Squares)
3) airpls (Adaptive iteratively reweighted penalized least squares)
4) arpls (Asymmetrically reweighted penalized least squares)
5) drpls (Doubly reweighted penalized least squares)
6) iarpls (Improved Asymmetrically reweighted penalized least squares)
7) aspls (Adaptive smoothness penalized least squares)
3) airpls (Adaptive Iteratively Reweighted Penalized Least Squares)
4) arpls (Asymmetrically Reweighted Penalized Least Squares)
5) drpls (Doubly Reweighted Penalized Least Squares)
6) iarpls (Improved Asymmetrically Reweighted Penalized Least Squares)
7) aspls (Adaptive Smoothness Penalized Least Squares)
8) psalsa (Peaked Signal's Asymmetric Least Squares Algorithm)

c) Morphological (pybaselines.morphological)

Expand All @@ -64,16 +69,19 @@ c) Morphological (pybaselines.morphological)
3) imor (Improved Morphological)
4) mormol (Morphological and Mollified Baseline)
5) amormol (Averaging Morphological and Mollified Baseline)
6) rolling_ball (Rolling Ball Baseline)

d) Window-based (pybaselines.window)

1) noise_median (Noise Median method)
2) snip (Statistics-sensitive Non-linear Iterative Peak-clipping)
3) swima (Small-Window Moving Average)

e) Optimizers (pybaselines.optimizers)

1) collab_pls (Collaborative Penalized Least Squares)
2) optimize_extended_range
3) adaptive_minmax (Adaptive MinMax)

f) Manual methods (pybaselines.manual)

Expand All @@ -86,14 +94,15 @@ Installation
Dependencies
~~~~~~~~~~~~

pybaselines requires `Python <https://python.org>`_ version 3.6 or later and the following libraries:
pybaselines requires `Python <https://python.org>`_ version 3.6 or later
and the following libraries:

* `NumPy <https://numpy.org>`_ (>= 1.9)
* `SciPy <https://www.scipy.org/scipylib/index.html>`_


All of the required libraries should be automatically installed when installing pybaselines
using either of the two installation methods below.
All of the required libraries should be automatically installed when
installing pybaselines using either of the two installation methods below.


Stable Release
Expand Down Expand Up @@ -138,8 +147,14 @@ Quick Start

To use the various functions in pybaselines, simply input the measured
data and any required parameters. All baseline functions in pybaselines
will output two items: the calculated baseline and a dictionary of parameters
that can be helpful for reusing the functions.
will output two items: a numpy array of the calculated baseline and a
dictionary of parameters that can be helpful for reusing the functions.

For more details on each baseline algorithm, refer to the `algorithms section`_ of
pybaselines's documentation.

.. _algorithms section: https://pybaselines.readthedocs.io/en/latest/algorithms/index.html


A simple example is shown below.

Expand All @@ -149,7 +164,7 @@ A simple example is shown below.
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(100, 4200, 2000)
x = np.linspace(100, 4200, 1000)
# a measured signal containing several Gaussian peaks
signal = (
pybaselines.utils.gaussian(x, 2, 700, 50)
Expand All @@ -158,20 +173,23 @@ A simple example is shown below.
+ pybaselines.utils.gaussian(x, 4, 2500, 50)
+ pybaselines.utils.gaussian(x, 7, 3300, 100)
)
# baseline is a polynomial plus a broad gaussian
true_baseline = (
10 + 0.001 * x # polynomial baseline
+ pybaselines.utils.gaussian(x, 6, 2000, 2000) # gaussian baseline
10 + 0.001 * x
+ pybaselines.utils.gaussian(x, 6, 2000, 2000)
)
noise = np.random.default_rng(0).normal(0, 0.2, x.size)
noise = np.random.default_rng(1).normal(0, 0.2, x.size)
y = signal + true_baseline + noise
bkg_1 = pybaselines.polynomial.modpoly(y, x, poly_order=3)[0]
bkg_2 = pybaselines.whittaker.asls(y, lam=1e8, p=0.01)[0]
bkg_3 = pybaselines.morphological.imor(y, half_window=50)[0]
bkg_4 = pybaselines.window.snip(y, max_half_window=70, decreasing=True, smooth=True)[0]
bkg_2 = pybaselines.whittaker.asls(y, lam=1e7, p=0.01)[0]
bkg_3 = pybaselines.morphological.imor(y, half_window=25)[0]
bkg_4 = pybaselines.window.snip(
y, max_half_window=40, decreasing=True, smooth_half_window=1
)[0]
plt.plot(x, y, label='raw data')
plt.plot(x, y, label='raw data', lw=1.5)
plt.plot(x, true_baseline, lw=3, label='true baseline')
plt.plot(x, bkg_1, '--', label='modpoly')
plt.plot(x, bkg_2, '--', label='asls')
Expand All @@ -192,8 +210,9 @@ The above code will produce the image shown below.
Contributing
------------

Contributions are welcomed and greatly appreciated. For information on submitting bug reports,
pull requests, or general feedback, please refer to the `contributing guide`_.
Contributions are welcomed and greatly appreciated. For information on
submitting bug reports, pull requests, or general feedback, please refer
to the `contributing guide`_.

.. _contributing guide: https://github.com/derb12/pybaselines/tree/main/docs/contributing.rst

Expand Down
3 changes: 3 additions & 0 deletions docs/algorithms/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,9 @@ are not fully optimized for the data).
:align: center
:alt: window baselines

.. image:: ../images/optimizers.jpg
:align: center
:alt: optimizer baselines

.. toctree::
:maxdepth: 2
Expand Down
9 changes: 9 additions & 0 deletions docs/algorithms/morphological.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,12 @@ include dilation, erosion, opening, and closing. Similar to the algorithms in
:mod:`pybaselines.window`, morphological operators use moving windows and compute
the maximum, minimum, or a combination of the two within each window.

.. note::
All morphological algorithms use a `half_window` parameter to define the size
of the window used for the morphological operators. `half_window` is index-based,
rather than based on the units of the data, so proper conversions must be done
by the user to get the desired window size.


Algorithms
----------
Expand All @@ -31,3 +37,6 @@ mormol (Morphological and Mollified Baseline)

amormol (Averaging Morphological and Mollified Baseline)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

rolling_ball (Rolling Ball)
~~~~~~~~~~~~~~~~~~~~~~~~~~~
2 changes: 2 additions & 0 deletions docs/algorithms/optimizers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,6 @@ Penalized Least Squares (erPLS) method, but extends its usage to all
Whittaker-smoothing-based algorithms and polynomial algorithms.


adaptive_minmax (Adaptive MinMax)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

13 changes: 7 additions & 6 deletions docs/algorithms/polynomial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ However, since only the baseline of the data is desired, the least-squares
approach must be modified. For polynomial-based algorithms, this is done
by 1) only fitting the data in regions where there is only baseline (termed
selective masking), 2) modifying the y-values being fit each iteration, termed
thresholding, or 3) applying custom weights.
thresholding, or 3) penalyzing outliers.

Selective Masking
~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -56,12 +56,13 @@ Thresholding
The algorithms in pybaselines that use thresholding are :func:`.modpoly`,
:func:`.imodpoly`, and :func:`.loess` (if `use_threshold` is True).

Custom Weighting
~~~~~~~~~~~~~~~~
Penalyzing Outliers
~~~~~~~~~~~~~~~~~~~


The algorithms in pybaselines that use custom weighting are
:func:`.penalized_poly`, and :func:`.loess` (if `use_threshold` is False).
The algorithms in pybaselines that penalyze outliers are
:func:`.penalized_poly`, which incorporate the penalty directly into the
minimized cost function, and :func:`.loess` (if `use_threshold` is False),
which incorporates penalties by applying lower weights to outliers.


Algorithms
Expand Down
Loading

0 comments on commit 9377e3b

Please sign in to comment.