Skip to content

r-hyperspec/pyperspec

Repository files navigation

Project Status: WIP – Initial development is in progress, but there has not yet been a stable release yet.

Python Package pyperspec

This is a Python package designed to simplify the analysis and manipulation of hyperspectral datasets. The package provides an object-oriented approach providing a user-friendly interface that feels familiar to Python users, i.e. close to libraries such as numpy, pandas, and scikit-learn.

This is heavily inspired by R package hyperSpec and part of r-hyperspec. The goal is to make the work with hyperspectral data sets, (i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra) more comfortable. The spectra can be data obtained during XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. spectroscopy measurements.

NOTE: The main focus is not on algroithms since there are already many other good packages implementing algorithms, e.g. numpy, scipy, pybaselines, and more can be found in FOSS For Spectroscopy list.

Rather, it provides convinient interface for those algorithms and other routine tasks.

For detailed information and documentation, please visit PyPerSpec Documentation.

Documentation

Please, check here

Installation

Currently available only from GitHub:

pip install git+https://github.com/r-hyperspec/pyperspec.git

Quick Demo

import pyspc
import numpy as np
import pandas as pd

spc = np.random.rand(10, 20) # Here is you spectra in unfolded structure
wl = np.linspace(1000,2000,20) # Array of wavelength/wavenumbers
meta_data = pd.DataFrame({"group": ..., "date": ...,}) # Additional meta-data

# Create the object
sf = pyspc.SpectraFrame(spc, wl=wl, data=data)

# Easy meta-data manipulation
sf.A
sf["A"]
sf["E"] = ...

# Easy data slicing/filtering, similar to hyperSpec
sf[:,:,500:1000] # Cut wavelenght range to [500, 1000]
sf[:5,:,:5, True] # Use iloc style to get only first five spectra and first five wavenumbers
sf.query("group == 'Control'") # Get only 'Control' group

# Simple aggregation even with custom methods
sf[:,:,500:1000].mean(groupby=["group", "date"])
sf.query("group = 'Control'").apply(lamda x: np.sum(x**2), axis=0)

# Chaining methods
sf_processed = (
    sf.query("group = 'Control'")
    .mean(groupby="date")
    .smooth("savgol", window_length=7, polyorder=2)
    .sbaseline("rubberband")
    .normalize("area")
)

# Select 3 random spectra and plot them colored by "date"
sf.sample(3).plot(colors="date")

# Export to wide pandas DataFrame
sf.to_pandas()

Acknowlegments

Emanuel Institute of Biochemical Physics, RAS

Horizon 2020

IMAGE-IN

Chemometrix GmbH

Leibniz-IPHT

BMD Software