Skip to content
This repository has been archived by the owner on Nov 10, 2021. It is now read-only.

Commit

Permalink
Whoop
Browse files Browse the repository at this point in the history
  • Loading branch information
KeironO committed Jul 17, 2019
1 parent 649ec7e commit 0c1f736
Show file tree
Hide file tree
Showing 2 changed files with 49 additions and 2 deletions.
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ This work is very much inspired by the methods detailed in [High-throughput, non
- Loading mass spectrometry files from mzML.
- Support for polarity switching.
- MAD-estimated infusion profiling.
- Assay-wide outlier spectrum detection.
- Spurious peak elimination.
- Spectrum export for direct dissemination using Metaboanalyst.
- Spectral binning.
Expand Down Expand Up @@ -117,6 +118,13 @@ If you're only using this pipeline to extract mass spectrum for Metabolanalyst,

That being said, this pipeline contains many of the preprocessing methods found in Metaboanalyst - so it may be easier for you to just use ours.

As a diagnostic measure, the TIC can provide an estimation of factos that may adversely affect the overal intensity count of a run. As a rule, it is common to remove spectrum in which the TIC deviates 2/3 times from the median-absolute deviation. We can do this by calling the ```detect_outliers``` method:

```python
>>> speclist.detect_outliers(thresh = 2, verbose=True)
Detected Outliers: outlier_one;outlier_two
```

A common first step in the analysis of mass-spectrometry data is to bin the data to a given mass-to-ion value. To do this for all ```Spectrum``` held within our ```SpectrumList``` object, simply apply the ```bin``` method:

```python
Expand Down
43 changes: 41 additions & 2 deletions dimepy/spectrumList.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,48 @@ def append(self, spectrum: Spectrum):
else:
raise ValueError("SpectrumList only accepts Spectrum objects.")

def detect_outliers(self, threshold: float = 1, verbose: bool = False):
"""
Method to locate and remove outlier spectrum using MAD.
Arguments:
threshold (int): Threshold for MAD outlier detection.
"""

def _get_tics() -> Tuple[np.array, np.array]:
tics = []

for spec in self._list:
tics.append(np.sum(spec.intensities))

return np.array(tics)

def _calculate_mad(tics: np.array) -> float:
return np.median(np.abs(tics - np.median(tics)))

def _get_mask(tics: np.array, mad: float) -> np.array:
tics = tics[:, None]
median = np.median(tics, axis=0)
diff = np.sum((tics - median)**2, axis=-1)
diff = np.sqrt(diff)

med_abs_deviation = np.median(diff)

modified_z_score = 0.6745 * diff / med_abs_deviation

return modified_z_score <= threshold

tics = _get_tics()
mad = _calculate_mad(tics)
to_keep = _get_mask(tics, mad)

_list = np.array(self._list)

if verbose:
print("Detected Outliers: %s" %
";".join([x.identifier for x in _list[~to_keep]]))

def outlier_detection(self):
pass
self._list = _list[to_keep].tolist()

def bin(self, bin_width: float = 0.5, statistic: str = "mean"):
"""
Expand Down

0 comments on commit 0c1f736

Please sign in to comment.