-
Notifications
You must be signed in to change notification settings - Fork 2
Baseline and_noise correction
The method doBaselinePrf in the Finnee class is used to correct a full dataset, recorded in profile mode, from baseline drift and background noise. Using the dataset recorded in profile mode allows avoiding potential errors induced by the centroid algorithms as well as having to reconstruct profiles using either extracted ion profiles or single ion profiles. The corrected data are 10 to 100 times smaller.
Supposing a Finnee object called myFinnee, this method should be called by
myFinnee = myFinnee.doBaselinePrf(dts, par4bas)
where dts is the index to the dataset of interest (usually 1) and par4bas is a structure that contains all the necessary information. Various steps are necessary to obtain the par4bas structure.
When using a profile dataset, a profile is simply obtained my scanning each intensity at given m/z value. Such approach is less error prone than using centroid dataset as it is possible to work with the data as recorded by the instrument (no centroid algorithms, binning or other methods). However, the amount of information is also far larger. For example with the [Urine CE-TOFMS dataset] (https://data.mendeley.com/datasets/cb4hv9cp2c/2), the mz axe contains 60512 value. This is the number of profiles that should be corrected. While this is possible, this will also be lengthy. A procedure that allows selecting the profiles to be corrected is highly interesting. With Finnee2016, the class Options4bslCorPrf is used to select the optimal parameters for the doBaselinePrf method.
Options4bslCorPrf should be initiated with
myO4B = Options4bslCorPrf(dtsIn);
where dtsIn is a dataset, for example
myO4B = Options4bslCorPrf(myFinnee.Datasets{1})
will launch the object with the first dataset from the object myFInnee. After hitting enter, the frequency profile will be display.
![frequency profile] (https://github.com/glerny/img4wiki/blob/master/profilePlot.gif)
This figure is build using the frequency ion spectrum [FIS] (https://github.com/glerny/Finnee2016/wiki/Definition-of-terms) that is the number of nonnull values during the whole separation at each -m-/-z- value. The previous figure is a histogram representation of the FIS that allows selecting (i) selecting profiles at m/z with very few nonnull values that are assumed to be noise and (ii) selecting profiles at m/z with very few null values that are assumed to be, or contain background ions. After choosing (i) and (ii), a new figure will be made that allows verifying the pertinence of your choice.
![Selecting profiles] (https://github.com/glerny/img4wiki/blob/master/selectingprofile.gif)
The top panel in this figure is the noise as estimated using the m/z channels below the threshold defined in (i), the middle panel is the BPP calculated using only the m/z channel over the threshold defined in (ii) and the bottom panel is the BPP calculated using the remaining channels. If the thresholds were well chosen, the baseline in the bottom panel should more or less be in equal to the background level. The number of profiles that will have to be corrected is indicated on in the second panel.
The baseline function, its parameters and the parameters for background noise removal can be optimised using the setBslParameters method. Entering
myO4B.setBslParameters
will launch a graphic user interface (GUI) to select the baseline correction method
![setBslParameters] (https://github.com/glerny/img4wiki/blob/master/select%20baseline.gif)
The top panel is dedicated to the baseline function and associated parameters, the bottom panel to the noise removal parameters. Profiles are randomly selected, it is remanded to verify that the selected parameters allow correcting, at least ten differents profiles. Different baselines functions can be tested (None, PF (polynomial fitting), arPLS, arPLS2 (modified arPlS)...). Noise removal is performed using a moving window. The size of the window is determined by the WDZ parameter. For each point, in all intensities within a square of sides equal to 2*WDZ + 1 and centred around the data of interest are lower than 3 times the noise threshold, then the intensity of the data of interest is arbitrary set to zero. The initial value of the noise threshold is estimated using the m/z channels below threshold (i) but can be modified. When hitting this figure, the par4bas structure will be created in the workspace; par4bas summarised all needed parameters.
To the following command should be used to do the baseline correction (dependent on the size of the file, the baseline function and the number of profiles to correct, this can take time):
myFinnee = myFinnee.doBaselinePrf(1, par4bas);
(or 2 if dataset 2...). Upon completion, a new dataset will be created within the Finnee object.
length(myFinnee.Datasets) ans = 2
It is possible to repeat the process with different baseline parameters if this case for each trial a new dataset will be created. As example using [Urine CE-TOFMS dataset] (https://data.mendeley.com/datasets/cb4hv9cp2c/2) in profile mode and, with the following parameters for par4bas
.type = 'arPLS2'
.parameter = 1000000
.obj
...
.Fmin = 10
.Fmax = 70
...
.noise = 10
.wdz = 3
The full correction took 1.5 min. While the original profile data file was 217 MB, the corrected file is 10 MB. Finnee2016 allows to easily compare the results.
myFinnee.Datasets{1}.BPP.plot
![BPP1] (https://github.com/glerny/img4wiki/blob/master/BPP1.gif)
myFinnee.Datasets{2}.BPP.plot
![BPP2] (https://github.com/glerny/img4wiki/blob/master/BPP2.gif)
myFinnee.Datasets{1}.getSpectra([13.77 14.03]).plot
![sptrum1] (https://github.com/glerny/img4wiki/blob/master/spectr1.gif)
myFinnee.Datasets{2}.getSpectra([13.77 14.03]).plot
![sptrum2] (https://github.com/glerny/img4wiki/blob/master/spectr2.gif)
Up : Basic operation with Dataset
Next : Centroid algorithms
Previous : Filtering
Related to: