Releases: HegemanLab/w4mclassfilter_galaxy_wrapper
W4M Data Subset tool for Galaxy
Description
The W4M Data Subset tool selects subsets of samples, features, or data values and conditions the data for further analysis.
- The tool takes as input the dataMatrix, sampleMetadata, and variableMetadata datasets produced by W4M's XCMS and CAMERA [Kuhl et al., 2012] tools.
- The tool produces the same trio of output datasets, modified as described below.
This tool can perform several operations to reduce the number samples or features to be analyzed (although this should be done only in a statistically sound manner consistent with the nature of the experiment):
- Sample filtering: Samples may be selected by designating a "sample class" column in sampleMetadata and specifying criteria to include or exclude samples based on the contents of this column.
- Feature filtering: Features may be selected by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
- Intensity filtering: To exclude minimal features from consideration, a lower bound may be specified for the maximum intensity for a feature across all samples (i.e., for a row in dataMatrix).
This tool also conditions data for statistical analysis:
- Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
- Features that are missing from either variableMetadata or dataMatrix are eliminated.
- Features and samples that have zero variance are eliminated.
- Samples and features are ordered consistently in variableMetadata, sampleMetadata, and dataMatrix. (The columns for sorting variableMetadata or sampleMetadata may be specified.)
- The names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
- If desired, the values in the dataMatrix may be log-transformed.
- Negative intensities become missing values (before missing-value replacement is performed).
- If desired, each missing value in dataMatrix may be replaced with zero or the median value observed for the corresponding feature.
- If desired, a "center" for each treatment can be computed in lieu of the samples for that treatment.
This tool may be applied several times sequentially, which may be useful for:
- analyzing subsets of samples for progressively smaller sets of treatment levels, or
- choosing subsets of samples or features, respectively based on criteria in columns of sampleMetadata or variableMetadata.
Changes in version 0.98.19
This version in Galaxy toolshed
https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/aae9fa9a7d4d
New features
- Bug fix: "medoid computation aborts when trt is numeric or some trts have one replicate" HegemanLab/w4mclassfilter#8.
Internal modifications
- Use v0.98.19 of the w4mclassfilter bioconda package which was built with R 4.0.3.
- Use v4.0.3 of the r-base conda-forge package.
W4M Data Subset tool for Galaxy
Description
The W4M Data Subset tool selects subsets of samples, features, or data values and conditions the data for further analysis.
- The tool takes as input the dataMatrix, sampleMetadata, and variableMetadata datasets produced by W4M's XCMS and CAMERA [Kuhl et al., 2012] tools.
- The tool produces the same trio of output datasets, modified as described below.
This tool can perform several operations to reduce the number samples or features to be analyzed (although this should be done only in a statistically sound manner consistent with the nature of the experiment):
- Sample filtering: Samples may be selected by designating a "sample class" column in sampleMetadata and specifying criteria to include or exclude samples based on the contents of this column.
- Feature filtering: Features may be selected by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
- Intensity filtering: To exclude minimal features from consideration, a lower bound may be specified for the maximum intensity for a feature across all samples (i.e., for a row in dataMatrix).
This tool also conditions data for statistical analysis:
- Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
- Features that are missing from either variableMetadata or dataMatrix are eliminated.
- Features and samples that have zero variance are eliminated.
- Samples and features are ordered consistently in variableMetadata, sampleMetadata, and dataMatrix. (The columns for sorting variableMetadata or sampleMetadata may be specified.)
- The names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
- If desired, the values in the dataMatrix may be log-transformed.
- Negative intensities become missing values (before missing-value replacement is performed).
- If desired, each missing value in dataMatrix may be replaced with zero or the median value observed for the corresponding feature.
- If desired, a "center" for each treatment can be computed in lieu of the samples for that treatment.
This tool may be applied several times sequentially, which may be useful for:
- analyzing subsets of samples for progressively smaller sets of treatment levels, or
- choosing subsets of samples or features, respectively based on criteria in columns of sampleMetadata or variableMetadata.
Changes in version 0.98.18
This version in Galaxy toolshed
https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/87ec0d3c2266
New features
- Enhancement: Added option "compute center for each treatment" HegemanLab/w4mclassfilter#6.
- Enhancement: Added option "enable sorting on multiple columns of metadata" HegemanLab/w4mclassfilter#7.
- Enhancement: Added option "always treat negative intensities as missing values" #7.
Internal modifications
- Use v0.98.18 of the w4mclassfilter bioconda package.
W4M Data Subset tool for Galaxy
Description
The W4M Data Subset tool selects subsets of samples, features, or data values for further analysis.
- The tool takes as input the data matrix, sample metadata, and variable metadata datasets produced by the XCMS [Smith et al., 2006, http://dx.doi.org/10.1021/ac051437y] and CAMERA [Kuhl et al., 2012, http://dx.doi.org/10.1021/ac202450g] tools of Workflow4metabolomics (W4m), http://workflow4metabolomics.org [Giacomoni et al., 2014, https://doi.org/10.1021%2Fac051437y].
- The tool produces as output the same trio of datasets, modified as described below.
This tool performs several operations to address several data issues that may impede downstream statistical analysis:
- Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
- Features that are missing from either variableMetadata or dataMatrix are eliminated.
- Features and samples that have zero variance are eliminated.
- Samples and features have consistent order in
variableMetadata
,sampleMetadata
, anddataMatrix
.- (The column for sorting
variableMetadata
orsampleMetadata
may be specified.)
- (The column for sorting
- By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
- Negative intensities are replaced by zeros.
- If desired, the values in the dataMatrix may be log-transformed.
- If desired, each missing value in dataMatrix is replaced with zero or the median value observed for the corresponding feature.
This tool also can perform several operations to reduce the number samples or features to be analyzed:
- Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
- Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
- Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).
The W4M Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.
Changes in version 0.98.14
This version in Galaxy toolshed
https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/c18040b6e8b9
New features
- Enhancement #6 - "Provide sort options for features and samples".
Internal modifications
- Use v0.98.14 of the w4mclassfilter bioconda package.
W4M Data Subset tool for Galaxy
Description
The W4M Data Subset tool selects subsets of samples, features, or data values for further analysis.
- The tool takes as input the data matrix, sample metadata, and variable metadata datasets produced by the XCMS [Smith et al., 2006, http://dx.doi.org/10.1021/ac051437y] and CAMERA [Kuhl et al., 2012, http://dx.doi.org/10.1021/ac202450g] tools of Workflow4metabolomics (W4m), http://workflow4metabolomics.org [Giacomoni et al., 2014, https://doi.org/10.1021%2Fac051437y].
- The tool produces as output the same trio of datasets, modified as described below.
This tool performs several operations to address several data issues that may impede downstream statistical analysis:
- Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
- Features that are missing from either variableMetadata or dataMatrix are eliminated.
- Features and samples that have zero variance are eliminated.
- Samples and features are sorted alphabetically in rows and columns of variableMetadata, sampleMetadata, and dataMatrix.
- By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
- Negative intensities are replaced by zeros.
- If desired, the values in the dataMatrix may be log-transformed.
- If desired, each missing value in dataMatrix is replaced with zero or the median value observed for the corresponding feature.
This tool also can perform several operations to reduce the number samples or features to be analyzed:
- Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
- Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
- Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).
The W4M Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.
Changes in version 0.98.13
(Note that version number 0.98.12 was skipped)
This version in Galaxy toolshed
https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/38f509903a0b
New features
- Support enhancement HegemanLab/w4mclassfilter#4 - "add and test no-imputation and centering-imputation functions":
- Support no imputation.
- Support imputating missing feature-intensities as median intensity for the corresponding feature.
Internal modifications
- Use v0.98.13 of the w4mclassfilter bioconda package.
W4m Data Subset tool for Galaxy
Description
The W4m Data Subset tool selects subsets of samples, features, or data values for further analysis.
- The tool takes as input the data matrix, sample metadata, and variable metadata datasets produced by the XCMS [Smith et al., 2006, http://dx.doi.org/10.1021/ac051437y] and CAMERA [Kuhl et al., 2012, http://dx.doi.org/10.1021/ac202450g] tools of Workflow4metabolomics (W4m), http://workflow4metabolomics.org [Giacomoni et al., 2014, https://doi.org/10.1021%2Fac051437y].
- The tool produces as output the same trio of datasets, modified as described below.
This tool performs several operations to address several data issues that may impede downstream statistical analysis:
- Missing values in dataMatrix are imputed to zero.
- The dataMatrix values may be log-transformed if desired.
- Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
- Features that are missing from either variableMetadata or dataMatrix are eliminated.
- Features and samples that have zero variance are eliminated.
- Samples and features are sorted alphabetically in rows and columns of variableMetadata, sampleMetadata, and dataMatrix.
- By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
This tool also can perform several operations to reduce the number samples or features to be analyzed:
- Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
- Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
- Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).
The W4m Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.
Changes in version 0.98.11
This version in Galaxy toolshed
https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/9f5c0e23c205
New features
- none
Internal modifications
- Use v0.98.8 of the w4mclassfilter bioconda package.
W4m Data Subset tool for Galaxy
Description
The W4m Data Subset tool selects subsets of samples, features, or data values for further analysis.
- The tool takes as input the data matrix, sample metadata, and variable metadata datasets produced by produced by the XCMS [Smith et al., 2006, http://dx.doi.org/10.1021/ac051437y] and CAMERA [Kuhl et al., 2012, http://dx.doi.org/10.1021/ac202450g] tools of Workflow4metabolomics (W4m), http://workflow4metabolomics.org [Giacomoni et al., 2014, https://doi.org/10.1021%2Fac051437y].
- The tool produces as output the same trio of datasets, modified as follows:
This tool performs several operations to address several data issues that may impede downstream statistical analysis:
- Missing values in dataMatrix are imputed to zero.
- The dataMatrix values may be log-transformed if desired.
- Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
- Features that are missing from either variableMetadata or dataMatrix are eliminated.
- Features and samples that have zero variance are eliminated.
- Samples and features are sorted alphabetically in rows and columns of variableMetadata, sampleMetadata, and dataMatrix.
- By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
This tool also can perform several operations to reduce the number samples or features to be analyzed:
- Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
- Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
- Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).
The W4m Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.
Changes in version 0.98.10
This version in Galaxy toolshed
https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/649cb1bafd3e
New features
- None
Internal modifications
- Quality-assurance improvements - Changes to repository layout for IUC conformance and automated Planemo testing on Travis CI.
W4m Data Subset tool for Galaxy
Uploaded to Galaxy toolshed at https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/1ced8b5dfa3e
Description
The W4m Data Subset tool selects subsets of samples, features, or data values for further analysis.
- The tool takes as input the data matrix, sample metadata, and variable metadata datasets produced by produced by the XCMS [Smith et al., 2006, http://dx.doi.org/10.1021/ac051437y] and CAMERA [Kuhl et al., 2012, http://dx.doi.org/10.1021/ac202450g] tools of Workflow4metabolomics (W4m), http://workflow4metabolomics.org [Giacomoni et al., 2014, https://doi.org/10.1021%2Fac051437y].
- The tool produces as output the same trio of datasets, modified as follows:
This tool performs several operations to address several data issues that may impede downstream statistical analysis:
- Missing values in dataMatrix are imputed to zero.
- The dataMatrix values may be log-transformed if desired.
- Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
- Features that are missing from either variableMetadata or dataMatrix are eliminated.
- Features and samples that have zero variance are eliminated.
- Samples and features are sorted alphabetically in rows and columns of variableMetadata, sampleMetadata, and dataMatrix.
- By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
This tool also can perform several operations to reduce the number samples or features to be analyzed:
- Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
- Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
- Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).
The W4m Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.
Changes in version 0.98.9
New features
- None
Internal modifications
- Added missing support for hyphen character in regular expressions.
W4m Data Subset tool for Galaxy
Uploaded to Galaxy toolshed at https://toolshed.g2.bx.psu.edu/repository?repository_id=5f24951d82ab40fa&changeset_revision=d5cf23369d12
Description
The W4m Data Subset tool selects subsets of samples, features, or data values for further analysis.
- The tool takes as input the data matrix, sample metadata, and variable metadata datasets produced by produced by W4m's XCMS [Smith et al., 2006] and CAMERA [Kuhl et al., 2012] tools.
- The tool produces as output the same trio of datasets, modified as follows:
This tool performs several operations to address several data issues that may impede downstream statistical analysis:
- Missing values in dataMatrix are imputed to zero.
- The dataMatrix values may be log-transformed if desired.
- Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
- Features that are missing from either variableMetadata or dataMatrix are eliminated.
- Features and samples that have zero variance are eliminated.
- Samples and features are sorted alphabetically in rows and columns of variableMetadata, sampleMetadata, and dataMatrix.
- By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
This tool also can perform several operations to reduce the number samples or features to be analyzed:
- Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
- Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
- Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).
The W4m Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.
Changes in version 0.98.8
New features
- The tool now appears in Galaxy with a new, more representative name: "W4m Data Subset". (Earlier versions of this tool appeared in Galaxy with the name "Sample Subset".)
- Option was added to log-transform data matrix values.
- Output datasets are named in conformance with the W4m convention of appending the name of each preprocessing tool to the input dataset name.
- Superflous "Column that names the sample" input parameter was eliminated.
- Some documentation was updated or clarified.
Internal modifications
- None
w4mclassfilter galaxy wrapper
Uploaded to Galay toolshed at
https://toolshed.g2.bx.psu.edu/repository?repository_id=5f24951d82ab40fa&changeset_revision=582a8a42a93b
CHANGES IN VERSION 0.98.7
NEW FEATURES
- First column of output variableMetadata (that has feature names) now is always named
variableMetadata
- First column of output sampleMetadata now (that has sample names) is always named
sampleMetadata
INTERNAL MODIFICATIONS
- Now uses w4mclassfilter R package v0.98.7.
w4mclassfilter_galaxy_wrapper release v0.98.6
Release is what was updated to the toolshed at https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/38ccf6722d54
CHANGES IN VERSION 0.98.6
NEW FEATURES
- Added support for filtering out features whose attributes fall outside specified ranges. For more detail, see "Variable-range filters" above.
INTERNAL MODIFICATIONS
- Now uses w4mclassfilter R package v0.98.6.
- Now sorts sample names and feature names in output files because some statistical tools expect the same order in dataMatrix row and column names as in the corresponding metadata files.