Skip to content

Releases: HegemanLab/w4mclassfilter_galaxy_wrapper

W4M Data Subset tool for Galaxy

11 Mar 22:40
Compare
Choose a tag to compare

Description

The W4M Data Subset tool selects subsets of samples, features, or data values and conditions the data for further analysis.

  • The tool takes as input the dataMatrix, sampleMetadata, and variableMetadata datasets produced by W4M's XCMS and CAMERA [Kuhl et al., 2012] tools.
  • The tool produces the same trio of output datasets, modified as described below.

This tool can perform several operations to reduce the number samples or features to be analyzed (although this should be done only in a statistically sound manner consistent with the nature of the experiment):

  • Sample filtering: Samples may be selected by designating a "sample class" column in sampleMetadata and specifying criteria to include or exclude samples based on the contents of this column.
  • Feature filtering: Features may be selected by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
  • Intensity filtering: To exclude minimal features from consideration, a lower bound may be specified for the maximum intensity for a feature across all samples (i.e., for a row in dataMatrix).

This tool also conditions data for statistical analysis:

  • Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
  • Features that are missing from either variableMetadata or dataMatrix are eliminated.
  • Features and samples that have zero variance are eliminated.
  • Samples and features are ordered consistently in variableMetadata, sampleMetadata, and dataMatrix. (The columns for sorting variableMetadata or sampleMetadata may be specified.)
  • The names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
  • If desired, the values in the dataMatrix may be log-transformed.
  • Negative intensities become missing values (before missing-value replacement is performed).
  • If desired, each missing value in dataMatrix may be replaced with zero or the median value observed for the corresponding feature.
  • If desired, a "center" for each treatment can be computed in lieu of the samples for that treatment.

This tool may be applied several times sequentially, which may be useful for:

  • analyzing subsets of samples for progressively smaller sets of treatment levels, or
  • choosing subsets of samples or features, respectively based on criteria in columns of sampleMetadata or variableMetadata.

Changes in version 0.98.19

This version in Galaxy toolshed

https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/aae9fa9a7d4d

New features

Internal modifications

W4M Data Subset tool for Galaxy

03 Jan 16:21
Compare
Choose a tag to compare

Description

The W4M Data Subset tool selects subsets of samples, features, or data values and conditions the data for further analysis.

  • The tool takes as input the dataMatrix, sampleMetadata, and variableMetadata datasets produced by W4M's XCMS and CAMERA [Kuhl et al., 2012] tools.
  • The tool produces the same trio of output datasets, modified as described below.

This tool can perform several operations to reduce the number samples or features to be analyzed (although this should be done only in a statistically sound manner consistent with the nature of the experiment):

  • Sample filtering: Samples may be selected by designating a "sample class" column in sampleMetadata and specifying criteria to include or exclude samples based on the contents of this column.
  • Feature filtering: Features may be selected by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
  • Intensity filtering: To exclude minimal features from consideration, a lower bound may be specified for the maximum intensity for a feature across all samples (i.e., for a row in dataMatrix).

This tool also conditions data for statistical analysis:

  • Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
  • Features that are missing from either variableMetadata or dataMatrix are eliminated.
  • Features and samples that have zero variance are eliminated.
  • Samples and features are ordered consistently in variableMetadata, sampleMetadata, and dataMatrix. (The columns for sorting variableMetadata or sampleMetadata may be specified.)
  • The names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
  • If desired, the values in the dataMatrix may be log-transformed.
  • Negative intensities become missing values (before missing-value replacement is performed).
  • If desired, each missing value in dataMatrix may be replaced with zero or the median value observed for the corresponding feature.
  • If desired, a "center" for each treatment can be computed in lieu of the samples for that treatment.

This tool may be applied several times sequentially, which may be useful for:

  • analyzing subsets of samples for progressively smaller sets of treatment levels, or
  • choosing subsets of samples or features, respectively based on criteria in columns of sampleMetadata or variableMetadata.

Changes in version 0.98.18

This version in Galaxy toolshed

https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/87ec0d3c2266

New features

  • Enhancement: Added option "compute center for each treatment" HegemanLab/w4mclassfilter#6.
  • Enhancement: Added option "enable sorting on multiple columns of metadata" HegemanLab/w4mclassfilter#7.
  • Enhancement: Added option "always treat negative intensities as missing values" #7.

Internal modifications

W4M Data Subset tool for Galaxy

24 Oct 14:31
Compare
Choose a tag to compare

Description

The W4M Data Subset tool selects subsets of samples, features, or data values for further analysis.

This tool performs several operations to address several data issues that may impede downstream statistical analysis:

  • Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
  • Features that are missing from either variableMetadata or dataMatrix are eliminated.
  • Features and samples that have zero variance are eliminated.
  • Samples and features have consistent order in variableMetadata, sampleMetadata, and dataMatrix.
    • (The column for sorting variableMetadata or sampleMetadata may be specified.)
  • By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
  • Negative intensities are replaced by zeros.
  • If desired, the values in the dataMatrix may be log-transformed.
  • If desired, each missing value in dataMatrix is replaced with zero or the median value observed for the corresponding feature.

This tool also can perform several operations to reduce the number samples or features to be analyzed:

  • Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
  • Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
  • Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).

The W4M Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.

Changes in version 0.98.14

This version in Galaxy toolshed

https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/c18040b6e8b9

New features

  • Enhancement #6 - "Provide sort options for features and samples".

Internal modifications

W4M Data Subset tool for Galaxy

01 Oct 23:04
Compare
Choose a tag to compare

Description

The W4M Data Subset tool selects subsets of samples, features, or data values for further analysis.

This tool performs several operations to address several data issues that may impede downstream statistical analysis:

  • Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
  • Features that are missing from either variableMetadata or dataMatrix are eliminated.
  • Features and samples that have zero variance are eliminated.
  • Samples and features are sorted alphabetically in rows and columns of variableMetadata, sampleMetadata, and dataMatrix.
  • By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".
  • Negative intensities are replaced by zeros.
  • If desired, the values in the dataMatrix may be log-transformed.
  • If desired, each missing value in dataMatrix is replaced with zero or the median value observed for the corresponding feature.

This tool also can perform several operations to reduce the number samples or features to be analyzed:

  • Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
  • Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
  • Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).

The W4M Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.

Changes in version 0.98.13

(Note that version number 0.98.12 was skipped)

This version in Galaxy toolshed

https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/38f509903a0b

New features

  • Support enhancement HegemanLab/w4mclassfilter#4 - "add and test no-imputation and centering-imputation functions":
    • Support no imputation.
    • Support imputating missing feature-intensities as median intensity for the corresponding feature.

Internal modifications

W4m Data Subset tool for Galaxy

04 Sep 02:57
Compare
Choose a tag to compare

Description

The W4m Data Subset tool selects subsets of samples, features, or data values for further analysis.

This tool performs several operations to address several data issues that may impede downstream statistical analysis:

  • Missing values in dataMatrix are imputed to zero.
  • The dataMatrix values may be log-transformed if desired.
  • Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
  • Features that are missing from either variableMetadata or dataMatrix are eliminated.
  • Features and samples that have zero variance are eliminated.
  • Samples and features are sorted alphabetically in rows and columns of variableMetadata, sampleMetadata, and dataMatrix.
  • By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".

This tool also can perform several operations to reduce the number samples or features to be analyzed:

  • Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
  • Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
  • Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).

The W4m Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.

Changes in version 0.98.11

This version in Galaxy toolshed

https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/9f5c0e23c205

New features

  • none

Internal modifications

W4m Data Subset tool for Galaxy

09 Aug 17:45
Compare
Choose a tag to compare

Description

The W4m Data Subset tool selects subsets of samples, features, or data values for further analysis.

This tool performs several operations to address several data issues that may impede downstream statistical analysis:

  • Missing values in dataMatrix are imputed to zero.
  • The dataMatrix values may be log-transformed if desired.
  • Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
  • Features that are missing from either variableMetadata or dataMatrix are eliminated.
  • Features and samples that have zero variance are eliminated.
  • Samples and features are sorted alphabetically in rows and columns of variableMetadata, sampleMetadata, and dataMatrix.
  • By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".

This tool also can perform several operations to reduce the number samples or features to be analyzed:

  • Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
  • Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
  • Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).

The W4m Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.

Changes in version 0.98.10

This version in Galaxy toolshed

https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/649cb1bafd3e

New features

  • None

Internal modifications

  • Quality-assurance improvements - Changes to repository layout for IUC conformance and automated Planemo testing on Travis CI.

W4m Data Subset tool for Galaxy

28 Mar 16:32
Compare
Choose a tag to compare

Uploaded to Galaxy toolshed at https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/1ced8b5dfa3e

Description

The W4m Data Subset tool selects subsets of samples, features, or data values for further analysis.

This tool performs several operations to address several data issues that may impede downstream statistical analysis:

  • Missing values in dataMatrix are imputed to zero.
  • The dataMatrix values may be log-transformed if desired.
  • Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
  • Features that are missing from either variableMetadata or dataMatrix are eliminated.
  • Features and samples that have zero variance are eliminated.
  • Samples and features are sorted alphabetically in rows and columns of variableMetadata, sampleMetadata, and dataMatrix.
  • By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".

This tool also can perform several operations to reduce the number samples or features to be analyzed:

  • Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
  • Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
  • Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).

The W4m Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.

Changes in version 0.98.9

New features

  • None

Internal modifications

  • Added missing support for hyphen character in regular expressions.

W4m Data Subset tool for Galaxy

04 Mar 04:09
Compare
Choose a tag to compare

Uploaded to Galaxy toolshed at https://toolshed.g2.bx.psu.edu/repository?repository_id=5f24951d82ab40fa&changeset_revision=d5cf23369d12

Description

The W4m Data Subset tool selects subsets of samples, features, or data values for further analysis.

  • The tool takes as input the data matrix, sample metadata, and variable metadata datasets produced by produced by W4m's XCMS [Smith et al., 2006] and CAMERA [Kuhl et al., 2012] tools.
  • The tool produces as output the same trio of datasets, modified as follows:

This tool performs several operations to address several data issues that may impede downstream statistical analysis:

  • Missing values in dataMatrix are imputed to zero.
  • The dataMatrix values may be log-transformed if desired.
  • Samples that are missing from either sampleMetadata or dataMatrix are eliminated.
  • Features that are missing from either variableMetadata or dataMatrix are eliminated.
  • Features and samples that have zero variance are eliminated.
  • Samples and features are sorted alphabetically in rows and columns of variableMetadata, sampleMetadata, and dataMatrix.
  • By default, the names of the first columns of variableMetadata and sampleMetadata are set respectively to "variableMetadata" and "sampleMetadata".

This tool also can perform several operations to reduce the number samples or features to be analyzed:

  • Samples may be eliminated by filtering on a designated “sample class” column in sampleMetadata.
  • Features may be eliminated by specifying minimum or maximum value (or both) allowable in columns of variableMetadata.
  • Features may be eliminated by specifying minimum or maximum intensity (or both) allowable in columns of dataMatrix for at least one sample for each feature (“range of row-maximum for each feature”).

The W4m Data Subset tool may be applied several times sequentially; for example, this may be useful for viewing clusters of progressively smaller subsets of samples.

Changes in version 0.98.8

New features

  • The tool now appears in Galaxy with a new, more representative name: "W4m Data Subset". (Earlier versions of this tool appeared in Galaxy with the name "Sample Subset".)
  • Option was added to log-transform data matrix values.
  • Output datasets are named in conformance with the W4m convention of appending the name of each preprocessing tool to the input dataset name.
  • Superflous "Column that names the sample" input parameter was eliminated.
  • Some documentation was updated or clarified.

Internal modifications

  • None

w4mclassfilter galaxy wrapper

30 Jan 02:22
Compare
Choose a tag to compare

Uploaded to Galay toolshed at
https://toolshed.g2.bx.psu.edu/repository?repository_id=5f24951d82ab40fa&changeset_revision=582a8a42a93b

CHANGES IN VERSION 0.98.7

NEW FEATURES

  • First column of output variableMetadata (that has feature names) now is always named variableMetadata
  • First column of output sampleMetadata now (that has sample names) is always named sampleMetadata

INTERNAL MODIFICATIONS

  • Now uses w4mclassfilter R package v0.98.7.

w4mclassfilter_galaxy_wrapper release v0.98.6

15 Jan 19:37
Compare
Choose a tag to compare

Release is what was updated to the toolshed at https://toolshed.g2.bx.psu.edu/view/eschen42/w4mclassfilter/38ccf6722d54

CHANGES IN VERSION 0.98.6

NEW FEATURES

  • Added support for filtering out features whose attributes fall outside specified ranges. For more detail, see "Variable-range filters" above.

INTERNAL MODIFICATIONS

  • Now uses w4mclassfilter R package v0.98.6.
  • Now sorts sample names and feature names in output files because some statistical tools expect the same order in dataMatrix row and column names as in the corresponding metadata files.