diff --git a/dev/composition/index.html b/dev/composition/index.html index 4c478e52..2c8175bf 100644 --- a/dev/composition/index.html +++ b/dev/composition/index.html @@ -1,2 +1,2 @@ -Composition · MLJBase.jl
+Composition · MLJBase.jl
diff --git a/dev/datasets/index.html b/dev/datasets/index.html index 9243f626..1f55cd0f 100644 --- a/dev/datasets/index.html +++ b/dev/datasets/index.html @@ -4,8 +4,8 @@ categorical=true)

Load it with DelimitedFiles and Tables

data_raw, data_header = readdlm(fpath, ',', header=true)
 data_table = Tables.table(data_raw; header=Symbol.(vec(data_header)))

Retrieve the conversions:

for (n, st) in zip(names(data), scitype_union.(eachcol(data)))
     println(":$n=>$st,")
-end

Copy and paste the result in a coerce

data_table = coerce(data_table, ...)
MLJBase.load_datasetMethod

load_dataset(fpath, coercions)

Load one of standard dataset like Boston etc assuming the file is a comma separated file with a header.

source
MLJBase.load_sunspotsMethod

Load a well-known sunspot time series (table with one column). [https://www.sws.bom.gov.au/Educational/2/3/6]](https://www.sws.bom.gov.au/Educational/2/3/6)

source
MLJBase.@load_amesMacro

Load the full version of the well-known Ames Housing task.

source
MLJBase.@load_bostonMacro

Load a well-known public regression dataset with Continuous features.

source
MLJBase.@load_crabsMacro

Load a well-known crab classification dataset with nominal features.

source
MLJBase.@load_irisMacro

Load a well-known public classification task with nominal features.

source
MLJBase.@load_reduced_amesMacro

Load a reduced version of the well-known Ames Housing task

source
MLJBase.@load_smarketMacro

Load S&P Stock Market dataset, as used in (An Introduction to Statistical Learning with applications in R)https://rdrr.io/cran/ISLR/man/Smarket.html, by Witten et al (2013), Springer-Verlag, New York.

source
MLJBase.@load_sunspotsMacro

Load a well-known sunspot time series (single table with one column).

source

Synthetic datasets

MLJBase.xConstant
finalize_Xy(X, y, shuffle, as_table, eltype, rng; clf)

Internal function to finalize the make_* functions.

source
MLJBase.augment_XMethod
augment_X(X, fit_intercept)

Given a matrix X, append a column of ones if fit_intercept is true. See make_regression.

source
MLJBase.make_blobsFunction
X, y = make_blobs(n=100, p=2; kwargs...)

Generate Gaussian blobs for clustering and classification problems.

Return value

By default, a table X with p columns (features) and n rows (observations), together with a corresponding vector of n Multiclass target observations y, indicating blob membership.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • centers=3: either a number of centers or a c x p matrix with c pre-determined centers,

  • cluster_std=1.0: the standard deviation(s) of each blob,

  • center_box=(-10. => 10.): the limits of the p-dimensional cube within which the cluster centers are drawn if they are not provided,

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])
source
MLJBase.make_circlesFunction
X, y = make_circles(n=100; kwargs...)

Generate n labeled points close to two concentric circles for classification and clustering models.

Return value

By default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the smaller or larger circle, respectively.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • noise=0: standard deviation of the Gaussian noise added to the data,

  • factor=0.8: ratio of the smaller radius over the larger one,

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_circles(100; noise=0.5, factor=0.3)
source
MLJBase.make_moonsFunction
    make_moons(n::Int=100; kwargs...)

Generates labeled two-dimensional points lying close to two interleaved semi-circles, for use with classification and clustering models.

Return value

By default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the left or right semi-circle.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • noise=0.1: standard deviation of the Gaussian noise added to the data,

  • xshift=1.0: horizontal translation of the second center with respect to the first one.

  • yshift=0.3: vertical translation of the second center with respect to the first one.

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_moons(100; noise=0.5)
source
MLJBase.make_regressionFunction
make_regression(n, p; kwargs...)

Generate Gaussian input features and a linear response with Gaussian noise, for use with regression models.

Return value

By default, a tuple (X, y) where table X has p columns and n rows (observations), together with a corresponding vector of n Continuous target observations y.

Keywords

  • intercept=true: Whether to generate data from a model with intercept.

  • n_targets=1: Number of columns in the target.

  • sparse=0: Proportion of the generating weight vector that is sparse.

  • noise=0.1: Standard deviation of the Gaussian noise added to the response (target).

  • outliers=0: Proportion of the response vector to make as outliers by adding a random quantity with high variance. (Only applied if binary is false.)

  • as_table=true: Whether X (and y, if n_targets > 1) should be a table or a matrix.

  • eltype=Float64: Element type for X and y. Must subtype AbstractFloat.

  • binary=false: Whether the target should be binarized (via a sigmoid).

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false).

Example

X, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)
source
MLJBase.outlify!Method

Add outliers to portion s of vector.

source
MLJBase.runif_abMethod
runif_ab(rng, n, p, a, b)

Internal function to generate n points in [a, b]ᵖ uniformly at random.

source
MLJBase.sigmoidMethod
sigmoid(x)

Return the sigmoid computed in a numerically stable way:

$σ(x) = 1/(1+exp(-x))$

source
MLJBase.sparsify!Method
sparsify!(rng, θ, s)

Make portion s of vector θ exactly 0.

source

Utility functions

MLJBase.complementMethod
complement(folds, i)

The complement of the ith fold of folds in the concatenation of all elements of folds. Here folds is a vector or tuple of integer vectors, typically representing row indices or a vector, matrix or table.

complement(([1,2], [3,], [4, 5]), 2) # [1 ,2, 4, 5]
source
MLJBase.corestrictMethod
corestrict(X, folds, i)

The restriction of X, a vector, matrix or table, to the complement of the ith fold of folds, where folds is a tuple of vectors of row indices.

The method is curried, so that corestrict(folds, i) is the operator on data defined by corestrict(folds, i)(X) = corestrict(X, folds, i).

Example

folds = ([1, 2], [3, 4, 5],  [6,])
-corestrict([:x1, :x2, :x3, :x4, :x5, :x6], folds, 2) # [:x1, :x2, :x6]
source
MLJBase.partitionMethod
partition(X, fractions...;
+end

Copy and paste the result in a coerce

data_table = coerce(data_table, ...)
MLJBase.load_datasetMethod

load_dataset(fpath, coercions)

Load one of standard dataset like Boston etc assuming the file is a comma separated file with a header.

source
MLJBase.load_sunspotsMethod

Load a well-known sunspot time series (table with one column). [https://www.sws.bom.gov.au/Educational/2/3/6]](https://www.sws.bom.gov.au/Educational/2/3/6)

source

Synthetic datasets

MLJBase.xConstant
finalize_Xy(X, y, shuffle, as_table, eltype, rng; clf)

Internal function to finalize the make_* functions.

source
MLJBase.make_blobsFunction
X, y = make_blobs(n=100, p=2; kwargs...)

Generate Gaussian blobs for clustering and classification problems.

Return value

By default, a table X with p columns (features) and n rows (observations), together with a corresponding vector of n Multiclass target observations y, indicating blob membership.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • centers=3: either a number of centers or a c x p matrix with c pre-determined centers,

  • cluster_std=1.0: the standard deviation(s) of each blob,

  • center_box=(-10. => 10.): the limits of the p-dimensional cube within which the cluster centers are drawn if they are not provided,

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])
source
MLJBase.make_circlesFunction
X, y = make_circles(n=100; kwargs...)

Generate n labeled points close to two concentric circles for classification and clustering models.

Return value

By default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the smaller or larger circle, respectively.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • noise=0: standard deviation of the Gaussian noise added to the data,

  • factor=0.8: ratio of the smaller radius over the larger one,

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_circles(100; noise=0.5, factor=0.3)
source
MLJBase.make_moonsFunction
    make_moons(n::Int=100; kwargs...)

Generates labeled two-dimensional points lying close to two interleaved semi-circles, for use with classification and clustering models.

Return value

By default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the left or right semi-circle.

Keyword arguments

  • shuffle=true: whether to shuffle the resulting points,

  • noise=0.1: standard deviation of the Gaussian noise added to the data,

  • xshift=1.0: horizontal translation of the second center with respect to the first one.

  • yshift=0.3: vertical translation of the second center with respect to the first one.

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type.

Example

X, y = make_moons(100; noise=0.5)
source
MLJBase.make_regressionFunction
make_regression(n, p; kwargs...)

Generate Gaussian input features and a linear response with Gaussian noise, for use with regression models.

Return value

By default, a tuple (X, y) where table X has p columns and n rows (observations), together with a corresponding vector of n Continuous target observations y.

Keywords

  • intercept=true: Whether to generate data from a model with intercept.

  • n_targets=1: Number of columns in the target.

  • sparse=0: Proportion of the generating weight vector that is sparse.

  • noise=0.1: Standard deviation of the Gaussian noise added to the response (target).

  • outliers=0: Proportion of the response vector to make as outliers by adding a random quantity with high variance. (Only applied if binary is false.)

  • as_table=true: Whether X (and y, if n_targets > 1) should be a table or a matrix.

  • eltype=Float64: Element type for X and y. Must subtype AbstractFloat.

  • binary=false: Whether the target should be binarized (via a sigmoid).

  • eltype=Float64: machine type of points (any subtype of AbstractFloat).

  • rng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).

  • as_table=true: whether to return the points as a table (true) or a matrix (false).

Example

X, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)
source
MLJBase.runif_abMethod
runif_ab(rng, n, p, a, b)

Internal function to generate n points in [a, b]ᵖ uniformly at random.

source
MLJBase.sigmoidMethod
sigmoid(x)

Return the sigmoid computed in a numerically stable way:

$σ(x) = 1/(1+exp(-x))$

source

Utility functions

MLJBase.complementMethod
complement(folds, i)

The complement of the ith fold of folds in the concatenation of all elements of folds. Here folds is a vector or tuple of integer vectors, typically representing row indices or a vector, matrix or table.

complement(([1,2], [3,], [4, 5]), 2) # [1 ,2, 4, 5]
source
MLJBase.corestrictMethod
corestrict(X, folds, i)

The restriction of X, a vector, matrix or table, to the complement of the ith fold of folds, where folds is a tuple of vectors of row indices.

The method is curried, so that corestrict(folds, i) is the operator on data defined by corestrict(folds, i)(X) = corestrict(X, folds, i).

Example

folds = ([1, 2], [3, 4, 5],  [6,])
+corestrict([:x1, :x2, :x3, :x4, :x5, :x6], folds, 2) # [:x1, :x2, :x6]
source
MLJBase.partitionMethod
partition(X, fractions...;
           shuffle=nothing,
           rng=Random.GLOBAL_RNG,
           stratify=nothing,
@@ -21,8 +21,8 @@
 X, y = make_blobs() # a table and vector
 Xtrain, Xtest = partition(X, 0.8, stratify=y)
 
-(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)

Keywords

  • shuffle=nothing: if set to true, shuffles the rows before taking fractions.

  • rng=Random.GLOBAL_RNG: specifies the random number generator to be used, can be an integer seed. If specified, and shuffle === nothing is interpreted as true.

  • stratify=nothing: if a vector is specified, the partition will match the stratification of the given vector. In that case, shuffle cannot be false.

  • multi=false: if true then X is expected to be a tuple of objects sharing a common length, which are each partitioned separately using the same specified fractions and the same row shuffling. Returns a tuple of partitions (a tuple of tuples).

source
MLJBase.restrictMethod
restrict(X, folds, i)

The restriction of X, a vector, matrix or table, to the ith fold of folds, where folds is a tuple of vectors of row indices.

The method is curried, so that restrict(folds, i) is the operator on data defined by restrict(folds, i)(X) = restrict(X, folds, i).

Example

folds = ([1, 2], [3, 4, 5],  [6,])
-restrict([:x1, :x2, :x3, :x4, :x5, :x6], folds, 2) # [:x3, :x4, :x5]

See also corestrict

source
MLJBase.skipinvalidMethod
skipinvalid(itr)

Return an iterator over the elements in itr skipping missing and NaN values. Behaviour is similar to skipmissing.

skipinvalid(A, B)

For vectors A and B of the same length, return a tuple of vectors (A[mask], B[mask]) where mask[i] is true if and only if A[i] and B[i] are both valid (non-missing and non-NaN). Can also called on other iterators of matching length, such as arrays, but always returns a vector. Does not remove Missing from the element types if present in the original iterators.

source
MLJBase.unpackMethod
unpack(table, f1, f2, ... fk;
+(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)

Keywords

  • shuffle=nothing: if set to true, shuffles the rows before taking fractions.

  • rng=Random.GLOBAL_RNG: specifies the random number generator to be used, can be an integer seed. If specified, and shuffle === nothing is interpreted as true.

  • stratify=nothing: if a vector is specified, the partition will match the stratification of the given vector. In that case, shuffle cannot be false.

  • multi=false: if true then X is expected to be a tuple of objects sharing a common length, which are each partitioned separately using the same specified fractions and the same row shuffling. Returns a tuple of partitions (a tuple of tuples).

source
MLJBase.restrictMethod
restrict(X, folds, i)

The restriction of X, a vector, matrix or table, to the ith fold of folds, where folds is a tuple of vectors of row indices.

The method is curried, so that restrict(folds, i) is the operator on data defined by restrict(folds, i)(X) = restrict(X, folds, i).

Example

folds = ([1, 2], [3, 4, 5],  [6,])
+restrict([:x1, :x2, :x3, :x4, :x5, :x6], folds, 2) # [:x3, :x4, :x5]

See also corestrict

source
MLJBase.skipinvalidMethod
skipinvalid(itr)

Return an iterator over the elements in itr skipping missing and NaN values. Behaviour is similar to skipmissing.

skipinvalid(A, B)

For vectors A and B of the same length, return a tuple of vectors (A[mask], B[mask]) where mask[i] is true if and only if A[i] and B[i] are both valid (non-missing and non-NaN). Can also called on other iterators of matching length, such as arrays, but always returns a vector. Does not remove Missing from the element types if present in the original iterators.

source
MLJBase.unpackMethod
unpack(table, f1, f2, ... fk;
        wrap_singles=false,
        shuffle=false,
        rng::Union{AbstractRNG,Int,Nothing}=nothing,
@@ -51,4 +51,4 @@
 julia> W  # the column(s) left over
 2-element Vector{String}:
  "A"
- "B"

Whenever a returned table contains a single column, it is converted to a vector unless wrap_singles=true.

If coerce_options are specified then table is first replaced with coerce(table, coerce_options). See ScientificTypes.coerce for details.

If shuffle=true then the rows of table are first shuffled, using the global RNG, unless rng is specified; if rng is an integer, it specifies the seed of an automatically generated Mersenne twister. If rng is specified then shuffle=true is implicit.

source
+ "B"

Whenever a returned table contains a single column, it is converted to a vector unless wrap_singles=true.

If coerce_options are specified then table is first replaced with coerce(table, coerce_options). See ScientificTypes.coerce for details.

If shuffle=true then the rows of table are first shuffled, using the global RNG, unless rng is specified; if rng is an integer, it specifies the seed of an automatically generated Mersenne twister. If rng is specified then shuffle=true is implicit.

source diff --git a/dev/distributions/index.html b/dev/distributions/index.html index 043f898b..eef8bf22 100644 --- a/dev/distributions/index.html +++ b/dev/distributions/index.html @@ -26,6 +26,6 @@ [5.0, 5.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 221 [5.5, 6.0) ┤ 0 [6.0, 6.5) ┤▇▇▇▇▇▇▇▇▇▇▇ 89 - └ ┘source
MLJBase.iteratorMethod
iterator([rng, ], r::NominalRange, [,n])
-iterator([rng, ], r::NumericRange, n)

Return an iterator (currently a vector) for a ParamRange object r. In the first case iteration is over all values stored in the range (or just the first n, if n is specified). In the second case, the iteration is over approximately n ordered values, generated as follows:

(i) First, exactly n values are generated between U and L, with a spacing determined by r.scale (uniform if scale=:linear) where U and L are given by the following table:

r.lowerr.upperLU
finitefiniter.lowerr.upper
-Inffiniter.upper - 2r.unitr.upper
finiteInfr.lowerr.lower + 2r.unit
-InfInfr.origin - r.unitr.origin + r.unit

(ii) If a callable f is provided as scale, then a uniform spacing is always applied in (i) but f is broadcast over the results. (Unlike ordinary scales, this alters the effective range of values generated, instead of just altering the spacing.)

(iii) If r is a discrete numeric range (r isa NumericRange{<:Integer}) then the values are additionally rounded, with any duplicate values removed. Otherwise all the values are used (and there are exacltly n of them).

(iv) Finally, if a random number generator rng is specified, then the values are returned in random order (sampling without replacement), and otherwise they are returned in numeric order, or in the order provided to the range constructor, in the case of a NominalRange.

source
MLJBase.scaleMethod
scale(r::ParamRange)

Return the scale associated with a ParamRange object r. The possible return values are: :none (for a NominalRange), :linear, :log, :log10, :log2, or :custom (if r.scale is a callable object).

source
StatsAPI.fitMethod
Distributions.fit(D, r::MLJBase.NumericRange)

Fit and return a distribution d of type D to the one-dimensional range r.

Only types D in the table below are supported.

The distribution d is constructed in two stages. First, a distributon d0, characterized by the conditions in the second column of the table, is fit to r. Then d0 is truncated between r.lower and r.upper to obtain d.

Distribution type DCharacterization of d0
Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweightminimum(d) = r.lower, maximum(d) = r.upper
Normal, Gamma, InverseGaussian, Logistic, LogNormalmean(d) = r.origin, std(d) = r.unit
Cauchy, Gumbel, Laplace, (Normal)Dist.location(d) = r.origin, Dist.scale(d) = r.unit
PoissonDist.mean(d) = r.unit

Here Dist = Distributions.

source
Base.rangeMethod
r = range(model, :hyper; values=nothing)

Define a one-dimensional NominalRange object for a field hyper of model. Note that r is not directly iterable but iterator(r) is.

A nested hyperparameter is specified using dot notation. For example, :(atom.max_depth) specifies the max_depth hyperparameter of the submodel model.atom.

r = range(model, :hyper; upper=nothing, lower=nothing,
-          scale=nothing, values=nothing)

Assuming values is not specified, define a one-dimensional NumericRange object for a Real field hyper of model. Note that r is not directly iteratable but iterator(r, n)is an iterator of length n. To generate random elements from r, instead apply rand methods to sampler(r). The supported scales are :linear,:log, :logminus, :log10, :log10minus, :log2, or a callable object.

Note that r is not directly iterable, but iterator(r, n) is, for given resolution (length) n.

By default, the behaviour of the constructed object depends on the type of the value of the hyperparameter :hyper at model at the time of construction. To override this behaviour (for instance if model is not available) specify a type in place of model so the behaviour is determined by the value of the specified type.

A nested hyperparameter is specified using dot notation (see above).

If scale is unspecified, it is set to :linear, :log, :log10minus, or :linear, according to whether the interval (lower, upper) is bounded, right-unbounded, left-unbounded, or doubly unbounded, respectively. Note upper=Inf and lower=-Inf are allowed.

If values is specified, the other keyword arguments are ignored and a NominalRange object is returned (see above).

See also: iterator, sampler

source

Utility functions

+ └ ┘source
MLJBase.iteratorMethod
iterator([rng, ], r::NominalRange, [,n])
+iterator([rng, ], r::NumericRange, n)

Return an iterator (currently a vector) for a ParamRange object r. In the first case iteration is over all values stored in the range (or just the first n, if n is specified). In the second case, the iteration is over approximately n ordered values, generated as follows:

(i) First, exactly n values are generated between U and L, with a spacing determined by r.scale (uniform if scale=:linear) where U and L are given by the following table:

r.lowerr.upperLU
finitefiniter.lowerr.upper
-Inffiniter.upper - 2r.unitr.upper
finiteInfr.lowerr.lower + 2r.unit
-InfInfr.origin - r.unitr.origin + r.unit

(ii) If a callable f is provided as scale, then a uniform spacing is always applied in (i) but f is broadcast over the results. (Unlike ordinary scales, this alters the effective range of values generated, instead of just altering the spacing.)

(iii) If r is a discrete numeric range (r isa NumericRange{<:Integer}) then the values are additionally rounded, with any duplicate values removed. Otherwise all the values are used (and there are exacltly n of them).

(iv) Finally, if a random number generator rng is specified, then the values are returned in random order (sampling without replacement), and otherwise they are returned in numeric order, or in the order provided to the range constructor, in the case of a NominalRange.

source
MLJBase.scaleMethod
scale(r::ParamRange)

Return the scale associated with a ParamRange object r. The possible return values are: :none (for a NominalRange), :linear, :log, :log10, :log2, or :custom (if r.scale is a callable object).

source
StatsAPI.fitMethod
Distributions.fit(D, r::MLJBase.NumericRange)

Fit and return a distribution d of type D to the one-dimensional range r.

Only types D in the table below are supported.

The distribution d is constructed in two stages. First, a distributon d0, characterized by the conditions in the second column of the table, is fit to r. Then d0 is truncated between r.lower and r.upper to obtain d.

Distribution type DCharacterization of d0
Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweightminimum(d) = r.lower, maximum(d) = r.upper
Normal, Gamma, InverseGaussian, Logistic, LogNormalmean(d) = r.origin, std(d) = r.unit
Cauchy, Gumbel, Laplace, (Normal)Dist.location(d) = r.origin, Dist.scale(d) = r.unit
PoissonDist.mean(d) = r.unit

Here Dist = Distributions.

source
Base.rangeMethod
r = range(model, :hyper; values=nothing)

Define a one-dimensional NominalRange object for a field hyper of model. Note that r is not directly iterable but iterator(r) is.

A nested hyperparameter is specified using dot notation. For example, :(atom.max_depth) specifies the max_depth hyperparameter of the submodel model.atom.

r = range(model, :hyper; upper=nothing, lower=nothing,
+          scale=nothing, values=nothing)

Assuming values is not specified, define a one-dimensional NumericRange object for a Real field hyper of model. Note that r is not directly iteratable but iterator(r, n)is an iterator of length n. To generate random elements from r, instead apply rand methods to sampler(r). The supported scales are :linear,:log, :logminus, :log10, :log10minus, :log2, or a callable object.

Note that r is not directly iterable, but iterator(r, n) is, for given resolution (length) n.

By default, the behaviour of the constructed object depends on the type of the value of the hyperparameter :hyper at model at the time of construction. To override this behaviour (for instance if model is not available) specify a type in place of model so the behaviour is determined by the value of the specified type.

A nested hyperparameter is specified using dot notation (see above).

If scale is unspecified, it is set to :linear, :log, :log10minus, or :linear, according to whether the interval (lower, upper) is bounded, right-unbounded, left-unbounded, or doubly unbounded, respectively. Note upper=Inf and lower=-Inf are allowed.

If values is specified, the other keyword arguments are ignored and a NominalRange object is returned (see above).

See also: iterator, sampler

source

Utility functions

diff --git a/dev/index.html b/dev/index.html index 40785861..b810afc5 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · MLJBase.jl

MLJBase.jl

These docs are bare-bones and auto-generated. Complete MLJ documentation is here.

For MLJBase-specific developer information, see also the README.md file.

+Home · MLJBase.jl

MLJBase.jl

These docs are bare-bones and auto-generated. Complete MLJ documentation is here.

For MLJBase-specific developer information, see also the README.md file.

diff --git a/dev/measures/index.html b/dev/measures/index.html index ea42e82b..c60414ca 100644 --- a/dev/measures/index.html +++ b/dev/measures/index.html @@ -1,10 +1,2 @@ -Measures · MLJBase.jl

Measures

Helper functions

Continuous loss functions

Confusion matrix

MLJBase.ConfusionMatrixObjectType
ConfusionMatrixObject{C}

Confusion matrix with C ≥ 2 classes. Rows correspond to predicted values and columns to the ground truth.

source
MLJBase._confmatMethod
_confmat(ŷ, y; rev=false)

A private method. General users should use confmat or other instances of the measure type ConfusionMatrix.

Computes the confusion matrix given a predicted with categorical elements and the actual y. Rows are the predicted class, columns the ground truth. The ordering follows that of levels(y).

Keywords

  • rev=false: in the binary case, this keyword allows to swap the ordering of classes.
  • perm=[]: in the general case, this keyword allows to specify a permutation re-ordering the classes.
  • warn=true: whether to show a warning in case y does not have scientific type OrderedFactor{2} (see note below).

Note

To decrease the risk of unexpected errors, if y does not have scientific type OrderedFactor{2} (and so does not have a "natural ordering" negative-positive), a warning is shown indicating the current order unless the user explicitly specifies either rev or perm in which case it's assumed the user is aware of the class ordering.

The confusion_matrix is a measure (although neither a score nor a loss) and so may be specified as such in calls to evaluate, evaluate!, although not in TunedModels. In this case, however, there no way to specify an ordering different from levels(y), where y is the target.

source

Finite loss functions

MLJBase.MulticlassFScoreType
MulticlassFScore(; β=1.0, average=macro_avg, return_type=LittleDict)

One-parameter generalization, $F_β$, of the F-measure or balanced F-score for multiclass observations.

MulticlassFScore()(ŷ, y)
-MulticlassFScore()(ŷ, y, class_w)

Evaluate the default score on multiclass observations, , given ground truth values, y. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.

For more information, run info(MulticlassFScore).

source
MLJBase.MulticlassFalseDiscoveryRateType
MulticlassFalseDiscoveryRate(; average=macro_avg, return_type=LittleDict)

multiclass false discovery rate; aliases: multiclass_false_discovery_rate, multiclass_falsediscovery_rate, multiclass_fdr.

MulticlassFalseDiscoveryRate()(ŷ, y)
-MulticlassFalseDiscoveryRate()(ŷ, y, class_w)

False discovery rate for multiclass observations and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.

For more information, run info(MulticlassFalseDiscoveryRate).

source
MLJBase.MulticlassFalseNegativeRateType
MulticlassFalseNegativeRate(; average=macro_avg, return_type=LittleDict)

multiclass false negative rate; aliases: multiclass_false_negative_rate, multiclass_fnr, multiclass_miss_rate, multiclass_falsenegative_rate.

MulticlassFalseNegativeRate()(ŷ, y)
-MulticlassFalseNegativeRate()(ŷ, y, class_w)

False negative rate for multiclass observations and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.

For more information, run info(MulticlassFalseNegativeRate).

source
MLJBase.MulticlassFalsePositiveRateType
MulticlassFalsePositiveRate(; average=macro_avg, return_type=LittleDict)

multiclass false positive rate; aliases: multiclass_false_positive_rate, multiclass_fpr multiclass_fallout, multiclass_falsepositive_rate.

MulticlassFalsePositiveRate()(ŷ, y)
-MulticlassFalsePositiveRate()(ŷ, y, class_w)

False positive rate for multiclass observations and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.

For more information, run info(MulticlassFalsePositiveRate).

source
MLJBase.MulticlassNegativePredictiveValueType
MulticlassNegativePredictiveValue(; average=macro_avg, return_type=LittleDict)

multiclass negative predictive value; aliases: multiclass_negative_predictive_value, multiclass_negativepredictive_value, multiclass_npv.

MulticlassNegativePredictiveValue()(ŷ, y)
-MulticlassNegativePredictiveValue()(ŷ, y, class_w)

Negative predictive value for multiclass observations and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.

For more information, run info(MulticlassNegativePredictiveValue).

source
MLJBase.MulticlassPrecisionType
MulticlassPrecision(; average=macro_avg, return_type=LittleDict)

multiclass positive predictive value (aka precision); aliases: multiclass_positive_predictive_value, multiclass_ppv, multiclass_positivepredictive_value, multiclass_precision.

MulticlassPrecision()(ŷ, y)
-MulticlassPrecision()(ŷ, y, class_w)

Precision for multiclass observations and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.

For more information, run info(MulticlassPrecision).

source
MLJBase.MulticlassTrueNegativeType
MulticlassTrueNegative(; return_type=LittleDict)

Number of true negatives; aliases: multiclass_true_negative, multiclass_truenegative.

MulticlassTrueNegative()(ŷ, y)

Number of true negatives for multiclass observations and ground truth y, using default return type. Options for return_type are: LittleDict(default) or Vector.

For more information, run info(MulticlassTrueNegative).

source
MLJBase.MulticlassTrueNegativeRateType
MulticlassTrueNegativeRate(; average=macro_avg, return_type=LittleDict)

multiclass true negative rate; aliases: multiclass_true_negative_rate, multiclass_tnr multiclass_specificity, multiclass_selectivity, multiclass_truenegative_rate.

MulticlassTrueNegativeRate()(ŷ, y)
-MulticlassTrueNegativeRate()(ŷ, y, class_w)

True negative rate for multiclass observations and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.

For more information, run info(MulticlassTrueNegativeRate).

source
MLJBase.MulticlassTruePositiveType
MulticlassTruePositive(; return_type=LittleDict)

Number of true positives; aliases: multiclass_true_positive, multiclass_truepositive.

MulticlassTruePositive()(ŷ, y)

Number of true positives for multiclass observations and ground truth y, using default return type. Options for return_type are: LittleDict(default) or Vector.

For more information, run info(MulticlassTruePositive).

source
MLJBase.MulticlassTruePositiveRateType
MulticlassTruePositiveRate(; average=macro_avg, return_type=LittleDict)

multiclass true positive rate; aliases: multiclass_true_positive_rate, multiclass_tpr, multiclass_sensitivity, multiclass_recall, multiclass_hit_rate, multiclass_truepositive_rate,

MulticlassTruePositiveRate(ŷ, y)
-MulticlassTruePositiveRate(ŷ, y, class_w)

True positive rate (a.k.a. sensitivity, recall, hit rate) for multiclass observations and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.

For more information, run info(MulticlassTruePositiveRate).

source
MLJBase.MulticlassNegativeFunction
MulticlassFalseNegative(; return_type=LittleDict)

Number of false negatives; aliases: multiclass_false_negative, multiclass_falsenegative.

MulticlassFalseNegative()(ŷ, y)

Number of false negatives for multiclass observations and ground truth y, using default return type. Options for return_type are: LittleDict(default) or Vector.

For more information, run info(MulticlassFalseNegative).

source
MLJBase.MulticlassPositiveFunction
MulticlassFalsePositive(; return_type=LittleDict)

Number of false positives; aliases: multiclass_false_positive, multiclass_falsepositive.

MulticlassFalsePositive()(ŷ, y)

Number of false positives for multiclass observations and ground truth y, using default return type. Options for return_type are: LittleDict(default) or Vector.

For more information, run info(MulticlassFalsePositive).

source
+Measures · MLJBase.jl
diff --git a/dev/resampling/index.html b/dev/resampling/index.html index 7cc03a26..6209f613 100644 --- a/dev/resampling/index.html +++ b/dev/resampling/index.html @@ -1,7 +1,7 @@ -Resampling · MLJBase.jl

Resampling

MLJBase.CVType
cv = CV(; nfolds=6,  shuffle=nothing, rng=nothing)

Cross-validation resampling strategy, for use in evaluate!, evaluate and tuning.

train_test_pairs(cv, rows)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector. With no row pre-shuffling, the order of rows is preserved, in the sense that rows coincides precisely with the concatenation of the test vectors, in the order they are generated. The first r test vectors have length n + 1, where n, r = divrem(length(rows), nfolds), and the remaining test vectors have length n.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the CV keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.

source
MLJBase.HoldoutType
holdout = Holdout(; fraction_train=0.7,
+Resampling · MLJBase.jl

Resampling

MLJBase.CVType
cv = CV(; nfolds=6,  shuffle=nothing, rng=nothing)

Cross-validation resampling strategy, for use in evaluate!, evaluate and tuning.

train_test_pairs(cv, rows)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector. With no row pre-shuffling, the order of rows is preserved, in the sense that rows coincides precisely with the concatenation of the test vectors, in the order they are generated. The first r test vectors have length n + 1, where n, r = divrem(length(rows), nfolds), and the remaining test vectors have length n.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the CV keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.

source
MLJBase.HoldoutType
holdout = Holdout(; fraction_train=0.7,
                      shuffle=nothing,
-                     rng=nothing)

Holdout resampling strategy, for use in evaluate!, evaluate and in tuning.

train_test_pairs(holdout, rows)

Returns the pair [(train, test)], where train and test are vectors such that rows=vcat(train, test) and length(train)/length(rows) is approximatey equal to fraction_train`.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the Holdout keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is specified.

source
MLJBase.PerformanceEvaluationType
PerformanceEvaluation

Type of object returned by evaluate (for models plus data) or evaluate! (for machines). Such objects encode estimates of the performance (generalization error) of a supervised model or outlier detection model.

When evaluate/evaluate! is called, a number of train/test pairs ("folds") of row indices are generated, according to the options provided, which are discussed in the evaluate! doc-string. Rows correspond to observations. The generated train/test pairs are recorded in the train_test_rows field of the PerformanceEvaluation struct, and the corresponding estimates, aggregated over all train/test pairs, are recorded in measurement, a vector with one entry for each measure (metric) recorded in measure.

When displayed, a PerformanceEvalution object includes a value under the heading 1.96*SE, derived from the standard error of the per_fold entries. This value is suitable for constructing a formal 95% confidence interval for the given measurement. Such intervals should be interpreted with caution. See, for example, Bates et al. (2021).

Fields

These fields are part of the public API of the PerformanceEvaluation struct.

  • model: model used to create the performance evaluation. In the case a tuning model, this is the best model found.

  • measure: vector of measures (metrics) used to evaluate performance

  • measurement: vector of measurements - one for each element of measure - aggregating the performance measurements over all train/test pairs (folds). The aggregation method applied for a given measure m is aggregation(m) (commonly Mean or Sum)

  • operation (e.g., predict_mode): the operations applied for each measure to generate predictions to be evaluated. Possibilities are: predict, predict_mean, predict_mode, predict_median, or predict_joint.

  • per_fold: a vector of vectors of individual test fold evaluations (one vector per measure). Useful for obtaining a rough estimate of the variance of the performance estimate.

  • per_observation: a vector of vectors of individual observation evaluations of those measures for which reports_each_observation(measure) is true, which is otherwise reported missing. Useful for some forms of hyper-parameter optimization.

  • fitted_params_per_fold: a vector containing fitted params(mach) for each machine mach trained during resampling - one machine per train/test pair. Use this to extract the learned parameters for each individual training event.

  • report_per_fold: a vector containing report(mach) for each machine mach training in resampling - one machine per train/test pair.

  • train_test_rows: a vector of tuples, each of the form (train, test), where train and test are vectors of row (observation) indices for training and evaluation respectively.

  • resampling: the resampling strategy used to generate the train/test pairs.

  • repeats: the number of times the resampling strategy was repeated.

source
MLJBase.ResamplerType
resampler = Resampler(
+                     rng=nothing)

Holdout resampling strategy, for use in evaluate!, evaluate and in tuning.

train_test_pairs(holdout, rows)

Returns the pair [(train, test)], where train and test are vectors such that rows=vcat(train, test) and length(train)/length(rows) is approximatey equal to fraction_train`.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the Holdout keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is specified.

source
MLJBase.PerformanceEvaluationType
PerformanceEvaluation

Type of object returned by evaluate (for models plus data) or evaluate! (for machines). Such objects encode estimates of the performance (generalization error) of a supervised model or outlier detection model.

When evaluate/evaluate! is called, a number of train/test pairs ("folds") of row indices are generated, according to the options provided, which are discussed in the evaluate! doc-string. Rows correspond to observations. The generated train/test pairs are recorded in the train_test_rows field of the PerformanceEvaluation struct, and the corresponding estimates, aggregated over all train/test pairs, are recorded in measurement, a vector with one entry for each measure (metric) recorded in measure.

When displayed, a PerformanceEvalution object includes a value under the heading 1.96*SE, derived from the standard error of the per_fold entries. This value is suitable for constructing a formal 95% confidence interval for the given measurement. Such intervals should be interpreted with caution. See, for example, Bates et al. (2021).

Fields

These fields are part of the public API of the PerformanceEvaluation struct.

  • model: model used to create the performance evaluation. In the case a tuning model, this is the best model found.

  • measure: vector of measures (metrics) used to evaluate performance

  • measurement: vector of measurements - one for each element of measure - aggregating the performance measurements over all train/test pairs (folds). The aggregation method applied for a given measure m is StatisticalMeasuresBase.external_aggregation_mode(m) (commonly Mean() or Sum())

  • operation (e.g., predict_mode): the operations applied for each measure to generate predictions to be evaluated. Possibilities are: predict, predict_mean, predict_mode, predict_median, or predict_joint.

  • per_fold: a vector of vectors of individual test fold evaluations (one vector per measure). Useful for obtaining a rough estimate of the variance of the performance estimate.

  • per_observation: a vector of vectors of vectors containing individual per-observation measurements: for an evaluation e, e.per_observation[m][f][i] is the measurement for the ith observation in the fth test fold, evaluated using the mth measure. Useful for some forms of hyper-parameter optimization. Note that an aggregregated measurement for some measure measure is repeated across all observations in a fold if StatisticalMeasures.can_report_unaggregated(measure) == true. If e has been computed with the per_observation=false option, then e_per_observation is a vector of missings.

  • fitted_params_per_fold: a vector containing fitted params(mach) for each machine mach trained during resampling - one machine per train/test pair. Use this to extract the learned parameters for each individual training event.

  • report_per_fold: a vector containing report(mach) for each machine mach training in resampling - one machine per train/test pair.

  • train_test_rows: a vector of tuples, each of the form (train, test), where train and test are vectors of row (observation) indices for training and evaluation respectively.

  • resampling: the resampling strategy used to generate the train/test pairs.

  • repeats: the number of times the resampling strategy was repeated.

source
MLJBase.ResamplerType
resampler = Resampler(
     model=ConstantRegressor(),
     resampling=CV(),
     measure=nothing,
@@ -11,10 +11,11 @@
     repeats = 1,
     acceleration=default_resource(),
     check_measure=true,
-    logger=nothing
-)

Resampling model wrapper, used internally by the fit method of TunedModel instances and IteratedModel instances. See `evaluate! for options. Not intended for general use.

Given a machine mach = machine(resampler, args...) one obtains a performance evaluation of the specified model, performed according to the prescribed resampling strategy and other parameters, using data args..., by calling fit!(mach) followed by evaluate(mach).

On subsequent calls to fit!(mach) new train/test pairs of row indices are only regenerated if resampling, repeats or cache fields of resampler have changed. The evolution of an RNG field of resampler does not constitute a change (== for MLJType objects is not sensitive to such changes; see `issameexcept').

If there is single train/test pair, then warm-restart behavior of the wrapped model resampler.model will extend to warm-restart behaviour of the wrapper resampler, with respect to mutations of the wrapped model.

The sample weights are passed to the specified performance measures that support weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.

The sample class_weights are passed to the specified performance measures that support per-class weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.

source
MLJBase.StratifiedCVType
stratified_cv = StratifiedCV(; nfolds=6,
+    per_observation=true,
+    logger=nothing,
+)

Resampling model wrapper, used internally by the fit method of TunedModel instances and IteratedModel instances. See `evaluate! for options. Not intended for use by general user, who will ordinarily use evaluate! directly.

Given a machine mach = machine(resampler, args...) one obtains a performance evaluation of the specified model, performed according to the prescribed resampling strategy and other parameters, using data args..., by calling fit!(mach) followed by evaluate(mach).

On subsequent calls to fit!(mach) new train/test pairs of row indices are only regenerated if resampling, repeats or cache fields of resampler have changed. The evolution of an RNG field of resampler does not constitute a change (== for MLJType objects is not sensitive to such changes; see is_same_except).

If there is single train/test pair, then warm-restart behavior of the wrapped model resampler.model will extend to warm-restart behaviour of the wrapper resampler, with respect to mutations of the wrapped model.

The sample weights are passed to the specified performance measures that support weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.

The sample class_weights are passed to the specified performance measures that support per-class weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.

source
MLJBase.StratifiedCVType
stratified_cv = StratifiedCV(; nfolds=6,
                                shuffle=false,
-                               rng=Random.GLOBAL_RNG)

Stratified cross-validation resampling strategy, for use in evaluate!, evaluate and in tuning. Applies only to classification problems (OrderedFactor or Multiclass targets).

train_test_pairs(stratified_cv, rows, y)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices) where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector.

Unlike regular cross-validation, the distribution of the levels of the target y corresponding to each train and test is constrained, as far as possible, to replicate that of y[rows] as a whole.

The stratified train_test_pairs algorithm is invariant to label renaming. For example, if you run replace!(y, 'a' => 'b', 'b' => 'a') and then re-run train_test_pairs, the returned (train, test) pairs will be the same.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the StratifedCV keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.

source
MLJBase.TimeSeriesCVType
tscv = TimeSeriesCV(; nfolds=4)

Cross-validation resampling strategy, for use in evaluate!, evaluate and tuning, when observations are chronological and not expected to be independent.

train_test_pairs(tscv, rows)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The rows are partitioned sequentially into nfolds + 1 approximately equal length partitions, where the first partition is the first train set, and the second partition is the first test set. The second train set consists of the first two partitions, and the second test set consists of the third partition, and so on for each fold.

The first partition (which is the first train set) has length n + r, where n, r = divrem(length(rows), nfolds + 1), and the remaining partitions (all of the test folds) have length n.

Examples

julia> MLJBase.train_test_pairs(TimeSeriesCV(nfolds=3), 1:10)
+                               rng=Random.GLOBAL_RNG)

Stratified cross-validation resampling strategy, for use in evaluate!, evaluate and in tuning. Applies only to classification problems (OrderedFactor or Multiclass targets).

train_test_pairs(stratified_cv, rows, y)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices) where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector.

Unlike regular cross-validation, the distribution of the levels of the target y corresponding to each train and test is constrained, as far as possible, to replicate that of y[rows] as a whole.

The stratified train_test_pairs algorithm is invariant to label renaming. For example, if you run replace!(y, 'a' => 'b', 'b' => 'a') and then re-run train_test_pairs, the returned (train, test) pairs will be the same.

Pre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the StratifedCV keywod constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.

If rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.

source
MLJBase.TimeSeriesCVType
tscv = TimeSeriesCV(; nfolds=4)

Cross-validation resampling strategy, for use in evaluate!, evaluate and tuning, when observations are chronological and not expected to be independent.

train_test_pairs(tscv, rows)

Returns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The rows are partitioned sequentially into nfolds + 1 approximately equal length partitions, where the first partition is the first train set, and the second partition is the first test set. The second train set consists of the first two partitions, and the second test set consists of the third partition, and so on for each fold.

The first partition (which is the first train set) has length n + r, where n, r = divrem(length(rows), nfolds + 1), and the remaining partitions (all of the test folds) have length n.

Examples

julia> MLJBase.train_test_pairs(TimeSeriesCV(nfolds=3), 1:10)
 3-element Vector{Tuple{UnitRange{Int64}, UnitRange{Int64}}}:
  (1:4, 5:6)
  (1:6, 7:8)
@@ -40,17 +41,4 @@
 _.per_observation = [missing]
 _.fitted_params_per_fold = [ … ]
 _.report_per_fold = [ … ]
-_.train_test_rows = [ … ]
source
MLJBase.evaluate!Method
evaluate!(mach,
-          resampling=CV(),
-          measure=nothing,
-          rows=nothing,
-          weights=nothing,
-          class_weights=nothing,
-          operation=nothing,
-          repeats=1,
-          acceleration=default_resource(),
-          force=false,
-          verbosity=1,
-          check_measure=true,
-          logger=nothing)

Estimate the performance of a machine mach wrapping a supervised model in data, using the specified resampling strategy (defaulting to 6-fold cross-validation) and measure, which can be a single measure or vector.

Do subtypes(MLJ.ResamplingStrategy) to obtain a list of available resampling strategies. If resampling is not an object of type MLJ.ResamplingStrategy, then a vector of tuples (of the form (train_rows, test_rows) is expected. For example, setting

resampling = [((1:100), (101:200)),
-               ((101:200), (1:100))]

gives two-fold cross-validation using the first 200 rows of data.

The type of operation (predict, predict_mode, etc) to be associated with measure is automatically inferred from measure traits where possible. For example, predict_mode will be used for a Multiclass target, if model is probabilistic but measure is deterministic. The operations applied can be inspected from the operation field of the object returned. Alternatively, operations can be explicitly specified using operation=.... If measure is a vector, then operation must be a single operation, which will be associated with all measures, or a vector of the same length as measure.

The resampling strategy is applied repeatedly (Monte Carlo resampling) if repeats > 1. For example, if repeats = 10, then resampling = CV(nfolds=5, shuffle=true), generates a total of 50 (train, test) pairs for evaluation and subsequent aggregation.

If resampling isa MLJ.ResamplingStrategy then one may optionally restrict the data used in evaluation by specifying rows.

An optional weights vector may be passed for measures that support sample weights (MLJ.supports_weights(measure) == true), which is ignored by those that don't. These weights are not to be confused with any weights w bound to mach (as in mach = machine(model, X, y, w)). To pass these to the performance evaluation measures you must explictly specify weights=w in the evaluate! call.

Additionally, optional class_weights dictionary may be passed for measures that support class weights (MLJ.supports_class_weights(measure) == true), which is ignored by those that don't. These weights are not to be confused with any weights class_w bound to mach (as in mach = machine(model, X, y, class_w)). To pass these to the performance evaluation measures you must explictly specify class_weights=w in the evaluate! call.

User-defined measures are supported; see the manual for details.

If no measure is specified, then default_measure(mach.model) is used, unless this default is nothing and an error is thrown.

The acceleration keyword argument is used to specify the compute resource (a subtype of ComputationalResources.AbstractResource) that will be used to accelerate/parallelize the resampling operation.

Although evaluate! is mutating, mach.model and mach.args are untouched.

Summary of key-word arguments

  • resampling - resampling strategy (default is CV(nfolds=6))

  • measure/measures - measure or vector of measures (losses, scores, etc)

  • rows - vector of observation indices from which both train and test folds are constructed (default is all observations)

  • weights - per-sample weights for measures that support them (not to be confused with weights used in training)

  • class_weights - dictionary of per-class weights for use with measures that support these, in classification problems (not to be confused with per-sample weights or with class weights used in training)

  • operation/operations - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified.

  • repeats - default is 1; set to a higher value for repeated (Monte Carlo) resampling

  • acceleration - parallelization option; currently supported options are instances of CPU1 (single-threaded computation) CPUThreads (multi-threaded computation) and CPUProcesses (multi-process computation); default is default_resource().

  • force - default is false; set to true for force cold-restart of each training event

  • verbosity level, an integer defaulting to 1.

  • check_measure - default is true

  • logger - a logger object (see MLJBase.log_evaluation)

Return value

A PerformanceEvaluation object. See PerformanceEvaluation for details.

source
MLJBase.log_evaluationMethod
log_evaluation(logger, performance_evaluation)

Log a performance evaluation to logger, an object specific to some logging platform, such as mlflow. If logger=nothing then no logging is performed. The method is called at the end of every call to evaluate/evaluate! using the logger provided by the logger keyword argument.

Implementations for new logging platforms

Julia interfaces to workflow logging platforms, such as mlflow (provided by the MLFlowClient.jl interface) should overload log_evaluation(logger::LoggerType, performance_evaluation), where LoggerType is a platform-specific type for logger objects. For an example, see the implementation provided by the MLJFlow.jl package.

source
MLJModelInterface.evaluateMethod
evaluate(model, data...; cache=true, kw_options...)

Equivalent to evaluate!(machine(model, data..., cache=cache); wk_options...). See the machine version evaluate! for the complete list of options.

source
+_.train_test_rows = [ … ]
source
MLJBase.log_evaluationMethod
log_evaluation(logger, performance_evaluation)

Log a performance evaluation to logger, an object specific to some logging platform, such as mlflow. If logger=nothing then no logging is performed. The method is called at the end of every call to evaluate/evaluate! using the logger provided by the logger keyword argument.

Implementations for new logging platforms

Julia interfaces to workflow logging platforms, such as mlflow (provided by the MLFlowClient.jl interface) should overload log_evaluation(logger::LoggerType, performance_evaluation), where LoggerType is a platform-specific type for logger objects. For an example, see the implementation provided by the MLJFlow.jl package.

source
diff --git a/dev/search/index.html b/dev/search/index.html index 7982561a..5cc0cb25 100644 --- a/dev/search/index.html +++ b/dev/search/index.html @@ -1,2 +1,2 @@ -Search · MLJBase.jl

Loading search...

    +Search · MLJBase.jl

    Loading search...

      diff --git a/dev/search_index.js b/dev/search_index.js index e2595735..532d5026 100644 --- a/dev/search_index.js +++ b/dev/search_index.js @@ -1,3 +1,3 @@ var documenterSearchIndex = {"docs": -[{"location":"measures/#Measures","page":"Measures","title":"Measures","text":"","category":"section"},{"location":"measures/#Helper-functions","page":"Measures","title":"Helper functions","text":"","category":"section"},{"location":"measures/","page":"Measures","title":"Measures","text":"Modules = [MLJBase]\nPages = [\"measures/registry.jl\", \"measures/measures.jl\"]","category":"page"},{"location":"measures/#Continuous-loss-functions","page":"Measures","title":"Continuous loss functions","text":"","category":"section"},{"location":"measures/","page":"Measures","title":"Measures","text":"Modules = [MLJBase]\nPages = [\"measures/continuous.jl\"]","category":"page"},{"location":"measures/#Confusion-matrix","page":"Measures","title":"Confusion matrix","text":"","category":"section"},{"location":"measures/","page":"Measures","title":"Measures","text":"Modules = [MLJBase]\nPages = [\"measures/confusion_matrix.jl\"]","category":"page"},{"location":"measures/#MLJBase.ConfusionMatrixObject","page":"Measures","title":"MLJBase.ConfusionMatrixObject","text":"ConfusionMatrixObject{C}\n\nConfusion matrix with C ≥ 2 classes. Rows correspond to predicted values and columns to the ground truth.\n\n\n\n\n\n","category":"type"},{"location":"measures/#MLJBase.ConfusionMatrixObject-Tuple{Matrix{Int64}, Vector{String}}","page":"Measures","title":"MLJBase.ConfusionMatrixObject","text":"ConfusionMatrixObject(m, labels)\n\nInstantiates a confusion matrix out of a square integer matrix m. Rows are the predicted class, columns the ground truth. See also the wikipedia article.\n\n\n\n\n\n","category":"method"},{"location":"measures/#MLJBase._confmat-Union{Tuple{N}, Tuple{V2}, Tuple{V1}, Tuple{V}, Tuple{Union{AbstractArray{V1, N}, CategoricalArrays.CategoricalArray{V1, N}}, Union{AbstractArray{V2, N}, CategoricalArrays.CategoricalArray{V2, N}}}} where {V, V1<:Union{Missing, V}, V2<:Union{Missing, V}, N}","page":"Measures","title":"MLJBase._confmat","text":"_confmat(ŷ, y; rev=false)\n\nA private method. General users should use confmat or other instances of the measure type ConfusionMatrix.\n\nComputes the confusion matrix given a predicted ŷ with categorical elements and the actual y. Rows are the predicted class, columns the ground truth. The ordering follows that of levels(y).\n\nKeywords\n\nrev=false: in the binary case, this keyword allows to swap the ordering of classes.\nperm=[]: in the general case, this keyword allows to specify a permutation re-ordering the classes.\nwarn=true: whether to show a warning in case y does not have scientific type OrderedFactor{2} (see note below).\n\nNote\n\nTo decrease the risk of unexpected errors, if y does not have scientific type OrderedFactor{2} (and so does not have a \"natural ordering\" negative-positive), a warning is shown indicating the current order unless the user explicitly specifies either rev or perm in which case it's assumed the user is aware of the class ordering.\n\nThe confusion_matrix is a measure (although neither a score nor a loss) and so may be specified as such in calls to evaluate, evaluate!, although not in TunedModels. In this case, however, there no way to specify an ordering different from levels(y), where y is the target.\n\n\n\n\n\n","category":"method"},{"location":"measures/#Finite-loss-functions","page":"Measures","title":"Finite loss functions","text":"","category":"section"},{"location":"measures/","page":"Measures","title":"Measures","text":"Modules = [MLJBase]\nPages = [\"measures/finite.jl\"]","category":"page"},{"location":"measures/#MLJBase.MulticlassFScore","page":"Measures","title":"MLJBase.MulticlassFScore","text":"MulticlassFScore(; β=1.0, average=macro_avg, return_type=LittleDict)\n\nOne-parameter generalization, F_β, of the F-measure or balanced F-score for multiclass observations.\n\nMulticlassFScore()(ŷ, y)\nMulticlassFScore()(ŷ, y, class_w)\n\nEvaluate the default score on multiclass observations, ŷ, given ground truth values, y. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.\n\nFor more information, run info(MulticlassFScore).\n\n\n\n\n\n","category":"type"},{"location":"measures/#MLJBase.MulticlassFalseDiscoveryRate","page":"Measures","title":"MLJBase.MulticlassFalseDiscoveryRate","text":"MulticlassFalseDiscoveryRate(; average=macro_avg, return_type=LittleDict)\n\nmulticlass false discovery rate; aliases: multiclass_false_discovery_rate, multiclass_falsediscovery_rate, multiclass_fdr.\n\nMulticlassFalseDiscoveryRate()(ŷ, y)\nMulticlassFalseDiscoveryRate()(ŷ, y, class_w)\n\nFalse discovery rate for multiclass observations ŷ and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.\n\nFor more information, run info(MulticlassFalseDiscoveryRate).\n\n\n\n\n\n","category":"type"},{"location":"measures/#MLJBase.MulticlassFalseNegativeRate","page":"Measures","title":"MLJBase.MulticlassFalseNegativeRate","text":"MulticlassFalseNegativeRate(; average=macro_avg, return_type=LittleDict)\n\nmulticlass false negative rate; aliases: multiclass_false_negative_rate, multiclass_fnr, multiclass_miss_rate, multiclass_falsenegative_rate.\n\nMulticlassFalseNegativeRate()(ŷ, y)\nMulticlassFalseNegativeRate()(ŷ, y, class_w)\n\nFalse negative rate for multiclass observations ŷ and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.\n\nFor more information, run info(MulticlassFalseNegativeRate).\n\n\n\n\n\n","category":"type"},{"location":"measures/#MLJBase.MulticlassFalsePositiveRate","page":"Measures","title":"MLJBase.MulticlassFalsePositiveRate","text":"MulticlassFalsePositiveRate(; average=macro_avg, return_type=LittleDict)\n\nmulticlass false positive rate; aliases: multiclass_false_positive_rate, multiclass_fpr multiclass_fallout, multiclass_falsepositive_rate.\n\nMulticlassFalsePositiveRate()(ŷ, y)\nMulticlassFalsePositiveRate()(ŷ, y, class_w)\n\nFalse positive rate for multiclass observations ŷ and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.\n\nFor more information, run info(MulticlassFalsePositiveRate).\n\n\n\n\n\n","category":"type"},{"location":"measures/#MLJBase.MulticlassNegativePredictiveValue","page":"Measures","title":"MLJBase.MulticlassNegativePredictiveValue","text":"MulticlassNegativePredictiveValue(; average=macro_avg, return_type=LittleDict)\n\nmulticlass negative predictive value; aliases: multiclass_negative_predictive_value, multiclass_negativepredictive_value, multiclass_npv.\n\nMulticlassNegativePredictiveValue()(ŷ, y)\nMulticlassNegativePredictiveValue()(ŷ, y, class_w)\n\nNegative predictive value for multiclass observations ŷ and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.\n\nFor more information, run info(MulticlassNegativePredictiveValue).\n\n\n\n\n\n","category":"type"},{"location":"measures/#MLJBase.MulticlassPrecision","page":"Measures","title":"MLJBase.MulticlassPrecision","text":"MulticlassPrecision(; average=macro_avg, return_type=LittleDict)\n\nmulticlass positive predictive value (aka precision); aliases: multiclass_positive_predictive_value, multiclass_ppv, multiclass_positivepredictive_value, multiclass_precision.\n\nMulticlassPrecision()(ŷ, y)\nMulticlassPrecision()(ŷ, y, class_w)\n\nPrecision for multiclass observations ŷ and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.\n\nFor more information, run info(MulticlassPrecision).\n\n\n\n\n\n","category":"type"},{"location":"measures/#MLJBase.MulticlassTrueNegative","page":"Measures","title":"MLJBase.MulticlassTrueNegative","text":"MulticlassTrueNegative(; return_type=LittleDict)\n\nNumber of true negatives; aliases: multiclass_true_negative, multiclass_truenegative.\n\nMulticlassTrueNegative()(ŷ, y)\n\nNumber of true negatives for multiclass observations ŷ and ground truth y, using default return type. Options for return_type are: LittleDict(default) or Vector. \n\nFor more information, run info(MulticlassTrueNegative).\n\n\n\n\n\n","category":"type"},{"location":"measures/#MLJBase.MulticlassTrueNegativeRate","page":"Measures","title":"MLJBase.MulticlassTrueNegativeRate","text":"MulticlassTrueNegativeRate(; average=macro_avg, return_type=LittleDict)\n\nmulticlass true negative rate; aliases: multiclass_true_negative_rate, multiclass_tnr multiclass_specificity, multiclass_selectivity, multiclass_truenegative_rate.\n\nMulticlassTrueNegativeRate()(ŷ, y)\nMulticlassTrueNegativeRate()(ŷ, y, class_w)\n\nTrue negative rate for multiclass observations ŷ and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.\n\nFor more information, run info(MulticlassTrueNegativeRate).\n\n\n\n\n\n","category":"type"},{"location":"measures/#MLJBase.MulticlassTruePositive","page":"Measures","title":"MLJBase.MulticlassTruePositive","text":"MulticlassTruePositive(; return_type=LittleDict)\n\nNumber of true positives; aliases: multiclass_true_positive, multiclass_truepositive.\n\nMulticlassTruePositive()(ŷ, y)\n\nNumber of true positives for multiclass observations ŷ and ground truth y, using default return type. Options for return_type are: LittleDict(default) or Vector. \n\nFor more information, run info(MulticlassTruePositive).\n\n\n\n\n\n","category":"type"},{"location":"measures/#MLJBase.MulticlassTruePositiveRate","page":"Measures","title":"MLJBase.MulticlassTruePositiveRate","text":"MulticlassTruePositiveRate(; average=macro_avg, return_type=LittleDict)\n\nmulticlass true positive rate; aliases: multiclass_true_positive_rate, multiclass_tpr, multiclass_sensitivity, multiclass_recall, multiclass_hit_rate, multiclass_truepositive_rate, \n\nMulticlassTruePositiveRate(ŷ, y)\nMulticlassTruePositiveRate(ŷ, y, class_w)\n\nTrue positive rate (a.k.a. sensitivity, recall, hit rate) for multiclass observations ŷ and ground truth y, using default averaging and return type. Options for average are: no_avg, macro_avg (default) and micro_avg. Options for return_type, applying in the no_avg case, are: LittleDict (default) or Vector. An optional AbstractDict, denoted class_w above, keyed on levels(y), specifies class weights. It applies if average=macro_avg or average=no_avg.\n\nFor more information, run info(MulticlassTruePositiveRate).\n\n\n\n\n\n","category":"type"},{"location":"measures/#Base.instances-Tuple{Type{<:FalseDiscoveryRate}}","page":"Measures","title":"Base.instances","text":".\n\n\n\n\n\n","category":"method"},{"location":"measures/#Base.instances-Tuple{Type{<:FalseNegativeRate}}","page":"Measures","title":"Base.instances","text":".\n\n\n\n\n\n","category":"method"},{"location":"measures/#MLJBase.MulticlassNegative","page":"Measures","title":"MLJBase.MulticlassNegative","text":"MulticlassFalseNegative(; return_type=LittleDict)\n\nNumber of false negatives; aliases: multiclass_false_negative, multiclass_falsenegative.\n\nMulticlassFalseNegative()(ŷ, y)\n\nNumber of false negatives for multiclass observations ŷ and ground truth y, using default return type. Options for return_type are: LittleDict(default) or Vector. \n\nFor more information, run info(MulticlassFalseNegative).\n\n\n\n\n\n","category":"function"},{"location":"measures/#MLJBase.MulticlassPositive","page":"Measures","title":"MLJBase.MulticlassPositive","text":"MulticlassFalsePositive(; return_type=LittleDict)\n\nNumber of false positives; aliases: multiclass_false_positive, multiclass_falsepositive.\n\nMulticlassFalsePositive()(ŷ, y)\n\nNumber of false positives for multiclass observations ŷ and ground truth y, using default return type. Options for return_type are: LittleDict(default) or Vector. \n\nFor more information, run info(MulticlassFalsePositive).\n\n\n\n\n\n","category":"function"},{"location":"distributions/#Distributions","page":"Distributions","title":"Distributions","text":"","category":"section"},{"location":"distributions/#Univariate-Finite-Distribution","page":"Distributions","title":"Univariate Finite Distribution","text":"","category":"section"},{"location":"distributions/","page":"Distributions","title":"Distributions","text":"Modules = [MLJBase]\nPages = [\"interface/univariate_finite.jl\"]","category":"page"},{"location":"distributions/#hyperparameters","page":"Distributions","title":"hyperparameters","text":"","category":"section"},{"location":"distributions/","page":"Distributions","title":"Distributions","text":"Modules = [MLJBase]\nPages = [\"hyperparam/one_dimensional_range_methods.jl\", \"hyperparam/one_dimensional_ranges.jl\"]","category":"page"},{"location":"distributions/#Distributions.sampler-Union{Tuple{T}, Tuple{NumericRange{T}, Distributions.UnivariateDistribution}} where T","page":"Distributions","title":"Distributions.sampler","text":"sampler(r::NominalRange, probs::AbstractVector{<:Real})\nsampler(r::NominalRange)\nsampler(r::NumericRange{T}, d)\n\nConstruct an object s which can be used to generate random samples from a ParamRange object r (a one-dimensional range) using one of the following calls:\n\nrand(s) # for one sample\nrand(s, n) # for n samples\nrand(rng, s [, n]) # to specify an RNG\n\nThe argument probs can be any probability vector with the same length as r.values. The second sampler method above calls the first with a uniform probs vector.\n\nThe argument d can be either an arbitrary instance of UnivariateDistribution from the Distributions.jl package, or one of a Distributions.jl types for which fit(d, ::NumericRange) is defined. These include: Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight, Normal, Gamma, InverseGaussian, Logistic, LogNormal, Cauchy, Gumbel, Laplace, and Poisson; but see the doc-string for Distributions.fit for an up-to-date list.\n\nIf d is an instance, then sampling is from a truncated form of the supplied distribution d, the truncation bounds being r.lower and r.upper (the attributes r.origin and r.unit attributes are ignored). For discrete numeric ranges (T <: Integer) the samples are rounded.\n\nIf d is a type then a suitably truncated distribution is automatically generated using Distributions.fit(d, r).\n\nImportant. Values are generated with no regard to r.scale, except in the special case r.scale is a callable object f. In that case, f is applied to all values generated by rand as described above (prior to rounding, in the case of discrete numeric ranges).\n\nExamples\n\nr = range(Char, :letter, values=collect(\"abc\"))\ns = sampler(r, [0.1, 0.2, 0.7])\nsamples = rand(s, 1000);\nStatsBase.countmap(samples)\nDict{Char,Int64} with 3 entries:\n 'a' => 107\n 'b' => 205\n 'c' => 688\n\nr = range(Int, :k, lower=2, upper=6) # numeric but discrete\ns = sampler(r, Normal)\nsamples = rand(s, 1000);\nUnicodePlots.histogram(samples)\n ┌ ┐\n[2.0, 2.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 119\n[2.5, 3.0) ┤ 0\n[3.0, 3.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 296\n[3.5, 4.0) ┤ 0\n[4.0, 4.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 275\n[4.5, 5.0) ┤ 0\n[5.0, 5.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 221\n[5.5, 6.0) ┤ 0\n[6.0, 6.5) ┤▇▇▇▇▇▇▇▇▇▇▇ 89\n └ ┘\n\n\n\n\n\n","category":"method"},{"location":"distributions/#MLJBase.iterator-Tuple{Random.AbstractRNG, ParamRange, Vararg{Any}}","page":"Distributions","title":"MLJBase.iterator","text":"iterator([rng, ], r::NominalRange, [,n])\niterator([rng, ], r::NumericRange, n)\n\nReturn an iterator (currently a vector) for a ParamRange object r. In the first case iteration is over all values stored in the range (or just the first n, if n is specified). In the second case, the iteration is over approximately n ordered values, generated as follows:\n\n(i) First, exactly n values are generated between U and L, with a spacing determined by r.scale (uniform if scale=:linear) where U and L are given by the following table:\n\nr.lower r.upper L U\nfinite finite r.lower r.upper\n-Inf finite r.upper - 2r.unit r.upper\nfinite Inf r.lower r.lower + 2r.unit\n-Inf Inf r.origin - r.unit r.origin + r.unit\n\n(ii) If a callable f is provided as scale, then a uniform spacing is always applied in (i) but f is broadcast over the results. (Unlike ordinary scales, this alters the effective range of values generated, instead of just altering the spacing.)\n\n(iii) If r is a discrete numeric range (r isa NumericRange{<:Integer}) then the values are additionally rounded, with any duplicate values removed. Otherwise all the values are used (and there are exacltly n of them).\n\n(iv) Finally, if a random number generator rng is specified, then the values are returned in random order (sampling without replacement), and otherwise they are returned in numeric order, or in the order provided to the range constructor, in the case of a NominalRange.\n\n\n\n\n\n","category":"method"},{"location":"distributions/#MLJBase.scale-Tuple{NominalRange}","page":"Distributions","title":"MLJBase.scale","text":"scale(r::ParamRange)\n\nReturn the scale associated with a ParamRange object r. The possible return values are: :none (for a NominalRange), :linear, :log, :log10, :log2, or :custom (if r.scale is a callable object).\n\n\n\n\n\n","category":"method"},{"location":"distributions/#StatsAPI.fit-Union{Tuple{D}, Tuple{Type{D}, NumericRange}} where D<:Distributions.Distribution","page":"Distributions","title":"StatsAPI.fit","text":"Distributions.fit(D, r::MLJBase.NumericRange)\n\nFit and return a distribution d of type D to the one-dimensional range r.\n\nOnly types D in the table below are supported.\n\nThe distribution d is constructed in two stages. First, a distributon d0, characterized by the conditions in the second column of the table, is fit to r. Then d0 is truncated between r.lower and r.upper to obtain d.\n\nDistribution type D Characterization of d0\nArcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight minimum(d) = r.lower, maximum(d) = r.upper\nNormal, Gamma, InverseGaussian, Logistic, LogNormal mean(d) = r.origin, std(d) = r.unit\nCauchy, Gumbel, Laplace, (Normal) Dist.location(d) = r.origin, Dist.scale(d) = r.unit\nPoisson Dist.mean(d) = r.unit\n\nHere Dist = Distributions.\n\n\n\n\n\n","category":"method"},{"location":"distributions/#Base.range-Union{Tuple{D}, Tuple{Union{Model, Type}, Union{Expr, Symbol}}} where D","page":"Distributions","title":"Base.range","text":"r = range(model, :hyper; values=nothing)\n\nDefine a one-dimensional NominalRange object for a field hyper of model. Note that r is not directly iterable but iterator(r) is.\n\nA nested hyperparameter is specified using dot notation. For example, :(atom.max_depth) specifies the max_depth hyperparameter of the submodel model.atom.\n\nr = range(model, :hyper; upper=nothing, lower=nothing,\n scale=nothing, values=nothing)\n\nAssuming values is not specified, define a one-dimensional NumericRange object for a Real field hyper of model. Note that r is not directly iteratable but iterator(r, n)is an iterator of length n. To generate random elements from r, instead apply rand methods to sampler(r). The supported scales are :linear,:log, :logminus, :log10, :log10minus, :log2, or a callable object.\n\nNote that r is not directly iterable, but iterator(r, n) is, for given resolution (length) n.\n\nBy default, the behaviour of the constructed object depends on the type of the value of the hyperparameter :hyper at model at the time of construction. To override this behaviour (for instance if model is not available) specify a type in place of model so the behaviour is determined by the value of the specified type.\n\nA nested hyperparameter is specified using dot notation (see above).\n\nIf scale is unspecified, it is set to :linear, :log, :log10minus, or :linear, according to whether the interval (lower, upper) is bounded, right-unbounded, left-unbounded, or doubly unbounded, respectively. Note upper=Inf and lower=-Inf are allowed.\n\nIf values is specified, the other keyword arguments are ignored and a NominalRange object is returned (see above).\n\nSee also: iterator, sampler\n\n\n\n\n\n","category":"method"},{"location":"distributions/#Utility-functions","page":"Distributions","title":"Utility functions","text":"","category":"section"},{"location":"distributions/","page":"Distributions","title":"Distributions","text":"Modules = [MLJBase]\nPages = [\"distributions.jl\"]","category":"page"},{"location":"utilities/#Utilities","page":"Utilities","title":"Utilities","text":"","category":"section"},{"location":"utilities/#Machines","page":"Utilities","title":"Machines","text":"","category":"section"},{"location":"utilities/","page":"Utilities","title":"Utilities","text":"Modules = [MLJBase]\nPages = [\"machines.jl\"]","category":"page"},{"location":"utilities/#Base.replace-Union{Tuple{C}, Tuple{Machine{<:Any, C}, Vararg{Pair}}} where C","page":"Utilities","title":"Base.replace","text":"replace(mach::Machine, field1 => value1, field2 => value2, ...)\n\nPrivate method.\n\nReturn a shallow copy of the machine mach with the specified field replacements. Undefined field values are preserved. Unspecified fields have identically equal values, with the exception of mach.fit_okay, which is always a new instance Channel{Bool}(1).\n\nThe following example returns a machine with no traces of training data (but also removes any upstream dependencies in a learning network):\n\n```julia replace(mach, :args => (), :data => (), :dataresampleddata => (), :cache => nothing)\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.age-Tuple{Machine}","page":"Utilities","title":"MLJBase.age","text":"age(mach::Machine)\n\nReturn an integer representing the number of times mach has been trained or updated. For more detail, see the discussion of training logic at fit_only!.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.ancestors-Tuple{Machine}","page":"Utilities","title":"MLJBase.ancestors","text":"ancestors(mach::Machine; self=false)\n\nAll ancestors of mach, including mach if self=true.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.default_scitype_check_level","page":"Utilities","title":"MLJBase.default_scitype_check_level","text":"default_scitype_check_level()\n\nReturn the current global default value for scientific type checking when constructing machines.\n\ndefault_scitype_check_level(i::Integer)\n\nSet the global default value for scientific type checking to i.\n\nThe effect of the scitype_check_level option in calls of the form machine(model, data, scitype_check_level=...) is summarized below:\n\nscitype_check_level Inspect scitypes? If Unknown in scitypes If other scitype mismatch\n0 × \n1 (value at startup) ✓ warning\n2 ✓ warning warning\n3 ✓ warning error\n4 ✓ error error\n\nSee also machine\n\n\n\n\n\n","category":"function"},{"location":"utilities/#MLJBase.fit_only!-Union{Tuple{Machine{<:Any, cache_data}}, Tuple{cache_data}} where cache_data","page":"Utilities","title":"MLJBase.fit_only!","text":"MLJBase.fit_only!(\n mach::Machine;\n rows=nothing,\n verbosity=1,\n force=false,\n composite=nothing,\n)\n\nWithout mutating any other machine on which it may depend, perform one of the following actions to the machine mach, using the data and model bound to it, and restricting the data to rows if specified:\n\nAb initio training. Ignoring any previous learned parameters and cache, compute and store new learned parameters. Increment mach.state.\nTraining update. Making use of previous learned parameters and/or cache, replace or mutate existing learned parameters. The effect is the same (or nearly the same) as in ab initio training, but may be faster or use less memory, assuming the model supports an update option (implements MLJBase.update). Increment mach.state.\nNo-operation. Leave existing learned parameters untouched. Do not increment mach.state.\n\nIf the model, model, bound to mach is a symbol, then instead perform the action using the true model given by getproperty(composite, model). See also machine.\n\nTraining action logic\n\nFor the action to be a no-operation, either mach.frozen == true or or none of the following apply:\n\n(i) mach has never been trained (mach.state == 0).\n(ii) force == true.\n(iii) The state of some other machine on which mach depends has changed since the last time mach was trained (ie, the last time mach.state was last incremented).\n(iv) The specified rows have changed since the last retraining and mach.model does not have Static type.\n(v) mach.model is a model and different from the last model used for training, but has the same type.\n(vi) mach.model is a model but has a type different from the last model used for training.\n(vii) mach.model is a symbol and (composite, mach.model) is different from the last model used for training, but has the same type.\n(viii) mach.model is a symbol and (composite, mach.model) has a different type from the last model used for training.\n\nIn any of the cases (i) - (iv), (vi), or (viii), mach is trained ab initio. If (v) or (vii) is true, then a training update is applied.\n\nTo freeze or unfreeze mach, use freeze!(mach) or thaw!(mach).\n\nImplementation details\n\nThe data to which a machine is bound is stored in mach.args. Each element of args is either a Node object, or, in the case that concrete data was bound to the machine, it is concrete data wrapped in a Source node. In all cases, to obtain concrete data for actual training, each argument N is called, as in N() or N(rows=rows), and either MLJBase.fit (ab initio training) or MLJBase.update (training update) is dispatched on mach.model and this data. See the \"Adding models for general use\" section of the MLJ documentation for more on these lower-level training methods.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.freeze!-Tuple{Machine}","page":"Utilities","title":"MLJBase.freeze!","text":"freeze!(mach)\n\nFreeze the machine mach so that it will never be retrained (unless thawed).\n\nSee also thaw!.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.glb-Tuple{Machine{<:Union{Surrogate, Composite}}}","page":"Utilities","title":"MLJBase.glb","text":"N = glb(mach::Machine{<:Union{Composite,Surrogate}})\n\nA greatest lower bound for the nodes appearing in the learning network interface of mach.\n\nA learning network interface is a named tuple declaring certain interface points in a learning network, to be used when \"exporting\" the network as a new stand-alone model type. Examples are\n\n (predict=yhat,)\n (transform=Xsmall, acceleration=CPUThreads())\n (predict=yhat, transform=W, report=(loss=loss_node,))\n\nHere yhat, Xsmall, W and loss_node are nodes in the network.\n\nThe keys of the learning network interface always one of the following:\n\nThe name of an operation, such as :predict, :predict_mode, :transform, :inverse_transform. See \"Operation keys\" below.\n:report, for exposing results of calling a node with no arguments in the composite model report. See \"Including report nodes\" below.\n:fitted_params, for exposing results of calling a node with no arguments as fitted parameters of the composite model. See \"Including fitted parameter nodes\" below.\n:acceleration, for articulating acceleration mode for training the network, e.g., CPUThreads(). Corresponding value must be an AbstractResource. If not included, CPU1() is used.\n\nOperation keys\n\nIf the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).\n\nIncluding report nodes\n\nIf the key is :report, then the corresponding value must be a named tuple\n\n (k1=n1, k2=n2, ...)\n\nwhose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the \"Learning Networks\" section of the MLJ manual.\n\nIncluding fitted parameter nodes\n\nIf the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.\n\nPrivate method.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.last_model-Tuple{Any}","page":"Utilities","title":"MLJBase.last_model","text":"last_model(mach::Machine)\n\nReturn the last model used to train the machine mach. This is a bona fide model, even if mach.model is a symbol.\n\nReturns nothing if mach has not been trained.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.machine","page":"Utilities","title":"MLJBase.machine","text":"machine(model, args...; cache=true, scitype_check_level=1)\n\nConstruct a Machine object binding a model, storing hyper-parameters of some machine learning algorithm, to some data, args. Calling fit! on a Machine instance mach stores outcomes of applying the algorithm in mach, which can be inspected using fitted_params(mach) (learned paramters) and report(mach) (other outcomes). This in turn enables generalization to new data using operations such as predict or transform:\n\nusing MLJModels\nX, y = make_regression()\n\nPCA = @load PCA pkg=MultivariateStats\nmodel = PCA()\nmach = machine(model, X)\nfit!(mach, rows=1:50)\ntransform(mach, selectrows(X, 51:100)) # or transform(mach, rows=51:100)\n\nDecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree\nmodel = DecisionTreeRegressor()\nmach = machine(model, X, y)\nfit!(mach, rows=1:50)\npredict(mach, selectrows(X, 51:100)) # or predict(mach, rows=51:100)\n\nSpecify cache=false to prioritize memory management over speed.\n\nWhen building a learning network, Node objects can be substituted for the concrete data but no type or dimension checks are applied.\n\nChecks on the types of training data\n\nA model articulates its data requirements using scientific types, i.e., using the scitype function instead of the typeof function.\n\nIf scitype_check_level > 0 then the scitype of each arg in args is computed, and this is compared with the scitypes expected by the model, unless args contains Unknown scitypes and scitype_check_level < 4, in which case no further action is taken. Whether warnings are issued or errors thrown depends the level. For details, see default_scitype_check_level, a method to inspect or change the default level (1 at startup).\n\nMachines with model placeholders\n\nA symbol can be substituted for a model in machine constructors to act as a placeholder for a model specified at training time. The symbol must be the field name for a struct whose corresponding value is a model, as shown in the following example:\n\nmutable struct MyComposite\n transformer\n classifier\nend\n\nmy_composite = MyComposite(Standardizer(), ConstantClassifier)\n\nX, y = make_blobs()\nmach = machine(:classifier, X, y)\nfit!(mach, composite=my_composite)\n\nThe last two lines are equivalent to\n\nmach = machine(ConstantClassifier(), X, y)\nfit!(mach)\n\nDelaying model specification is used when exporting learning networks as new stand-alone model types. See prefit and the MLJ documentation on learning networks.\n\nSee also fit!, default_scitype_check_level, MLJBase.save, serializable.\n\n\n\n\n\n","category":"function"},{"location":"utilities/#MLJBase.machine-Tuple{Union{IO, String}}","page":"Utilities","title":"MLJBase.machine","text":"machine(file::Union{String, IO})\n\nRebuild from a file a machine that has been serialized using the default Serialization module.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.model_supertype-Tuple{Any}","page":"Utilities","title":"MLJBase.model_supertype","text":"model_supertype(interface)\n\nReturn, if this can be inferred, which of Deterministic, Probabilistic and Unsupervised is the appropriate supertype for a composite model obtained by exporting a learning network with the specified learning network interface.\n\nA learning network interface is a named tuple declaring certain interface points in a learning network, to be used when \"exporting\" the network as a new stand-alone model type. Examples are\n\n (predict=yhat,)\n (transform=Xsmall, acceleration=CPUThreads())\n (predict=yhat, transform=W, report=(loss=loss_node,))\n\nHere yhat, Xsmall, W and loss_node are nodes in the network.\n\nThe keys of the learning network interface always one of the following:\n\nThe name of an operation, such as :predict, :predict_mode, :transform, :inverse_transform. See \"Operation keys\" below.\n:report, for exposing results of calling a node with no arguments in the composite model report. See \"Including report nodes\" below.\n:fitted_params, for exposing results of calling a node with no arguments as fitted parameters of the composite model. See \"Including fitted parameter nodes\" below.\n:acceleration, for articulating acceleration mode for training the network, e.g., CPUThreads(). Corresponding value must be an AbstractResource. If not included, CPU1() is used.\n\nOperation keys\n\nIf the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).\n\nIncluding report nodes\n\nIf the key is :report, then the corresponding value must be a named tuple\n\n (k1=n1, k2=n2, ...)\n\nwhose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the \"Learning Networks\" section of the MLJ manual.\n\nIncluding fitted parameter nodes\n\nIf the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.\n\nIf a supertype cannot be inferred, nothing is returned.\n\nIf the network with given signature is not exportable, this method will not error but it will not a give meaningful return value either.\n\nPrivate method.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.report-Tuple{MLJBase.CompositeFitresult}","page":"Utilities","title":"MLJBase.report","text":"report(fitresult::CompositeFitresult)\n\nReturn a tuple combining the report from fitresult.glb (a Node report) with the additions coming from nodes declared as report nodes in fitresult.signature, but without merging the two.\n\nA learning network interface is a named tuple declaring certain interface points in a learning network, to be used when \"exporting\" the network as a new stand-alone model type. Examples are\n\n (predict=yhat,)\n (transform=Xsmall, acceleration=CPUThreads())\n (predict=yhat, transform=W, report=(loss=loss_node,))\n\nHere yhat, Xsmall, W and loss_node are nodes in the network.\n\nThe keys of the learning network interface always one of the following:\n\nThe name of an operation, such as :predict, :predict_mode, :transform, :inverse_transform. See \"Operation keys\" below.\n:report, for exposing results of calling a node with no arguments in the composite model report. See \"Including report nodes\" below.\n:fitted_params, for exposing results of calling a node with no arguments as fitted parameters of the composite model. See \"Including fitted parameter nodes\" below.\n:acceleration, for articulating acceleration mode for training the network, e.g., CPUThreads(). Corresponding value must be an AbstractResource. If not included, CPU1() is used.\n\nOperation keys\n\nIf the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).\n\nIncluding report nodes\n\nIf the key is :report, then the corresponding value must be a named tuple\n\n (k1=n1, k2=n2, ...)\n\nwhose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the \"Learning Networks\" section of the MLJ manual.\n\nIncluding fitted parameter nodes\n\nIf the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.\n\nPrivate method\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.report-Tuple{Machine}","page":"Utilities","title":"MLJBase.report","text":"report(mach)\n\nReturn the report for a machine mach that has been fit!, for example the coefficients in a linear model.\n\nThis is a named tuple and human-readable if possible.\n\nIf mach is a machine for a composite model, such as a model constructed using @pipeline, then the returned named tuple has the composite type's field names as keys. The corresponding value is the report for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)\n\nusing MLJ\n@load LinearBinaryClassifier pkg=GLM\nX, y = @load_crabs;\npipe = @pipeline Standardizer LinearBinaryClassifier\nmach = machine(pipe, X, y) |> fit!\n\njulia> report(mach).linear_binary_classifier\n(deviance = 3.8893386087844543e-7,\n dof_residual = 195.0,\n stderror = [18954.83496713119, 6502.845740757159, 48484.240246060406, 34971.131004997274, 20654.82322484894, 2111.1294584763386],\n vcov = [3.592857686311793e8 9.122732393971942e6 … -8.454645589364915e7 5.38856837634321e6; 9.122732393971942e6 4.228700272808351e7 … -4.978433790526467e7 -8.442545425533723e6; … ; -8.454645589364915e7 -4.978433790526467e7 … 4.2662172244975924e8 2.1799125705781363e7; 5.38856837634321e6 -8.442545425533723e6 … 2.1799125705781363e7 4.456867590446599e6],)\n\n\nAdditional keys, machines and report_given_machine, give a list of all machines in the underlying network, and a dictionary of reports keyed on those machines.\n\nSee also fitted_params\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.report_given_method-Tuple{Machine}","page":"Utilities","title":"MLJBase.report_given_method","text":"report_given_method(mach::Machine)\n\nSame as report(mach) but broken down by the method (fit, predict, etc) that contributed the report.\n\nA specialized method intended for learning network applications.\n\nThe return value is a dictionary keyed on the symbol representing the method (:fit, :predict, etc) and the values report contributed by that method.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.restore!-Tuple{Machine}","page":"Utilities","title":"MLJBase.restore!","text":"restore!(mach::Machine)\n\nRestore the state of a machine that is currently serializable but which may not be otherwise usable. For such a machine, mach, one has mach.state=1. Intended for restoring deserialized machine objects to a useable form.\n\nFor an example see serializable.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.return!-Tuple{Machine{<:Surrogate}, Union{Nothing, Model}, Any}","page":"Utilities","title":"MLJBase.return!","text":"return!(mach::Machine{<:Surrogate}, model, verbosity; acceleration=CPU1())\n\nThe last call in custom code defining the MLJBase.fit method for a new composite model type. Here model is the instance of the new type appearing in the MLJBase.fit signature, while mach is a learning network machine constructed using model. Not relevant when defining composite models using @pipeline (deprecated) or @from_network.\n\nFor usage, see the example given below. Specifically, the call does the following:\n\nDetermines which hyper-parameters of model point to model instances in the learning network wrapped by mach, for recording in an object called cache, for passing onto the MLJ logic that handles smart updating (namely, an MLJBase.update fallback for composite models).\nCalls fit!(mach, verbosity=verbosity, acceleration=acceleration).\nRecords (among other things) a copy of model in a variable called cache\nReturns cache and outcomes of training in an appropriate form (specifically, (mach.fitresult, cache, mach.report); see Adding Models for General Use for technical details.)\n\nExample\n\nThe following code defines, \"by hand\", a new model type MyComposite for composing standardization (whitening) with a deterministic regressor:\n\nmutable struct MyComposite <: DeterministicComposite\n regressor\nend\n\nfunction MLJBase.fit(model::MyComposite, verbosity, X, y)\n Xs = source(X)\n ys = source(y)\n\n mach1 = machine(Standardizer(), Xs)\n Xwhite = transform(mach1, Xs)\n\n mach2 = machine(model.regressor, Xwhite, ys)\n yhat = predict(mach2, Xwhite)\n\n mach = machine(Deterministic(), Xs, ys; predict=yhat)\n return!(mach, model, verbosity)\nend\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.serializable-Union{Tuple{Machine{<:Any, C}}, Tuple{C}} where C","page":"Utilities","title":"MLJBase.serializable","text":"serializable(mach::Machine)\n\nReturns a shallow copy of the machine to make it serializable. In particular, all training data is removed and, if necessary, learned parameters are replaced with persistent representations.\n\nAny general purpose Julia serializer may be applied to the output of serializable (eg, JLSO, BSON, JLD) but you must call restore!(mach) on the deserialised object mach before using it. See the example below.\n\nIf using Julia's standard Serialization library, a shorter workflow is available using the MLJBase.save (or MLJ.save) method.\n\nA machine returned by serializable is characterized by the property mach.state == -1.\n\nExample using JLSO\n\nusing MLJ\nusing JLSO\nTree = @load DecisionTreeClassifier\ntree = Tree()\nX, y = @load_iris\nmach = fit!(machine(tree, X, y))\n\n# This machine can now be serialized\nsmach = serializable(mach)\nJLSO.save(\"machine.jlso\", :machine => smach)\n\n# Deserialize and restore learned parameters to useable form:\nloaded_mach = JLSO.load(\"machine.jlso\")[:machine]\nrestore!(loaded_mach)\n\npredict(loaded_mach, X)\npredict(mach, X)\n\nSee also restore!, MLJBase.save.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.thaw!-Tuple{Machine}","page":"Utilities","title":"MLJBase.thaw!","text":"thaw!(mach)\n\nUnfreeze the machine mach so that it can be retrained.\n\nSee also freeze!.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJModelInterface.feature_importances-Tuple{Machine}","page":"Utilities","title":"MLJModelInterface.feature_importances","text":"feature_importances(mach::Machine)\n\nReturn a list of feature => importance pairs for a fitted machine, mach, for supported models. Otherwise return nothing.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJModelInterface.fitted_params-Tuple{Machine}","page":"Utilities","title":"MLJModelInterface.fitted_params","text":"fitted_params(mach)\n\nReturn the learned parameters for a machine mach that has been fit!, for example the coefficients in a linear model.\n\nThis is a named tuple and human-readable if possible.\n\nIf mach is a machine for a composite model, such as a model constructed using @pipeline, then the returned named tuple has the composite type's field names as keys. The corresponding value is the fitted parameters for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)\n\nusing MLJ\n@load LogisticClassifier pkg=MLJLinearModels\nX, y = @load_crabs;\npipe = @pipeline Standardizer LogisticClassifier\nmach = machine(pipe, X, y) |> fit!\n\njulia> fitted_params(mach).logistic_classifier\n(classes = CategoricalArrays.CategoricalValue{String,UInt32}[\"B\", \"O\"],\n coefs = Pair{Symbol,Float64}[:FL => 3.7095037897680405, :RW => 0.1135739140854546, :CL => -1.6036892745322038, :CW => -4.415667573486482, :BD => 3.238476051092471],\n intercept = 0.0883301599726305,)\n\nAdditional keys, machines and fitted_params_given_machine, give a list of all machines in the underlying network, and a dictionary of fitted parameters keyed on those machines.\n\nSee also report\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJModelInterface.save-Tuple{Union{IO, String}, Machine}","page":"Utilities","title":"MLJModelInterface.save","text":"MLJ.save(filename, mach::Machine)\nMLJ.save(io, mach::Machine)\n\nMLJBase.save(filename, mach::Machine)\nMLJBase.save(io, mach::Machine)\n\nSerialize the machine mach to a file with path filename, or to an input/output stream io (at least IOBuffer instances are supported) using the Serialization module.\n\nTo serialise using a different format, see serializable.\n\nMachines are deserialized using the machine constructor as shown in the example below.\n\nThe implementation of save for machines changed in MLJ 0.18 (MLJBase 0.20). You can only restore a machine saved using older versions of MLJ using an older version.\n\nExample\n\nusing MLJ\nTree = @load DecisionTreeClassifier\nX, y = @load_iris\nmach = fit!(machine(Tree(), X, y))\n\nMLJ.save(\"tree.jls\", mach)\nmach_predict_only = machine(\"tree.jls\")\npredict(mach_predict_only, X)\n\n# using a buffer:\nio = IOBuffer()\nMLJ.save(io, mach)\nseekstart(io)\npredict_only_mach = machine(io)\npredict(predict_only_mach, X)\n\nwarning: Only load files from trusted sources\nMaliciously constructed JLS files, like pickles, and most other general purpose serialization formats, can allow for arbitrary code execution during loading. This means it is possible for someone to use a JLS file that looks like a serialized MLJ machine as a Trojan horse.\n\nSee also serializable, machine.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#StatsAPI.fit!-Tuple{Machine{<:Surrogate}}","page":"Utilities","title":"StatsAPI.fit!","text":"fit!(mach::Machine{<:Surrogate};\n rows=nothing,\n acceleration=CPU1(),\n verbosity=1,\n force=false))\n\nTrain the complete learning network wrapped by the machine mach.\n\nMore precisely, if s is the learning network signature used to construct mach, then call fit!(N), where N is a greatest lower bound of the nodes appearing in the signature (values in the signature that are not AbstractNode are ignored). For example, if s = (predict=yhat, transform=W), then call fit!(glb(yhat, W)).\n\nSee also machine\n\n\n\n\n\n","category":"method"},{"location":"utilities/#StatsAPI.fit!-Tuple{Machine}","page":"Utilities","title":"StatsAPI.fit!","text":"fit!(mach::Machine, rows=nothing, verbosity=1, force=false, composite=nothing)\n\nFit the machine mach. In the case that mach has Node arguments, first train all other machines on which mach depends.\n\nTo attempt to fit a machine without touching any other machine, use fit_only!. For more on options and the the internal logic of fitting see fit_only!\n\n\n\n\n\n","category":"method"},{"location":"utilities/#Parameter-Inspection","page":"Utilities","title":"Parameter Inspection","text":"","category":"section"},{"location":"utilities/","page":"Utilities","title":"Utilities","text":"Modules = [MLJBase]\nPages = [\"parameter_inspection.jl\"]","category":"page"},{"location":"utilities/#Show","page":"Utilities","title":"Show","text":"","category":"section"},{"location":"utilities/","page":"Utilities","title":"Utilities","text":"Modules = [MLJBase]\nPages = [\"show.jl\"]","category":"page"},{"location":"utilities/#MLJBase._recursive_show-Tuple{IO, MLJType, Any, Any}","page":"Utilities","title":"MLJBase._recursive_show","text":"_recursive_show(stream, object, current_depth, depth)\n\nGenerate a table of the properties of the MLJType object, dislaying each property value by calling the method _show on it. The behaviour of _show(stream, f) is as follows:\n\nIf f is itself a MLJType object, then its short form is shown\n\nand _recursive_show generates as separate table for each of its properties (and so on, up to a depth of argument depth).\n\nOtherwise f is displayed as \"(omitted T)\" where T = typeof(f),\n\nunless istoobig(f) is false (the istoobig fall-back for arbitrary types being true). In the latter case, the long (ie, MIME\"plain/text\") form of f is shown. To override this behaviour, overload the _show method for the type in question.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.abbreviated-Tuple{Any}","page":"Utilities","title":"MLJBase.abbreviated","text":"to display abbreviated versions of integers\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.color_off-Tuple{}","page":"Utilities","title":"MLJBase.color_off","text":"color_off()\n\nSuppress color and bold output at the REPL for displaying MLJ objects.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.color_on-Tuple{}","page":"Utilities","title":"MLJBase.color_on","text":"color_on()\n\nEnable color and bold output at the REPL, for enhanced display of MLJ objects.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.handle-Tuple{Any}","page":"Utilities","title":"MLJBase.handle","text":"return abbreviated object id (as string) or it's registered handle (as string) if this exists\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.@constant-Tuple{Any}","page":"Utilities","title":"MLJBase.@constant","text":"@constant x = value\n\nPrivate method (used in testing).\n\nEquivalent to const x = value but registers the binding thus:\n\nMLJBase.HANDLE_GIVEN_ID[objectid(value)] = :x\n\nRegistered objects get displayed using the variable name to which it was bound in calls to show(x), etc.\n\nWARNING: As with any const declaration, binding x to new value of the same type is not prevented and the registration will not be updated.\n\n\n\n\n\n","category":"macro"},{"location":"utilities/#MLJBase.@more-Tuple{}","page":"Utilities","title":"MLJBase.@more","text":"@more\n\nEntered at the REPL, equivalent to show(ans, 100). Use to get a recursive description of all properties of the last REPL value.\n\n\n\n\n\n","category":"macro"},{"location":"utilities/#Utility-functions","page":"Utilities","title":"Utility functions","text":"","category":"section"},{"location":"utilities/","page":"Utilities","title":"Utilities","text":"Modules = [MLJBase]\nPages = [\"utilities.jl\"]","category":"page"},{"location":"utilities/#MLJBase.Accuracy","page":"Utilities","title":"MLJBase.Accuracy","text":"MLJBase.Accuracy\n\nA measure type for accuracy, which includes the instance(s): accuracy.\n\nAccuracy()(ŷ, y)\nAccuracy()(ŷ, y, w)\n\nEvaluate the accuracy on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nAccuracy is proportion of correct predictions ŷ[i] that match the ground truth y[i] observations. This metric is invariant to class reordering.\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite,Missing} (multiclass classification); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(Accuracy). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.AreaUnderCurve","page":"Utilities","title":"MLJBase.AreaUnderCurve","text":"MLJBase.AreaUnderCurve\n\nA measure type for area under the ROC, which includes the instance(s): area_under_curve, auc.\n\nAreaUnderCurve()(ŷ, y)\n\nEvaluate the area under the ROC on predictions ŷ, given ground truth observations y. \n\nReturns the area under the ROC (receiver operator characteristic)\n\nIf missing or NaN values are present, use auc(skipinvalid(yhat, y)...).\n\nThis metric is invariant to class reordering.\n\nRequires scitype(y) to be a subtype of Union{AbstractArray{<:Union{Missing, ScientificTypesBase.Multiclass{2}}}, AbstractArray{<:Union{Missing, ScientificTypesBase.OrderedFactor{2}}}}; ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(AreaUnderCurve). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.BalancedAccuracy","page":"Utilities","title":"MLJBase.BalancedAccuracy","text":"MLJBase.BalancedAccuracy\n\nA measure type for balanced accuracy, which includes the instance(s): balanced_accuracy, bacc, bac.\n\nBalancedAccuracy()(ŷ, y)\nBalancedAccuracy()(ŷ, y, w)\n\nEvaluate the default instance of BalancedAccuracy on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nBalanced accuracy compensates standard Accuracy for class imbalance. See https://en.wikipedia.org/wiki/Precisionandrecall#Imbalanced_data. \n\nSetting adjusted=true rescales the score in the way prescribed in L. Mosley (2013): A balanced approach to the multi-class imbalance problem. PhD thesis, Iowa State University. In the binary case, the adjusted balanced accuracy is also known as Youden’s J statistic, or informedness.\n\nThis metric is invariant to class reordering.\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite,Missing} (multiclass classification); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(BalancedAccuracy). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.BrierLoss","page":"Utilities","title":"MLJBase.BrierLoss","text":"MLJBase.BrierLoss\n\nA measure type for Brier loss (a.k.a. quadratic loss), which includes the instance(s): brier_loss.\n\nBrierLoss()(ŷ, y)\nBrierLoss()(ŷ, y, w)\n\nEvaluate the Brier loss (a.k.a. quadratic loss) on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor details, see BrierScore, which differs only by a sign.\n\nRequires scitype(y) to be a subtype of AbtractArray{<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ŷ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ŷ); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(BrierLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.BrierScore","page":"Utilities","title":"MLJBase.BrierScore","text":"MLJBase.BrierScore\n\nA measure type for Brier score (a.k.a. quadratic score), which includes the instance(s): brier_score.\n\nBrierScore()(ŷ, y)\nBrierScore()(ŷ, y, w)\n\nEvaluate the Brier score (a.k.a. quadratic score) on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nConvention as in Gneiting and Raftery (2007), \"StrictlyProper Scoring Rules, Prediction, and Estimation\"\n\nFinite case. If p is the predicted probability mass function for a single observation η, and C all possible classes, then the corresponding score for that observation is given by\n\n2p(η) - left(sum_c C p(c)^2right) - 1\n\nWarning. BrierScore() is a \"score\" in the sense that bigger is better (with 0 optimal, and all other values negative). In Brier's original 1950 paper, and many other places, it has the opposite sign, despite the name. Moreover, the present implementation does not treat the binary case as special, so that the score may differ in the binary case by a factor of two from usage elsewhere.\n\nInfinite case. Replacing the sum above with an integral does not lead to the formula adopted here in the case of Continuous or Count target y. Rather the convention in the paper cited above is adopted, which means returning a score of\n\n2p(η) - p(t)^2 dt\n\nin the Continuous case (p the probablity density function) or\n\n2p(η) - _t p(t)^2\n\nin the Count cae (p the probablity mass function).\n\nRequires scitype(y) to be a subtype of AbtractArray{<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ŷ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ŷ); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(BrierScore). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.ConfusionMatrix","page":"Utilities","title":"MLJBase.ConfusionMatrix","text":"MLJBase.ConfusionMatrix\n\nA measure type for confusion matrix, which includes the instance(s): confusion_matrix, confmat.\n\nConfusionMatrix()(ŷ, y)\n\nEvaluate the default instance of ConfusionMatrix on predictions ŷ, given ground truth observations y. \n\nIf r is the return value, then the raw confusion matrix is r.mat, whose rows correspond to predictions, and columns to ground truth. The ordering follows that of levels(y).\n\nUse ConfusionMatrix(perm=[2, 1]) to reverse the class order for binary data. For more than two classes, specify an appropriate permutation, as in ConfusionMatrix(perm=[2, 3, 1]).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(ConfusionMatrix). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.DWDMarginLoss","page":"Utilities","title":"MLJBase.DWDMarginLoss","text":"MLJBase.DWDMarginLoss\n\nA measure type for distance weighted discrimination loss, which includes the instance(s): dwd_margin_loss.\n\nDWDMarginLoss()(ŷ, y)\nDWDMarginLoss()(ŷ, y, w)\n\nEvaluate the default instance of DWDMarginLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of probabilistic predictions. \n\nConstructor signature: DWDMarginLoss(; q=1.0)\n\nFor more information, run info(DWDMarginLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.ExpLoss","page":"Utilities","title":"MLJBase.ExpLoss","text":"MLJBase.ExpLoss\n\nA measure type for exp loss, which includes the instance(s): exp_loss.\n\nExpLoss()(ŷ, y)\nExpLoss()(ŷ, y, w)\n\nEvaluate the default instance of ExpLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(ExpLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.FScore","page":"Utilities","title":"MLJBase.FScore","text":"MLJBase.FScore\n\nA measure type for F-Score, which includes the instance(s): f1score.\n\nFScore()(ŷ, y)\n\nEvaluate the default instance of FScore on predictions ŷ, given ground truth observations y. \n\nThis is the one-parameter generalization, F_β, of the F-measure or balanced F-score.\n\nhttps://en.wikipedia.org/wiki/F1_score\n\nConstructor signature: FScore(; β=1.0, rev=true).\n\nBy default, the second element of levels(y) is designated as true. To reverse roles, specify rev=true.\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nConstructor signature: FScore(β=1.0, rev=false). \n\nFor more information, run info(FScore). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.FalseDiscoveryRate","page":"Utilities","title":"MLJBase.FalseDiscoveryRate","text":"MLJBase.FalseDiscoveryRate\n\nA measure type for false discovery rate, which includes the instance(s): false_discovery_rate, falsediscovery_rate, fdr.\n\nFalseDiscoveryRate()(ŷ, y)\n\nEvaluate the default instance of FalseDiscoveryRate on predictions ŷ, given ground truth observations y. \n\nAssigns false to first element of levels(y). To reverse roles, use FalseDiscoveryRate(rev=true).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(FalseDiscoveryRate). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.FalseNegative","page":"Utilities","title":"MLJBase.FalseNegative","text":"MLJBase.FalseNegative\n\nA measure type for number of false negatives, which includes the instance(s): false_negative, falsenegative.\n\nFalseNegative()(ŷ, y)\n\nEvaluate the default instance of FalseNegative on predictions ŷ, given ground truth observations y. \n\nAssigns false to first element of levels(y). To reverse roles, use FalseNegative(rev=true).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(FalseNegative). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.FalseNegativeRate","page":"Utilities","title":"MLJBase.FalseNegativeRate","text":"MLJBase.FalseNegativeRate\n\nA measure type for false negative rate, which includes the instance(s): false_negative_rate, falsenegative_rate, fnr, miss_rate.\n\nFalseNegativeRate()(ŷ, y)\n\nEvaluate the default instance of FalseNegativeRate on predictions ŷ, given ground truth observations y. \n\nAssigns false to first element of levels(y). To reverse roles, use FalseNegativeRate(rev=true).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(FalseNegativeRate). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.FalsePositive","page":"Utilities","title":"MLJBase.FalsePositive","text":"MLJBase.FalsePositive\n\nA measure type for number of false positives, which includes the instance(s): false_positive, falsepositive.\n\nFalsePositive()(ŷ, y)\n\nEvaluate the default instance of FalsePositive on predictions ŷ, given ground truth observations y. \n\nAssigns false to first element of levels(y). To reverse roles, use FalsePositive(rev=true).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(FalsePositive). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.FalsePositiveRate","page":"Utilities","title":"MLJBase.FalsePositiveRate","text":"MLJBase.FalsePositiveRate\n\nA measure type for false positive rate, which includes the instance(s): false_positive_rate, falsepositive_rate, fpr, fallout.\n\nFalsePositiveRate()(ŷ, y)\n\nEvaluate the default instance of FalsePositiveRate on predictions ŷ, given ground truth observations y. \n\nAssigns false to first element of levels(y). To reverse roles, use FalsePositiveRate(rev=true).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(FalsePositiveRate). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.HuberLoss","page":"Utilities","title":"MLJBase.HuberLoss","text":"MLJBase.HuberLoss\n\nA measure type for huber loss, which includes the instance(s): huber_loss.\n\nHuberLoss()(ŷ, y)\nHuberLoss()(ŷ, y, w)\n\nEvaluate the default instance of HuberLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; ŷ must be an array of deterministic predictions. \n\nConstructor signature: HuberLoss(; d=1.0)\n\nFor more information, run info(HuberLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.Kappa","page":"Utilities","title":"MLJBase.Kappa","text":"MLJBase.Kappa\n\nA measure type for kappa, which includes the instance(s): kappa.\n\nKappa()(ŷ, y)\n\nEvaluate the kappa on predictions ŷ, given ground truth observations y. \n\nA metric to measure agreement between predicted labels and the ground truth. See https://en.wikipedia.org/wiki/Cohen%27s_kappa\n\nThis metric is invariant to class reordering.\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite,Missing} (multiclass classification); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(Kappa). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.L1EpsilonInsLoss","page":"Utilities","title":"MLJBase.L1EpsilonInsLoss","text":"MLJBase.L1EpsilonInsLoss\n\nA measure type for l1 ϵ-insensitive loss, which includes the instance(s): l1_epsilon_ins_loss.\n\nL1EpsilonInsLoss()(ŷ, y)\nL1EpsilonInsLoss()(ŷ, y, w)\n\nEvaluate the default instance of L1EpsilonInsLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; ŷ must be an array of deterministic predictions. \n\nConstructor signature: L1EpsilonInsLoss(; ε=1.0)\n\nFor more information, run info(L1EpsilonInsLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.L1HingeLoss","page":"Utilities","title":"MLJBase.L1HingeLoss","text":"MLJBase.L1HingeLoss\n\nA measure type for l1 hinge loss, which includes the instance(s): l1_hinge_loss.\n\nL1HingeLoss()(ŷ, y)\nL1HingeLoss()(ŷ, y, w)\n\nEvaluate the default instance of L1HingeLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(L1HingeLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.L2EpsilonInsLoss","page":"Utilities","title":"MLJBase.L2EpsilonInsLoss","text":"MLJBase.L2EpsilonInsLoss\n\nA measure type for l2 ϵ-insensitive loss, which includes the instance(s): l2_epsilon_ins_loss.\n\nL2EpsilonInsLoss()(ŷ, y)\nL2EpsilonInsLoss()(ŷ, y, w)\n\nEvaluate the default instance of L2EpsilonInsLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; ŷ must be an array of deterministic predictions. \n\nConstructor signature: L2EpsilonInsLoss(; ε=1.0)\n\nFor more information, run info(L2EpsilonInsLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.L2HingeLoss","page":"Utilities","title":"MLJBase.L2HingeLoss","text":"MLJBase.L2HingeLoss\n\nA measure type for l2 hinge loss, which includes the instance(s): l2_hinge_loss.\n\nL2HingeLoss()(ŷ, y)\nL2HingeLoss()(ŷ, y, w)\n\nEvaluate the default instance of L2HingeLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(L2HingeLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.L2MarginLoss","page":"Utilities","title":"MLJBase.L2MarginLoss","text":"MLJBase.L2MarginLoss\n\nA measure type for l2 margin loss, which includes the instance(s): l2_margin_loss.\n\nL2MarginLoss()(ŷ, y)\nL2MarginLoss()(ŷ, y, w)\n\nEvaluate the default instance of L2MarginLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(L2MarginLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.LPDistLoss","page":"Utilities","title":"MLJBase.LPDistLoss","text":"MLJBase.LPDistLoss\n\nA measure type for lp dist loss, which includes the instance(s): lp_dist_loss.\n\nLPDistLoss()(ŷ, y)\nLPDistLoss()(ŷ, y, w)\n\nEvaluate the default instance of LPDistLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; ŷ must be an array of deterministic predictions. \n\nConstructor signature: LPDistLoss(; P=2)\n\nFor more information, run info(LPDistLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.LPLoss","page":"Utilities","title":"MLJBase.LPLoss","text":"MLJBase.LPLoss\n\nA measure type for lp loss, which includes the instance(s): l1, l2.\n\nLPLoss()(ŷ, y)\nLPLoss()(ŷ, y, w)\n\nEvaluate the default instance of LPLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nConstructor signature: LPLoss(p=2). Reports |ŷ[i] - y[i]|^p for every index i.\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; ŷ must be an array of deterministic predictions. \n\nFor more information, run info(LPLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.LogCoshLoss","page":"Utilities","title":"MLJBase.LogCoshLoss","text":"MLJBase.LogCoshLoss\n\nA measure type for log cosh loss, which includes the instance(s): log_cosh, log_cosh_loss.\n\nLogCoshLoss()(ŷ, y)\nLogCoshLoss()(ŷ, y, w)\n\nEvaluate the log cosh loss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nReports log(cosh(yᵢ-yᵢ)) for each index i. \n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; ŷ must be an array of deterministic predictions. \n\nFor more information, run info(LogCoshLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.LogLoss","page":"Utilities","title":"MLJBase.LogLoss","text":"MLJBase.LogLoss\n\nA measure type for log loss, which includes the instance(s): log_loss, cross_entropy.\n\nLogLoss()(ŷ, y)\nLogLoss()(ŷ, y, w)\n\nEvaluate the default instance of LogLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor details, see LogScore, which differs only by a sign.\n\nRequires scitype(y) to be a subtype of AbtractArray{<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ŷ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ŷ); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(LogLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.LogScore","page":"Utilities","title":"MLJBase.LogScore","text":"MLJBase.LogScore\n\nA measure type for log score, which includes the instance(s): log_score.\n\nLogScore()(ŷ, y)\nLogScore()(ŷ, y, w)\n\nEvaluate the default instance of LogScore on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nSince the score is undefined in the case that the true observation is predicted to occur with probability zero, probablities are clamped between tol and 1-tol, where tol is a constructor key-word argument.\n\nIf p is the predicted probability mass or density function corresponding to a single ground truth observation η, then the score for that example is\n\nlog(clamp(p(η), tol), 1 - tol)\n\nFor example, for a binary target with \"yes\"/\"no\" labels, and predicted probability of \"yes\" equal to 0.8, an observation of \"no\" scores log(0.2).\n\nThe predictions ŷ should be an array of UnivariateFinite distributions in the case of Finite target y, and otherwise a supported Distributions.UnivariateDistribution such as Normal or Poisson.\n\nSee also LogLoss, which differs only in sign.\n\nRequires scitype(y) to be a subtype of AbtractArray{<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ŷ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ŷ); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(LogScore). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.LogitDistLoss","page":"Utilities","title":"MLJBase.LogitDistLoss","text":"MLJBase.LogitDistLoss\n\nA measure type for logit dist loss, which includes the instance(s): logit_dist_loss.\n\nLogitDistLoss()(ŷ, y)\nLogitDistLoss()(ŷ, y, w)\n\nEvaluate the default instance of LogitDistLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; ŷ must be an array of deterministic predictions. \n\nFor more information, run info(LogitDistLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.LogitMarginLoss","page":"Utilities","title":"MLJBase.LogitMarginLoss","text":"MLJBase.LogitMarginLoss\n\nA measure type for logit margin loss, which includes the instance(s): logit_margin_loss.\n\nLogitMarginLoss()(ŷ, y)\nLogitMarginLoss()(ŷ, y, w)\n\nEvaluate the default instance of LogitMarginLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(LogitMarginLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.MatthewsCorrelation","page":"Utilities","title":"MLJBase.MatthewsCorrelation","text":"MLJBase.MatthewsCorrelation\n\nA measure type for matthews correlation, which includes the instance(s): matthews_correlation, mcc.\n\nMatthewsCorrelation()(ŷ, y)\n\nEvaluate the matthews correlation on predictions ŷ, given ground truth observations y. \n\nhttps://en.wikipedia.org/wiki/Matthewscorrelationcoefficient This metric is invariant to class reordering.\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(MatthewsCorrelation). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.MeanAbsoluteError","page":"Utilities","title":"MLJBase.MeanAbsoluteError","text":"MLJBase.MeanAbsoluteError\n\nA measure type for mean absolute error, which includes the instance(s): mae, mav, mean_absolute_error, mean_absolute_value.\n\nMeanAbsoluteError()(ŷ, y)\nMeanAbsoluteError()(ŷ, y, w)\n\nEvaluate the mean absolute error on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\ntextmean absolute error = n^-1ᵢyᵢ-ŷᵢ or textmean absolute error = n^-1ᵢwᵢyᵢ-ŷᵢ\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; ŷ must be an array of deterministic predictions. \n\nFor more information, run info(MeanAbsoluteError). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.MeanAbsoluteProportionalError","page":"Utilities","title":"MLJBase.MeanAbsoluteProportionalError","text":"MLJBase.MeanAbsoluteProportionalError\n\nA measure type for mean absolute proportional error, which includes the instance(s): mape.\n\nMeanAbsoluteProportionalError()(ŷ, y)\nMeanAbsoluteProportionalError()(ŷ, y, w)\n\nEvaluate the default instance of MeanAbsoluteProportionalError on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nConstructor key-word arguments: tol (default = eps()).\n\ntextmean absolute proportional error = m^-1ᵢ(yᵢ-yᵢ) over yᵢ\n\nwhere the sum is over indices such that abs(yᵢ) > tol and m is the number of such indices.\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; ŷ must be an array of deterministic predictions. \n\nFor more information, run info(MeanAbsoluteProportionalError). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.MisclassificationRate","page":"Utilities","title":"MLJBase.MisclassificationRate","text":"MLJBase.MisclassificationRate\n\nA measure type for misclassification rate, which includes the instance(s): misclassification_rate, mcr.\n\nMisclassificationRate()(ŷ, y)\nMisclassificationRate()(ŷ, y, w)\n\nEvaluate the misclassification rate on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nA confusion matrix can also be passed as argument. This metric is invariant to class reordering.\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite,Missing} (multiclass classification); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(MisclassificationRate). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.ModifiedHuberLoss","page":"Utilities","title":"MLJBase.ModifiedHuberLoss","text":"MLJBase.ModifiedHuberLoss\n\nA measure type for modified huber loss, which includes the instance(s): modified_huber_loss.\n\nModifiedHuberLoss()(ŷ, y)\nModifiedHuberLoss()(ŷ, y, w)\n\nEvaluate the default instance of ModifiedHuberLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(ModifiedHuberLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.NegativePredictiveValue","page":"Utilities","title":"MLJBase.NegativePredictiveValue","text":"MLJBase.NegativePredictiveValue\n\nA measure type for negative predictive value, which includes the instance(s): negative_predictive_value, negativepredictive_value, npv.\n\nNegativePredictiveValue()(ŷ, y)\n\nEvaluate the default instance of NegativePredictiveValue on predictions ŷ, given ground truth observations y. \n\nAssigns false to first element of levels(y). To reverse roles, use NegativePredictiveValue(rev=true).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(NegativePredictiveValue). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.PerceptronLoss","page":"Utilities","title":"MLJBase.PerceptronLoss","text":"MLJBase.PerceptronLoss\n\nA measure type for perceptron loss, which includes the instance(s): perceptron_loss.\n\nPerceptronLoss()(ŷ, y)\nPerceptronLoss()(ŷ, y, w)\n\nEvaluate the default instance of PerceptronLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(PerceptronLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.PeriodicLoss","page":"Utilities","title":"MLJBase.PeriodicLoss","text":"MLJBase.PeriodicLoss\n\nA measure type for periodic loss, which includes the instance(s): periodic_loss.\n\nPeriodicLoss()(ŷ, y)\nPeriodicLoss()(ŷ, y, w)\n\nEvaluate the default instance of PeriodicLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; ŷ must be an array of deterministic predictions. \n\nFor more information, run info(PeriodicLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.Precision","page":"Utilities","title":"MLJBase.Precision","text":"MLJBase.Precision\n\nA measure type for precision (a.k.a. positive predictive value), which includes the instance(s): positive_predictive_value, ppv, positivepredictive_value, precision.\n\nPrecision()(ŷ, y)\n\nEvaluate the default instance of Precision on predictions ŷ, given ground truth observations y. \n\nAssigns false to first element of levels(y). To reverse roles, use Precision(rev=true).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(Precision). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.QuantileLoss","page":"Utilities","title":"MLJBase.QuantileLoss","text":"MLJBase.QuantileLoss\n\nA measure type for quantile loss, which includes the instance(s): quantile_loss.\n\nQuantileLoss()(ŷ, y)\nQuantileLoss()(ŷ, y, w)\n\nEvaluate the default instance of QuantileLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; ŷ must be an array of deterministic predictions. \n\nConstructor signature: QuantileLoss(; τ=0.7)\n\nFor more information, run info(QuantileLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.RSquared","page":"Utilities","title":"MLJBase.RSquared","text":"MLJBase.RSquared\n\nA measure type for r squared, which includes the instance(s): rsq, rsquared.\n\nRSquared()(ŷ, y)\n\nEvaluate the r squared on predictions ŷ, given ground truth observations y. \n\nThe R² (also known as R-squared or coefficient of determination) is suitable for interpreting linear regression analysis (Chicco et al., 2021).\n\nLet overliney denote the mean of y, then\n\ntextR^2 = 1 - frac (haty - y)^2 overliney - y)^2\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; ŷ must be an array of deterministic predictions. \n\nFor more information, run info(RSquared). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.RootMeanSquaredError","page":"Utilities","title":"MLJBase.RootMeanSquaredError","text":"MLJBase.RootMeanSquaredError\n\nA measure type for root mean squared error, which includes the instance(s): rms, rmse, root_mean_squared_error.\n\nRootMeanSquaredError()(ŷ, y)\nRootMeanSquaredError()(ŷ, y, w)\n\nEvaluate the root mean squared error on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\ntextroot mean squared error = sqrtn^-1ᵢyᵢ-yᵢ^2 or textroot mean squared error = sqrtfracᵢwᵢyᵢ-yᵢ^2ᵢwᵢ\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; ŷ must be an array of deterministic predictions. \n\nFor more information, run info(RootMeanSquaredError). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.RootMeanSquaredLogError","page":"Utilities","title":"MLJBase.RootMeanSquaredLogError","text":"MLJBase.RootMeanSquaredLogError\n\nA measure type for root mean squared log error, which includes the instance(s): rmsl, rmsle, root_mean_squared_log_error.\n\nRootMeanSquaredLogError()(ŷ, y)\nRootMeanSquaredLogError()(ŷ, y, w)\n\nEvaluate the root mean squared log error on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\ntextroot mean squared log error = sqrtn^-1ᵢlogleft(yᵢ over yᵢright)^2\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; ŷ must be an array of deterministic predictions. \n\nSee also rmslp1.\n\nFor more information, run info(RootMeanSquaredLogError). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.RootMeanSquaredLogProportionalError","page":"Utilities","title":"MLJBase.RootMeanSquaredLogProportionalError","text":"MLJBase.RootMeanSquaredLogProportionalError\n\nA measure type for root mean squared log proportional error, which includes the instance(s): rmslp1.\n\nRootMeanSquaredLogProportionalError()(ŷ, y)\nRootMeanSquaredLogProportionalError()(ŷ, y, w)\n\nEvaluate the default instance of RootMeanSquaredLogProportionalError on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nConstructor signature: RootMeanSquaredLogProportionalError(; offset = 1.0).\n\ntextroot mean squared log proportional error = sqrtn^-1ᵢlogleft(yᵢ + textoffset over yᵢ + textoffsetright)\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; ŷ must be an array of deterministic predictions. \n\nSee also rmsl. \n\nFor more information, run info(RootMeanSquaredLogProportionalError). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.RootMeanSquaredProportionalError","page":"Utilities","title":"MLJBase.RootMeanSquaredProportionalError","text":"MLJBase.RootMeanSquaredProportionalError\n\nA measure type for root mean squared proportional error, which includes the instance(s): rmsp.\n\nRootMeanSquaredProportionalError()(ŷ, y)\nRootMeanSquaredProportionalError()(ŷ, y, w)\n\nEvaluate the default instance of RootMeanSquaredProportionalError on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nConstructor keyword arguments: tol (default = eps()).\n\ntextroot mean squared proportional error = sqrtm^-1ᵢ left(yᵢ-yᵢ over yᵢright)^2\n\nwhere the sum is over indices such that abs(yᵢ) > tol and m is the number of such indices.\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; ŷ must be an array of deterministic predictions. \n\nFor more information, run info(RootMeanSquaredProportionalError). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.SigmoidLoss","page":"Utilities","title":"MLJBase.SigmoidLoss","text":"MLJBase.SigmoidLoss\n\nA measure type for sigmoid loss, which includes the instance(s): sigmoid_loss.\n\nSigmoidLoss()(ŷ, y)\nSigmoidLoss()(ŷ, y, w)\n\nEvaluate the default instance of SigmoidLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(SigmoidLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.SmoothedL1HingeLoss","page":"Utilities","title":"MLJBase.SmoothedL1HingeLoss","text":"MLJBase.SmoothedL1HingeLoss\n\nA measure type for smoothed l1 hinge loss, which includes the instance(s): smoothed_l1_hinge_loss.\n\nSmoothedL1HingeLoss()(ŷ, y)\nSmoothedL1HingeLoss()(ŷ, y, w)\n\nEvaluate the default instance of SmoothedL1HingeLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of probabilistic predictions. \n\nConstructor signature: SmoothedL1HingeLoss(; gamma=1.0)\n\nFor more information, run info(SmoothedL1HingeLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.SphericalScore","page":"Utilities","title":"MLJBase.SphericalScore","text":"MLJBase.SphericalScore\n\nA measure type for Spherical score, which includes the instance(s): spherical_score.\n\nSphericalScore()(ŷ, y)\nSphericalScore()(ŷ, y, w)\n\nEvaluate the default instance of SphericalScore on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nConvention as in Gneiting and Raftery (2007), \"StrictlyProper Scoring Rules, Prediction, and Estimation\": If η takes on a finite number of classes C and `p(η) is the predicted probability for a single observation η, then the corresponding score for that observation is given by\n\np(y)^α left(sum_η C p(η)^αright)^1-α - 1\n\nwhere α is the measure parameter alpha.\n\nIn the case the predictions ŷ are continuous probability distributions, such as Distributions.Normal, replace the above sum with an integral, and interpret p as the probablity density function. In case of discrete distributions over the integers, such as Distributions.Poisson, sum over all integers instead of C.\n\nRequires scitype(y) to be a subtype of AbtractArray{<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ŷ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ŷ); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(SphericalScore). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.TrueNegative","page":"Utilities","title":"MLJBase.TrueNegative","text":"MLJBase.TrueNegative\n\nA measure type for number of true negatives, which includes the instance(s): true_negative, truenegative.\n\nTrueNegative()(ŷ, y)\n\nEvaluate the default instance of TrueNegative on predictions ŷ, given ground truth observations y. \n\nAssigns false to first element of levels(y). To reverse roles, use TrueNegative(rev=true).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(TrueNegative). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.TrueNegativeRate","page":"Utilities","title":"MLJBase.TrueNegativeRate","text":"MLJBase.TrueNegativeRate\n\nA measure type for true negative rate, which includes the instance(s): true_negative_rate, truenegative_rate, tnr, specificity, selectivity.\n\nTrueNegativeRate()(ŷ, y)\n\nEvaluate the default instance of TrueNegativeRate on predictions ŷ, given ground truth observations y. \n\nAssigns false to first element of levels(y). To reverse roles, use TrueNegativeRate(rev=true).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(TrueNegativeRate). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.TruePositive","page":"Utilities","title":"MLJBase.TruePositive","text":"MLJBase.TruePositive\n\nA measure type for number of true positives, which includes the instance(s): true_positive, truepositive.\n\nTruePositive()(ŷ, y)\n\nEvaluate the default instance of TruePositive on predictions ŷ, given ground truth observations y. \n\nAssigns false to first element of levels(y). To reverse roles, use TruePositive(rev=true).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(TruePositive). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.TruePositiveRate","page":"Utilities","title":"MLJBase.TruePositiveRate","text":"MLJBase.TruePositiveRate\n\nA measure type for true positive rate (a.k.a recall), which includes the instance(s): true_positive_rate, truepositive_rate, tpr, sensitivity, recall, hit_rate.\n\nTruePositiveRate()(ŷ, y)\n\nEvaluate the default instance of TruePositiveRate on predictions ŷ, given ground truth observations y. \n\nAssigns false to first element of levels(y). To reverse roles, use TruePositiveRate(rev=true).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of \"true\" effects the measure); ŷ must be an array of deterministic predictions. \n\nFor more information, run info(TruePositiveRate). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase.ZeroOneLoss","page":"Utilities","title":"MLJBase.ZeroOneLoss","text":"MLJBase.ZeroOneLoss\n\nA measure type for zero one loss, which includes the instance(s): zero_one_loss.\n\nZeroOneLoss()(ŷ, y)\nZeroOneLoss()(ŷ, y, w)\n\nEvaluate the default instance of ZeroOneLoss on predictions ŷ, given ground truth observations y. Optionally specify per-sample weights, w. \n\nFor more detail, see the original LossFunctions.jl documentation but note differences in the signature.\n\nLosses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).\n\nRequires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); ŷ must be an array of probabilistic predictions. \n\nFor more information, run info(ZeroOneLoss). \n\n\n\n\n\n","category":"type"},{"location":"utilities/#MLJBase._permute_rows-Tuple{AbstractVecOrMat, Vector{Int64}}","page":"Utilities","title":"MLJBase._permute_rows","text":"permuterows(obj, perm)\n\nInternal function to return a vector or matrix with permuted rows given the permutation perm.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.available_name-Tuple{Any, Any}","page":"Utilities","title":"MLJBase.available_name","text":"available_name(modl::Module, name::Symbol)\n\nFunction to replace, if necessary, a given name with a modified one that ensures it is not the name of any existing object in the global scope of modl. Modifications are created with numerical suffixes.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.check_dimensions-Tuple{Any, Any}","page":"Utilities","title":"MLJBase.check_dimensions","text":"check_dimensions(X, Y)\n\nInternal function to check two arrays have the same shape.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.check_same_nrows-Tuple{Any, Any}","page":"Utilities","title":"MLJBase.check_same_nrows","text":"check_same_nrows(X, Y)\n\nInternal function to check two objects, each a vector or a matrix, have the same number of rows.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.chunks-Tuple{AbstractRange, Integer}","page":"Utilities","title":"MLJBase.chunks","text":"chunks(range, n)\n\nSplit an AbstractRange into n subranges of approximately equal length.\n\nExample\n\njulia> collect(chunks(1:5, 2))\n2-element Array{UnitRange{Int64},1}:\n 1:3\n 4:5\n\n**Private method**\n\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.flat_values-Tuple{NamedTuple}","page":"Utilities","title":"MLJBase.flat_values","text":"flat_values(t::NamedTuple)\n\nView a nested named tuple t as a tree and return, as a tuple, the values at the leaves, in the order they appear in the original tuple.\n\njulia> t = (X = (x = 1, y = 2), Y = 3)\njulia> flat_values(t)\n(1, 2, 3)\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.generate_name!-Tuple{DataType, Any}","page":"Utilities","title":"MLJBase.generate_name!","text":"generate_name!(M, existing_names; only=Union{Function,Type}, substitute=:f)\n\nGiven a type M (e.g., MyEvenInteger{N}) return a symbolic, snake-case, representation of the type name (such as my_even_integer). The symbol is pushed to existing_names, which must be an AbstractVector to which a Symbol can be pushed.\n\nIf the snake-case representation already exists in existing_names a suitable integer is appended to the name.\n\nIf only is specified, then the operation is restricted to those M for which M isa only. In all other cases the symbolic name is generated using substitute as the base symbol.\n\nexisting_names = []\njulia> generate_name!(Vector{Int}, existing_names)\n:vector\n\njulia> generate_name!(Vector{Int}, existing_names)\n:vector2\n\njulia> generate_name!(AbstractFloat, existing_names)\n:abstract_float\n\njulia> generate_name!(Int, existing_names, only=Array, substitute=:not_array)\n:not_array\n\njulia> generate_name!(Int, existing_names, only=Array, substitute=:not_array)\n:not_array2\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.init_rng-Tuple{Any}","page":"Utilities","title":"MLJBase.init_rng","text":"init_rng(rng)\n\nCreate an AbstractRNG from rng. If rng is a non-negative Integer, it returns a MersenneTwister random number generator seeded with rng; If rng is an AbstractRNG object it returns rng, otherwise it throws an error.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.measures_for_export-Tuple{}","page":"Utilities","title":"MLJBase.measures_for_export","text":"measures_for_export()\n\nReturn a list of the symbolic representation of all:\n\nmeasure types (subtypes of Aggregated and Unaggregated) measure\ntype aliases (as defined by the constant MLJBase.MEASURE_TYPE_ALIASES)\nall built-in measure instances (as declared by instances trait)\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.metadata_measure-Tuple{Any}","page":"Utilities","title":"MLJBase.metadata_measure","text":"metadata_measure(T; kw...)\n\nHelper function to write the metadata (trait definitions) for a single measure.\n\nCompulsory keyword arguments\n\ntarget_scitype: The allowed scientific type of y in measure(ŷ, y, ...). This is typically some abstract array. E.g, in single target variable regression this is typically AbstractArray{<:Union{Missing,Continuous}}. For a binary classification metric insensitive to class order, this would typically be Union{AbstractArray{<:Union{Missing,Multiclass{2}}}, AbstractArray{<:Union{Missing,OrderedFactor{2}}}}, which has the alias FiniteArrMissing.\norientation: Orientation of the measure. Use :loss when lower is better and :score when higher is better. For example, set :loss for root mean square and :score for area under the ROC curve.\nprediction_type: Refers to ŷ in measure(ŷ, y, ...) and should be one of: :deterministic (ŷ has same type as y), :probabilistic or :interval.\n\nOptional keyword arguments\n\nThe following have meaningful defaults but may still require overloading:\n\ninstances: A vector of strings naming the built-in instances of the measurement type provided by the implementation, which are usually just common aliases for the default instance. E.g., for RSquared has the instances = [\"rsq\", \"rsquared\"] which are both defined as RSquared() in the implementation. MulticlassFScore has the instances = [\"macro_f1score\", \"micro_f1score\", \"multiclass_f1score\"], where micro_f1score = MulticlassFScore(average=micro_avg), etc. Default is String[].\naggregation: Aggregation method for measurements, typically Mean() (for, e.g., mean absolute error) or Sum() (for number of true positives). Default is Mean(). Must subtype StatisticalTraits.AggregationMode. It is used to:\naggregate measurements in resampling (e.g., cross-validation)\naggregating per-observation measurements returned by single in the fallback definition of call for Unaggregated measures\n(such as area under the ROC curve).\nsupports_weights: Whether the measure can be called with per-observation weights w, as in l2(ŷ, y, w). Default is true.\nsupports_class_weights: Whether the measure can be called with a class weight dictionary w, as in micro_f1score(ŷ, y, w). Default is true. Default is false.\nhuman_name: Ordinary name of measure. Used in the full auto-generated docstring, which begins \"A measure type for human_name ...\". Eg, the human_name for TruePositive is number of true positives. Default is snake-case version of type name, with underscores replaced by spaces; soMeanAbsoluteError` becomes \"mean absolute error\".\ndocstring: An abbreviated docstring, displayed by info(measure). Fallback uses human_name and lists the instances.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.prepend-Tuple{Symbol, Nothing}","page":"Utilities","title":"MLJBase.prepend","text":"MLJBase.prepend(::Symbol, ::Union{Symbol,Expr,Nothing})\n\nFor prepending symbols in expressions like :(y.w) and :(x1.x2.x3).\n\njulia> prepend(:x, :y) :(x.y)\n\njulia> prepend(:x, :(y.z)) :(x.y.z)\n\njulia> prepend(:w, ans) :(w.x.y.z)\n\nIf the second argument is nothing, then nothing is returned.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.recursive_getproperty-Tuple{Any, Symbol}","page":"Utilities","title":"MLJBase.recursive_getproperty","text":"recursive_getproperty(object, nested_name::Expr)\n\nCall getproperty recursively on object to extract the value of some nested property, as in the following example:\n\njulia> object = (X = (x = 1, y = 2), Y = 3)\njulia> recursive_getproperty(object, :(X.y))\n2\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.recursive_setproperty!-Tuple{Any, Symbol, Any}","page":"Utilities","title":"MLJBase.recursive_setproperty!","text":"recursively_setproperty!(object, nested_name::Expr, value)\n\nSet a nested property of an object to value, as in the following example:\n\njulia> mutable struct Foo\n X\n Y\n end\n\njulia> mutable struct Bar\n x\n y\n end\n\njulia> object = Foo(Bar(1, 2), 3)\nFoo(Bar(1, 2), 3)\n\njulia> recursively_setproperty!(object, :(X.y), 42)\n42\n\njulia> object\nFoo(Bar(1, 42), 3)\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.sequence_string-Union{Tuple{Itr}, Tuple{Itr, Any}} where Itr","page":"Utilities","title":"MLJBase.sequence_string","text":"sequence_string(itr, n=3)\n\nReturn a \"sequence\" string from the first n elements generated by itr.\n\njulia> MLJBase.sequence_string(1:10, 4)\n\"1, 2, 3, 4, ...\"\n\nPrivate method.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.shuffle_rows-Tuple{AbstractVecOrMat, AbstractVecOrMat}","page":"Utilities","title":"MLJBase.shuffle_rows","text":"shuffle_rows(X::AbstractVecOrMat,\n Y::AbstractVecOrMat;\n rng::AbstractRNG=Random.GLOBAL_RNG)\n\nReturn row-shuffled vectors or matrices using a random permutation of X and Y. An optional random number generator can be specified using the rng argument.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.unwind-Tuple","page":"Utilities","title":"MLJBase.unwind","text":"unwind(iterators...)\n\nRepresent all possible combinations of values generated by iterators as rows of a matrix A. In more detail, A has one column for each iterator in iterators and one row for each distinct possible combination of values taken on by the iterators. Elements in the first column cycle fastest, those in the last clolumn slowest.\n\nExample\n\njulia> iterators = ([1, 2], [\"a\",\"b\"], [\"x\", \"y\", \"z\"]);\njulia> MLJTuning.unwind(iterators...)\n12×3 Array{Any,2}:\n 1 \"a\" \"x\"\n 2 \"a\" \"x\"\n 1 \"b\" \"x\"\n 2 \"b\" \"x\"\n 1 \"a\" \"y\"\n 2 \"a\" \"y\"\n 1 \"b\" \"y\"\n 2 \"b\" \"y\"\n 1 \"a\" \"z\"\n 2 \"a\" \"z\"\n 1 \"b\" \"z\"\n 2 \"b\" \"z\"\n\n\n\n\n\n","category":"method"},{"location":"resampling/#Resampling","page":"Resampling","title":"Resampling","text":"","category":"section"},{"location":"resampling/","page":"Resampling","title":"Resampling","text":"Modules = [MLJBase]\nPages = [\"resampling.jl\"]","category":"page"},{"location":"resampling/#MLJBase.CV","page":"Resampling","title":"MLJBase.CV","text":"cv = CV(; nfolds=6, shuffle=nothing, rng=nothing)\n\nCross-validation resampling strategy, for use in evaluate!, evaluate and tuning.\n\ntrain_test_pairs(cv, rows)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector. With no row pre-shuffling, the order of rows is preserved, in the sense that rows coincides precisely with the concatenation of the test vectors, in the order they are generated. The first r test vectors have length n + 1, where n, r = divrem(length(rows), nfolds), and the remaining test vectors have length n.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the CV keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.Holdout","page":"Resampling","title":"MLJBase.Holdout","text":"holdout = Holdout(; fraction_train=0.7,\n shuffle=nothing,\n rng=nothing)\n\nHoldout resampling strategy, for use in evaluate!, evaluate and in tuning.\n\ntrain_test_pairs(holdout, rows)\n\nReturns the pair [(train, test)], where train and test are vectors such that rows=vcat(train, test) and length(train)/length(rows) is approximatey equal to fraction_train`.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the Holdout keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is specified.\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.PerformanceEvaluation","page":"Resampling","title":"MLJBase.PerformanceEvaluation","text":"PerformanceEvaluation\n\nType of object returned by evaluate (for models plus data) or evaluate! (for machines). Such objects encode estimates of the performance (generalization error) of a supervised model or outlier detection model.\n\nWhen evaluate/evaluate! is called, a number of train/test pairs (\"folds\") of row indices are generated, according to the options provided, which are discussed in the evaluate! doc-string. Rows correspond to observations. The generated train/test pairs are recorded in the train_test_rows field of the PerformanceEvaluation struct, and the corresponding estimates, aggregated over all train/test pairs, are recorded in measurement, a vector with one entry for each measure (metric) recorded in measure.\n\nWhen displayed, a PerformanceEvalution object includes a value under the heading 1.96*SE, derived from the standard error of the per_fold entries. This value is suitable for constructing a formal 95% confidence interval for the given measurement. Such intervals should be interpreted with caution. See, for example, Bates et al. (2021).\n\nFields\n\nThese fields are part of the public API of the PerformanceEvaluation struct.\n\nmodel: model used to create the performance evaluation. In the case a tuning model, this is the best model found.\nmeasure: vector of measures (metrics) used to evaluate performance\nmeasurement: vector of measurements - one for each element of measure - aggregating the performance measurements over all train/test pairs (folds). The aggregation method applied for a given measure m is aggregation(m) (commonly Mean or Sum)\noperation (e.g., predict_mode): the operations applied for each measure to generate predictions to be evaluated. Possibilities are: predict, predict_mean, predict_mode, predict_median, or predict_joint.\nper_fold: a vector of vectors of individual test fold evaluations (one vector per measure). Useful for obtaining a rough estimate of the variance of the performance estimate.\nper_observation: a vector of vectors of individual observation evaluations of those measures for which reports_each_observation(measure) is true, which is otherwise reported missing. Useful for some forms of hyper-parameter optimization.\nfitted_params_per_fold: a vector containing fitted params(mach) for each machine mach trained during resampling - one machine per train/test pair. Use this to extract the learned parameters for each individual training event.\nreport_per_fold: a vector containing report(mach) for each machine mach training in resampling - one machine per train/test pair.\ntrain_test_rows: a vector of tuples, each of the form (train, test), where train and test are vectors of row (observation) indices for training and evaluation respectively.\nresampling: the resampling strategy used to generate the train/test pairs.\nrepeats: the number of times the resampling strategy was repeated.\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.Resampler","page":"Resampling","title":"MLJBase.Resampler","text":"resampler = Resampler(\n model=ConstantRegressor(),\n resampling=CV(),\n measure=nothing,\n weights=nothing,\n class_weights=nothing\n operation=predict,\n repeats = 1,\n acceleration=default_resource(),\n check_measure=true,\n logger=nothing\n)\n\nResampling model wrapper, used internally by the fit method of TunedModel instances and IteratedModel instances. See `evaluate! for options. Not intended for general use.\n\nGiven a machine mach = machine(resampler, args...) one obtains a performance evaluation of the specified model, performed according to the prescribed resampling strategy and other parameters, using data args..., by calling fit!(mach) followed by evaluate(mach).\n\nOn subsequent calls to fit!(mach) new train/test pairs of row indices are only regenerated if resampling, repeats or cache fields of resampler have changed. The evolution of an RNG field of resampler does not constitute a change (== for MLJType objects is not sensitive to such changes; see `issameexcept').\n\nIf there is single train/test pair, then warm-restart behavior of the wrapped model resampler.model will extend to warm-restart behaviour of the wrapper resampler, with respect to mutations of the wrapped model.\n\nThe sample weights are passed to the specified performance measures that support weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.\n\nThe sample class_weights are passed to the specified performance measures that support per-class weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.StratifiedCV","page":"Resampling","title":"MLJBase.StratifiedCV","text":"stratified_cv = StratifiedCV(; nfolds=6,\n shuffle=false,\n rng=Random.GLOBAL_RNG)\n\nStratified cross-validation resampling strategy, for use in evaluate!, evaluate and in tuning. Applies only to classification problems (OrderedFactor or Multiclass targets).\n\ntrain_test_pairs(stratified_cv, rows, y)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices) where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector.\n\nUnlike regular cross-validation, the distribution of the levels of the target y corresponding to each train and test is constrained, as far as possible, to replicate that of y[rows] as a whole.\n\nThe stratified train_test_pairs algorithm is invariant to label renaming. For example, if you run replace!(y, 'a' => 'b', 'b' => 'a') and then re-run train_test_pairs, the returned (train, test) pairs will be the same.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the StratifedCV keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.TimeSeriesCV","page":"Resampling","title":"MLJBase.TimeSeriesCV","text":"tscv = TimeSeriesCV(; nfolds=4)\n\nCross-validation resampling strategy, for use in evaluate!, evaluate and tuning, when observations are chronological and not expected to be independent.\n\ntrain_test_pairs(tscv, rows)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The rows are partitioned sequentially into nfolds + 1 approximately equal length partitions, where the first partition is the first train set, and the second partition is the first test set. The second train set consists of the first two partitions, and the second test set consists of the third partition, and so on for each fold.\n\nThe first partition (which is the first train set) has length n + r, where n, r = divrem(length(rows), nfolds + 1), and the remaining partitions (all of the test folds) have length n.\n\nExamples\n\njulia> MLJBase.train_test_pairs(TimeSeriesCV(nfolds=3), 1:10)\n3-element Vector{Tuple{UnitRange{Int64}, UnitRange{Int64}}}:\n (1:4, 5:6)\n (1:6, 7:8)\n (1:8, 9:10)\n\njulia> model = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)();\n\njulia> data = @load_sunspots;\n\njulia> X = (lag1 = data.sunspot_number[2:end-1],\n lag2 = data.sunspot_number[1:end-2]);\n\njulia> y = data.sunspot_number[3:end];\n\njulia> tscv = TimeSeriesCV(nfolds=3);\n\njulia> evaluate(model, X, y, resampling=tscv, measure=rmse, verbosity=0)\n┌───────────────────────────┬───────────────┬────────────────────┐\n│ _.measure │ _.measurement │ _.per_fold │\n├───────────────────────────┼───────────────┼────────────────────┤\n│ RootMeanSquaredError @753 │ 21.7 │ [25.4, 16.3, 22.4] │\n└───────────────────────────┴───────────────┴────────────────────┘\n_.per_observation = [missing]\n_.fitted_params_per_fold = [ … ]\n_.report_per_fold = [ … ]\n_.train_test_rows = [ … ]\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.evaluate!-Tuple{Machine{<:Union{Annotator, Supervised}}}","page":"Resampling","title":"MLJBase.evaluate!","text":"evaluate!(mach,\n resampling=CV(),\n measure=nothing,\n rows=nothing,\n weights=nothing,\n class_weights=nothing,\n operation=nothing,\n repeats=1,\n acceleration=default_resource(),\n force=false,\n verbosity=1,\n check_measure=true,\n logger=nothing)\n\nEstimate the performance of a machine mach wrapping a supervised model in data, using the specified resampling strategy (defaulting to 6-fold cross-validation) and measure, which can be a single measure or vector.\n\nDo subtypes(MLJ.ResamplingStrategy) to obtain a list of available resampling strategies. If resampling is not an object of type MLJ.ResamplingStrategy, then a vector of tuples (of the form (train_rows, test_rows) is expected. For example, setting\n\nresampling = [((1:100), (101:200)),\n ((101:200), (1:100))]\n\ngives two-fold cross-validation using the first 200 rows of data.\n\nThe type of operation (predict, predict_mode, etc) to be associated with measure is automatically inferred from measure traits where possible. For example, predict_mode will be used for a Multiclass target, if model is probabilistic but measure is deterministic. The operations applied can be inspected from the operation field of the object returned. Alternatively, operations can be explicitly specified using operation=.... If measure is a vector, then operation must be a single operation, which will be associated with all measures, or a vector of the same length as measure.\n\nThe resampling strategy is applied repeatedly (Monte Carlo resampling) if repeats > 1. For example, if repeats = 10, then resampling = CV(nfolds=5, shuffle=true), generates a total of 50 (train, test) pairs for evaluation and subsequent aggregation.\n\nIf resampling isa MLJ.ResamplingStrategy then one may optionally restrict the data used in evaluation by specifying rows.\n\nAn optional weights vector may be passed for measures that support sample weights (MLJ.supports_weights(measure) == true), which is ignored by those that don't. These weights are not to be confused with any weights w bound to mach (as in mach = machine(model, X, y, w)). To pass these to the performance evaluation measures you must explictly specify weights=w in the evaluate! call.\n\nAdditionally, optional class_weights dictionary may be passed for measures that support class weights (MLJ.supports_class_weights(measure) == true), which is ignored by those that don't. These weights are not to be confused with any weights class_w bound to mach (as in mach = machine(model, X, y, class_w)). To pass these to the performance evaluation measures you must explictly specify class_weights=w in the evaluate! call.\n\nUser-defined measures are supported; see the manual for details.\n\nIf no measure is specified, then default_measure(mach.model) is used, unless this default is nothing and an error is thrown.\n\nThe acceleration keyword argument is used to specify the compute resource (a subtype of ComputationalResources.AbstractResource) that will be used to accelerate/parallelize the resampling operation.\n\nAlthough evaluate! is mutating, mach.model and mach.args are untouched.\n\nSummary of key-word arguments\n\nresampling - resampling strategy (default is CV(nfolds=6))\nmeasure/measures - measure or vector of measures (losses, scores, etc)\nrows - vector of observation indices from which both train and test folds are constructed (default is all observations)\nweights - per-sample weights for measures that support them (not to be confused with weights used in training)\nclass_weights - dictionary of per-class weights for use with measures that support these, in classification problems (not to be confused with per-sample weights or with class weights used in training)\noperation/operations - One of predict, predict_mean, predict_mode, predict_median, or predict_joint, or a vector of these of the same length as measure/measures. Automatically inferred if left unspecified.\nrepeats - default is 1; set to a higher value for repeated (Monte Carlo) resampling\nacceleration - parallelization option; currently supported options are instances of CPU1 (single-threaded computation) CPUThreads (multi-threaded computation) and CPUProcesses (multi-process computation); default is default_resource().\nforce - default is false; set to true for force cold-restart of each training event\nverbosity level, an integer defaulting to 1.\ncheck_measure - default is true\nlogger - a logger object (see MLJBase.log_evaluation)\n\nReturn value\n\nA PerformanceEvaluation object. See PerformanceEvaluation for details.\n\n\n\n\n\n","category":"method"},{"location":"resampling/#MLJBase.log_evaluation-Tuple{Any, Any}","page":"Resampling","title":"MLJBase.log_evaluation","text":"log_evaluation(logger, performance_evaluation)\n\nLog a performance evaluation to logger, an object specific to some logging platform, such as mlflow. If logger=nothing then no logging is performed. The method is called at the end of every call to evaluate/evaluate! using the logger provided by the logger keyword argument.\n\nImplementations for new logging platforms\n\n\n\nJulia interfaces to workflow logging platforms, such as mlflow (provided by the MLFlowClient.jl interface) should overload log_evaluation(logger::LoggerType, performance_evaluation), where LoggerType is a platform-specific type for logger objects. For an example, see the implementation provided by the MLJFlow.jl package.\n\n\n\n\n\n","category":"method"},{"location":"resampling/#MLJModelInterface.evaluate-Tuple{Union{Annotator, Supervised}, Vararg{Any}}","page":"Resampling","title":"MLJModelInterface.evaluate","text":"evaluate(model, data...; cache=true, kw_options...)\n\nEquivalent to evaluate!(machine(model, data..., cache=cache); wk_options...). See the machine version evaluate! for the complete list of options.\n\n\n\n\n\n","category":"method"},{"location":"composition/#Composition","page":"Composition","title":"Composition","text":"","category":"section"},{"location":"composition/#Composites","page":"Composition","title":"Composites","text":"","category":"section"},{"location":"composition/","page":"Composition","title":"Composition","text":"Modules = [MLJBase]\nPages = [\"composition/composites.jl\"]","category":"page"},{"location":"composition/#Networks","page":"Composition","title":"Networks","text":"","category":"section"},{"location":"composition/","page":"Composition","title":"Composition","text":"Modules = [MLJBase]\nPages = [\"composition/networks.jl\"]","category":"page"},{"location":"composition/#Pipelines","page":"Composition","title":"Pipelines","text":"","category":"section"},{"location":"composition/","page":"Composition","title":"Composition","text":"Modules = [MLJBase]\nPages = [\"composition/pipeline_static.jl\", \"composition/pipelines.jl\"]","category":"page"},{"location":"#MLJBase.jl","page":"Home","title":"MLJBase.jl","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"These docs are bare-bones and auto-generated. Complete MLJ documentation is here. ","category":"page"},{"location":"","page":"Home","title":"Home","text":"For MLJBase-specific developer information, see also the README.md file.","category":"page"},{"location":"datasets/#Datasets","page":"Datasets","title":"Datasets","text":"","category":"section"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Pages = [\"data/datasets_synthetic.jl\"]","category":"page"},{"location":"datasets/#Standard-datasets","page":"Datasets","title":"Standard datasets","text":"","category":"section"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"To add a new dataset assuming it has a header and is, at path data/newdataset.csv","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Start by loading it with CSV:","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"fpath = joinpath(\"datadir\", \"newdataset.csv\")\ndata = CSV.read(fpath, copycols=true,\n categorical=true)","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Load it with DelimitedFiles and Tables","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"data_raw, data_header = readdlm(fpath, ',', header=true)\ndata_table = Tables.table(data_raw; header=Symbol.(vec(data_header)))","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Retrieve the conversions:","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"for (n, st) in zip(names(data), scitype_union.(eachcol(data)))\n println(\":$n=>$st,\")\nend","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Copy and paste the result in a coerce","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"data_table = coerce(data_table, ...)","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Modules = [MLJBase]\nPages = [\"data/datasets.jl\"]","category":"page"},{"location":"datasets/#MLJBase.load_dataset-Tuple{String, Tuple}","page":"Datasets","title":"MLJBase.load_dataset","text":"load_dataset(fpath, coercions)\n\nLoad one of standard dataset like Boston etc assuming the file is a comma separated file with a header.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.load_sunspots-Tuple{}","page":"Datasets","title":"MLJBase.load_sunspots","text":"Load a well-known sunspot time series (table with one column). [https://www.sws.bom.gov.au/Educational/2/3/6]](https://www.sws.bom.gov.au/Educational/2/3/6)\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.@load_ames-Tuple{}","page":"Datasets","title":"MLJBase.@load_ames","text":"Load the full version of the well-known Ames Housing task.\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_boston-Tuple{}","page":"Datasets","title":"MLJBase.@load_boston","text":"Load a well-known public regression dataset with Continuous features.\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_crabs-Tuple{}","page":"Datasets","title":"MLJBase.@load_crabs","text":"Load a well-known crab classification dataset with nominal features.\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_iris-Tuple{}","page":"Datasets","title":"MLJBase.@load_iris","text":"Load a well-known public classification task with nominal features.\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_reduced_ames-Tuple{}","page":"Datasets","title":"MLJBase.@load_reduced_ames","text":"Load a reduced version of the well-known Ames Housing task\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_smarket-Tuple{}","page":"Datasets","title":"MLJBase.@load_smarket","text":"Load S&P Stock Market dataset, as used in (An Introduction to Statistical Learning with applications in R)https://rdrr.io/cran/ISLR/man/Smarket.html, by Witten et al (2013), Springer-Verlag, New York.\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_sunspots-Tuple{}","page":"Datasets","title":"MLJBase.@load_sunspots","text":"Load a well-known sunspot time series (single table with one column).\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#Synthetic-datasets","page":"Datasets","title":"Synthetic datasets","text":"","category":"section"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Modules = [MLJBase]\nPages = [\"data/datasets_synthetic.jl\"]","category":"page"},{"location":"datasets/#MLJBase.x","page":"Datasets","title":"MLJBase.x","text":"finalize_Xy(X, y, shuffle, as_table, eltype, rng; clf)\n\nInternal function to finalize the make_* functions.\n\n\n\n\n\n","category":"constant"},{"location":"datasets/#MLJBase.augment_X-Tuple{Matrix{<:Real}, Bool}","page":"Datasets","title":"MLJBase.augment_X","text":"augment_X(X, fit_intercept)\n\nGiven a matrix X, append a column of ones if fit_intercept is true. See make_regression.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.make_blobs","page":"Datasets","title":"MLJBase.make_blobs","text":"X, y = make_blobs(n=100, p=2; kwargs...)\n\nGenerate Gaussian blobs for clustering and classification problems.\n\nReturn value\n\nBy default, a table X with p columns (features) and n rows (observations), together with a corresponding vector of n Multiclass target observations y, indicating blob membership.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\ncenters=3: either a number of centers or a c x p matrix with c pre-determined centers,\ncluster_std=1.0: the standard deviation(s) of each blob,\ncenter_box=(-10. => 10.): the limits of the p-dimensional cube within which the cluster centers are drawn if they are not provided,\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])\n\n\n\n\n\n","category":"function"},{"location":"datasets/#MLJBase.make_circles","page":"Datasets","title":"MLJBase.make_circles","text":"X, y = make_circles(n=100; kwargs...)\n\nGenerate n labeled points close to two concentric circles for classification and clustering models.\n\nReturn value\n\nBy default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the smaller or larger circle, respectively.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\nnoise=0: standard deviation of the Gaussian noise added to the data,\nfactor=0.8: ratio of the smaller radius over the larger one,\n\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_circles(100; noise=0.5, factor=0.3)\n\n\n\n\n\n","category":"function"},{"location":"datasets/#MLJBase.make_moons","page":"Datasets","title":"MLJBase.make_moons","text":" make_moons(n::Int=100; kwargs...)\n\nGenerates labeled two-dimensional points lying close to two interleaved semi-circles, for use with classification and clustering models.\n\nReturn value\n\nBy default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the left or right semi-circle.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\nnoise=0.1: standard deviation of the Gaussian noise added to the data,\nxshift=1.0: horizontal translation of the second center with respect to the first one.\nyshift=0.3: vertical translation of the second center with respect to the first one. \neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_moons(100; noise=0.5)\n\n\n\n\n\n","category":"function"},{"location":"datasets/#MLJBase.make_regression","page":"Datasets","title":"MLJBase.make_regression","text":"make_regression(n, p; kwargs...)\n\nGenerate Gaussian input features and a linear response with Gaussian noise, for use with regression models.\n\nReturn value\n\nBy default, a tuple (X, y) where table X has p columns and n rows (observations), together with a corresponding vector of n Continuous target observations y.\n\nKeywords\n\nintercept=true: Whether to generate data from a model with intercept.\nn_targets=1: Number of columns in the target.\nsparse=0: Proportion of the generating weight vector that is sparse.\nnoise=0.1: Standard deviation of the Gaussian noise added to the response (target).\noutliers=0: Proportion of the response vector to make as outliers by adding a random quantity with high variance. (Only applied if binary is false.)\nas_table=true: Whether X (and y, if n_targets > 1) should be a table or a matrix.\neltype=Float64: Element type for X and y. Must subtype AbstractFloat.\nbinary=false: Whether the target should be binarized (via a sigmoid).\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). \n\nExample\n\nX, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)\n\n\n\n\n\n","category":"function"},{"location":"datasets/#MLJBase.outlify!-Tuple{Any, Any, Any}","page":"Datasets","title":"MLJBase.outlify!","text":"Add outliers to portion s of vector.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.runif_ab-NTuple{5, Any}","page":"Datasets","title":"MLJBase.runif_ab","text":"runif_ab(rng, n, p, a, b)\n\nInternal function to generate n points in [a, b]ᵖ uniformly at random.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.sigmoid-Tuple{Float64}","page":"Datasets","title":"MLJBase.sigmoid","text":"sigmoid(x)\n\nReturn the sigmoid computed in a numerically stable way:\n\nσ(x) = 1(1+exp(-x))\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.sparsify!-Tuple{Any, Any, Any}","page":"Datasets","title":"MLJBase.sparsify!","text":"sparsify!(rng, θ, s)\n\nMake portion s of vector θ exactly 0.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#Utility-functions","page":"Datasets","title":"Utility functions","text":"","category":"section"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Modules = [MLJBase]\nPages = [\"data/data.jl\"]","category":"page"},{"location":"datasets/#MLJBase.complement-Tuple{Any, Any}","page":"Datasets","title":"MLJBase.complement","text":"complement(folds, i)\n\nThe complement of the ith fold of folds in the concatenation of all elements of folds. Here folds is a vector or tuple of integer vectors, typically representing row indices or a vector, matrix or table.\n\ncomplement(([1,2], [3,], [4, 5]), 2) # [1 ,2, 4, 5]\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.corestrict-Union{Tuple{N}, Tuple{Tuple{Vararg{T, N}} where T, Any}} where N","page":"Datasets","title":"MLJBase.corestrict","text":"corestrict(X, folds, i)\n\nThe restriction of X, a vector, matrix or table, to the complement of the ith fold of folds, where folds is a tuple of vectors of row indices.\n\nThe method is curried, so that corestrict(folds, i) is the operator on data defined by corestrict(folds, i)(X) = corestrict(X, folds, i).\n\nExample\n\nfolds = ([1, 2], [3, 4, 5], [6,])\ncorestrict([:x1, :x2, :x3, :x4, :x5, :x6], folds, 2) # [:x1, :x2, :x6]\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.partition-Tuple{Any, Vararg{Real}}","page":"Datasets","title":"MLJBase.partition","text":"partition(X, fractions...;\n shuffle=nothing,\n rng=Random.GLOBAL_RNG,\n stratify=nothing,\n multi=false)\n\nSplits the vector, matrix or table X into a tuple of objects of the same type, whose vertical concatenation is X. The number of rows in each component of the return value is determined by the corresponding fractions of length(nrows(X)), where valid fractions are floats between 0 and 1 whose sum is less than one. The last fraction is not provided, as it is inferred from the preceding ones.\n\nFor \"synchronized\" partitioning of multiple objects, use the multi=true option described below.\n\njulia> partition(1:1000, 0.8)\n([1,...,800], [801,...,1000])\n\njulia> partition(1:1000, 0.2, 0.7)\n([1,...,200], [201,...,900], [901,...,1000])\n\njulia> partition(reshape(1:10, 5, 2), 0.2, 0.4)\n([1 6], [2 7; 3 8], [4 9; 5 10])\n\nX, y = make_blobs() # a table and vector\nXtrain, Xtest = partition(X, 0.8, stratify=y)\n\n(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)\n\nKeywords\n\nshuffle=nothing: if set to true, shuffles the rows before taking fractions.\nrng=Random.GLOBAL_RNG: specifies the random number generator to be used, can be an integer seed. If specified, and shuffle === nothing is interpreted as true.\nstratify=nothing: if a vector is specified, the partition will match the stratification of the given vector. In that case, shuffle cannot be false.\nmulti=false: if true then X is expected to be a tuple of objects sharing a common length, which are each partitioned separately using the same specified fractions and the same row shuffling. Returns a tuple of partitions (a tuple of tuples).\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.restrict-Union{Tuple{N}, Tuple{Tuple{Vararg{T, N}} where T, Any}} where N","page":"Datasets","title":"MLJBase.restrict","text":"restrict(X, folds, i)\n\nThe restriction of X, a vector, matrix or table, to the ith fold of folds, where folds is a tuple of vectors of row indices.\n\nThe method is curried, so that restrict(folds, i) is the operator on data defined by restrict(folds, i)(X) = restrict(X, folds, i).\n\nExample\n\nfolds = ([1, 2], [3, 4, 5], [6,])\nrestrict([:x1, :x2, :x3, :x4, :x5, :x6], folds, 2) # [:x3, :x4, :x5]\n\nSee also corestrict\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.skipinvalid-Tuple{Any}","page":"Datasets","title":"MLJBase.skipinvalid","text":"skipinvalid(itr)\n\nReturn an iterator over the elements in itr skipping missing and NaN values. Behaviour is similar to skipmissing.\n\nskipinvalid(A, B)\n\nFor vectors A and B of the same length, return a tuple of vectors (A[mask], B[mask]) where mask[i] is true if and only if A[i] and B[i] are both valid (non-missing and non-NaN). Can also called on other iterators of matching length, such as arrays, but always returns a vector. Does not remove Missing from the element types if present in the original iterators.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.unpack-Tuple{Any, Vararg{Any}}","page":"Datasets","title":"MLJBase.unpack","text":"unpack(table, f1, f2, ... fk;\n wrap_singles=false,\n shuffle=false,\n rng::Union{AbstractRNG,Int,Nothing}=nothing,\n coerce_options...)\n\nHorizontally split any Tables.jl compatible table into smaller tables or vectors by making column selections determined by the predicates f1, f2, ..., fk. Selection from the column names is without replacement. A predicate is any object f such that f(name) is true or false for each column name::Symbol of table.\n\nReturns a tuple of tables/vectors with length one greater than the number of supplied predicates, with the last component including all previously unselected columns.\n\njulia> table = DataFrame(x=[1,2], y=['a', 'b'], z=[10.0, 20.0], w=[\"A\", \"B\"])\n2×4 DataFrame\n Row │ x y z w\n │ Int64 Char Float64 String\n─────┼──────────────────────────────\n 1 │ 1 a 10.0 A\n 2 │ 2 b 20.0 B\n\nZ, XY, W = unpack(table, ==(:z), !=(:w))\njulia> Z\n2-element Vector{Float64}:\n 10.0\n 20.0\n\njulia> XY\n2×2 DataFrame\n Row │ x y\n │ Int64 Char\n─────┼─────────────\n 1 │ 1 a\n 2 │ 2 b\n\njulia> W # the column(s) left over\n2-element Vector{String}:\n \"A\"\n \"B\"\n\nWhenever a returned table contains a single column, it is converted to a vector unless wrap_singles=true.\n\nIf coerce_options are specified then table is first replaced with coerce(table, coerce_options). See ScientificTypes.coerce for details.\n\nIf shuffle=true then the rows of table are first shuffled, using the global RNG, unless rng is specified; if rng is an integer, it specifies the seed of an automatically generated Mersenne twister. If rng is specified then shuffle=true is implicit.\n\n\n\n\n\n","category":"method"}] +[{"location":"measures/#Measures","page":"Measures","title":"Measures","text":"","category":"section"},{"location":"measures/#Helper-functions","page":"Measures","title":"Helper functions","text":"","category":"section"},{"location":"measures/","page":"Measures","title":"Measures","text":"Modules = [MLJBase]\nPages = [\"measures/registry.jl\", \"measures/measures.jl\"]","category":"page"},{"location":"measures/#Continuous-loss-functions","page":"Measures","title":"Continuous loss functions","text":"","category":"section"},{"location":"measures/","page":"Measures","title":"Measures","text":"Modules = [MLJBase]\nPages = [\"measures/continuous.jl\"]","category":"page"},{"location":"measures/#Confusion-matrix","page":"Measures","title":"Confusion matrix","text":"","category":"section"},{"location":"measures/","page":"Measures","title":"Measures","text":"Modules = [MLJBase]\nPages = [\"measures/confusion_matrix.jl\"]","category":"page"},{"location":"measures/#Finite-loss-functions","page":"Measures","title":"Finite loss functions","text":"","category":"section"},{"location":"measures/","page":"Measures","title":"Measures","text":"Modules = [MLJBase]\nPages = [\"measures/finite.jl\"]","category":"page"},{"location":"distributions/#Distributions","page":"Distributions","title":"Distributions","text":"","category":"section"},{"location":"distributions/#Univariate-Finite-Distribution","page":"Distributions","title":"Univariate Finite Distribution","text":"","category":"section"},{"location":"distributions/","page":"Distributions","title":"Distributions","text":"Modules = [MLJBase]\nPages = [\"interface/univariate_finite.jl\"]","category":"page"},{"location":"distributions/#hyperparameters","page":"Distributions","title":"hyperparameters","text":"","category":"section"},{"location":"distributions/","page":"Distributions","title":"Distributions","text":"Modules = [MLJBase]\nPages = [\"hyperparam/one_dimensional_range_methods.jl\", \"hyperparam/one_dimensional_ranges.jl\"]","category":"page"},{"location":"distributions/#Distributions.sampler-Union{Tuple{T}, Tuple{NumericRange{T}, Distributions.UnivariateDistribution}} where T","page":"Distributions","title":"Distributions.sampler","text":"sampler(r::NominalRange, probs::AbstractVector{<:Real})\nsampler(r::NominalRange)\nsampler(r::NumericRange{T}, d)\n\nConstruct an object s which can be used to generate random samples from a ParamRange object r (a one-dimensional range) using one of the following calls:\n\nrand(s) # for one sample\nrand(s, n) # for n samples\nrand(rng, s [, n]) # to specify an RNG\n\nThe argument probs can be any probability vector with the same length as r.values. The second sampler method above calls the first with a uniform probs vector.\n\nThe argument d can be either an arbitrary instance of UnivariateDistribution from the Distributions.jl package, or one of a Distributions.jl types for which fit(d, ::NumericRange) is defined. These include: Arcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight, Normal, Gamma, InverseGaussian, Logistic, LogNormal, Cauchy, Gumbel, Laplace, and Poisson; but see the doc-string for Distributions.fit for an up-to-date list.\n\nIf d is an instance, then sampling is from a truncated form of the supplied distribution d, the truncation bounds being r.lower and r.upper (the attributes r.origin and r.unit attributes are ignored). For discrete numeric ranges (T <: Integer) the samples are rounded.\n\nIf d is a type then a suitably truncated distribution is automatically generated using Distributions.fit(d, r).\n\nImportant. Values are generated with no regard to r.scale, except in the special case r.scale is a callable object f. In that case, f is applied to all values generated by rand as described above (prior to rounding, in the case of discrete numeric ranges).\n\nExamples\n\nr = range(Char, :letter, values=collect(\"abc\"))\ns = sampler(r, [0.1, 0.2, 0.7])\nsamples = rand(s, 1000);\nStatsBase.countmap(samples)\nDict{Char,Int64} with 3 entries:\n 'a' => 107\n 'b' => 205\n 'c' => 688\n\nr = range(Int, :k, lower=2, upper=6) # numeric but discrete\ns = sampler(r, Normal)\nsamples = rand(s, 1000);\nUnicodePlots.histogram(samples)\n ┌ ┐\n[2.0, 2.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 119\n[2.5, 3.0) ┤ 0\n[3.0, 3.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 296\n[3.5, 4.0) ┤ 0\n[4.0, 4.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 275\n[4.5, 5.0) ┤ 0\n[5.0, 5.5) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 221\n[5.5, 6.0) ┤ 0\n[6.0, 6.5) ┤▇▇▇▇▇▇▇▇▇▇▇ 89\n └ ┘\n\n\n\n\n\n","category":"method"},{"location":"distributions/#MLJBase.iterator-Tuple{Random.AbstractRNG, ParamRange, Vararg{Any}}","page":"Distributions","title":"MLJBase.iterator","text":"iterator([rng, ], r::NominalRange, [,n])\niterator([rng, ], r::NumericRange, n)\n\nReturn an iterator (currently a vector) for a ParamRange object r. In the first case iteration is over all values stored in the range (or just the first n, if n is specified). In the second case, the iteration is over approximately n ordered values, generated as follows:\n\n(i) First, exactly n values are generated between U and L, with a spacing determined by r.scale (uniform if scale=:linear) where U and L are given by the following table:\n\nr.lower r.upper L U\nfinite finite r.lower r.upper\n-Inf finite r.upper - 2r.unit r.upper\nfinite Inf r.lower r.lower + 2r.unit\n-Inf Inf r.origin - r.unit r.origin + r.unit\n\n(ii) If a callable f is provided as scale, then a uniform spacing is always applied in (i) but f is broadcast over the results. (Unlike ordinary scales, this alters the effective range of values generated, instead of just altering the spacing.)\n\n(iii) If r is a discrete numeric range (r isa NumericRange{<:Integer}) then the values are additionally rounded, with any duplicate values removed. Otherwise all the values are used (and there are exacltly n of them).\n\n(iv) Finally, if a random number generator rng is specified, then the values are returned in random order (sampling without replacement), and otherwise they are returned in numeric order, or in the order provided to the range constructor, in the case of a NominalRange.\n\n\n\n\n\n","category":"method"},{"location":"distributions/#MLJBase.scale-Tuple{NominalRange}","page":"Distributions","title":"MLJBase.scale","text":"scale(r::ParamRange)\n\nReturn the scale associated with a ParamRange object r. The possible return values are: :none (for a NominalRange), :linear, :log, :log10, :log2, or :custom (if r.scale is a callable object).\n\n\n\n\n\n","category":"method"},{"location":"distributions/#StatsAPI.fit-Union{Tuple{D}, Tuple{Type{D}, NumericRange}} where D<:Distributions.Distribution","page":"Distributions","title":"StatsAPI.fit","text":"Distributions.fit(D, r::MLJBase.NumericRange)\n\nFit and return a distribution d of type D to the one-dimensional range r.\n\nOnly types D in the table below are supported.\n\nThe distribution d is constructed in two stages. First, a distributon d0, characterized by the conditions in the second column of the table, is fit to r. Then d0 is truncated between r.lower and r.upper to obtain d.\n\nDistribution type D Characterization of d0\nArcsine, Uniform, Biweight, Cosine, Epanechnikov, SymTriangularDist, Triweight minimum(d) = r.lower, maximum(d) = r.upper\nNormal, Gamma, InverseGaussian, Logistic, LogNormal mean(d) = r.origin, std(d) = r.unit\nCauchy, Gumbel, Laplace, (Normal) Dist.location(d) = r.origin, Dist.scale(d) = r.unit\nPoisson Dist.mean(d) = r.unit\n\nHere Dist = Distributions.\n\n\n\n\n\n","category":"method"},{"location":"distributions/#Base.range-Union{Tuple{D}, Tuple{Union{Model, Type}, Union{Expr, Symbol}}} where D","page":"Distributions","title":"Base.range","text":"r = range(model, :hyper; values=nothing)\n\nDefine a one-dimensional NominalRange object for a field hyper of model. Note that r is not directly iterable but iterator(r) is.\n\nA nested hyperparameter is specified using dot notation. For example, :(atom.max_depth) specifies the max_depth hyperparameter of the submodel model.atom.\n\nr = range(model, :hyper; upper=nothing, lower=nothing,\n scale=nothing, values=nothing)\n\nAssuming values is not specified, define a one-dimensional NumericRange object for a Real field hyper of model. Note that r is not directly iteratable but iterator(r, n)is an iterator of length n. To generate random elements from r, instead apply rand methods to sampler(r). The supported scales are :linear,:log, :logminus, :log10, :log10minus, :log2, or a callable object.\n\nNote that r is not directly iterable, but iterator(r, n) is, for given resolution (length) n.\n\nBy default, the behaviour of the constructed object depends on the type of the value of the hyperparameter :hyper at model at the time of construction. To override this behaviour (for instance if model is not available) specify a type in place of model so the behaviour is determined by the value of the specified type.\n\nA nested hyperparameter is specified using dot notation (see above).\n\nIf scale is unspecified, it is set to :linear, :log, :log10minus, or :linear, according to whether the interval (lower, upper) is bounded, right-unbounded, left-unbounded, or doubly unbounded, respectively. Note upper=Inf and lower=-Inf are allowed.\n\nIf values is specified, the other keyword arguments are ignored and a NominalRange object is returned (see above).\n\nSee also: iterator, sampler\n\n\n\n\n\n","category":"method"},{"location":"distributions/#Utility-functions","page":"Distributions","title":"Utility functions","text":"","category":"section"},{"location":"distributions/","page":"Distributions","title":"Distributions","text":"Modules = [MLJBase]\nPages = [\"distributions.jl\"]","category":"page"},{"location":"utilities/#Utilities","page":"Utilities","title":"Utilities","text":"","category":"section"},{"location":"utilities/#Machines","page":"Utilities","title":"Machines","text":"","category":"section"},{"location":"utilities/","page":"Utilities","title":"Utilities","text":"Modules = [MLJBase]\nPages = [\"machines.jl\"]","category":"page"},{"location":"utilities/#Base.replace-Union{Tuple{C}, Tuple{Machine{<:Any, C}, Vararg{Pair}}} where C","page":"Utilities","title":"Base.replace","text":"replace(mach::Machine, field1 => value1, field2 => value2, ...)\n\nPrivate method.\n\nReturn a shallow copy of the machine mach with the specified field replacements. Undefined field values are preserved. Unspecified fields have identically equal values, with the exception of mach.fit_okay, which is always a new instance Channel{Bool}(1).\n\nThe following example returns a machine with no traces of training data (but also removes any upstream dependencies in a learning network):\n\n```julia replace(mach, :args => (), :data => (), :dataresampleddata => (), :cache => nothing)\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.age-Tuple{Machine}","page":"Utilities","title":"MLJBase.age","text":"age(mach::Machine)\n\nReturn an integer representing the number of times mach has been trained or updated. For more detail, see the discussion of training logic at fit_only!.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.ancestors-Tuple{Machine}","page":"Utilities","title":"MLJBase.ancestors","text":"ancestors(mach::Machine; self=false)\n\nAll ancestors of mach, including mach if self=true.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.default_scitype_check_level","page":"Utilities","title":"MLJBase.default_scitype_check_level","text":"default_scitype_check_level()\n\nReturn the current global default value for scientific type checking when constructing machines.\n\ndefault_scitype_check_level(i::Integer)\n\nSet the global default value for scientific type checking to i.\n\nThe effect of the scitype_check_level option in calls of the form machine(model, data, scitype_check_level=...) is summarized below:\n\nscitype_check_level Inspect scitypes? If Unknown in scitypes If other scitype mismatch\n0 × \n1 (value at startup) ✓ warning\n2 ✓ warning warning\n3 ✓ warning error\n4 ✓ error error\n\nSee also machine\n\n\n\n\n\n","category":"function"},{"location":"utilities/#MLJBase.fit_only!-Union{Tuple{Machine{<:Any, cache_data}}, Tuple{cache_data}} where cache_data","page":"Utilities","title":"MLJBase.fit_only!","text":"MLJBase.fit_only!(\n mach::Machine;\n rows=nothing,\n verbosity=1,\n force=false,\n composite=nothing,\n)\n\nWithout mutating any other machine on which it may depend, perform one of the following actions to the machine mach, using the data and model bound to it, and restricting the data to rows if specified:\n\nAb initio training. Ignoring any previous learned parameters and cache, compute and store new learned parameters. Increment mach.state.\nTraining update. Making use of previous learned parameters and/or cache, replace or mutate existing learned parameters. The effect is the same (or nearly the same) as in ab initio training, but may be faster or use less memory, assuming the model supports an update option (implements MLJBase.update). Increment mach.state.\nNo-operation. Leave existing learned parameters untouched. Do not increment mach.state.\n\nIf the model, model, bound to mach is a symbol, then instead perform the action using the true model given by getproperty(composite, model). See also machine.\n\nTraining action logic\n\nFor the action to be a no-operation, either mach.frozen == true or or none of the following apply:\n\n(i) mach has never been trained (mach.state == 0).\n(ii) force == true.\n(iii) The state of some other machine on which mach depends has changed since the last time mach was trained (ie, the last time mach.state was last incremented).\n(iv) The specified rows have changed since the last retraining and mach.model does not have Static type.\n(v) mach.model is a model and different from the last model used for training, but has the same type.\n(vi) mach.model is a model but has a type different from the last model used for training.\n(vii) mach.model is a symbol and (composite, mach.model) is different from the last model used for training, but has the same type.\n(viii) mach.model is a symbol and (composite, mach.model) has a different type from the last model used for training.\n\nIn any of the cases (i) - (iv), (vi), or (viii), mach is trained ab initio. If (v) or (vii) is true, then a training update is applied.\n\nTo freeze or unfreeze mach, use freeze!(mach) or thaw!(mach).\n\nImplementation details\n\nThe data to which a machine is bound is stored in mach.args. Each element of args is either a Node object, or, in the case that concrete data was bound to the machine, it is concrete data wrapped in a Source node. In all cases, to obtain concrete data for actual training, each argument N is called, as in N() or N(rows=rows), and either MLJBase.fit (ab initio training) or MLJBase.update (training update) is dispatched on mach.model and this data. See the \"Adding models for general use\" section of the MLJ documentation for more on these lower-level training methods.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.freeze!-Tuple{Machine}","page":"Utilities","title":"MLJBase.freeze!","text":"freeze!(mach)\n\nFreeze the machine mach so that it will never be retrained (unless thawed).\n\nSee also thaw!.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.last_model-Tuple{Any}","page":"Utilities","title":"MLJBase.last_model","text":"last_model(mach::Machine)\n\nReturn the last model used to train the machine mach. This is a bona fide model, even if mach.model is a symbol.\n\nReturns nothing if mach has not been trained.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.machine","page":"Utilities","title":"MLJBase.machine","text":"machine(model, args...; cache=true, scitype_check_level=1)\n\nConstruct a Machine object binding a model, storing hyper-parameters of some machine learning algorithm, to some data, args. Calling fit! on a Machine instance mach stores outcomes of applying the algorithm in mach, which can be inspected using fitted_params(mach) (learned paramters) and report(mach) (other outcomes). This in turn enables generalization to new data using operations such as predict or transform:\n\nusing MLJModels\nX, y = make_regression()\n\nPCA = @load PCA pkg=MultivariateStats\nmodel = PCA()\nmach = machine(model, X)\nfit!(mach, rows=1:50)\ntransform(mach, selectrows(X, 51:100)) # or transform(mach, rows=51:100)\n\nDecisionTreeRegressor = @load DecisionTreeRegressor pkg=DecisionTree\nmodel = DecisionTreeRegressor()\nmach = machine(model, X, y)\nfit!(mach, rows=1:50)\npredict(mach, selectrows(X, 51:100)) # or predict(mach, rows=51:100)\n\nSpecify cache=false to prioritize memory management over speed.\n\nWhen building a learning network, Node objects can be substituted for the concrete data but no type or dimension checks are applied.\n\nChecks on the types of training data\n\nA model articulates its data requirements using scientific types, i.e., using the scitype function instead of the typeof function.\n\nIf scitype_check_level > 0 then the scitype of each arg in args is computed, and this is compared with the scitypes expected by the model, unless args contains Unknown scitypes and scitype_check_level < 4, in which case no further action is taken. Whether warnings are issued or errors thrown depends the level. For details, see default_scitype_check_level, a method to inspect or change the default level (1 at startup).\n\nMachines with model placeholders\n\nA symbol can be substituted for a model in machine constructors to act as a placeholder for a model specified at training time. The symbol must be the field name for a struct whose corresponding value is a model, as shown in the following example:\n\nmutable struct MyComposite\n transformer\n classifier\nend\n\nmy_composite = MyComposite(Standardizer(), ConstantClassifier)\n\nX, y = make_blobs()\nmach = machine(:classifier, X, y)\nfit!(mach, composite=my_composite)\n\nThe last two lines are equivalent to\n\nmach = machine(ConstantClassifier(), X, y)\nfit!(mach)\n\nDelaying model specification is used when exporting learning networks as new stand-alone model types. See prefit and the MLJ documentation on learning networks.\n\nSee also fit!, default_scitype_check_level, MLJBase.save, serializable.\n\n\n\n\n\n","category":"function"},{"location":"utilities/#MLJBase.machine-Tuple{Union{IO, String}}","page":"Utilities","title":"MLJBase.machine","text":"machine(file::Union{String, IO})\n\nRebuild from a file a machine that has been serialized using the default Serialization module.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.report-Tuple{Machine}","page":"Utilities","title":"MLJBase.report","text":"report(mach)\n\nReturn the report for a machine mach that has been fit!, for example the coefficients in a linear model.\n\nThis is a named tuple and human-readable if possible.\n\nIf mach is a machine for a composite model, such as a model constructed using @pipeline, then the returned named tuple has the composite type's field names as keys. The corresponding value is the report for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)\n\nusing MLJ\n@load LinearBinaryClassifier pkg=GLM\nX, y = @load_crabs;\npipe = @pipeline Standardizer LinearBinaryClassifier\nmach = machine(pipe, X, y) |> fit!\n\njulia> report(mach).linear_binary_classifier\n(deviance = 3.8893386087844543e-7,\n dof_residual = 195.0,\n stderror = [18954.83496713119, 6502.845740757159, 48484.240246060406, 34971.131004997274, 20654.82322484894, 2111.1294584763386],\n vcov = [3.592857686311793e8 9.122732393971942e6 … -8.454645589364915e7 5.38856837634321e6; 9.122732393971942e6 4.228700272808351e7 … -4.978433790526467e7 -8.442545425533723e6; … ; -8.454645589364915e7 -4.978433790526467e7 … 4.2662172244975924e8 2.1799125705781363e7; 5.38856837634321e6 -8.442545425533723e6 … 2.1799125705781363e7 4.456867590446599e6],)\n\n\nAdditional keys, machines and report_given_machine, give a list of all machines in the underlying network, and a dictionary of reports keyed on those machines.\n\nSee also fitted_params\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.report_given_method-Tuple{Machine}","page":"Utilities","title":"MLJBase.report_given_method","text":"report_given_method(mach::Machine)\n\nSame as report(mach) but broken down by the method (fit, predict, etc) that contributed the report.\n\nA specialized method intended for learning network applications.\n\nThe return value is a dictionary keyed on the symbol representing the method (:fit, :predict, etc) and the values report contributed by that method.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.restore!-Tuple{Machine}","page":"Utilities","title":"MLJBase.restore!","text":"restore!(mach::Machine)\n\nRestore the state of a machine that is currently serializable but which may not be otherwise usable. For such a machine, mach, one has mach.state=1. Intended for restoring deserialized machine objects to a useable form.\n\nFor an example see serializable.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.serializable-Union{Tuple{Machine{<:Any, C}}, Tuple{C}} where C","page":"Utilities","title":"MLJBase.serializable","text":"serializable(mach::Machine)\n\nReturns a shallow copy of the machine to make it serializable. In particular, all training data is removed and, if necessary, learned parameters are replaced with persistent representations.\n\nAny general purpose Julia serializer may be applied to the output of serializable (eg, JLSO, BSON, JLD) but you must call restore!(mach) on the deserialised object mach before using it. See the example below.\n\nIf using Julia's standard Serialization library, a shorter workflow is available using the MLJBase.save (or MLJ.save) method.\n\nA machine returned by serializable is characterized by the property mach.state == -1.\n\nExample using JLSO\n\nusing MLJ\nusing JLSO\nTree = @load DecisionTreeClassifier\ntree = Tree()\nX, y = @load_iris\nmach = fit!(machine(tree, X, y))\n\n# This machine can now be serialized\nsmach = serializable(mach)\nJLSO.save(\"machine.jlso\", :machine => smach)\n\n# Deserialize and restore learned parameters to useable form:\nloaded_mach = JLSO.load(\"machine.jlso\")[:machine]\nrestore!(loaded_mach)\n\npredict(loaded_mach, X)\npredict(mach, X)\n\nSee also restore!, MLJBase.save.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.thaw!-Tuple{Machine}","page":"Utilities","title":"MLJBase.thaw!","text":"thaw!(mach)\n\nUnfreeze the machine mach so that it can be retrained.\n\nSee also freeze!.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJModelInterface.feature_importances-Tuple{Machine}","page":"Utilities","title":"MLJModelInterface.feature_importances","text":"feature_importances(mach::Machine)\n\nReturn a list of feature => importance pairs for a fitted machine, mach, for supported models. Otherwise return nothing.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJModelInterface.fitted_params-Tuple{Machine}","page":"Utilities","title":"MLJModelInterface.fitted_params","text":"fitted_params(mach)\n\nReturn the learned parameters for a machine mach that has been fit!, for example the coefficients in a linear model.\n\nThis is a named tuple and human-readable if possible.\n\nIf mach is a machine for a composite model, such as a model constructed using @pipeline, then the returned named tuple has the composite type's field names as keys. The corresponding value is the fitted parameters for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)\n\nusing MLJ\n@load LogisticClassifier pkg=MLJLinearModels\nX, y = @load_crabs;\npipe = @pipeline Standardizer LogisticClassifier\nmach = machine(pipe, X, y) |> fit!\n\njulia> fitted_params(mach).logistic_classifier\n(classes = CategoricalArrays.CategoricalValue{String,UInt32}[\"B\", \"O\"],\n coefs = Pair{Symbol,Float64}[:FL => 3.7095037897680405, :RW => 0.1135739140854546, :CL => -1.6036892745322038, :CW => -4.415667573486482, :BD => 3.238476051092471],\n intercept = 0.0883301599726305,)\n\nAdditional keys, machines and fitted_params_given_machine, give a list of all machines in the underlying network, and a dictionary of fitted parameters keyed on those machines.\n\nSee also report\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJModelInterface.save-Tuple{Union{IO, String}, Machine}","page":"Utilities","title":"MLJModelInterface.save","text":"MLJ.save(filename, mach::Machine)\nMLJ.save(io, mach::Machine)\n\nMLJBase.save(filename, mach::Machine)\nMLJBase.save(io, mach::Machine)\n\nSerialize the machine mach to a file with path filename, or to an input/output stream io (at least IOBuffer instances are supported) using the Serialization module.\n\nTo serialise using a different format, see serializable.\n\nMachines are deserialized using the machine constructor as shown in the example below.\n\nThe implementation of save for machines changed in MLJ 0.18 (MLJBase 0.20). You can only restore a machine saved using older versions of MLJ using an older version.\n\nExample\n\nusing MLJ\nTree = @load DecisionTreeClassifier\nX, y = @load_iris\nmach = fit!(machine(Tree(), X, y))\n\nMLJ.save(\"tree.jls\", mach)\nmach_predict_only = machine(\"tree.jls\")\npredict(mach_predict_only, X)\n\n# using a buffer:\nio = IOBuffer()\nMLJ.save(io, mach)\nseekstart(io)\npredict_only_mach = machine(io)\npredict(predict_only_mach, X)\n\nwarning: Only load files from trusted sources\nMaliciously constructed JLS files, like pickles, and most other general purpose serialization formats, can allow for arbitrary code execution during loading. This means it is possible for someone to use a JLS file that looks like a serialized MLJ machine as a Trojan horse.\n\nSee also serializable, machine.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#StatsAPI.fit!-Tuple{Machine}","page":"Utilities","title":"StatsAPI.fit!","text":"fit!(mach::Machine, rows=nothing, verbosity=1, force=false, composite=nothing)\n\nFit the machine mach. In the case that mach has Node arguments, first train all other machines on which mach depends.\n\nTo attempt to fit a machine without touching any other machine, use fit_only!. For more on options and the the internal logic of fitting see fit_only!\n\n\n\n\n\n","category":"method"},{"location":"utilities/#Parameter-Inspection","page":"Utilities","title":"Parameter Inspection","text":"","category":"section"},{"location":"utilities/","page":"Utilities","title":"Utilities","text":"Modules = [MLJBase]\nPages = [\"parameter_inspection.jl\"]","category":"page"},{"location":"utilities/#Show","page":"Utilities","title":"Show","text":"","category":"section"},{"location":"utilities/","page":"Utilities","title":"Utilities","text":"Modules = [MLJBase]\nPages = [\"show.jl\"]","category":"page"},{"location":"utilities/#MLJBase._recursive_show-Tuple{IO, MLJType, Any, Any}","page":"Utilities","title":"MLJBase._recursive_show","text":"_recursive_show(stream, object, current_depth, depth)\n\nGenerate a table of the properties of the MLJType object, dislaying each property value by calling the method _show on it. The behaviour of _show(stream, f) is as follows:\n\nIf f is itself a MLJType object, then its short form is shown\n\nand _recursive_show generates as separate table for each of its properties (and so on, up to a depth of argument depth).\n\nOtherwise f is displayed as \"(omitted T)\" where T = typeof(f),\n\nunless istoobig(f) is false (the istoobig fall-back for arbitrary types being true). In the latter case, the long (ie, MIME\"plain/text\") form of f is shown. To override this behaviour, overload the _show method for the type in question.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.abbreviated-Tuple{Any}","page":"Utilities","title":"MLJBase.abbreviated","text":"to display abbreviated versions of integers\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.color_off-Tuple{}","page":"Utilities","title":"MLJBase.color_off","text":"color_off()\n\nSuppress color and bold output at the REPL for displaying MLJ objects.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.color_on-Tuple{}","page":"Utilities","title":"MLJBase.color_on","text":"color_on()\n\nEnable color and bold output at the REPL, for enhanced display of MLJ objects.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.handle-Tuple{Any}","page":"Utilities","title":"MLJBase.handle","text":"return abbreviated object id (as string) or it's registered handle (as string) if this exists\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.@constant-Tuple{Any}","page":"Utilities","title":"MLJBase.@constant","text":"@constant x = value\n\nPrivate method (used in testing).\n\nEquivalent to const x = value but registers the binding thus:\n\nMLJBase.HANDLE_GIVEN_ID[objectid(value)] = :x\n\nRegistered objects get displayed using the variable name to which it was bound in calls to show(x), etc.\n\nWARNING: As with any const declaration, binding x to new value of the same type is not prevented and the registration will not be updated.\n\n\n\n\n\n","category":"macro"},{"location":"utilities/#MLJBase.@more-Tuple{}","page":"Utilities","title":"MLJBase.@more","text":"@more\n\nEntered at the REPL, equivalent to show(ans, 100). Use to get a recursive description of all properties of the last REPL value.\n\n\n\n\n\n","category":"macro"},{"location":"utilities/#Utility-functions","page":"Utilities","title":"Utility functions","text":"","category":"section"},{"location":"utilities/","page":"Utilities","title":"Utilities","text":"Modules = [MLJBase]\nPages = [\"utilities.jl\"]","category":"page"},{"location":"utilities/#MLJBase._permute_rows-Tuple{AbstractVecOrMat, Vector{Int64}}","page":"Utilities","title":"MLJBase._permute_rows","text":"permuterows(obj, perm)\n\nInternal function to return a vector or matrix with permuted rows given the permutation perm.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.available_name-Tuple{Any, Any}","page":"Utilities","title":"MLJBase.available_name","text":"available_name(modl::Module, name::Symbol)\n\nFunction to replace, if necessary, a given name with a modified one that ensures it is not the name of any existing object in the global scope of modl. Modifications are created with numerical suffixes.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.check_same_nrows-Tuple{Any, Any}","page":"Utilities","title":"MLJBase.check_same_nrows","text":"check_same_nrows(X, Y)\n\nInternal function to check two objects, each a vector or a matrix, have the same number of rows.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.chunks-Tuple{AbstractRange, Integer}","page":"Utilities","title":"MLJBase.chunks","text":"chunks(range, n)\n\nSplit an AbstractRange into n subranges of approximately equal length.\n\nExample\n\njulia> collect(chunks(1:5, 2))\n2-element Array{UnitRange{Int64},1}:\n 1:3\n 4:5\n\n**Private method**\n\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.flat_values-Tuple{NamedTuple}","page":"Utilities","title":"MLJBase.flat_values","text":"flat_values(t::NamedTuple)\n\nView a nested named tuple t as a tree and return, as a tuple, the values at the leaves, in the order they appear in the original tuple.\n\njulia> t = (X = (x = 1, y = 2), Y = 3)\njulia> flat_values(t)\n(1, 2, 3)\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.generate_name!-Tuple{DataType, Any}","page":"Utilities","title":"MLJBase.generate_name!","text":"generate_name!(M, existing_names; only=Union{Function,Type}, substitute=:f)\n\nGiven a type M (e.g., MyEvenInteger{N}) return a symbolic, snake-case, representation of the type name (such as my_even_integer). The symbol is pushed to existing_names, which must be an AbstractVector to which a Symbol can be pushed.\n\nIf the snake-case representation already exists in existing_names a suitable integer is appended to the name.\n\nIf only is specified, then the operation is restricted to those M for which M isa only. In all other cases the symbolic name is generated using substitute as the base symbol.\n\nexisting_names = []\njulia> generate_name!(Vector{Int}, existing_names)\n:vector\n\njulia> generate_name!(Vector{Int}, existing_names)\n:vector2\n\njulia> generate_name!(AbstractFloat, existing_names)\n:abstract_float\n\njulia> generate_name!(Int, existing_names, only=Array, substitute=:not_array)\n:not_array\n\njulia> generate_name!(Int, existing_names, only=Array, substitute=:not_array)\n:not_array2\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.guess_model_target_observation_scitype-Tuple{Any}","page":"Utilities","title":"MLJBase.guess_model_target_observation_scitype","text":"guess_model_targetobservation_scitype(model)\n\nPrivate method\n\nTry to infer a lowest upper bound on the scitype of target observations acceptable to model, by inspecting target_scitype(model). Return Unknown if unable to draw reliable inferrence.\n\nThe observation scitype for a table is here understood as the scitype of a row converted to a vector.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.guess_observation_scitype-Tuple{Any}","page":"Utilities","title":"MLJBase.guess_observation_scitype","text":"guess_observation_scitype(y)\n\nPrivate method.\n\nIf y is an AbstractArray, return the scitype of y[:, :, ..., :, 1]. If y is a table, return the scitype of the first row, converted to a vector, unless this row has missing elements, in which case return Unknown.\n\nIn all other cases, Unknown.\n\njulia> guess_observation_scitype([missing, 1, 2, 3])\nUnion{Missing, Count}\n\njulia> guess_observation_scitype(rand(3, 2))\nAbstractVector{Continuous}\n\njulia> guess_observation_scitype((x=rand(3), y=rand(Bool, 3)))\nAbstractVector{Union{Continuous, Count}}\n\njulia> guess_observation_scitype((x=[missing, 1, 2], y=[1, 2, 3]))\nUnknown\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.init_rng-Tuple{Any}","page":"Utilities","title":"MLJBase.init_rng","text":"init_rng(rng)\n\nCreate an AbstractRNG from rng. If rng is a non-negative Integer, it returns a MersenneTwister random number generator seeded with rng; If rng is an AbstractRNG object it returns rng, otherwise it throws an error.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.observation-Tuple{Type}","page":"Utilities","title":"MLJBase.observation","text":"observation(S)\n\nPrivate method.\n\nTries to infer the per-observation scitype from the scitype of S, when S is known to be the scitype of some container with multiple observations; here we view the scitype for one row of a table to be the scitype of the row converted to a vector. Return Unknown if unable to draw reliable inferrence.\n\nThe observation scitype for a table is here understood as the scitype of a row converted to a vector.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.prepend-Tuple{Symbol, Nothing}","page":"Utilities","title":"MLJBase.prepend","text":"MLJBase.prepend(::Symbol, ::Union{Symbol,Expr,Nothing})\n\nFor prepending symbols in expressions like :(y.w) and :(x1.x2.x3).\n\njulia> prepend(:x, :y) :(x.y)\n\njulia> prepend(:x, :(y.z)) :(x.y.z)\n\njulia> prepend(:w, ans) :(w.x.y.z)\n\nIf the second argument is nothing, then nothing is returned.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.recursive_getproperty-Tuple{Any, Symbol}","page":"Utilities","title":"MLJBase.recursive_getproperty","text":"recursive_getproperty(object, nested_name::Expr)\n\nCall getproperty recursively on object to extract the value of some nested property, as in the following example:\n\njulia> object = (X = (x = 1, y = 2), Y = 3)\njulia> recursive_getproperty(object, :(X.y))\n2\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.recursive_setproperty!-Tuple{Any, Symbol, Any}","page":"Utilities","title":"MLJBase.recursive_setproperty!","text":"recursively_setproperty!(object, nested_name::Expr, value)\n\nSet a nested property of an object to value, as in the following example:\n\njulia> mutable struct Foo\n X\n Y\n end\n\njulia> mutable struct Bar\n x\n y\n end\n\njulia> object = Foo(Bar(1, 2), 3)\nFoo(Bar(1, 2), 3)\n\njulia> recursively_setproperty!(object, :(X.y), 42)\n42\n\njulia> object\nFoo(Bar(1, 42), 3)\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.sequence_string-Union{Tuple{Itr}, Tuple{Itr, Any}} where Itr","page":"Utilities","title":"MLJBase.sequence_string","text":"sequence_string(itr, n=3)\n\nReturn a \"sequence\" string from the first n elements generated by itr.\n\njulia> MLJBase.sequence_string(1:10, 4)\n\"1, 2, 3, 4, ...\"\n\nPrivate method.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.shuffle_rows-Tuple{AbstractVecOrMat, AbstractVecOrMat}","page":"Utilities","title":"MLJBase.shuffle_rows","text":"shuffle_rows(X::AbstractVecOrMat,\n Y::AbstractVecOrMat;\n rng::AbstractRNG=Random.GLOBAL_RNG)\n\nReturn row-shuffled vectors or matrices using a random permutation of X and Y. An optional random number generator can be specified using the rng argument.\n\n\n\n\n\n","category":"method"},{"location":"utilities/#MLJBase.unwind-Tuple","page":"Utilities","title":"MLJBase.unwind","text":"unwind(iterators...)\n\nRepresent all possible combinations of values generated by iterators as rows of a matrix A. In more detail, A has one column for each iterator in iterators and one row for each distinct possible combination of values taken on by the iterators. Elements in the first column cycle fastest, those in the last clolumn slowest.\n\nExample\n\njulia> iterators = ([1, 2], [\"a\",\"b\"], [\"x\", \"y\", \"z\"]);\njulia> MLJTuning.unwind(iterators...)\n12×3 Array{Any,2}:\n 1 \"a\" \"x\"\n 2 \"a\" \"x\"\n 1 \"b\" \"x\"\n 2 \"b\" \"x\"\n 1 \"a\" \"y\"\n 2 \"a\" \"y\"\n 1 \"b\" \"y\"\n 2 \"b\" \"y\"\n 1 \"a\" \"z\"\n 2 \"a\" \"z\"\n 1 \"b\" \"z\"\n 2 \"b\" \"z\"\n\n\n\n\n\n","category":"method"},{"location":"resampling/#Resampling","page":"Resampling","title":"Resampling","text":"","category":"section"},{"location":"resampling/","page":"Resampling","title":"Resampling","text":"Modules = [MLJBase]\nPages = [\"resampling.jl\"]","category":"page"},{"location":"resampling/#MLJBase.CV","page":"Resampling","title":"MLJBase.CV","text":"cv = CV(; nfolds=6, shuffle=nothing, rng=nothing)\n\nCross-validation resampling strategy, for use in evaluate!, evaluate and tuning.\n\ntrain_test_pairs(cv, rows)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector. With no row pre-shuffling, the order of rows is preserved, in the sense that rows coincides precisely with the concatenation of the test vectors, in the order they are generated. The first r test vectors have length n + 1, where n, r = divrem(length(rows), nfolds), and the remaining test vectors have length n.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the CV keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.Holdout","page":"Resampling","title":"MLJBase.Holdout","text":"holdout = Holdout(; fraction_train=0.7,\n shuffle=nothing,\n rng=nothing)\n\nHoldout resampling strategy, for use in evaluate!, evaluate and in tuning.\n\ntrain_test_pairs(holdout, rows)\n\nReturns the pair [(train, test)], where train and test are vectors such that rows=vcat(train, test) and length(train)/length(rows) is approximatey equal to fraction_train`.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the Holdout keyword constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is specified.\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.PerformanceEvaluation","page":"Resampling","title":"MLJBase.PerformanceEvaluation","text":"PerformanceEvaluation\n\nType of object returned by evaluate (for models plus data) or evaluate! (for machines). Such objects encode estimates of the performance (generalization error) of a supervised model or outlier detection model.\n\nWhen evaluate/evaluate! is called, a number of train/test pairs (\"folds\") of row indices are generated, according to the options provided, which are discussed in the evaluate! doc-string. Rows correspond to observations. The generated train/test pairs are recorded in the train_test_rows field of the PerformanceEvaluation struct, and the corresponding estimates, aggregated over all train/test pairs, are recorded in measurement, a vector with one entry for each measure (metric) recorded in measure.\n\nWhen displayed, a PerformanceEvalution object includes a value under the heading 1.96*SE, derived from the standard error of the per_fold entries. This value is suitable for constructing a formal 95% confidence interval for the given measurement. Such intervals should be interpreted with caution. See, for example, Bates et al. (2021).\n\nFields\n\nThese fields are part of the public API of the PerformanceEvaluation struct.\n\nmodel: model used to create the performance evaluation. In the case a tuning model, this is the best model found.\nmeasure: vector of measures (metrics) used to evaluate performance\nmeasurement: vector of measurements - one for each element of measure - aggregating the performance measurements over all train/test pairs (folds). The aggregation method applied for a given measure m is StatisticalMeasuresBase.external_aggregation_mode(m) (commonly Mean() or Sum())\noperation (e.g., predict_mode): the operations applied for each measure to generate predictions to be evaluated. Possibilities are: predict, predict_mean, predict_mode, predict_median, or predict_joint.\nper_fold: a vector of vectors of individual test fold evaluations (one vector per measure). Useful for obtaining a rough estimate of the variance of the performance estimate.\nper_observation: a vector of vectors of vectors containing individual per-observation measurements: for an evaluation e, e.per_observation[m][f][i] is the measurement for the ith observation in the fth test fold, evaluated using the mth measure. Useful for some forms of hyper-parameter optimization. Note that an aggregregated measurement for some measure measure is repeated across all observations in a fold if StatisticalMeasures.can_report_unaggregated(measure) == true. If e has been computed with the per_observation=false option, then e_per_observation is a vector of missings.\nfitted_params_per_fold: a vector containing fitted params(mach) for each machine mach trained during resampling - one machine per train/test pair. Use this to extract the learned parameters for each individual training event.\nreport_per_fold: a vector containing report(mach) for each machine mach training in resampling - one machine per train/test pair.\ntrain_test_rows: a vector of tuples, each of the form (train, test), where train and test are vectors of row (observation) indices for training and evaluation respectively.\nresampling: the resampling strategy used to generate the train/test pairs.\nrepeats: the number of times the resampling strategy was repeated.\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.Resampler","page":"Resampling","title":"MLJBase.Resampler","text":"resampler = Resampler(\n model=ConstantRegressor(),\n resampling=CV(),\n measure=nothing,\n weights=nothing,\n class_weights=nothing\n operation=predict,\n repeats = 1,\n acceleration=default_resource(),\n check_measure=true,\n per_observation=true,\n logger=nothing,\n)\n\nResampling model wrapper, used internally by the fit method of TunedModel instances and IteratedModel instances. See `evaluate! for options. Not intended for use by general user, who will ordinarily use evaluate! directly.\n\nGiven a machine mach = machine(resampler, args...) one obtains a performance evaluation of the specified model, performed according to the prescribed resampling strategy and other parameters, using data args..., by calling fit!(mach) followed by evaluate(mach).\n\nOn subsequent calls to fit!(mach) new train/test pairs of row indices are only regenerated if resampling, repeats or cache fields of resampler have changed. The evolution of an RNG field of resampler does not constitute a change (== for MLJType objects is not sensitive to such changes; see is_same_except).\n\nIf there is single train/test pair, then warm-restart behavior of the wrapped model resampler.model will extend to warm-restart behaviour of the wrapper resampler, with respect to mutations of the wrapped model.\n\nThe sample weights are passed to the specified performance measures that support weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.\n\nThe sample class_weights are passed to the specified performance measures that support per-class weights for evaluation. These weights are not to be confused with any weights bound to a Resampler instance in a machine, used for training the wrapped model when supported.\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.StratifiedCV","page":"Resampling","title":"MLJBase.StratifiedCV","text":"stratified_cv = StratifiedCV(; nfolds=6,\n shuffle=false,\n rng=Random.GLOBAL_RNG)\n\nStratified cross-validation resampling strategy, for use in evaluate!, evaluate and in tuning. Applies only to classification problems (OrderedFactor or Multiclass targets).\n\ntrain_test_pairs(stratified_cv, rows, y)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices) where each train and test is a sub-vector of rows. The test vectors are mutually exclusive and exhaust rows. Each train vector is the complement of the corresponding test vector.\n\nUnlike regular cross-validation, the distribution of the levels of the target y corresponding to each train and test is constrained, as far as possible, to replicate that of y[rows] as a whole.\n\nThe stratified train_test_pairs algorithm is invariant to label renaming. For example, if you run replace!(y, 'a' => 'b', 'b' => 'a') and then re-run train_test_pairs, the returned (train, test) pairs will be the same.\n\nPre-shuffling of rows is controlled by rng and shuffle. If rng is an integer, then the StratifedCV keywod constructor resets it to MersenneTwister(rng). Otherwise some AbstractRNG object is expected.\n\nIf rng is left unspecified, rng is reset to Random.GLOBAL_RNG, in which case rows are only pre-shuffled if shuffle=true is explicitly specified.\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.TimeSeriesCV","page":"Resampling","title":"MLJBase.TimeSeriesCV","text":"tscv = TimeSeriesCV(; nfolds=4)\n\nCross-validation resampling strategy, for use in evaluate!, evaluate and tuning, when observations are chronological and not expected to be independent.\n\ntrain_test_pairs(tscv, rows)\n\nReturns an nfolds-length iterator of (train, test) pairs of vectors (row indices), where each train and test is a sub-vector of rows. The rows are partitioned sequentially into nfolds + 1 approximately equal length partitions, where the first partition is the first train set, and the second partition is the first test set. The second train set consists of the first two partitions, and the second test set consists of the third partition, and so on for each fold.\n\nThe first partition (which is the first train set) has length n + r, where n, r = divrem(length(rows), nfolds + 1), and the remaining partitions (all of the test folds) have length n.\n\nExamples\n\njulia> MLJBase.train_test_pairs(TimeSeriesCV(nfolds=3), 1:10)\n3-element Vector{Tuple{UnitRange{Int64}, UnitRange{Int64}}}:\n (1:4, 5:6)\n (1:6, 7:8)\n (1:8, 9:10)\n\njulia> model = (@load RidgeRegressor pkg=MultivariateStats verbosity=0)();\n\njulia> data = @load_sunspots;\n\njulia> X = (lag1 = data.sunspot_number[2:end-1],\n lag2 = data.sunspot_number[1:end-2]);\n\njulia> y = data.sunspot_number[3:end];\n\njulia> tscv = TimeSeriesCV(nfolds=3);\n\njulia> evaluate(model, X, y, resampling=tscv, measure=rmse, verbosity=0)\n┌───────────────────────────┬───────────────┬────────────────────┐\n│ _.measure │ _.measurement │ _.per_fold │\n├───────────────────────────┼───────────────┼────────────────────┤\n│ RootMeanSquaredError @753 │ 21.7 │ [25.4, 16.3, 22.4] │\n└───────────────────────────┴───────────────┴────────────────────┘\n_.per_observation = [missing]\n_.fitted_params_per_fold = [ … ]\n_.report_per_fold = [ … ]\n_.train_test_rows = [ … ]\n\n\n\n\n\n","category":"type"},{"location":"resampling/#MLJBase.log_evaluation-Tuple{Any, Any}","page":"Resampling","title":"MLJBase.log_evaluation","text":"log_evaluation(logger, performance_evaluation)\n\nLog a performance evaluation to logger, an object specific to some logging platform, such as mlflow. If logger=nothing then no logging is performed. The method is called at the end of every call to evaluate/evaluate! using the logger provided by the logger keyword argument.\n\nImplementations for new logging platforms\n\nJulia interfaces to workflow logging platforms, such as mlflow (provided by the MLFlowClient.jl interface) should overload log_evaluation(logger::LoggerType, performance_evaluation), where LoggerType is a platform-specific type for logger objects. For an example, see the implementation provided by the MLJFlow.jl package.\n\n\n\n\n\n","category":"method"},{"location":"resampling/#MLJModelInterface.evaluate-Tuple{Union{Annotator, Supervised}, Vararg{Any}}","page":"Resampling","title":"MLJModelInterface.evaluate","text":"evaluate(model, data...; cache=true, options...)\n\nEquivalent to evaluate!(machine(model, data..., cache=cache); options...). See the machine version evaluate! for the complete list of options.\n\nReturns a PerformanceEvaluation object.\n\nSee also evaluate!.\n\n\n\n\n\n","category":"method"},{"location":"composition/#Composition","page":"Composition","title":"Composition","text":"","category":"section"},{"location":"composition/#Composites","page":"Composition","title":"Composites","text":"","category":"section"},{"location":"composition/","page":"Composition","title":"Composition","text":"Modules = [MLJBase]\nPages = [\"composition/composites.jl\"]","category":"page"},{"location":"composition/#Networks","page":"Composition","title":"Networks","text":"","category":"section"},{"location":"composition/","page":"Composition","title":"Composition","text":"Modules = [MLJBase]\nPages = [\"composition/networks.jl\"]","category":"page"},{"location":"composition/#Pipelines","page":"Composition","title":"Pipelines","text":"","category":"section"},{"location":"composition/","page":"Composition","title":"Composition","text":"Modules = [MLJBase]\nPages = [\"composition/pipeline_static.jl\", \"composition/pipelines.jl\"]","category":"page"},{"location":"#MLJBase.jl","page":"Home","title":"MLJBase.jl","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"These docs are bare-bones and auto-generated. Complete MLJ documentation is here. ","category":"page"},{"location":"","page":"Home","title":"Home","text":"For MLJBase-specific developer information, see also the README.md file.","category":"page"},{"location":"datasets/#Datasets","page":"Datasets","title":"Datasets","text":"","category":"section"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Pages = [\"data/datasets_synthetic.jl\"]","category":"page"},{"location":"datasets/#Standard-datasets","page":"Datasets","title":"Standard datasets","text":"","category":"section"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"To add a new dataset assuming it has a header and is, at path data/newdataset.csv","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Start by loading it with CSV:","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"fpath = joinpath(\"datadir\", \"newdataset.csv\")\ndata = CSV.read(fpath, copycols=true,\n categorical=true)","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Load it with DelimitedFiles and Tables","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"data_raw, data_header = readdlm(fpath, ',', header=true)\ndata_table = Tables.table(data_raw; header=Symbol.(vec(data_header)))","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Retrieve the conversions:","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"for (n, st) in zip(names(data), scitype_union.(eachcol(data)))\n println(\":$n=>$st,\")\nend","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Copy and paste the result in a coerce","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"data_table = coerce(data_table, ...)","category":"page"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Modules = [MLJBase]\nPages = [\"data/datasets.jl\"]","category":"page"},{"location":"datasets/#MLJBase.load_dataset-Tuple{String, Tuple}","page":"Datasets","title":"MLJBase.load_dataset","text":"load_dataset(fpath, coercions)\n\nLoad one of standard dataset like Boston etc assuming the file is a comma separated file with a header.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.load_sunspots-Tuple{}","page":"Datasets","title":"MLJBase.load_sunspots","text":"Load a well-known sunspot time series (table with one column). [https://www.sws.bom.gov.au/Educational/2/3/6]](https://www.sws.bom.gov.au/Educational/2/3/6)\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.@load_ames-Tuple{}","page":"Datasets","title":"MLJBase.@load_ames","text":"Load the full version of the well-known Ames Housing task.\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_boston-Tuple{}","page":"Datasets","title":"MLJBase.@load_boston","text":"Load a well-known public regression dataset with Continuous features.\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_crabs-Tuple{}","page":"Datasets","title":"MLJBase.@load_crabs","text":"Load a well-known crab classification dataset with nominal features.\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_iris-Tuple{}","page":"Datasets","title":"MLJBase.@load_iris","text":"Load a well-known public classification task with nominal features.\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_reduced_ames-Tuple{}","page":"Datasets","title":"MLJBase.@load_reduced_ames","text":"Load a reduced version of the well-known Ames Housing task\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_smarket-Tuple{}","page":"Datasets","title":"MLJBase.@load_smarket","text":"Load S&P Stock Market dataset, as used in (An Introduction to Statistical Learning with applications in R)https://rdrr.io/cran/ISLR/man/Smarket.html, by Witten et al (2013), Springer-Verlag, New York.\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#MLJBase.@load_sunspots-Tuple{}","page":"Datasets","title":"MLJBase.@load_sunspots","text":"Load a well-known sunspot time series (single table with one column).\n\n\n\n\n\n","category":"macro"},{"location":"datasets/#Synthetic-datasets","page":"Datasets","title":"Synthetic datasets","text":"","category":"section"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Modules = [MLJBase]\nPages = [\"data/datasets_synthetic.jl\"]","category":"page"},{"location":"datasets/#MLJBase.x","page":"Datasets","title":"MLJBase.x","text":"finalize_Xy(X, y, shuffle, as_table, eltype, rng; clf)\n\nInternal function to finalize the make_* functions.\n\n\n\n\n\n","category":"constant"},{"location":"datasets/#MLJBase.augment_X-Tuple{Matrix{<:Real}, Bool}","page":"Datasets","title":"MLJBase.augment_X","text":"augment_X(X, fit_intercept)\n\nGiven a matrix X, append a column of ones if fit_intercept is true. See make_regression.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.make_blobs","page":"Datasets","title":"MLJBase.make_blobs","text":"X, y = make_blobs(n=100, p=2; kwargs...)\n\nGenerate Gaussian blobs for clustering and classification problems.\n\nReturn value\n\nBy default, a table X with p columns (features) and n rows (observations), together with a corresponding vector of n Multiclass target observations y, indicating blob membership.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\ncenters=3: either a number of centers or a c x p matrix with c pre-determined centers,\ncluster_std=1.0: the standard deviation(s) of each blob,\ncenter_box=(-10. => 10.): the limits of the p-dimensional cube within which the cluster centers are drawn if they are not provided,\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_blobs(100, 3; centers=2, cluster_std=[1.0, 3.0])\n\n\n\n\n\n","category":"function"},{"location":"datasets/#MLJBase.make_circles","page":"Datasets","title":"MLJBase.make_circles","text":"X, y = make_circles(n=100; kwargs...)\n\nGenerate n labeled points close to two concentric circles for classification and clustering models.\n\nReturn value\n\nBy default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the smaller or larger circle, respectively.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\nnoise=0: standard deviation of the Gaussian noise added to the data,\nfactor=0.8: ratio of the smaller radius over the larger one,\n\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_circles(100; noise=0.5, factor=0.3)\n\n\n\n\n\n","category":"function"},{"location":"datasets/#MLJBase.make_moons","page":"Datasets","title":"MLJBase.make_moons","text":" make_moons(n::Int=100; kwargs...)\n\nGenerates labeled two-dimensional points lying close to two interleaved semi-circles, for use with classification and clustering models.\n\nReturn value\n\nBy default, a table X with 2 columns and n rows (observations), together with a corresponding vector of n Multiclass target observations y. The target is either 0 or 1, corresponding to membership to the left or right semi-circle.\n\nKeyword arguments\n\nshuffle=true: whether to shuffle the resulting points,\nnoise=0.1: standard deviation of the Gaussian noise added to the data,\nxshift=1.0: horizontal translation of the second center with respect to the first one.\nyshift=0.3: vertical translation of the second center with respect to the first one. \neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). If false the target y has integer element type. \n\nExample\n\nX, y = make_moons(100; noise=0.5)\n\n\n\n\n\n","category":"function"},{"location":"datasets/#MLJBase.make_regression","page":"Datasets","title":"MLJBase.make_regression","text":"make_regression(n, p; kwargs...)\n\nGenerate Gaussian input features and a linear response with Gaussian noise, for use with regression models.\n\nReturn value\n\nBy default, a tuple (X, y) where table X has p columns and n rows (observations), together with a corresponding vector of n Continuous target observations y.\n\nKeywords\n\nintercept=true: Whether to generate data from a model with intercept.\nn_targets=1: Number of columns in the target.\nsparse=0: Proportion of the generating weight vector that is sparse.\nnoise=0.1: Standard deviation of the Gaussian noise added to the response (target).\noutliers=0: Proportion of the response vector to make as outliers by adding a random quantity with high variance. (Only applied if binary is false.)\nas_table=true: Whether X (and y, if n_targets > 1) should be a table or a matrix.\neltype=Float64: Element type for X and y. Must subtype AbstractFloat.\nbinary=false: Whether the target should be binarized (via a sigmoid).\neltype=Float64: machine type of points (any subtype of AbstractFloat).\nrng=Random.GLOBAL_RNG: any AbstractRNG object, or integer to seed a MersenneTwister (for reproducibility).\nas_table=true: whether to return the points as a table (true) or a matrix (false). \n\nExample\n\nX, y = make_regression(100, 5; noise=0.5, sparse=0.2, outliers=0.1)\n\n\n\n\n\n","category":"function"},{"location":"datasets/#MLJBase.outlify!-Tuple{Any, Any, Any}","page":"Datasets","title":"MLJBase.outlify!","text":"Add outliers to portion s of vector.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.runif_ab-NTuple{5, Any}","page":"Datasets","title":"MLJBase.runif_ab","text":"runif_ab(rng, n, p, a, b)\n\nInternal function to generate n points in [a, b]ᵖ uniformly at random.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.sigmoid-Tuple{Float64}","page":"Datasets","title":"MLJBase.sigmoid","text":"sigmoid(x)\n\nReturn the sigmoid computed in a numerically stable way:\n\nσ(x) = 1(1+exp(-x))\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.sparsify!-Tuple{Any, Any, Any}","page":"Datasets","title":"MLJBase.sparsify!","text":"sparsify!(rng, θ, s)\n\nMake portion s of vector θ exactly 0.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#Utility-functions","page":"Datasets","title":"Utility functions","text":"","category":"section"},{"location":"datasets/","page":"Datasets","title":"Datasets","text":"Modules = [MLJBase]\nPages = [\"data/data.jl\"]","category":"page"},{"location":"datasets/#MLJBase.complement-Tuple{Any, Any}","page":"Datasets","title":"MLJBase.complement","text":"complement(folds, i)\n\nThe complement of the ith fold of folds in the concatenation of all elements of folds. Here folds is a vector or tuple of integer vectors, typically representing row indices or a vector, matrix or table.\n\ncomplement(([1,2], [3,], [4, 5]), 2) # [1 ,2, 4, 5]\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.corestrict-Union{Tuple{N}, Tuple{Tuple{Vararg{T, N}} where T, Any}} where N","page":"Datasets","title":"MLJBase.corestrict","text":"corestrict(X, folds, i)\n\nThe restriction of X, a vector, matrix or table, to the complement of the ith fold of folds, where folds is a tuple of vectors of row indices.\n\nThe method is curried, so that corestrict(folds, i) is the operator on data defined by corestrict(folds, i)(X) = corestrict(X, folds, i).\n\nExample\n\nfolds = ([1, 2], [3, 4, 5], [6,])\ncorestrict([:x1, :x2, :x3, :x4, :x5, :x6], folds, 2) # [:x1, :x2, :x6]\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.partition-Tuple{Any, Vararg{Real}}","page":"Datasets","title":"MLJBase.partition","text":"partition(X, fractions...;\n shuffle=nothing,\n rng=Random.GLOBAL_RNG,\n stratify=nothing,\n multi=false)\n\nSplits the vector, matrix or table X into a tuple of objects of the same type, whose vertical concatenation is X. The number of rows in each component of the return value is determined by the corresponding fractions of length(nrows(X)), where valid fractions are floats between 0 and 1 whose sum is less than one. The last fraction is not provided, as it is inferred from the preceding ones.\n\nFor \"synchronized\" partitioning of multiple objects, use the multi=true option described below.\n\njulia> partition(1:1000, 0.8)\n([1,...,800], [801,...,1000])\n\njulia> partition(1:1000, 0.2, 0.7)\n([1,...,200], [201,...,900], [901,...,1000])\n\njulia> partition(reshape(1:10, 5, 2), 0.2, 0.4)\n([1 6], [2 7; 3 8], [4 9; 5 10])\n\nX, y = make_blobs() # a table and vector\nXtrain, Xtest = partition(X, 0.8, stratify=y)\n\n(Xtrain, Xtest), (ytrain, ytest) = partition((X, y), 0.8, rng=123, multi=true)\n\nKeywords\n\nshuffle=nothing: if set to true, shuffles the rows before taking fractions.\nrng=Random.GLOBAL_RNG: specifies the random number generator to be used, can be an integer seed. If specified, and shuffle === nothing is interpreted as true.\nstratify=nothing: if a vector is specified, the partition will match the stratification of the given vector. In that case, shuffle cannot be false.\nmulti=false: if true then X is expected to be a tuple of objects sharing a common length, which are each partitioned separately using the same specified fractions and the same row shuffling. Returns a tuple of partitions (a tuple of tuples).\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.restrict-Union{Tuple{N}, Tuple{Tuple{Vararg{T, N}} where T, Any}} where N","page":"Datasets","title":"MLJBase.restrict","text":"restrict(X, folds, i)\n\nThe restriction of X, a vector, matrix or table, to the ith fold of folds, where folds is a tuple of vectors of row indices.\n\nThe method is curried, so that restrict(folds, i) is the operator on data defined by restrict(folds, i)(X) = restrict(X, folds, i).\n\nExample\n\nfolds = ([1, 2], [3, 4, 5], [6,])\nrestrict([:x1, :x2, :x3, :x4, :x5, :x6], folds, 2) # [:x3, :x4, :x5]\n\nSee also corestrict\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.skipinvalid-Tuple{Any}","page":"Datasets","title":"MLJBase.skipinvalid","text":"skipinvalid(itr)\n\nReturn an iterator over the elements in itr skipping missing and NaN values. Behaviour is similar to skipmissing.\n\nskipinvalid(A, B)\n\nFor vectors A and B of the same length, return a tuple of vectors (A[mask], B[mask]) where mask[i] is true if and only if A[i] and B[i] are both valid (non-missing and non-NaN). Can also called on other iterators of matching length, such as arrays, but always returns a vector. Does not remove Missing from the element types if present in the original iterators.\n\n\n\n\n\n","category":"method"},{"location":"datasets/#MLJBase.unpack-Tuple{Any, Vararg{Any}}","page":"Datasets","title":"MLJBase.unpack","text":"unpack(table, f1, f2, ... fk;\n wrap_singles=false,\n shuffle=false,\n rng::Union{AbstractRNG,Int,Nothing}=nothing,\n coerce_options...)\n\nHorizontally split any Tables.jl compatible table into smaller tables or vectors by making column selections determined by the predicates f1, f2, ..., fk. Selection from the column names is without replacement. A predicate is any object f such that f(name) is true or false for each column name::Symbol of table.\n\nReturns a tuple of tables/vectors with length one greater than the number of supplied predicates, with the last component including all previously unselected columns.\n\njulia> table = DataFrame(x=[1,2], y=['a', 'b'], z=[10.0, 20.0], w=[\"A\", \"B\"])\n2×4 DataFrame\n Row │ x y z w\n │ Int64 Char Float64 String\n─────┼──────────────────────────────\n 1 │ 1 a 10.0 A\n 2 │ 2 b 20.0 B\n\nZ, XY, W = unpack(table, ==(:z), !=(:w))\njulia> Z\n2-element Vector{Float64}:\n 10.0\n 20.0\n\njulia> XY\n2×2 DataFrame\n Row │ x y\n │ Int64 Char\n─────┼─────────────\n 1 │ 1 a\n 2 │ 2 b\n\njulia> W # the column(s) left over\n2-element Vector{String}:\n \"A\"\n \"B\"\n\nWhenever a returned table contains a single column, it is converted to a vector unless wrap_singles=true.\n\nIf coerce_options are specified then table is first replaced with coerce(table, coerce_options). See ScientificTypes.coerce for details.\n\nIf shuffle=true then the rows of table are first shuffled, using the global RNG, unless rng is specified; if rng is an integer, it specifies the seed of an automatically generated Mersenne twister. If rng is specified then shuffle=true is implicit.\n\n\n\n\n\n","category":"method"}] } diff --git a/dev/utilities/index.html b/dev/utilities/index.html index ea41541d..9c1d4be6 100644 --- a/dev/utilities/index.html +++ b/dev/utilities/index.html @@ -1,13 +1,11 @@ -Utilities · MLJBase.jl

      Utilities

      Machines

      Base.replaceMethod
      replace(mach::Machine, field1 => value1, field2 => value2, ...)

      Private method.

      Return a shallow copy of the machine mach with the specified field replacements. Undefined field values are preserved. Unspecified fields have identically equal values, with the exception of mach.fit_okay, which is always a new instance Channel{Bool}(1).

      The following example returns a machine with no traces of training data (but also removes any upstream dependencies in a learning network):

      ```julia replace(mach, :args => (), :data => (), :dataresampleddata => (), :cache => nothing)

      source
      MLJBase.ageMethod
      age(mach::Machine)

      Return an integer representing the number of times mach has been trained or updated. For more detail, see the discussion of training logic at fit_only!.

      source
      MLJBase.ancestorsMethod
      ancestors(mach::Machine; self=false)

      All ancestors of mach, including mach if self=true.

      source
      MLJBase.default_scitype_check_levelFunction
      default_scitype_check_level()

      Return the current global default value for scientific type checking when constructing machines.

      default_scitype_check_level(i::Integer)

      Set the global default value for scientific type checking to i.

      The effect of the scitype_check_level option in calls of the form machine(model, data, scitype_check_level=...) is summarized below:

      scitype_check_levelInspect scitypes?If Unknown in scitypesIf other scitype mismatch
      0×
      1 (value at startup)warning
      2warningwarning
      3warningerror
      4errorerror

      See also machine

      source
      MLJBase.fit_only!Method
      MLJBase.fit_only!(
      +Utilities · MLJBase.jl

      Utilities

      Machines

      Base.replaceMethod
      replace(mach::Machine, field1 => value1, field2 => value2, ...)

      Private method.

      Return a shallow copy of the machine mach with the specified field replacements. Undefined field values are preserved. Unspecified fields have identically equal values, with the exception of mach.fit_okay, which is always a new instance Channel{Bool}(1).

      The following example returns a machine with no traces of training data (but also removes any upstream dependencies in a learning network):

      ```julia replace(mach, :args => (), :data => (), :dataresampleddata => (), :cache => nothing)

      source
      MLJBase.ageMethod
      age(mach::Machine)

      Return an integer representing the number of times mach has been trained or updated. For more detail, see the discussion of training logic at fit_only!.

      source
      MLJBase.ancestorsMethod
      ancestors(mach::Machine; self=false)

      All ancestors of mach, including mach if self=true.

      source
      MLJBase.default_scitype_check_levelFunction
      default_scitype_check_level()

      Return the current global default value for scientific type checking when constructing machines.

      default_scitype_check_level(i::Integer)

      Set the global default value for scientific type checking to i.

      The effect of the scitype_check_level option in calls of the form machine(model, data, scitype_check_level=...) is summarized below:

      scitype_check_levelInspect scitypes?If Unknown in scitypesIf other scitype mismatch
      0×
      1 (value at startup)warning
      2warningwarning
      3warningerror
      4errorerror

      See also machine

      source
      MLJBase.fit_only!Method
      MLJBase.fit_only!(
           mach::Machine;
           rows=nothing,
           verbosity=1,
           force=false,
           composite=nothing,
      -)

      Without mutating any other machine on which it may depend, perform one of the following actions to the machine mach, using the data and model bound to it, and restricting the data to rows if specified:

      • Ab initio training. Ignoring any previous learned parameters and cache, compute and store new learned parameters. Increment mach.state.

      • Training update. Making use of previous learned parameters and/or cache, replace or mutate existing learned parameters. The effect is the same (or nearly the same) as in ab initio training, but may be faster or use less memory, assuming the model supports an update option (implements MLJBase.update). Increment mach.state.

      • No-operation. Leave existing learned parameters untouched. Do not increment mach.state.

      If the model, model, bound to mach is a symbol, then instead perform the action using the true model given by getproperty(composite, model). See also machine.

      Training action logic

      For the action to be a no-operation, either mach.frozen == true or or none of the following apply:

      • (i) mach has never been trained (mach.state == 0).

      • (ii) force == true.

      • (iii) The state of some other machine on which mach depends has changed since the last time mach was trained (ie, the last time mach.state was last incremented).

      • (iv) The specified rows have changed since the last retraining and mach.model does not have Static type.

      • (v) mach.model is a model and different from the last model used for training, but has the same type.

      • (vi) mach.model is a model but has a type different from the last model used for training.

      • (vii) mach.model is a symbol and (composite, mach.model) is different from the last model used for training, but has the same type.

      • (viii) mach.model is a symbol and (composite, mach.model) has a different type from the last model used for training.

      In any of the cases (i) - (iv), (vi), or (viii), mach is trained ab initio. If (v) or (vii) is true, then a training update is applied.

      To freeze or unfreeze mach, use freeze!(mach) or thaw!(mach).

      Implementation details

      The data to which a machine is bound is stored in mach.args. Each element of args is either a Node object, or, in the case that concrete data was bound to the machine, it is concrete data wrapped in a Source node. In all cases, to obtain concrete data for actual training, each argument N is called, as in N() or N(rows=rows), and either MLJBase.fit (ab initio training) or MLJBase.update (training update) is dispatched on mach.model and this data. See the "Adding models for general use" section of the MLJ documentation for more on these lower-level training methods.

      source
      MLJBase.freeze!Method
      freeze!(mach)

      Freeze the machine mach so that it will never be retrained (unless thawed).

      See also thaw!.

      source
      MLJBase.glbMethod
      N = glb(mach::Machine{<:Union{Composite,Surrogate}})

      A greatest lower bound for the nodes appearing in the learning network interface of mach.

      A learning network interface is a named tuple declaring certain interface points in a learning network, to be used when "exporting" the network as a new stand-alone model type. Examples are

       (predict=yhat,)
      - (transform=Xsmall, acceleration=CPUThreads())
      - (predict=yhat, transform=W, report=(loss=loss_node,))

      Here yhat, Xsmall, W and loss_node are nodes in the network.

      The keys of the learning network interface always one of the following:

      • The name of an operation, such as :predict, :predict_mode, :transform, :inverse_transform. See "Operation keys" below.

      • :report, for exposing results of calling a node with no arguments in the composite model report. See "Including report nodes" below.

      • :fitted_params, for exposing results of calling a node with no arguments as fitted parameters of the composite model. See "Including fitted parameter nodes" below.

      • :acceleration, for articulating acceleration mode for training the network, e.g., CPUThreads(). Corresponding value must be an AbstractResource. If not included, CPU1() is used.

      Operation keys

      If the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).

      Including report nodes

      If the key is :report, then the corresponding value must be a named tuple

       (k1=n1, k2=n2, ...)

      whose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the "Learning Networks" section of the MLJ manual.

      Including fitted parameter nodes

      If the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.

      Private method.

      source
      MLJBase.last_modelMethod

      last_model(mach::Machine)

      Return the last model used to train the machine mach. This is a bona fide model, even if mach.model is a symbol.

      Returns nothing if mach has not been trained.

      source
      MLJBase.machineFunction
      machine(model, args...; cache=true, scitype_check_level=1)

      Construct a Machine object binding a model, storing hyper-parameters of some machine learning algorithm, to some data, args. Calling fit! on a Machine instance mach stores outcomes of applying the algorithm in mach, which can be inspected using fitted_params(mach) (learned paramters) and report(mach) (other outcomes). This in turn enables generalization to new data using operations such as predict or transform:

      using MLJModels
      +)

      Without mutating any other machine on which it may depend, perform one of the following actions to the machine mach, using the data and model bound to it, and restricting the data to rows if specified:

      • Ab initio training. Ignoring any previous learned parameters and cache, compute and store new learned parameters. Increment mach.state.

      • Training update. Making use of previous learned parameters and/or cache, replace or mutate existing learned parameters. The effect is the same (or nearly the same) as in ab initio training, but may be faster or use less memory, assuming the model supports an update option (implements MLJBase.update). Increment mach.state.

      • No-operation. Leave existing learned parameters untouched. Do not increment mach.state.

      If the model, model, bound to mach is a symbol, then instead perform the action using the true model given by getproperty(composite, model). See also machine.

      Training action logic

      For the action to be a no-operation, either mach.frozen == true or or none of the following apply:

      • (i) mach has never been trained (mach.state == 0).

      • (ii) force == true.

      • (iii) The state of some other machine on which mach depends has changed since the last time mach was trained (ie, the last time mach.state was last incremented).

      • (iv) The specified rows have changed since the last retraining and mach.model does not have Static type.

      • (v) mach.model is a model and different from the last model used for training, but has the same type.

      • (vi) mach.model is a model but has a type different from the last model used for training.

      • (vii) mach.model is a symbol and (composite, mach.model) is different from the last model used for training, but has the same type.

      • (viii) mach.model is a symbol and (composite, mach.model) has a different type from the last model used for training.

      In any of the cases (i) - (iv), (vi), or (viii), mach is trained ab initio. If (v) or (vii) is true, then a training update is applied.

      To freeze or unfreeze mach, use freeze!(mach) or thaw!(mach).

      Implementation details

      The data to which a machine is bound is stored in mach.args. Each element of args is either a Node object, or, in the case that concrete data was bound to the machine, it is concrete data wrapped in a Source node. In all cases, to obtain concrete data for actual training, each argument N is called, as in N() or N(rows=rows), and either MLJBase.fit (ab initio training) or MLJBase.update (training update) is dispatched on mach.model and this data. See the "Adding models for general use" section of the MLJ documentation for more on these lower-level training methods.

      source
      MLJBase.freeze!Method
      freeze!(mach)

      Freeze the machine mach so that it will never be retrained (unless thawed).

      See also thaw!.

      source
      MLJBase.last_modelMethod

      last_model(mach::Machine)

      Return the last model used to train the machine mach. This is a bona fide model, even if mach.model is a symbol.

      Returns nothing if mach has not been trained.

      source
      MLJBase.machineFunction
      machine(model, args...; cache=true, scitype_check_level=1)

      Construct a Machine object binding a model, storing hyper-parameters of some machine learning algorithm, to some data, args. Calling fit! on a Machine instance mach stores outcomes of applying the algorithm in mach, which can be inspected using fitted_params(mach) (learned paramters) and report(mach) (other outcomes). This in turn enables generalization to new data using operations such as predict or transform:

      using MLJModels
       X, y = make_regression()
       
       PCA = @load PCA pkg=MultivariateStats
      @@ -30,11 +28,7 @@
       X, y = make_blobs()
       mach = machine(:classifier, X, y)
       fit!(mach, composite=my_composite)

      The last two lines are equivalent to

      mach = machine(ConstantClassifier(), X, y)
      -fit!(mach)

      Delaying model specification is used when exporting learning networks as new stand-alone model types. See prefit and the MLJ documentation on learning networks.

      See also fit!, default_scitype_check_level, MLJBase.save, serializable.

      source
      MLJBase.machineMethod
      machine(file::Union{String, IO})

      Rebuild from a file a machine that has been serialized using the default Serialization module.

      source
      MLJBase.model_supertypeMethod
      model_supertype(interface)

      Return, if this can be inferred, which of Deterministic, Probabilistic and Unsupervised is the appropriate supertype for a composite model obtained by exporting a learning network with the specified learning network interface.

      A learning network interface is a named tuple declaring certain interface points in a learning network, to be used when "exporting" the network as a new stand-alone model type. Examples are

       (predict=yhat,)
      - (transform=Xsmall, acceleration=CPUThreads())
      - (predict=yhat, transform=W, report=(loss=loss_node,))

      Here yhat, Xsmall, W and loss_node are nodes in the network.

      The keys of the learning network interface always one of the following:

      • The name of an operation, such as :predict, :predict_mode, :transform, :inverse_transform. See "Operation keys" below.

      • :report, for exposing results of calling a node with no arguments in the composite model report. See "Including report nodes" below.

      • :fitted_params, for exposing results of calling a node with no arguments as fitted parameters of the composite model. See "Including fitted parameter nodes" below.

      • :acceleration, for articulating acceleration mode for training the network, e.g., CPUThreads(). Corresponding value must be an AbstractResource. If not included, CPU1() is used.

      Operation keys

      If the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).

      Including report nodes

      If the key is :report, then the corresponding value must be a named tuple

       (k1=n1, k2=n2, ...)

      whose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the "Learning Networks" section of the MLJ manual.

      Including fitted parameter nodes

      If the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.

      If a supertype cannot be inferred, nothing is returned.

      If the network with given signature is not exportable, this method will not error but it will not a give meaningful return value either.

      Private method.

      source
      MLJBase.reportMethod
      report(fitresult::CompositeFitresult)

      Return a tuple combining the report from fitresult.glb (a Node report) with the additions coming from nodes declared as report nodes in fitresult.signature, but without merging the two.

      A learning network interface is a named tuple declaring certain interface points in a learning network, to be used when "exporting" the network as a new stand-alone model type. Examples are

       (predict=yhat,)
      - (transform=Xsmall, acceleration=CPUThreads())
      - (predict=yhat, transform=W, report=(loss=loss_node,))

      Here yhat, Xsmall, W and loss_node are nodes in the network.

      The keys of the learning network interface always one of the following:

      • The name of an operation, such as :predict, :predict_mode, :transform, :inverse_transform. See "Operation keys" below.

      • :report, for exposing results of calling a node with no arguments in the composite model report. See "Including report nodes" below.

      • :fitted_params, for exposing results of calling a node with no arguments as fitted parameters of the composite model. See "Including fitted parameter nodes" below.

      • :acceleration, for articulating acceleration mode for training the network, e.g., CPUThreads(). Corresponding value must be an AbstractResource. If not included, CPU1() is used.

      Operation keys

      If the key is an operation, then the value must be a node n in the network with a unique origin (length(origins(n)) === 1). The intention of a declaration such as predict=yhat is that the exported model type implements predict, which, when applied to new data Xnew, should return yhat(Xnew).

      Including report nodes

      If the key is :report, then the corresponding value must be a named tuple

       (k1=n1, k2=n2, ...)

      whose values are all nodes. For each k=n pair, the key k will appear as a key in the composite model report, with a corresponding value of deepcopy(n()), called immediatately after training or updating the network. For examples, refer to the "Learning Networks" section of the MLJ manual.

      Including fitted parameter nodes

      If the key is :fitted_params, then the behaviour is as for report nodes but results are exposed as fitted parameters of the composite model instead of the report.

      Private method

      source
      MLJBase.reportMethod
      report(mach)

      Return the report for a machine mach that has been fit!, for example the coefficients in a linear model.

      This is a named tuple and human-readable if possible.

      If mach is a machine for a composite model, such as a model constructed using @pipeline, then the returned named tuple has the composite type's field names as keys. The corresponding value is the report for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)

      using MLJ
      +fit!(mach)

      Delaying model specification is used when exporting learning networks as new stand-alone model types. See prefit and the MLJ documentation on learning networks.

      See also fit!, default_scitype_check_level, MLJBase.save, serializable.

      source
      MLJBase.machineMethod
      machine(file::Union{String, IO})

      Rebuild from a file a machine that has been serialized using the default Serialization module.

      source
      MLJBase.reportMethod
      report(mach)

      Return the report for a machine mach that has been fit!, for example the coefficients in a linear model.

      This is a named tuple and human-readable if possible.

      If mach is a machine for a composite model, such as a model constructed using @pipeline, then the returned named tuple has the composite type's field names as keys. The corresponding value is the report for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)

      using MLJ
       @load LinearBinaryClassifier pkg=GLM
       X, y = @load_crabs;
       pipe = @pipeline Standardizer LinearBinaryClassifier
      @@ -45,23 +39,7 @@
        dof_residual = 195.0,
        stderror = [18954.83496713119, 6502.845740757159, 48484.240246060406, 34971.131004997274, 20654.82322484894, 2111.1294584763386],
        vcov = [3.592857686311793e8 9.122732393971942e6 … -8.454645589364915e7 5.38856837634321e6; 9.122732393971942e6 4.228700272808351e7 … -4.978433790526467e7 -8.442545425533723e6; … ; -8.454645589364915e7 -4.978433790526467e7 … 4.2662172244975924e8 2.1799125705781363e7; 5.38856837634321e6 -8.442545425533723e6 … 2.1799125705781363e7 4.456867590446599e6],)
      -

      Additional keys, machines and report_given_machine, give a list of all machines in the underlying network, and a dictionary of reports keyed on those machines.

      See also fitted_params

      source
      MLJBase.report_given_methodMethod
      report_given_method(mach::Machine)

      Same as report(mach) but broken down by the method (fit, predict, etc) that contributed the report.

      A specialized method intended for learning network applications.

      The return value is a dictionary keyed on the symbol representing the method (:fit, :predict, etc) and the values report contributed by that method.

      source
      MLJBase.restore!Method
      restore!(mach::Machine)

      Restore the state of a machine that is currently serializable but which may not be otherwise usable. For such a machine, mach, one has mach.state=1. Intended for restoring deserialized machine objects to a useable form.

      For an example see serializable.

      source
      MLJBase.return!Method
      return!(mach::Machine{<:Surrogate}, model, verbosity; acceleration=CPU1())

      The last call in custom code defining the MLJBase.fit method for a new composite model type. Here model is the instance of the new type appearing in the MLJBase.fit signature, while mach is a learning network machine constructed using model. Not relevant when defining composite models using @pipeline (deprecated) or @from_network.

      For usage, see the example given below. Specifically, the call does the following:

      • Determines which hyper-parameters of model point to model instances in the learning network wrapped by mach, for recording in an object called cache, for passing onto the MLJ logic that handles smart updating (namely, an MLJBase.update fallback for composite models).

      • Calls fit!(mach, verbosity=verbosity, acceleration=acceleration).

      • Records (among other things) a copy of model in a variable called cache

      • Returns cache and outcomes of training in an appropriate form (specifically, (mach.fitresult, cache, mach.report); see Adding Models for General Use for technical details.)

      Example

      The following code defines, "by hand", a new model type MyComposite for composing standardization (whitening) with a deterministic regressor:

      mutable struct MyComposite <: DeterministicComposite
      -    regressor
      -end
      -
      -function MLJBase.fit(model::MyComposite, verbosity, X, y)
      -    Xs = source(X)
      -    ys = source(y)
      -
      -    mach1 = machine(Standardizer(), Xs)
      -    Xwhite = transform(mach1, Xs)
      -
      -    mach2 = machine(model.regressor, Xwhite, ys)
      -    yhat = predict(mach2, Xwhite)
      -
      -    mach = machine(Deterministic(), Xs, ys; predict=yhat)
      -    return!(mach, model, verbosity)
      -end
      source
      MLJBase.serializableMethod
      serializable(mach::Machine)

      Returns a shallow copy of the machine to make it serializable. In particular, all training data is removed and, if necessary, learned parameters are replaced with persistent representations.

      Any general purpose Julia serializer may be applied to the output of serializable (eg, JLSO, BSON, JLD) but you must call restore!(mach) on the deserialised object mach before using it. See the example below.

      If using Julia's standard Serialization library, a shorter workflow is available using the MLJBase.save (or MLJ.save) method.

      A machine returned by serializable is characterized by the property mach.state == -1.

      Example using JLSO

      using MLJ
      +

      Additional keys, machines and report_given_machine, give a list of all machines in the underlying network, and a dictionary of reports keyed on those machines.

      See also fitted_params

      source
      MLJBase.report_given_methodMethod
      report_given_method(mach::Machine)

      Same as report(mach) but broken down by the method (fit, predict, etc) that contributed the report.

      A specialized method intended for learning network applications.

      The return value is a dictionary keyed on the symbol representing the method (:fit, :predict, etc) and the values report contributed by that method.

      source
      MLJBase.restore!Method
      restore!(mach::Machine)

      Restore the state of a machine that is currently serializable but which may not be otherwise usable. For such a machine, mach, one has mach.state=1. Intended for restoring deserialized machine objects to a useable form.

      For an example see serializable.

      source
      MLJBase.serializableMethod
      serializable(mach::Machine)

      Returns a shallow copy of the machine to make it serializable. In particular, all training data is removed and, if necessary, learned parameters are replaced with persistent representations.

      Any general purpose Julia serializer may be applied to the output of serializable (eg, JLSO, BSON, JLD) but you must call restore!(mach) on the deserialised object mach before using it. See the example below.

      If using Julia's standard Serialization library, a shorter workflow is available using the MLJBase.save (or MLJ.save) method.

      A machine returned by serializable is characterized by the property mach.state == -1.

      Example using JLSO

      using MLJ
       using JLSO
       Tree = @load DecisionTreeClassifier
       tree = Tree()
      @@ -77,7 +55,7 @@
       restore!(loaded_mach)
       
       predict(loaded_mach, X)
      -predict(mach, X)

      See also restore!, MLJBase.save.

      source
      MLJModelInterface.fitted_paramsMethod
      fitted_params(mach)

      Return the learned parameters for a machine mach that has been fit!, for example the coefficients in a linear model.

      This is a named tuple and human-readable if possible.

      If mach is a machine for a composite model, such as a model constructed using @pipeline, then the returned named tuple has the composite type's field names as keys. The corresponding value is the fitted parameters for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)

      using MLJ
      +predict(mach, X)

      See also restore!, MLJBase.save.

      source
      MLJModelInterface.fitted_paramsMethod
      fitted_params(mach)

      Return the learned parameters for a machine mach that has been fit!, for example the coefficients in a linear model.

      This is a named tuple and human-readable if possible.

      If mach is a machine for a composite model, such as a model constructed using @pipeline, then the returned named tuple has the composite type's field names as keys. The corresponding value is the fitted parameters for the machine in the underlying learning network bound to that model. (If multiple machines share the same model, then the value is a vector.)

      using MLJ
       @load LogisticClassifier pkg=MLJLinearModels
       X, y = @load_crabs;
       pipe = @pipeline Standardizer LogisticClassifier
      @@ -86,7 +64,7 @@
       julia> fitted_params(mach).logistic_classifier
       (classes = CategoricalArrays.CategoricalValue{String,UInt32}["B", "O"],
        coefs = Pair{Symbol,Float64}[:FL => 3.7095037897680405, :RW => 0.1135739140854546, :CL => -1.6036892745322038, :CW => -4.415667573486482, :BD => 3.238476051092471],
      - intercept = 0.0883301599726305,)

      Additional keys, machines and fitted_params_given_machine, give a list of all machines in the underlying network, and a dictionary of fitted parameters keyed on those machines.

      See also report

      source
      MLJModelInterface.saveMethod
      MLJ.save(filename, mach::Machine)
      + intercept = 0.0883301599726305,)

      Additional keys, machines and fitted_params_given_machine, give a list of all machines in the underlying network, and a dictionary of fitted parameters keyed on those machines.

      See also report

      source
      MLJModelInterface.saveMethod
      MLJ.save(filename, mach::Machine)
       MLJ.save(io, mach::Machine)
       
       MLJBase.save(filename, mach::Machine)
      @@ -104,53 +82,15 @@
       MLJ.save(io, mach)
       seekstart(io)
       predict_only_mach = machine(io)
      -predict(predict_only_mach, X)
      Only load files from trusted sources

      Maliciously constructed JLS files, like pickles, and most other general purpose serialization formats, can allow for arbitrary code execution during loading. This means it is possible for someone to use a JLS file that looks like a serialized MLJ machine as a Trojan horse.

      See also serializable, machine.

      source
      StatsAPI.fit!Method
      fit!(mach::Machine{<:Surrogate};
      -     rows=nothing,
      -     acceleration=CPU1(),
      -     verbosity=1,
      -     force=false))

      Train the complete learning network wrapped by the machine mach.

      More precisely, if s is the learning network signature used to construct mach, then call fit!(N), where N is a greatest lower bound of the nodes appearing in the signature (values in the signature that are not AbstractNode are ignored). For example, if s = (predict=yhat, transform=W), then call fit!(glb(yhat, W)).

      See also machine

      source
      StatsAPI.fit!Method
      fit!(mach::Machine, rows=nothing, verbosity=1, force=false, composite=nothing)

      Fit the machine mach. In the case that mach has Node arguments, first train all other machines on which mach depends.

      To attempt to fit a machine without touching any other machine, use fit_only!. For more on options and the the internal logic of fitting see fit_only!

      source

      Parameter Inspection

      Show

      MLJBase._recursive_showMethod
      _recursive_show(stream, object, current_depth, depth)

      Generate a table of the properties of the MLJType object, dislaying each property value by calling the method _show on it. The behaviour of _show(stream, f) is as follows:

      1. If f is itself a MLJType object, then its short form is shown

      and _recursive_show generates as separate table for each of its properties (and so on, up to a depth of argument depth).

      1. Otherwise f is displayed as "(omitted T)" where T = typeof(f),

      unless istoobig(f) is false (the istoobig fall-back for arbitrary types being true). In the latter case, the long (ie, MIME"plain/text") form of f is shown. To override this behaviour, overload the _show method for the type in question.

      source
      MLJBase.color_offMethod
      color_off()

      Suppress color and bold output at the REPL for displaying MLJ objects.

      source
      MLJBase.color_onMethod
      color_on()

      Enable color and bold output at the REPL, for enhanced display of MLJ objects.

      source
      MLJBase.handleMethod

      return abbreviated object id (as string) or it's registered handle (as string) if this exists

      source
      MLJBase.@constantMacro
      @constant x = value

      Private method (used in testing).

      Equivalent to const x = value but registers the binding thus:

      MLJBase.HANDLE_GIVEN_ID[objectid(value)] = :x

      Registered objects get displayed using the variable name to which it was bound in calls to show(x), etc.

      WARNING: As with any const declaration, binding x to new value of the same type is not prevented and the registration will not be updated.

      source
      MLJBase.@moreMacro
      @more

      Entered at the REPL, equivalent to show(ans, 100). Use to get a recursive description of all properties of the last REPL value.

      source

      Utility functions

      MLJBase.AccuracyType
      MLJBase.Accuracy

      A measure type for accuracy, which includes the instance(s): accuracy.

      Accuracy()(ŷ, y)
      -Accuracy()(ŷ, y, w)

      Evaluate the accuracy on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      Accuracy is proportion of correct predictions ŷ[i] that match the ground truth y[i] observations. This metric is invariant to class reordering.

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite,Missing} (multiclass classification); must be an array of deterministic predictions.

      For more information, run info(Accuracy).

      source
      MLJBase.AreaUnderCurveType
      MLJBase.AreaUnderCurve

      A measure type for area under the ROC, which includes the instance(s): area_under_curve, auc.

      AreaUnderCurve()(ŷ, y)

      Evaluate the area under the ROC on predictions , given ground truth observations y.

      Returns the area under the ROC (receiver operator characteristic)

      If missing or NaN values are present, use auc(skipinvalid(yhat, y)...).

      This metric is invariant to class reordering.

      Requires scitype(y) to be a subtype of Union{AbstractArray{<:Union{Missing, ScientificTypesBase.Multiclass{2}}}, AbstractArray{<:Union{Missing, ScientificTypesBase.OrderedFactor{2}}}}; must be an array of probabilistic predictions.

      For more information, run info(AreaUnderCurve).

      source
      MLJBase.BalancedAccuracyType
      MLJBase.BalancedAccuracy

      A measure type for balanced accuracy, which includes the instance(s): balanced_accuracy, bacc, bac.

      BalancedAccuracy()(ŷ, y)
      -BalancedAccuracy()(ŷ, y, w)

      Evaluate the default instance of BalancedAccuracy on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      Balanced accuracy compensates standard Accuracy for class imbalance. See https://en.wikipedia.org/wiki/Precisionandrecall#Imbalanced_data.

      Setting adjusted=true rescales the score in the way prescribed in L. Mosley (2013): A balanced approach to the multi-class imbalance problem. PhD thesis, Iowa State University. In the binary case, the adjusted balanced accuracy is also known as Youden’s J statistic, or informedness.

      This metric is invariant to class reordering.

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite,Missing} (multiclass classification); must be an array of deterministic predictions.

      For more information, run info(BalancedAccuracy).

      source
      MLJBase.BrierLossType
      MLJBase.BrierLoss

      A measure type for Brier loss (a.k.a. quadratic loss), which includes the instance(s): brier_loss.

      BrierLoss()(ŷ, y)
      -BrierLoss()(ŷ, y, w)

      Evaluate the Brier loss (a.k.a. quadratic loss) on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For details, see BrierScore, which differs only by a sign.

      Requires scitype(y) to be a subtype of AbtractArray{<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ); must be an array of probabilistic predictions.

      For more information, run info(BrierLoss).

      source
      MLJBase.BrierScoreType
      MLJBase.BrierScore

      A measure type for Brier score (a.k.a. quadratic score), which includes the instance(s): brier_score.

      BrierScore()(ŷ, y)
      -BrierScore()(ŷ, y, w)

      Evaluate the Brier score (a.k.a. quadratic score) on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      Convention as in Gneiting and Raftery (2007), "StrictlyProper Scoring Rules, Prediction, and Estimation"

      Finite case. If p is the predicted probability mass function for a single observation η, and C all possible classes, then the corresponding score for that observation is given by

      $2p(η) - \left(\sum_{c ∈ C} p(c)^2\right) - 1$

      Warning. BrierScore() is a "score" in the sense that bigger is better (with 0 optimal, and all other values negative). In Brier's original 1950 paper, and many other places, it has the opposite sign, despite the name. Moreover, the present implementation does not treat the binary case as special, so that the score may differ in the binary case by a factor of two from usage elsewhere.

      Infinite case. Replacing the sum above with an integral does not lead to the formula adopted here in the case of Continuous or Count target y. Rather the convention in the paper cited above is adopted, which means returning a score of

      $2p(η) - ∫ p(t)^2 dt$

      in the Continuous case (p the probablity density function) or

      $2p(η) - ∑_t p(t)^2$

      in the Count cae (p the probablity mass function).

      Requires scitype(y) to be a subtype of AbtractArray{<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ); must be an array of probabilistic predictions.

      For more information, run info(BrierScore).

      source
      MLJBase.ConfusionMatrixType
      MLJBase.ConfusionMatrix

      A measure type for confusion matrix, which includes the instance(s): confusion_matrix, confmat.

      ConfusionMatrix()(ŷ, y)

      Evaluate the default instance of ConfusionMatrix on predictions , given ground truth observations y.

      If r is the return value, then the raw confusion matrix is r.mat, whose rows correspond to predictions, and columns to ground truth. The ordering follows that of levels(y).

      Use ConfusionMatrix(perm=[2, 1]) to reverse the class order for binary data. For more than two classes, specify an appropriate permutation, as in ConfusionMatrix(perm=[2, 3, 1]).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(ConfusionMatrix).

      source
      MLJBase.DWDMarginLossType
      MLJBase.DWDMarginLoss

      A measure type for distance weighted discrimination loss, which includes the instance(s): dwd_margin_loss.

      DWDMarginLoss()(ŷ, y)
      -DWDMarginLoss()(ŷ, y, w)

      Evaluate the default instance of DWDMarginLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of probabilistic predictions.

      Constructor signature: DWDMarginLoss(; q=1.0)

      For more information, run info(DWDMarginLoss).

      source
      MLJBase.ExpLossType
      MLJBase.ExpLoss

      A measure type for exp loss, which includes the instance(s): exp_loss.

      ExpLoss()(ŷ, y)
      -ExpLoss()(ŷ, y, w)

      Evaluate the default instance of ExpLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of probabilistic predictions.

      For more information, run info(ExpLoss).

      source
      MLJBase.FScoreType
      MLJBase.FScore

      A measure type for F-Score, which includes the instance(s): f1score.

      FScore()(ŷ, y)

      Evaluate the default instance of FScore on predictions , given ground truth observations y.

      This is the one-parameter generalization, $F_β$, of the F-measure or balanced F-score.

      https://en.wikipedia.org/wiki/F1_score

      Constructor signature: FScore(; β=1.0, rev=true).

      By default, the second element of levels(y) is designated as true. To reverse roles, specify rev=true.

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      Constructor signature: FScore(β=1.0, rev=false).

      For more information, run info(FScore).

      source
      MLJBase.FalseDiscoveryRateType
      MLJBase.FalseDiscoveryRate

      A measure type for false discovery rate, which includes the instance(s): false_discovery_rate, falsediscovery_rate, fdr.

      FalseDiscoveryRate()(ŷ, y)

      Evaluate the default instance of FalseDiscoveryRate on predictions , given ground truth observations y.

      Assigns false to first element of levels(y). To reverse roles, use FalseDiscoveryRate(rev=true).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(FalseDiscoveryRate).

      source
      MLJBase.FalseNegativeType
      MLJBase.FalseNegative

      A measure type for number of false negatives, which includes the instance(s): false_negative, falsenegative.

      FalseNegative()(ŷ, y)

      Evaluate the default instance of FalseNegative on predictions , given ground truth observations y.

      Assigns false to first element of levels(y). To reverse roles, use FalseNegative(rev=true).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(FalseNegative).

      source
      MLJBase.FalseNegativeRateType
      MLJBase.FalseNegativeRate

      A measure type for false negative rate, which includes the instance(s): false_negative_rate, falsenegative_rate, fnr, miss_rate.

      FalseNegativeRate()(ŷ, y)

      Evaluate the default instance of FalseNegativeRate on predictions , given ground truth observations y.

      Assigns false to first element of levels(y). To reverse roles, use FalseNegativeRate(rev=true).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(FalseNegativeRate).

      source
      MLJBase.FalsePositiveType
      MLJBase.FalsePositive

      A measure type for number of false positives, which includes the instance(s): false_positive, falsepositive.

      FalsePositive()(ŷ, y)

      Evaluate the default instance of FalsePositive on predictions , given ground truth observations y.

      Assigns false to first element of levels(y). To reverse roles, use FalsePositive(rev=true).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(FalsePositive).

      source
      MLJBase.FalsePositiveRateType
      MLJBase.FalsePositiveRate

      A measure type for false positive rate, which includes the instance(s): false_positive_rate, falsepositive_rate, fpr, fallout.

      FalsePositiveRate()(ŷ, y)

      Evaluate the default instance of FalsePositiveRate on predictions , given ground truth observations y.

      Assigns false to first element of levels(y). To reverse roles, use FalsePositiveRate(rev=true).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(FalsePositiveRate).

      source
      MLJBase.HuberLossType
      MLJBase.HuberLoss

      A measure type for huber loss, which includes the instance(s): huber_loss.

      HuberLoss()(ŷ, y)
      -HuberLoss()(ŷ, y, w)

      Evaluate the default instance of HuberLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; must be an array of deterministic predictions.

      Constructor signature: HuberLoss(; d=1.0)

      For more information, run info(HuberLoss).

      source
      MLJBase.KappaType
      MLJBase.Kappa

      A measure type for kappa, which includes the instance(s): kappa.

      Kappa()(ŷ, y)

      Evaluate the kappa on predictions , given ground truth observations y.

      A metric to measure agreement between predicted labels and the ground truth. See https://en.wikipedia.org/wiki/Cohen%27s_kappa

      This metric is invariant to class reordering.

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite,Missing} (multiclass classification); must be an array of deterministic predictions.

      For more information, run info(Kappa).

      source
      MLJBase.L1EpsilonInsLossType
      MLJBase.L1EpsilonInsLoss

      A measure type for l1 ϵ-insensitive loss, which includes the instance(s): l1_epsilon_ins_loss.

      L1EpsilonInsLoss()(ŷ, y)
      -L1EpsilonInsLoss()(ŷ, y, w)

      Evaluate the default instance of L1EpsilonInsLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; must be an array of deterministic predictions.

      Constructor signature: L1EpsilonInsLoss(; ε=1.0)

      For more information, run info(L1EpsilonInsLoss).

      source
      MLJBase.L1HingeLossType
      MLJBase.L1HingeLoss

      A measure type for l1 hinge loss, which includes the instance(s): l1_hinge_loss.

      L1HingeLoss()(ŷ, y)
      -L1HingeLoss()(ŷ, y, w)

      Evaluate the default instance of L1HingeLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of probabilistic predictions.

      For more information, run info(L1HingeLoss).

      source
      MLJBase.L2EpsilonInsLossType
      MLJBase.L2EpsilonInsLoss

      A measure type for l2 ϵ-insensitive loss, which includes the instance(s): l2_epsilon_ins_loss.

      L2EpsilonInsLoss()(ŷ, y)
      -L2EpsilonInsLoss()(ŷ, y, w)

      Evaluate the default instance of L2EpsilonInsLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; must be an array of deterministic predictions.

      Constructor signature: L2EpsilonInsLoss(; ε=1.0)

      For more information, run info(L2EpsilonInsLoss).

      source
      MLJBase.L2HingeLossType
      MLJBase.L2HingeLoss

      A measure type for l2 hinge loss, which includes the instance(s): l2_hinge_loss.

      L2HingeLoss()(ŷ, y)
      -L2HingeLoss()(ŷ, y, w)

      Evaluate the default instance of L2HingeLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of probabilistic predictions.

      For more information, run info(L2HingeLoss).

      source
      MLJBase.L2MarginLossType
      MLJBase.L2MarginLoss

      A measure type for l2 margin loss, which includes the instance(s): l2_margin_loss.

      L2MarginLoss()(ŷ, y)
      -L2MarginLoss()(ŷ, y, w)

      Evaluate the default instance of L2MarginLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of probabilistic predictions.

      For more information, run info(L2MarginLoss).

      source
      MLJBase.LPDistLossType
      MLJBase.LPDistLoss

      A measure type for lp dist loss, which includes the instance(s): lp_dist_loss.

      LPDistLoss()(ŷ, y)
      -LPDistLoss()(ŷ, y, w)

      Evaluate the default instance of LPDistLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; must be an array of deterministic predictions.

      Constructor signature: LPDistLoss(; P=2)

      For more information, run info(LPDistLoss).

      source
      MLJBase.LPLossType
      MLJBase.LPLoss

      A measure type for lp loss, which includes the instance(s): l1, l2.

      LPLoss()(ŷ, y)
      -LPLoss()(ŷ, y, w)

      Evaluate the default instance of LPLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      Constructor signature: LPLoss(p=2). Reports |ŷ[i] - y[i]|^p for every index i.

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; must be an array of deterministic predictions.

      For more information, run info(LPLoss).

      source
      MLJBase.LogCoshLossType
      MLJBase.LogCoshLoss

      A measure type for log cosh loss, which includes the instance(s): log_cosh, log_cosh_loss.

      LogCoshLoss()(ŷ, y)
      -LogCoshLoss()(ŷ, y, w)

      Evaluate the log cosh loss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      Reports $\log(\cosh(ŷᵢ-yᵢ))$ for each index i.

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; must be an array of deterministic predictions.

      For more information, run info(LogCoshLoss).

      source
      MLJBase.LogLossType
      MLJBase.LogLoss

      A measure type for log loss, which includes the instance(s): log_loss, cross_entropy.

      LogLoss()(ŷ, y)
      -LogLoss()(ŷ, y, w)

      Evaluate the default instance of LogLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For details, see LogScore, which differs only by a sign.

      Requires scitype(y) to be a subtype of AbtractArray{<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ); must be an array of probabilistic predictions.

      For more information, run info(LogLoss).

      source
      MLJBase.LogScoreType
      MLJBase.LogScore

      A measure type for log score, which includes the instance(s): log_score.

      LogScore()(ŷ, y)
      -LogScore()(ŷ, y, w)

      Evaluate the default instance of LogScore on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      Since the score is undefined in the case that the true observation is predicted to occur with probability zero, probablities are clamped between tol and 1-tol, where tol is a constructor key-word argument.

      If p is the predicted probability mass or density function corresponding to a single ground truth observation η, then the score for that example is

      log(clamp(p(η), tol), 1 - tol)

      For example, for a binary target with "yes"/"no" labels, and predicted probability of "yes" equal to 0.8, an observation of "no" scores log(0.2).

      The predictions should be an array of UnivariateFinite distributions in the case of Finite target y, and otherwise a supported Distributions.UnivariateDistribution such as Normal or Poisson.

      See also LogLoss, which differs only in sign.

      Requires scitype(y) to be a subtype of AbtractArray{<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ); must be an array of probabilistic predictions.

      For more information, run info(LogScore).

      source
      MLJBase.LogitDistLossType
      MLJBase.LogitDistLoss

      A measure type for logit dist loss, which includes the instance(s): logit_dist_loss.

      LogitDistLoss()(ŷ, y)
      -LogitDistLoss()(ŷ, y, w)

      Evaluate the default instance of LogitDistLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; must be an array of deterministic predictions.

      For more information, run info(LogitDistLoss).

      source
      MLJBase.LogitMarginLossType
      MLJBase.LogitMarginLoss

      A measure type for logit margin loss, which includes the instance(s): logit_margin_loss.

      LogitMarginLoss()(ŷ, y)
      -LogitMarginLoss()(ŷ, y, w)

      Evaluate the default instance of LogitMarginLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of probabilistic predictions.

      For more information, run info(LogitMarginLoss).

      source
      MLJBase.MatthewsCorrelationType
      MLJBase.MatthewsCorrelation

      A measure type for matthews correlation, which includes the instance(s): matthews_correlation, mcc.

      MatthewsCorrelation()(ŷ, y)

      Evaluate the matthews correlation on predictions , given ground truth observations y.

      https://en.wikipedia.org/wiki/Matthewscorrelationcoefficient This metric is invariant to class reordering.

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of deterministic predictions.

      For more information, run info(MatthewsCorrelation).

      source
      MLJBase.MeanAbsoluteErrorType
      MLJBase.MeanAbsoluteError

      A measure type for mean absolute error, which includes the instance(s): mae, mav, mean_absolute_error, mean_absolute_value.

      MeanAbsoluteError()(ŷ, y)
      -MeanAbsoluteError()(ŷ, y, w)

      Evaluate the mean absolute error on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      $\text{mean absolute error} = n^{-1}∑ᵢ|yᵢ-ŷᵢ|$ or $\text{mean absolute error} = n^{-1}∑ᵢwᵢ|yᵢ-ŷᵢ|$

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; must be an array of deterministic predictions.

      For more information, run info(MeanAbsoluteError).

      source
      MLJBase.MeanAbsoluteProportionalErrorType
      MLJBase.MeanAbsoluteProportionalError

      A measure type for mean absolute proportional error, which includes the instance(s): mape.

      MeanAbsoluteProportionalError()(ŷ, y)
      -MeanAbsoluteProportionalError()(ŷ, y, w)

      Evaluate the default instance of MeanAbsoluteProportionalError on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      Constructor key-word arguments: tol (default = eps()).

      $\text{mean absolute proportional error} = m^{-1}∑ᵢ|{(yᵢ-ŷᵢ) \over yᵢ}|$

      where the sum is over indices such that abs(yᵢ) > tol and m is the number of such indices.

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; must be an array of deterministic predictions.

      For more information, run info(MeanAbsoluteProportionalError).

      source
      MLJBase.MisclassificationRateType
      MLJBase.MisclassificationRate

      A measure type for misclassification rate, which includes the instance(s): misclassification_rate, mcr.

      MisclassificationRate()(ŷ, y)
      -MisclassificationRate()(ŷ, y, w)

      Evaluate the misclassification rate on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      A confusion matrix can also be passed as argument. This metric is invariant to class reordering.

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite,Missing} (multiclass classification); must be an array of deterministic predictions.

      For more information, run info(MisclassificationRate).

      source
      MLJBase.ModifiedHuberLossType
      MLJBase.ModifiedHuberLoss

      A measure type for modified huber loss, which includes the instance(s): modified_huber_loss.

      ModifiedHuberLoss()(ŷ, y)
      -ModifiedHuberLoss()(ŷ, y, w)

      Evaluate the default instance of ModifiedHuberLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of probabilistic predictions.

      For more information, run info(ModifiedHuberLoss).

      source
      MLJBase.NegativePredictiveValueType
      MLJBase.NegativePredictiveValue

      A measure type for negative predictive value, which includes the instance(s): negative_predictive_value, negativepredictive_value, npv.

      NegativePredictiveValue()(ŷ, y)

      Evaluate the default instance of NegativePredictiveValue on predictions , given ground truth observations y.

      Assigns false to first element of levels(y). To reverse roles, use NegativePredictiveValue(rev=true).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(NegativePredictiveValue).

      source
      MLJBase.PerceptronLossType
      MLJBase.PerceptronLoss

      A measure type for perceptron loss, which includes the instance(s): perceptron_loss.

      PerceptronLoss()(ŷ, y)
      -PerceptronLoss()(ŷ, y, w)

      Evaluate the default instance of PerceptronLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of probabilistic predictions.

      For more information, run info(PerceptronLoss).

      source
      MLJBase.PeriodicLossType
      MLJBase.PeriodicLoss

      A measure type for periodic loss, which includes the instance(s): periodic_loss.

      PeriodicLoss()(ŷ, y)
      -PeriodicLoss()(ŷ, y, w)

      Evaluate the default instance of PeriodicLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; must be an array of deterministic predictions.

      For more information, run info(PeriodicLoss).

      source
      MLJBase.PrecisionType
      MLJBase.Precision

      A measure type for precision (a.k.a. positive predictive value), which includes the instance(s): positive_predictive_value, ppv, positivepredictive_value, precision.

      Precision()(ŷ, y)

      Evaluate the default instance of Precision on predictions , given ground truth observations y.

      Assigns false to first element of levels(y). To reverse roles, use Precision(rev=true).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(Precision).

      source
      MLJBase.QuantileLossType
      MLJBase.QuantileLoss

      A measure type for quantile loss, which includes the instance(s): quantile_loss.

      QuantileLoss()(ŷ, y)
      -QuantileLoss()(ŷ, y, w)

      Evaluate the default instance of QuantileLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of Union{AbstractVector{ScientificTypesBase.Continuous}, AbstractVector{ScientificTypesBase.Count}}; must be an array of deterministic predictions.

      Constructor signature: QuantileLoss(; τ=0.7)

      For more information, run info(QuantileLoss).

      source
      MLJBase.RSquaredType
      MLJBase.RSquared

      A measure type for r squared, which includes the instance(s): rsq, rsquared.

      RSquared()(ŷ, y)

      Evaluate the r squared on predictions , given ground truth observations y.

      The R² (also known as R-squared or coefficient of determination) is suitable for interpreting linear regression analysis (Chicco et al., 2021).

      Let $\overline{y}$ denote the mean of $y$, then

      $\text{R^2} = 1 - \frac{∑ (\hat{y} - y)^2}{∑ \overline{y} - y)^2}.$

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; must be an array of deterministic predictions.

      For more information, run info(RSquared).

      source
      MLJBase.RootMeanSquaredErrorType
      MLJBase.RootMeanSquaredError

      A measure type for root mean squared error, which includes the instance(s): rms, rmse, root_mean_squared_error.

      RootMeanSquaredError()(ŷ, y)
      -RootMeanSquaredError()(ŷ, y, w)

      Evaluate the root mean squared error on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      $\text{root mean squared error} = \sqrt{n^{-1}∑ᵢ|yᵢ-ŷᵢ|^2}$ or $\text{root mean squared error} = \sqrt{\frac{∑ᵢwᵢ|yᵢ-ŷᵢ|^2}{∑ᵢwᵢ}}$

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; must be an array of deterministic predictions.

      For more information, run info(RootMeanSquaredError).

      source
      MLJBase.RootMeanSquaredLogErrorType
      MLJBase.RootMeanSquaredLogError

      A measure type for root mean squared log error, which includes the instance(s): rmsl, rmsle, root_mean_squared_log_error.

      RootMeanSquaredLogError()(ŷ, y)
      -RootMeanSquaredLogError()(ŷ, y, w)

      Evaluate the root mean squared log error on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      $\text{root mean squared log error} = \sqrt{n^{-1}∑ᵢ\log\left({yᵢ \over ŷᵢ}\right)^2}$

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; must be an array of deterministic predictions.

      See also rmslp1.

      For more information, run info(RootMeanSquaredLogError).

      source
      MLJBase.RootMeanSquaredLogProportionalErrorType
      MLJBase.RootMeanSquaredLogProportionalError

      A measure type for root mean squared log proportional error, which includes the instance(s): rmslp1.

      RootMeanSquaredLogProportionalError()(ŷ, y)
      -RootMeanSquaredLogProportionalError()(ŷ, y, w)

      Evaluate the default instance of RootMeanSquaredLogProportionalError on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      Constructor signature: RootMeanSquaredLogProportionalError(; offset = 1.0).

      $\text{root mean squared log proportional error} = \sqrt{n^{-1}∑ᵢ\log\left({yᵢ + \text{offset} \over ŷᵢ + \text{offset}}\right)}$

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; must be an array of deterministic predictions.

      See also rmsl.

      For more information, run info(RootMeanSquaredLogProportionalError).

      source
      MLJBase.RootMeanSquaredProportionalErrorType
      MLJBase.RootMeanSquaredProportionalError

      A measure type for root mean squared proportional error, which includes the instance(s): rmsp.

      RootMeanSquaredProportionalError()(ŷ, y)
      -RootMeanSquaredProportionalError()(ŷ, y, w)

      Evaluate the default instance of RootMeanSquaredProportionalError on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      Constructor keyword arguments: tol (default = eps()).

      $\text{root mean squared proportional error} = \sqrt{m^{-1}∑ᵢ \left({yᵢ-ŷᵢ \over yᵢ}\right)^2}$

      where the sum is over indices such that abs(yᵢ) > tol and m is the number of such indices.

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Infinite,Missing}}; must be an array of deterministic predictions.

      For more information, run info(RootMeanSquaredProportionalError).

      source
      MLJBase.SigmoidLossType
      MLJBase.SigmoidLoss

      A measure type for sigmoid loss, which includes the instance(s): sigmoid_loss.

      SigmoidLoss()(ŷ, y)
      -SigmoidLoss()(ŷ, y, w)

      Evaluate the default instance of SigmoidLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of probabilistic predictions.

      For more information, run info(SigmoidLoss).

      source
      MLJBase.SmoothedL1HingeLossType
      MLJBase.SmoothedL1HingeLoss

      A measure type for smoothed l1 hinge loss, which includes the instance(s): smoothed_l1_hinge_loss.

      SmoothedL1HingeLoss()(ŷ, y)
      -SmoothedL1HingeLoss()(ŷ, y, w)

      Evaluate the default instance of SmoothedL1HingeLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of probabilistic predictions.

      Constructor signature: SmoothedL1HingeLoss(; gamma=1.0)

      For more information, run info(SmoothedL1HingeLoss).

      source
      MLJBase.SphericalScoreType
      MLJBase.SphericalScore

      A measure type for Spherical score, which includes the instance(s): spherical_score.

      SphericalScore()(ŷ, y)
      -SphericalScore()(ŷ, y, w)

      Evaluate the default instance of SphericalScore on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      Convention as in Gneiting and Raftery (2007), "StrictlyProper Scoring Rules, Prediction, and Estimation": If η takes on a finite number of classes C and `p(η) is the predicted probability for a single observation η, then the corresponding score for that observation is given by

      $p(y)^α / \left(\sum_{η ∈ C} p(η)^α\right)^{1-α} - 1$

      where α is the measure parameter alpha.

      In the case the predictions are continuous probability distributions, such as Distributions.Normal, replace the above sum with an integral, and interpret p as the probablity density function. In case of discrete distributions over the integers, such as Distributions.Poisson, sum over all integers instead of C.

      Requires scitype(y) to be a subtype of AbtractArray{<:Union{Missing,T} where T is Continuous or Count (for respectively continuous or discrete Distribution.jl objects in ) or OrderedFactor or Multiclass (for UnivariateFinite distributions in ); must be an array of probabilistic predictions.

      For more information, run info(SphericalScore).

      source
      MLJBase.TrueNegativeType
      MLJBase.TrueNegative

      A measure type for number of true negatives, which includes the instance(s): true_negative, truenegative.

      TrueNegative()(ŷ, y)

      Evaluate the default instance of TrueNegative on predictions , given ground truth observations y.

      Assigns false to first element of levels(y). To reverse roles, use TrueNegative(rev=true).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(TrueNegative).

      source
      MLJBase.TrueNegativeRateType
      MLJBase.TrueNegativeRate

      A measure type for true negative rate, which includes the instance(s): true_negative_rate, truenegative_rate, tnr, specificity, selectivity.

      TrueNegativeRate()(ŷ, y)

      Evaluate the default instance of TrueNegativeRate on predictions , given ground truth observations y.

      Assigns false to first element of levels(y). To reverse roles, use TrueNegativeRate(rev=true).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(TrueNegativeRate).

      source
      MLJBase.TruePositiveType
      MLJBase.TruePositive

      A measure type for number of true positives, which includes the instance(s): true_positive, truepositive.

      TruePositive()(ŷ, y)

      Evaluate the default instance of TruePositive on predictions , given ground truth observations y.

      Assigns false to first element of levels(y). To reverse roles, use TruePositive(rev=true).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(TruePositive).

      source
      MLJBase.TruePositiveRateType
      MLJBase.TruePositiveRate

      A measure type for true positive rate (a.k.a recall), which includes the instance(s): true_positive_rate, truepositive_rate, tpr, sensitivity, recall, hit_rate.

      TruePositiveRate()(ŷ, y)

      Evaluate the default instance of TruePositiveRate on predictions , given ground truth observations y.

      Assigns false to first element of levels(y). To reverse roles, use TruePositiveRate(rev=true).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{OrderedFactor{2},Missing}} (binary classification where choice of "true" effects the measure); must be an array of deterministic predictions.

      For more information, run info(TruePositiveRate).

      source
      MLJBase.ZeroOneLossType
      MLJBase.ZeroOneLoss

      A measure type for zero one loss, which includes the instance(s): zero_one_loss.

      ZeroOneLoss()(ŷ, y)
      -ZeroOneLoss()(ŷ, y, w)

      Evaluate the default instance of ZeroOneLoss on predictions , given ground truth observations y. Optionally specify per-sample weights, w.

      For more detail, see the original LossFunctions.jl documentation but note differences in the signature.

      Losses from LossFunctions.jl do not support missing values. To use with missing values, replace (ŷ, y) with skipinvalid(ŷ, y)).

      Requires scitype(y) to be a subtype of AbstractArray{<:Union{Finite{2},Missing}} (binary classification); must be an array of probabilistic predictions.

      For more information, run info(ZeroOneLoss).

      source
      MLJBase._permute_rowsMethod

      permuterows(obj, perm)

      Internal function to return a vector or matrix with permuted rows given the permutation perm.

      source
      MLJBase.available_nameMethod
      available_name(modl::Module, name::Symbol)

      Function to replace, if necessary, a given name with a modified one that ensures it is not the name of any existing object in the global scope of modl. Modifications are created with numerical suffixes.

      source
      MLJBase.check_same_nrowsMethod
      check_same_nrows(X, Y)

      Internal function to check two objects, each a vector or a matrix, have the same number of rows.

      source
      MLJBase.chunksMethod
      chunks(range, n)

      Split an AbstractRange into n subranges of approximately equal length.

      Example

      julia> collect(chunks(1:5, 2))
      +predict(predict_only_mach, X)
      Only load files from trusted sources

      Maliciously constructed JLS files, like pickles, and most other general purpose serialization formats, can allow for arbitrary code execution during loading. This means it is possible for someone to use a JLS file that looks like a serialized MLJ machine as a Trojan horse.

      See also serializable, machine.

      source
      StatsAPI.fit!Method
      fit!(mach::Machine, rows=nothing, verbosity=1, force=false, composite=nothing)

      Fit the machine mach. In the case that mach has Node arguments, first train all other machines on which mach depends.

      To attempt to fit a machine without touching any other machine, use fit_only!. For more on options and the the internal logic of fitting see fit_only!

      source

      Parameter Inspection

      Show

      MLJBase._recursive_showMethod
      _recursive_show(stream, object, current_depth, depth)

      Generate a table of the properties of the MLJType object, dislaying each property value by calling the method _show on it. The behaviour of _show(stream, f) is as follows:

      1. If f is itself a MLJType object, then its short form is shown

      and _recursive_show generates as separate table for each of its properties (and so on, up to a depth of argument depth).

      1. Otherwise f is displayed as "(omitted T)" where T = typeof(f),

      unless istoobig(f) is false (the istoobig fall-back for arbitrary types being true). In the latter case, the long (ie, MIME"plain/text") form of f is shown. To override this behaviour, overload the _show method for the type in question.

      source
      MLJBase.color_offMethod
      color_off()

      Suppress color and bold output at the REPL for displaying MLJ objects.

      source
      MLJBase.color_onMethod
      color_on()

      Enable color and bold output at the REPL, for enhanced display of MLJ objects.

      source
      MLJBase.handleMethod

      return abbreviated object id (as string) or it's registered handle (as string) if this exists

      source
      MLJBase.@constantMacro
      @constant x = value

      Private method (used in testing).

      Equivalent to const x = value but registers the binding thus:

      MLJBase.HANDLE_GIVEN_ID[objectid(value)] = :x

      Registered objects get displayed using the variable name to which it was bound in calls to show(x), etc.

      WARNING: As with any const declaration, binding x to new value of the same type is not prevented and the registration will not be updated.

      source
      MLJBase.@moreMacro
      @more

      Entered at the REPL, equivalent to show(ans, 100). Use to get a recursive description of all properties of the last REPL value.

      source

      Utility functions

      MLJBase._permute_rowsMethod

      permuterows(obj, perm)

      Internal function to return a vector or matrix with permuted rows given the permutation perm.

      source
      MLJBase.available_nameMethod
      available_name(modl::Module, name::Symbol)

      Function to replace, if necessary, a given name with a modified one that ensures it is not the name of any existing object in the global scope of modl. Modifications are created with numerical suffixes.

      source
      MLJBase.check_same_nrowsMethod
      check_same_nrows(X, Y)

      Internal function to check two objects, each a vector or a matrix, have the same number of rows.

      source
      MLJBase.chunksMethod
      chunks(range, n)

      Split an AbstractRange into n subranges of approximately equal length.

      Example

      julia> collect(chunks(1:5, 2))
       2-element Array{UnitRange{Int64},1}:
        1:3
        4:5
       
       **Private method**
      -
      source
      MLJBase.flat_valuesMethod
      flat_values(t::NamedTuple)

      View a nested named tuple t as a tree and return, as a tuple, the values at the leaves, in the order they appear in the original tuple.

      julia> t = (X = (x = 1, y = 2), Y = 3)
      +
      source
      MLJBase.flat_valuesMethod
      flat_values(t::NamedTuple)

      View a nested named tuple t as a tree and return, as a tuple, the values at the leaves, in the order they appear in the original tuple.

      julia> t = (X = (x = 1, y = 2), Y = 3)
       julia> flat_values(t)
      -(1, 2, 3)
      source
      MLJBase.generate_name!Method
      generate_name!(M, existing_names; only=Union{Function,Type}, substitute=:f)

      Given a type M (e.g., MyEvenInteger{N}) return a symbolic, snake-case, representation of the type name (such as my_even_integer). The symbol is pushed to existing_names, which must be an AbstractVector to which a Symbol can be pushed.

      If the snake-case representation already exists in existing_names a suitable integer is appended to the name.

      If only is specified, then the operation is restricted to those M for which M isa only. In all other cases the symbolic name is generated using substitute as the base symbol.

      existing_names = []
      +(1, 2, 3)
      source
      MLJBase.generate_name!Method
      generate_name!(M, existing_names; only=Union{Function,Type}, substitute=:f)

      Given a type M (e.g., MyEvenInteger{N}) return a symbolic, snake-case, representation of the type name (such as my_even_integer). The symbol is pushed to existing_names, which must be an AbstractVector to which a Symbol can be pushed.

      If the snake-case representation already exists in existing_names a suitable integer is appended to the name.

      If only is specified, then the operation is restricted to those M for which M isa only. In all other cases the symbolic name is generated using substitute as the base symbol.

      existing_names = []
       julia> generate_name!(Vector{Int}, existing_names)
       :vector
       
      @@ -164,9 +104,19 @@
       :not_array
       
       julia> generate_name!(Int, existing_names, only=Array, substitute=:not_array)
      -:not_array2
      source
      MLJBase.init_rngMethod

      init_rng(rng)

      Create an AbstractRNG from rng. If rng is a non-negative Integer, it returns a MersenneTwister random number generator seeded with rng; If rng is an AbstractRNG object it returns rng, otherwise it throws an error.

      source
      MLJBase.measures_for_exportMethod
      measures_for_export()

      Return a list of the symbolic representation of all:

      • measure types (subtypes of Aggregated and Unaggregated) measure

      • type aliases (as defined by the constant MLJBase.MEASURE_TYPE_ALIASES)

      • all built-in measure instances (as declared by instances trait)

      source
      MLJBase.metadata_measureMethod
      metadata_measure(T; kw...)

      Helper function to write the metadata (trait definitions) for a single measure.

      Compulsory keyword arguments

      • target_scitype: The allowed scientific type of y in measure(ŷ, y, ...). This is typically some abstract array. E.g, in single target variable regression this is typically AbstractArray{<:Union{Missing,Continuous}}. For a binary classification metric insensitive to class order, this would typically be Union{AbstractArray{<:Union{Missing,Multiclass{2}}}, AbstractArray{<:Union{Missing,OrderedFactor{2}}}}, which has the alias FiniteArrMissing.

      • orientation: Orientation of the measure. Use :loss when lower is better and :score when higher is better. For example, set :loss for root mean square and :score for area under the ROC curve.

      • prediction_type: Refers to in measure(ŷ, y, ...) and should be one of: :deterministic ( has same type as y), :probabilistic or :interval.

      Optional keyword arguments

      The following have meaningful defaults but may still require overloading:

      • instances: A vector of strings naming the built-in instances of the measurement type provided by the implementation, which are usually just common aliases for the default instance. E.g., for RSquared has the instances = ["rsq", "rsquared"] which are both defined as RSquared() in the implementation. MulticlassFScore has the instances = ["macro_f1score", "micro_f1score", "multiclass_f1score"], where micro_f1score = MulticlassFScore(average=micro_avg), etc. Default is String[].

      • aggregation: Aggregation method for measurements, typically Mean() (for, e.g., mean absolute error) or Sum() (for number of true positives). Default is Mean(). Must subtype StatisticalTraits.AggregationMode. It is used to:

        • aggregate measurements in resampling (e.g., cross-validation)

        • aggregating per-observation measurements returned by single in the fallback definition of call for Unaggregated measures

        (such as area under the ROC curve).

      • supports_weights: Whether the measure can be called with per-observation weights w, as in l2(ŷ, y, w). Default is true.

      • supports_class_weights: Whether the measure can be called with a class weight dictionary w, as in micro_f1score(ŷ, y, w). Default is true. Default is false.

      • human_name: Ordinary name of measure. Used in the full auto-generated docstring, which begins "A measure type for human_name ...". Eg, the human_name for TruePositive is number of true positives. Default is snake-case version of type name, with underscores replaced by spaces; soMeanAbsoluteError` becomes "mean absolute error".

      • docstring: An abbreviated docstring, displayed by info(measure). Fallback uses human_name and lists the instances.

      source
      MLJBase.prependMethod
      MLJBase.prepend(::Symbol, ::Union{Symbol,Expr,Nothing})

      For prepending symbols in expressions like :(y.w) and :(x1.x2.x3).

      julia> prepend(:x, :y) :(x.y)

      julia> prepend(:x, :(y.z)) :(x.y.z)

      julia> prepend(:w, ans) :(w.x.y.z)

      If the second argument is nothing, then nothing is returned.

      source
      MLJBase.recursive_getpropertyMethod
      recursive_getproperty(object, nested_name::Expr)

      Call getproperty recursively on object to extract the value of some nested property, as in the following example:

      julia> object = (X = (x = 1, y = 2), Y = 3)
      +:not_array2
      source
      MLJBase.guess_model_target_observation_scitypeMethod
      guess_model_targetobservation_scitype(model)

      Private method

      Try to infer a lowest upper bound on the scitype of target observations acceptable to model, by inspecting target_scitype(model). Return Unknown if unable to draw reliable inferrence.

      The observation scitype for a table is here understood as the scitype of a row converted to a vector.

      source
      MLJBase.guess_observation_scitypeMethod
      guess_observation_scitype(y)

      Private method.

      If y is an AbstractArray, return the scitype of y[:, :, ..., :, 1]. If y is a table, return the scitype of the first row, converted to a vector, unless this row has missing elements, in which case return Unknown.

      In all other cases, Unknown.

      julia> guess_observation_scitype([missing, 1, 2, 3])
      +Union{Missing, Count}
      +
      +julia> guess_observation_scitype(rand(3, 2))
      +AbstractVector{Continuous}
      +
      +julia> guess_observation_scitype((x=rand(3), y=rand(Bool, 3)))
      +AbstractVector{Union{Continuous, Count}}
      +
      +julia> guess_observation_scitype((x=[missing, 1, 2], y=[1, 2, 3]))
      +Unknown
      source
      MLJBase.init_rngMethod

      init_rng(rng)

      Create an AbstractRNG from rng. If rng is a non-negative Integer, it returns a MersenneTwister random number generator seeded with rng; If rng is an AbstractRNG object it returns rng, otherwise it throws an error.

      source
      MLJBase.observationMethod
      observation(S)

      Private method.

      Tries to infer the per-observation scitype from the scitype of S, when S is known to be the scitype of some container with multiple observations; here we view the scitype for one row of a table to be the scitype of the row converted to a vector. Return Unknown if unable to draw reliable inferrence.

      The observation scitype for a table is here understood as the scitype of a row converted to a vector.

      source
      MLJBase.prependMethod
      MLJBase.prepend(::Symbol, ::Union{Symbol,Expr,Nothing})

      For prepending symbols in expressions like :(y.w) and :(x1.x2.x3).

      julia> prepend(:x, :y) :(x.y)

      julia> prepend(:x, :(y.z)) :(x.y.z)

      julia> prepend(:w, ans) :(w.x.y.z)

      If the second argument is nothing, then nothing is returned.

      source
      MLJBase.recursive_getpropertyMethod
      recursive_getproperty(object, nested_name::Expr)

      Call getproperty recursively on object to extract the value of some nested property, as in the following example:

      julia> object = (X = (x = 1, y = 2), Y = 3)
       julia> recursive_getproperty(object, :(X.y))
      -2
      source
      MLJBase.recursive_setproperty!Method
      recursively_setproperty!(object, nested_name::Expr, value)

      Set a nested property of an object to value, as in the following example:

      julia> mutable struct Foo
      +2
      source
      MLJBase.recursive_setproperty!Method
      recursively_setproperty!(object, nested_name::Expr, value)

      Set a nested property of an object to value, as in the following example:

      julia> mutable struct Foo
                  X
                  Y
              end
      @@ -183,10 +133,10 @@
       42
       
       julia> object
      -Foo(Bar(1, 42), 3)
      source
      MLJBase.sequence_stringMethod
      sequence_string(itr, n=3)

      Return a "sequence" string from the first n elements generated by itr.

      julia> MLJBase.sequence_string(1:10, 4)
      -"1, 2, 3, 4, ..."

      Private method.

      source
      MLJBase.sequence_stringMethod
      sequence_string(itr, n=3)

      Return a "sequence" string from the first n elements generated by itr.

      julia> MLJBase.sequence_string(1:10, 4)
      +"1, 2, 3, 4, ..."

      Private method.

      source
      MLJBase.shuffle_rowsMethod
      shuffle_rows(X::AbstractVecOrMat,
                    Y::AbstractVecOrMat;
      -             rng::AbstractRNG=Random.GLOBAL_RNG)

      Return row-shuffled vectors or matrices using a random permutation of X and Y. An optional random number generator can be specified using the rng argument.

      source
      MLJBase.unwindMethod
      unwind(iterators...)

      Represent all possible combinations of values generated by iterators as rows of a matrix A. In more detail, A has one column for each iterator in iterators and one row for each distinct possible combination of values taken on by the iterators. Elements in the first column cycle fastest, those in the last clolumn slowest.

      Example

      julia> iterators = ([1, 2], ["a","b"], ["x", "y", "z"]);
      +             rng::AbstractRNG=Random.GLOBAL_RNG)

      Return row-shuffled vectors or matrices using a random permutation of X and Y. An optional random number generator can be specified using the rng argument.

      source
      MLJBase.unwindMethod
      unwind(iterators...)

      Represent all possible combinations of values generated by iterators as rows of a matrix A. In more detail, A has one column for each iterator in iterators and one row for each distinct possible combination of values taken on by the iterators. Elements in the first column cycle fastest, those in the last clolumn slowest.

      Example

      julia> iterators = ([1, 2], ["a","b"], ["x", "y", "z"]);
       julia> MLJTuning.unwind(iterators...)
       12×3 Array{Any,2}:
        1  "a"  "x"
      @@ -200,4 +150,4 @@
        1  "a"  "z"
        2  "a"  "z"
        1  "b"  "z"
      - 2  "b"  "z"
      source
      + 2 "b" "z"
      source