Skip to content

Commit

Permalink
Implement multi-target for hist.
Browse files Browse the repository at this point in the history
Initial commit.

Predictor.

Compile.

fixes.

Cleanup.

Moving code around.

Start working on cat features.

Start working on model IO.

Fix.

Revert.

cleanup.

Rebase.

Reverse cleanup.

rename.

Fix rebase.

small cleanup.

inc

Merge it into reg tree.

Strategy.

Extract the cat matrix.

Use array in predictor.

Use array in scalar.

Merge two kernels.

QDM.

inplace predict.

cleanup.

naming.

cleanup.

cleanup.

sampler.

copy.

cleanup.

compile test.

Hide the tree.

Hide from the partitioner.

Hide init root.

layer to trees.

check.

Remove old sampling func.

leaf partition.

use linalg.

remove grad stats.

ro5

reverse.

Don't support prediction cache for now.

col sampler.

Cleanup.

Cleanup.

Cleanup histogram.

t

Cleanup evaluation.

ic.

Cleanup.

start working on io.

is valid.

basic io.

dispatch.

Basic IO.

Cleanup node sum.

cleanup.

Extract the updater.

Merge it into quantile hist.

cleanup.

Cleanup.

restore checks.

Cleanup.

remove num_target.

fix tests.

Fix.

fixes.

Type deduction.

R package.

Predict leaf.

Predict leaf.

cleanup.

Add a test to sampling.

check.

cleanup.

cleanup.

parallel.

Cleanup

Fix root.

column-major.

fewer right.

Cleanup.

Initial work on merging the updaters.

Fix.

Merge update tree.

Consistent naming.

HD.

Unify sampling.

Fix build.

Fix build.

CUDA build.

Fix GPU SHAP tests.

fix.

fix rebase.

nd.

update

rebase errors.

configuration.

Lint.

Fix segfault.

split up groups and targets.

Fix.

Fix.

Remove targets.

cleanup.

Cleanup linalg.

fix test.

revert.

Rebase.

interaction constraint.

try to use constant.

work on merging the parameter into tree.

work on tree json model.

Initialization.

remove fixme.

Pass the model parameter in.

Cleanup.

Fix size.

Checks.

lint.

Update document.

Pass obj info instead of model parameter.

make clang happy.

fix rebase.

Cleanup.

Tests.
  • Loading branch information
trivialfis committed Mar 14, 2023
1 parent 72e8331 commit fd670a8
Show file tree
Hide file tree
Showing 31 changed files with 1,199 additions and 917 deletions.
19 changes: 15 additions & 4 deletions demo/guide-python/multioutput_regression.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,18 @@ def gen_circle() -> Tuple[np.ndarray, np.ndarray]:
return X, y


def rmse_model(plot_result: bool):
def rmse_model(plot_result: bool, strategy: str):
"""Draw a circle with 2-dim coordinate as target variables."""
X, y = gen_circle()
# Train a regressor on it
reg = xgb.XGBRegressor(tree_method="hist", n_estimators=64)
reg = xgb.XGBRegressor(
tree_method="hist",
n_estimators=128,
n_jobs=16,
max_depth=8,
multi_strategy=strategy,
subsample=0.6,
)
reg.fit(X, y, eval_set=[(X, y)])

y_predt = reg.predict(X)
Expand Down Expand Up @@ -88,9 +95,10 @@ def rmse(predt: np.ndarray, dtrain: xgb.DMatrix) -> Tuple[str, float]:
{
"tree_method": "hist",
"num_target": y.shape[1],
"multi_strategy": "monolithic",
},
dtrain=Xy,
num_boost_round=100,
num_boost_round=128,
obj=squared_log,
evals=[(Xy, "Train")],
evals_result=results,
Expand All @@ -107,6 +115,9 @@ def rmse(predt: np.ndarray, dtrain: xgb.DMatrix) -> Tuple[str, float]:
parser.add_argument("--plot", choices=[0, 1], type=int, default=1)
args = parser.parse_args()
# Train with builtin RMSE objective
rmse_model(args.plot == 1)
# one model per output
rmse_model(args.plot == 1, "composite")
# one model for all outputs
rmse_model(args.plot == 1, "monolithic")
# Train with custom objective.
custom_rmse_model(args.plot == 1)
7 changes: 7 additions & 0 deletions doc/parameter.rst
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,13 @@ Parameters for Tree Booster
list is a group of indices of features that are allowed to interact with each other.
See :doc:`/tutorials/feature_interaction_constraint` for more information.

* ``multi_strategy``, [default = ``composite``]

- The strategy used for training multi-target models.

- ``composite``: One model for each target.
- ``monolithic``: Use multi-target trees.

.. _cat-param:

Parameters for Categorical Feature
Expand Down
29 changes: 28 additions & 1 deletion doc/tutorials/multioutput.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,11 @@ can be simultaneously classified as both sci-fi and comedy. For detailed explan
terminologies related to different multi-output models please refer to the
:doc:`scikit-learn user guide <sklearn:modules/multiclass>`.

Internally, XGBoost builds one model for each target similar to sklearn meta estimators,
**********************************
Training with One-Model-Per-Target
**********************************

By default, XGBoost builds one model for each target similar to sklearn meta estimators,
with the added benefit of reusing data and other integrated features like SHAP. For a
worked example of regression, see
:ref:`sphx_glr_python_examples_multioutput_regression.py`. For multi-label classification,
Expand All @@ -36,3 +40,26 @@ dense matrix for labels.
The feature is still under development with limited support from objectives and metrics.

*************************
Training with Vector Leaf
*************************

.. versionadded:: 2.0

.. note::

This is highly experimental and many features are missing.


XGBoost can optionally build multi-output trees with the size of leaf equals to the number
of targets. The behavior can be controlled by the ``multi_strategy`` training
parameter. It can take the value `composite` (the default) or `monolithic`. Specify
`monolithic` and use ``tree_method=hist`` to enable this feature.


.. code-black:: python

clf = xgb.XGBClassifier(tree_method="hist", multi_strategy="monolithic")

See :ref:`sphx_glr_python_examples_multioutput_regression.py` for a worked example.
12 changes: 6 additions & 6 deletions include/xgboost/base.h
Original file line number Diff line number Diff line change
Expand Up @@ -89,19 +89,19 @@
namespace xgboost {

/*! \brief unsigned integer type used for feature index. */
using bst_uint = uint32_t; // NOLINT
using bst_uint = std::uint32_t; // NOLINT
/*! \brief integer type. */
using bst_int = int32_t; // NOLINT
using bst_int = std::int32_t; // NOLINT
/*! \brief unsigned long integers */
using bst_ulong = uint64_t; // NOLINT
using bst_ulong = std::uint64_t; // NOLINT
/*! \brief float type, used for storing statistics */
using bst_float = float; // NOLINT
/*! \brief Categorical value type. */
using bst_cat_t = int32_t; // NOLINT
using bst_cat_t = std::int32_t; // NOLINT
/*! \brief Type for data column (feature) index. */
using bst_feature_t = uint32_t; // NOLINT
using bst_feature_t = std::uint32_t; // NOLINT
/*! \brief Type for histogram bin index. */
using bst_bin_t = int32_t; // NOLINT
using bst_bin_t = std::int32_t; // NOLINT
/*! \brief Type for data row index.
*
* Be careful `std::size_t' is implementation-defined. Meaning that the binary
Expand Down
8 changes: 4 additions & 4 deletions include/xgboost/linalg.h
Original file line number Diff line number Diff line change
Expand Up @@ -530,17 +530,17 @@ class TensorView {
/**
* \brief Number of items in the tensor.
*/
LINALG_HD [[nodiscard]] std::size_t Size() const { return size_; }
[[nodiscard]] LINALG_HD std::size_t Size() const { return size_; }
/**
* \brief Whether this is a contiguous array, both C and F contiguous returns true.
*/
LINALG_HD [[nodiscard]] bool Contiguous() const {
[[nodiscard]] LINALG_HD bool Contiguous() const {
return data_.size() == this->Size() || this->CContiguous() || this->FContiguous();
}
/**
* \brief Whether it's a c-contiguous array.
*/
LINALG_HD [[nodiscard]] bool CContiguous() const {
[[nodiscard]] LINALG_HD bool CContiguous() const {
StrideT stride;
static_assert(std::is_same<decltype(stride), decltype(stride_)>::value);
// It's contiguous if the stride can be calculated from shape.
Expand All @@ -550,7 +550,7 @@ class TensorView {
/**
* \brief Whether it's a f-contiguous array.
*/
LINALG_HD [[nodiscard]] bool FContiguous() const {
[[nodiscard]] LINALG_HD bool FContiguous() const {
StrideT stride;
static_assert(std::is_same<decltype(stride), decltype(stride_)>::value);
// It's contiguous if the stride can be calculated from shape.
Expand Down
14 changes: 7 additions & 7 deletions include/xgboost/task.h
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
/*!
* Copyright 2021-2022 by XGBoost Contributors
/**
* Copyright 2021-2023 by XGBoost Contributors
*/
#ifndef XGBOOST_TASK_H_
#define XGBOOST_TASK_H_

#include <xgboost/base.h>
#include <xgboost/base.h> // for XGBOOST_DEVICE

#include <cinttypes>
#include <cstdint> // for uint8_t

namespace xgboost {
/*!
/**
* \brief A struct returned by objective, which determines task at hand. The struct is
* not used by any algorithm yet, only for future development like categorical
* split.
Expand All @@ -23,7 +23,7 @@ namespace xgboost {
*/
struct ObjInfo {
// What kind of problem are we trying to solve
enum Task : uint8_t {
enum Task : std::uint8_t {
kRegression = 0,
kBinary = 1,
kClassification = 2,
Expand All @@ -41,7 +41,7 @@ struct ObjInfo {
/**
* \brief Use adaptive tree if the objective doesn't have valid hessian value.
*/
XGBOOST_DEVICE bool UpdateTreeLeaf() const { return zero_hess; }
[[nodiscard]] XGBOOST_DEVICE bool UpdateTreeLeaf() const { return zero_hess; }
};
} // namespace xgboost
#endif // XGBOOST_TASK_H_
23 changes: 9 additions & 14 deletions include/xgboost/tree_model.h
Original file line number Diff line number Diff line change
Expand Up @@ -101,14 +101,14 @@ struct RTreeNodeStat {
/*! \brief weight of current node */
bst_float base_weight;
/*! \brief number of child that is leaf node known up to now */
int leaf_child_cnt {0};
int leaf_child_cnt{0};

RTreeNodeStat() = default;
RTreeNodeStat(float loss_chg, float sum_hess, float weight) :
loss_chg{loss_chg}, sum_hess{sum_hess}, base_weight{weight} {}
RTreeNodeStat(float loss_chg, float sum_hess, float weight)
: loss_chg{loss_chg}, sum_hess{sum_hess}, base_weight{weight} {}
bool operator==(const RTreeNodeStat& b) const {
return loss_chg == b.loss_chg && sum_hess == b.sum_hess &&
base_weight == b.base_weight && leaf_child_cnt == b.leaf_child_cnt;
return loss_chg == b.loss_chg && sum_hess == b.sum_hess && base_weight == b.base_weight &&
leaf_child_cnt == b.leaf_child_cnt;
}
// Swap byte order for all fields. Useful for transporting models between machines with different
// endianness (big endian vs little endian)
Expand Down Expand Up @@ -433,11 +433,9 @@ class RegTree : public Model {
* \param leaf_right_child The right child index of leaf, by default kInvalidNodeId,
* some updaters use the right child index of leaf as a marker
*/
void ExpandNode(bst_node_t nid, unsigned split_index, bst_float split_value,
bool default_left, bst_float base_weight,
bst_float left_leaf_weight, bst_float right_leaf_weight,
bst_float loss_change, float sum_hess, float left_sum,
float right_sum,
void ExpandNode(bst_node_t nid, unsigned split_index, bst_float split_value, bool default_left,
bst_float base_weight, bst_float left_leaf_weight, bst_float right_leaf_weight,
bst_float loss_change, float sum_hess, float left_sum, float right_sum,
bst_node_t leaf_right_child = kInvalidNodeId);
/**
* \brief Expands a leaf node into two additional leaf nodes for a multi-target tree.
Expand Down Expand Up @@ -587,7 +585,6 @@ class RegTree : public Model {
[[nodiscard]] bool IsMissing(size_t i) const;
[[nodiscard]] bool HasMissing() const;


private:
/*!
* \brief a union value of value and flag
Expand Down Expand Up @@ -627,9 +624,7 @@ class RegTree : public Model {
/*!
* \brief Get split types for all nodes.
*/
[[nodiscard]] std::vector<FeatureType> const& GetSplitTypes() const {
return split_types_;
}
[[nodiscard]] std::vector<FeatureType> const& GetSplitTypes() const { return split_types_; }
[[nodiscard]] common::Span<uint32_t const> GetSplitCategories() const {
return split_categories_;
}
Expand Down
2 changes: 2 additions & 0 deletions include/xgboost/tree_updater.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@
#include <vector> // for vector

namespace xgboost {
struct ObjInfo;
struct Context;
namespace tree {
struct TrainParam;
}
Expand Down
2 changes: 2 additions & 0 deletions python-package/xgboost/sklearn.py
Original file line number Diff line number Diff line change
Expand Up @@ -624,6 +624,7 @@ def __init__(
feature_types: Optional[FeatureTypes] = None,
max_cat_to_onehot: Optional[int] = None,
max_cat_threshold: Optional[int] = None,
multi_strategy: Optional[str] = None,
eval_metric: Optional[Union[str, List[str], Callable]] = None,
early_stopping_rounds: Optional[int] = None,
callbacks: Optional[List[TrainingCallback]] = None,
Expand Down Expand Up @@ -670,6 +671,7 @@ def __init__(
self.feature_types = feature_types
self.max_cat_to_onehot = max_cat_to_onehot
self.max_cat_threshold = max_cat_threshold
self.multi_strategy = multi_strategy
self.eval_metric = eval_metric
self.early_stopping_rounds = early_stopping_rounds
self.callbacks = callbacks
Expand Down
5 changes: 2 additions & 3 deletions src/c_api/c_api.cc
Original file line number Diff line number Diff line change
Expand Up @@ -991,9 +991,8 @@ XGB_DLL int XGBoosterPredictFromDMatrix(BoosterHandle handle,
xgboost_CHECK_C_ARG_PTR(out_dim);
xgboost_CHECK_C_ARG_PTR(out_shape);

CalcPredictShape(strict_shape, type, p_m->Info().num_row_,
p_m->Info().num_col_, chunksize, learner->Groups(), rounds,
&shape, out_dim);
CalcPredictShape(strict_shape, type, p_m->Info().num_row_, p_m->Info().num_col_, chunksize,
learner->Groups(), rounds, &shape, out_dim);
*out_shape = dmlc::BeginPtr(shape);
API_END();
}
Expand Down
30 changes: 16 additions & 14 deletions src/c_api/c_api_utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
#define XGBOOST_C_API_C_API_UTILS_H_

#include <algorithm>
#include <cstddef>
#include <cstddef> // for size_t
#include <functional>
#include <memory> // std::shared_ptr
#include <memory> // for shared_ptr
#include <string>
#include <vector>

Expand All @@ -18,10 +18,11 @@
#include "xgboost/learner.h"
#include "xgboost/linalg.h" // ArrayInterfaceHandler
#include "xgboost/logging.h"
#include "xgboost/string_view.h" // StringView
#include "xgboost/string_view.h" // for StringView

namespace xgboost {
/* \brief Determine the output shape of prediction.
/**
* \brief Determine the output shape of prediction.
*
* \param strict_shape Whether should we reshape the output with consideration of groups
* and forest.
Expand All @@ -34,14 +35,14 @@ namespace xgboost {
* \param out_shape Output shape
* \param out_dim Output dimension
*/
inline void CalcPredictShape(bool strict_shape, PredictionType type, size_t rows, size_t cols,
size_t chunksize, size_t groups, size_t rounds,
std::vector<bst_ulong> *out_shape,
inline void CalcPredictShape(bool strict_shape, PredictionType type, std::size_t rows,
std::size_t cols, std::uint32_t chunksize, std::uint32_t n_groups,
std::size_t rounds, std::vector<bst_ulong> *out_shape,
xgboost::bst_ulong *out_dim) {
auto &shape = *out_shape;
if (type == PredictionType::kMargin && rows != 0) {
// When kValue is used, softmax can change the chunksize.
CHECK_EQ(chunksize, groups);
CHECK_EQ(chunksize, n_groups);
}

switch (type) {
Expand All @@ -55,13 +56,14 @@ inline void CalcPredictShape(bool strict_shape, PredictionType type, size_t rows
*out_dim = 2;
shape.resize(*out_dim);
shape.front() = rows;
shape.back() = std::min(groups, chunksize);
// chunksize can be 1 if it's softmax
shape.back() = std::min(n_groups, chunksize);
}
break;
}
case PredictionType::kApproxContribution:
case PredictionType::kContribution: {
if (groups == 1 && !strict_shape) {
if (n_groups == 1 && !strict_shape) {
*out_dim = 2;
shape.resize(*out_dim);
shape.front() = rows;
Expand All @@ -70,14 +72,14 @@ inline void CalcPredictShape(bool strict_shape, PredictionType type, size_t rows
*out_dim = 3;
shape.resize(*out_dim);
shape[0] = rows;
shape[1] = groups;
shape[1] = n_groups;
shape[2] = cols + 1;
}
break;
}
case PredictionType::kApproxInteraction:
case PredictionType::kInteraction: {
if (groups == 1 && !strict_shape) {
if (n_groups == 1 && !strict_shape) {
*out_dim = 3;
shape.resize(*out_dim);
shape[0] = rows;
Expand All @@ -87,7 +89,7 @@ inline void CalcPredictShape(bool strict_shape, PredictionType type, size_t rows
*out_dim = 4;
shape.resize(*out_dim);
shape[0] = rows;
shape[1] = groups;
shape[1] = n_groups;
shape[2] = cols + 1;
shape[3] = cols + 1;
}
Expand All @@ -98,7 +100,7 @@ inline void CalcPredictShape(bool strict_shape, PredictionType type, size_t rows
shape.resize(4);
shape[0] = rows;
shape[1] = rounds;
shape[2] = groups;
shape[2] = n_groups;
auto forest = chunksize / (shape[1] * shape[2]);
forest = std::max(static_cast<decltype(forest)>(1), forest);
shape[3] = forest;
Expand Down
Loading

0 comments on commit fd670a8

Please sign in to comment.