Skip to content

Using SharpLearning.XGBoost

Mads Dabros edited this page May 23, 2018 · 10 revisions

64 bit only

SharpLearning.XGBoost is a x64 project only. So when using the XGBoost models and learners in a project, make sure that the project is also compiled for x64. If not, it will result in a BadImageFormatException.

Native dependency and disposable models

SharpLearning.XGBoost depends on the native gradient boosting library xgboost. This means that the models produced by the learners in SharpLearning.XGBoost has a native dependency and implements the IDisposable interface. So when using an XGBoost model, RegressionXGBoostModel or ClassificationXGBoostModel, make sure to call the Dispose method or place the model in a using statement like shown below:

var learner = new RegressionXGBoostLearner();

using (var model = learner.Learn(observations, targets))
{
    var predictions = model.Predict(observations);
}

The using statement is especially important if several models are learned in a loop, for instance when tuning hyperparameters. If the models are not disposed after each iteration, the previous models will not be collected, and the loop will leak memory.

If an XGBoost model is a member of another type, make sure to implement the IDisposable interface on the type, and call the dispose method on the model:

public void Dispose()
{
    if (m_model != null)
    {
        m_model.Dispose();
    }
}

Saving and loading models

Since the model from XGBoost, RegressionXGBoostModel or ClassificationXGBoostModel, rely on the native implementation, the regular serialization methods (GenericBinarySerializer and GenericXmlDataContractSerializer) does not work with the XGBoost models. Therefore the only way to save an XGBoost model is by calling the Save method directly on the model. The save method takes as argument the filepath of the model to save:

var learner = new RegressionXGBoostLearner();

// Saves XGBoost model.
using (var model = learner.Learn(observations, targets))
{
    model.Save("C:\model.xgb");
}

Likewise the model is loaded by using the static Load method directly on the model class, for instance:

// Loads XGBoost model.
using (var loadedModel = RegressionXGBoostModel.Load("C:\model.xgb"))
{
    var predictions = loadedModel.Predict(observations);
}

Training with GPU

In order to use the XGBoost learners, RegressionXGBoostLearner and ClassificationXGBoostLearner, with GPU support it is required to install CUDA. The guide to install CUDA on windows can be found here: CUDA installation guide for Windows

After CUDA is installed, it is possible to use the GPU version of the TreeMethods. For instance:

var learner = new RegressionXGBoostLearner(treeMethod: TreeMethod.GPUExact);

using (var model = learner.Learn(observations, targets))
{
    var predictions = model.Predict(observations);
}

Future work

  • Add conversion support from SharpLearning.XGBoost models to SharpLearning.GradientBoost models.
    • This will add support for the general serialization methods.
    • This will avoid native dependencies when using the models.
  • Add x86/x64 support.