m2cgen (Model 2 Code Generator) - is a lightweight library which provides an easy way to transpile trained statistical models into a native code (Python, C, Java, Go, JavaScript, Visual Basic, C#, PowerShell, R, PHP).
Supported Python version is >= 3.5.
pip install m2cgen
- C
- C#
- Go
- Java
- JavaScript
- PHP
- PowerShell
- Python
- R
- Visual Basic
Classification | Regression | |
---|---|---|
Linear |
|
|
SVM |
|
|
Tree |
|
|
Random Forest |
|
|
Boosting |
|
|
Scalar value; signed distance of the sample to the hyperplane for the second class.
Vector value; signed distance of the sample to the hyperplane per each class.
The output is consistent with the output of LinearClassifierMixin.decision_function
.
Scalar value; signed distance of the sample to the hyperplane for the second class.
Vector value; one-vs-one score for each class, shape (n_samples, n_classes * (n_classes-1) / 2).
The output is consistent with the output of BaseSVC.decision_function
when the decision_function_shape
is set to ovo
.
Vector value; class probabilities.
Vector value; class probabilities.
The output is consistent with the output of the predict_proba
method of DecisionTreeClassifier
/ForestClassifier
/XGBClassifier
/LGBMClassifier
.
Here's a simple example of how a linear model trained in Python environment can be represented in Java code:
from sklearn.datasets import load_boston
from sklearn import linear_model
import m2cgen as m2c
boston = load_boston()
X, y = boston.data, boston.target
estimator = linear_model.LinearRegression()
estimator.fit(X, y)
code = m2c.export_to_java(estimator)
Generated Java code:
public class Model {
public static double score(double[] input) {
return (((((((((((((36.45948838508965) + ((input[0]) * (-0.10801135783679647))) + ((input[1]) * (0.04642045836688297))) + ((input[2]) * (0.020558626367073608))) + ((input[3]) * (2.6867338193449406))) + ((input[4]) * (-17.76661122830004))) + ((input[5]) * (3.8098652068092163))) + ((input[6]) * (0.0006922246403454562))) + ((input[7]) * (-1.475566845600257))) + ((input[8]) * (0.30604947898516943))) + ((input[9]) * (-0.012334593916574394))) + ((input[10]) * (-0.9527472317072884))) + ((input[11]) * (0.009311683273794044))) + ((input[12]) * (-0.5247583778554867));
}
}
You can find more examples of generated code for different models/languages here.
m2cgen
can be used as a CLI tool to generate code using serialized model objects (pickle protocol):
$ m2cgen <pickle_file> --language <language> [--indent <indent>] [--class_name <class_name>]
[--module_name <module_name>] [--package_name <package_name>] [--namespace <namespace>]
[--recursion-limit <recursion_limit>]
Don't forget that for unpickling serialized model objects their classes must be defined in the top level of an importable module in the unpickling environment.
Piping is also supported:
$ cat <pickle_file> | m2cgen --language <language>
Q: Generation fails with RuntimeError: maximum recursion depth exceeded
error.
A: If this error occurs while generating code using an ensemble model, try to reduce the number of trained estimators within that model. Alternatively you can increase the maximum recursion depth with sys.setrecursionlimit(<new_depth>)
.
Q: Generation fails with ImportError: No module named <module_name_here>
error while transpiling model from a serialized model object.
A: This error indicates that pickle protocol cannot deserialize model object. For unpickling serialized model objects, it is required that their classes must be defined in the top level of an importable module in the unpickling environment. So installation of package which provided model's class definition should solve the problem.