Oh, My Trees (ohmt
) is a library for hyperplane-based Decision Tree induction, which allows you to induce
both Univariate (e.g., CART, C4.5) and Multivariate (OC1, Geometric) Decision Trees.
It currently supports single-class classification trees, and does not support categorical variables as they don't play
well with hyperplanes.
Installation through git:
git clone https://github.com/msetzu/oh-my-trees
mkvirtualenv -p python3.11 omt # optional, creates virtual environment
cd oh-my-trees
pip install -r src/requirements.txt
or directly through pip
:
pip install ohmt
OMT follows the classic sklearn fit
/predict
interface.
You can find a full example in the examples notebook notebooks/examples.ipynb
.
from ohmt.trees.multivariate import OmnivariateDT
dt = OmnivariateDT()
x = ...
y = ...
# trees all follow a similar sklearn-like training interface, with max_depth, min_samples, and min_eps as available parameters
dt.fit(x, y, max_depth=4)
OMT also offers a pruning toolkit, handled by trees.pruning.Gardener
, which allows you to prune the inducted Tree.
Find out more in the example notebook.
OMT
offers several Tree induction algorithms
Algorithm | Type | Reference | Info |
---|---|---|---|
C4.5 | Univariate | ||
CART | Univariate | ||
DKM | Univariate | ||
OC1 | Multivariate | Paper | |
Geometric | Multivariate | Paper | Only traditional SVM cut |
Omnivariate | Multivariate | Test all possible splits, pick the best one | |
Model tree | Multivariate | Paper | |
Linear tree | Multivariate | Paper | |
Optimal trees* | Multivariate | Paper | Mirror of Interpretable AI's implementation |
*As mirror of Interpretable AI's implementation, you need to install the appropriate license to use Optimal trees
You can get an explicit view of a tree
by accessing:
tree.nodes: Dict[int, Node]
its nodes,tree.parent: Dict[int, int], tree.ancestors: Dict[int, List[int]]
its parent and ancestors,tree.descendants: Dict[int, List[int]
its descendants,tree.depth: Dict[int, int]
: the depth of its nodes.
Tree
s can also be JSONized:
tree.json()
Greedy trees follow the basic algorithmic core of
- learning step: induce a node
- if shall continue:
- generate two children
- recurse on the given children
We incorporate this algorithm in Tree
, where step
implements the node induction, thus, most greedy induction
algorithms can implemented by simply overriding the step
function:
def step(self, parent_node: Optional[InternalNode],
data: numpy.ndarray, labels: numpy.ndarray, classes: numpy.ndarray,
direction: Optional[str] = None, depth: int = 1,
min_eps: float = 0.000001, max_depth: int = 16, min_samples: int = 10,
node_fitness_function: Optional[Callable] = None,
node_hyperparameters: Optional[Dict] = None, **step_hyperparameters) -> Optional[Node]