This project automates the conversion of decision tables into decision trees, integrating classical decision-making theories with machine learning algorithms. This tool is crucial for simplifying complex decision processes in various domains such as healthcare, finance, and more.
A decision table is a structured way to represent conditional logic by listing scenarios and their respective actions. Mathematically, a decision table T
can be expressed as:
- Scenarios (S): Possible states or inputs represented as rows.
- Conditions (C): Criteria based on which decisions are made, represented as columns.
- Actions (A): Outcomes or operations to perform when conditions are satisfied.
A decision tree is a flowchart-like structure in which each internal node represents a "test" on an attribute, each branch represents the outcome of the test, and each leaf node represents a class label (decision). The paths from root to leaf represent classification rules.
The entropy H(S)
of a set S
is defined as:
Where p_i
is the proportion (frequency) of the number of elements in class i
to the number of elements in set S
.
Used to decide the best feature that gets to go on the node of the tree. It calculates the difference between entropy before split and average entropy after the split of the dataset based on given attribute values.
Where:
S_v
is the subset ofS
for each valuev
of attributeA
.|S_v|
and|S|
are the number of elements inS_v
andS
, respectively.
To set up the project environment:
git clone https://github.com/yourusername/decision-tree-generator.git
cd decision-tree-generator
pip install -r requirements.txt
Run the following command to generate a decision tree from a CSV file containing the decision table:
python decision_tree_generator.py --input your_decision_table.csv
Contributors are welcome. For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License - see the LICENSE
file for details.
- Dmytro Tolstoi
- Oleh Kiprik
- Andrii Voznesenskyi