modelviz is a Python package designed for comprehensive and customizable data visualization and model evaluation. With modules for visualizing relationships, confusion matrices, ROC curves, data distributions, and handling missing values, modelviz
simplifies exploratory data analysis (EDA) and model performance evaluation.
Install modelviz
via pip:
pip install modelviz
- Visualize Confusion Matrices:
- Supports both binary and multi-class confusion matrices.
- Displays proportions, TP, FP, FN, and TN labels.
- Includes detailed metrics like Accuracy, Precision, Recall, F1 Score, MCC, and Cohen's Kappa.
- Option to normalize the confusion matrix.
from modelviz.confusion_matrix import plot_confusion_matrix
import numpy as np
cm = np.array([[50, 10], [5, 35]]) # Binary confusion matrix
classes = ["Negative", "Positive"]
plot_confusion_matrix(cm, classes, "Logistic Regression")
- Feature Histograms:
- Automatically generate histograms for all numeric columns in a pandas DataFrame.
- Skip binary columns for cleaner visualizations.
- Customize bins, colors, and titles.
from modelviz.histogram import plot_feature_histograms
import pandas as pd
df = pd.DataFrame({
'Age': [25, 30, 35, 40],
'Income': [40000, 50000, 60000, 70000],
'Gender': [0, 1, 0, 1]
})
plot_feature_histograms(df, exclude_binary=True, bins=10, color='blue')
- ROC Curve Visualization:
- Plot Receiver Operating Characteristic (ROC) curves.
- Highlight thresholds like Youden's J and adjusted thresholds.
- Display key metrics like AUC (Area Under Curve).
from modelviz.roc import plot_roc_curve_with_youdens_thresholds
fpr = [0.0, 0.1, 0.2, 0.3]
tpr = [0.0, 0.4, 0.6, 1.0]
thresholds = [1.0, 0.8, 0.5, 0.2]
plot_roc_curve_with_youdens_thresholds(fpr, tpr, thresholds, roc_auc=0.85, model_name="My Model")
- Correlation Matrix:
- Generate and visualize correlation matrices for numeric features.
- Customize heatmaps with annotations, colormap, and figure size.
from modelviz.relationships import plot_correlation_matrix
import pandas as pd
df = pd.DataFrame({
'A': [1, 2, 3, 4],
'B': [4, 3, 2, 1],
'C': [5, 6, 7, 8]
})
plot_correlation_matrix(df, method='pearson')
- Visualize K-Fold Splits:
- Display data distribution across training and validation sets for K-Fold Cross-Validation.
- Easy visualization for understanding fold assignments.
- Missing Value Analysis:
- Visualize missing data in a DataFrame.
- Quickly identify patterns and percentage of missing values.
- Aggregate Model Metrics:
- Summarize key evaluation metrics for multiple models.
- Compare performance across models.
Each module in the package is designed to be imported separately. For example:
from modelviz.confusion_matrix import plot_confusion_matrix
from modelviz.histogram import plot_feature_histograms
from modelviz.roc import plot_roc_curve_with_youdens_thresholds
Contributions are welcome! If you have suggestions or new feature ideas, feel free to open an issue or create a pull request on GitHub.