koaning · koaning · Dec 13, 2023 · Dec 12, 2023 · Dec 12, 2023 · Dec 12, 2023
diff --git a/Makefile b/Makefile
@@ -9,26 +9,18 @@ install:
 	pip install -e ".[dev]"
 	pre-commit install
 
-doctest:
-	python -m doctest -v sklego/*.py
-
-test-notebooks:
-	pytest --nbval-lax doc/*.ipynb
-
-test: doctest
+test:
 	pytest --disable-warnings --cov=sklego
 	rm -rf .coverage*
-	pytest --nbval-lax doc/*.ipynb
 
 precommit:
 	pre-commit run
 
 docs:
-	pip install -e ".[docs]"
 	mkdocs serve
 
-docs-deploy: docs
-	netlify deploy --dir=docs --prod
+docs-deploy:
+	mkdocs gh-deploy
 
 clean:
 	rm -rf .pytest_cache

diff --git a/docs/contribution.md b/docs/contribution.md
@@ -1,7 +1,7 @@
 # Contribution
 
 <p align="center">
-  <img src="/_static/contribution/contribute.png" />
+  <img src="../_static/contribution/contribute.png" />
 </p>
 
 This project started because we saw people rewrite the same transformers and estimators at clients over and over again.

diff --git a/docs/rstudio.md b/docs/rstudio.md
@@ -110,7 +110,7 @@ ggplot(data=cv_df) +
 ```
 
 <p align="center">
-  <img src="/_static/rstudio/Rplot1.png" />
+  <img src="../_static/rstudio/Rplot1.png" />
 </p>
 
 ```r
@@ -123,7 +123,7 @@ ggplot(data=cv_df) +
 ```
 
 <p align="center">
-  <img src="/_static/rstudio/Rplot2.png" />
+  <img src="../_static/rstudio/Rplot2.png" />
 </p>
 
 ## Important

diff --git a/docs/user-guide/cross-validation.md b/docs/user-guide/cross-validation.md
@@ -27,37 +27,37 @@ Let's make some random data to start with, and next define a plotting function.
 --8<-- "docs/_scripts/cross-validation.py:example-1"
 ```
 
-![example-1](/_static/cross-validation/example-1.png)
+![example-1](../_static/cross-validation/example-1.png)
 
 ```py title="Example 2"
 --8<-- "docs/_scripts/cross-validation.py:example-2"
 ```
 
-![example-2](/_static/cross-validation/example-2.png)
+![example-2](../_static/cross-validation/example-2.png)
 
 `window="expanding"` is the closest to scikit-learn implementation:
 
 ```py title="Example 3"
 --8<-- "docs/_scripts/cross-validation.py:example-3"
 ```
 
-![example-3](/_static/cross-validation/example-3.png)
+![example-3](../_static/cross-validation/example-3.png)
 
 If `train_duration` is not passed the training duration is the maximum without overlapping validation folds:
 
 ```py title="Example 4"
 --8<-- "docs/_scripts/cross-validation.py:example-4"
 ```
 
-![example-4](/_static/cross-validation/example-4.png)
+![example-4](../_static/cross-validation/example-4.png)
 
 If train and valid duration would lead to unwanted amounts of splits n_splits can set a maximal amount of splits
 
 ```py title="Example 5"
 --8<-- "docs/_scripts/cross-validation.py:example-5"
 ```
 
-![example-5](/_static/cross-validation/example-5.png)
+![example-5](../_static/cross-validation/example-5.png)
 
 ```py title="Summary"
 --8<-- "docs/_scripts/cross-validation.py:summary"
@@ -109,7 +109,7 @@ Train = [2004, 2004, 2004, 2004, 2004]
 Test = [2005, 2005, 2006, 2006, 2007]
 ```
 
-![grp-ts-split](/_static/cross-validation/group-time-series-split.png)
+![grp-ts-split](../_static/cross-validation/group-time-series-split.png)
 
 As you can see above `GroupTimeSeriesSplit` keeps the order of the time chronological and makes sure that the same time value won't appear in both the train and test set of the same fold.
 

diff --git a/docs/user-guide/datasets.md b/docs/user-guide/datasets.md
@@ -38,7 +38,7 @@ Loads the abalone dataset where the goal is to predict the gender of the creatur
 --8<-- "docs/_scripts/datasets.py:plot-abalone"
 ```
 
-![abalone](/_static/datasets/abalone.png)
+![abalone](../_static/datasets/abalone.png)
 
 ## Arrests
 
@@ -58,7 +58,7 @@ The goal is to predict whether or not the arrestee was released with a summons w
 --8<-- "docs/_scripts/datasets.py:plot-arrests"
 ```
 
-![arrests](/_static/datasets/arrests.png)
+![arrests](../_static/datasets/arrests.png)
 
 ## Chickens
 
@@ -78,7 +78,7 @@ There were four groups on chicks on different protein diets.
 --8<-- "docs/_scripts/datasets.py:plot-chicken"
 ```
 
-![chickens](/_static/datasets/chicken.png)
+![chickens](../_static/datasets/chicken.png)
 
 ## Hearts
 
@@ -99,7 +99,7 @@ This implementation loads the Cleveland dataset of the research which is the onl
 --8<-- "docs/_scripts/datasets.py:plot-hearts"
 ```
 
-![hearts](/_static/datasets/hearts.png)
+![hearts](../_static/datasets/hearts.png)
 
 ## Heroes
 
@@ -119,7 +119,7 @@ Note that the pandas dataset returns more information.
 --8<-- "docs/_scripts/datasets.py:plot-heroes"
 ```
 
-![heroes](/_static/datasets/heroes.png)
+![heroes](../_static/datasets/heroes.png)
 
 ## Penguins
 
@@ -140,7 +140,7 @@ The goal of the dataset is to predict which species of penguin a penguin belongs
 --8<-- "docs/_scripts/datasets.py:plot-penguins"
 ```
 
-![penguins](/_static/datasets/penguins.png)
+![penguins](../_static/datasets/penguins.png)
 
 ## Creditcard frauds
 
@@ -179,7 +179,7 @@ The dataset is highly unbalanced, the positive class (frauds) account for 0.172%
 --8<-- "docs/_scripts/datasets.py:plot-creditcards"
 ```
 
-![creditcards](/_static/datasets/creditcards.png)
+![creditcards](../_static/datasets/creditcards.png)
 
 ## Simpleseries
 
@@ -195,7 +195,7 @@ Generate a *very simple* timeseries dataset to play with. The generator assumes
 --8<-- "docs/_scripts/datasets.py:plot-ts"
 ```
 
-![timeseries](/_static/datasets/timeseries.png)
+![timeseries](../_static/datasets/timeseries.png)
 
 [abalone-api]: /api/datasets#sklego.datasets.load_abalone
 [arrests-api]: /api/datasets#sklego.datasets.load_arrests

diff --git a/docs/user-guide/fairness.md b/docs/user-guide/fairness.md
@@ -6,7 +6,7 @@ Scikit learn (pre version 1.2) came with the boston housing dataset. We can make
 --8<-- "docs/_scripts/fairness.py:predict-boston-simple"
 ```
 
-![boston-simple](/_static/fairness/predict-boston-simple.png)
+![boston-simple](../_static/fairness/predict-boston-simple.png)
 
 We could stop our research here if we think that our MSE is _good enough_ but this would be _dangerous_. To find out why, we should look at the variables that are being used in our model.
 
@@ -107,7 +107,7 @@ It does this by projecting all vectors away such that the remaining dataset is o
 The [`InformationFilter`][filter-information-api] uses a variant of the [Gram–Schmidt process][gram–schmidt-process] to filter information out of the dataset. We can make it visual in two dimensions;
 
 <p align="center">
-  <img src="/_static/fairness/projections.png" />
+  <img src="../_static/fairness/projections.png" />
 </p>
 
 To explain what occurs in higher dimensions we need to resort to maths. Take a training matrix $X$ that contains columns $x_1, ..., x_k$.
@@ -159,23 +159,23 @@ We can see that the coefficients of the three models are indeed different.
     ```py
     --8<-- "docs/_scripts/fairness.py:original-situation"
     ```
-![original-situation](/_static/fairness/original-situation.png)
+![original-situation](../_static/fairness/original-situation.png)
 
 #### 2. Drop two columns
 
 ??? example "Code to generate the plot"
     ```py
     --8<-- "docs/_scripts/fairness.py:drop-two"
     ```
-![drop-two](/_static/fairness/drop-two.png)
+![drop-two](../_static/fairness/drop-two.png)
 
 #### 3. Use the Information Filter
 
 ??? example "Code to generate the plot"
     ```py
     --8<-- "docs/_scripts/fairness.py:use-info-filter"
     ```
-![use-info-filter](/_static/fairness/use-info-filter.png)
+![use-info-filter](../_static/fairness/use-info-filter.png)
 
 There definitely is a balance between fairness and model accuracy. Which model you'll use depends on the world you want to create by applying your model.
 
@@ -241,7 +241,7 @@ The results of the grid search are shown below. Note that the logistic regressio
     ```py
     --8<-- "docs/_scripts/fairness.py:demographic-parity-grid-results"
     ```
-![demographic-parity-grid-results](/_static/fairness/demographic-parity-grid-results.png)
+![demographic-parity-grid-results](../_static/fairness/demographic-parity-grid-results.png)
 
 ## Equal opportunity
 
@@ -267,7 +267,7 @@ where POS is the subset of the population where `y_true = positive_target`.
     ```py
     --8<-- "docs/_scripts/fairness.py:equal-opportunity-grid-results"
     ```
-![equal-opportunity-grid-results](/_static/fairness/equal-opportunity-grid-results.png)
+![equal-opportunity-grid-results](../_static/fairness/equal-opportunity-grid-results.png)
 
 [^1]: M. Zafar et al. (2017), Fairness Constraints: Mechanisms for Fair Classification
 [^2]: M. Hardt, E. Price and N. Srebro (2016), Equality of Opportunity in Supervised Learning

diff --git a/docs/user-guide/linear-models.md b/docs/user-guide/linear-models.md
@@ -20,24 +20,24 @@ Lowess stands for LOcally WEighted Scatterplot Smoothing and has historically be
     --8<-- "docs/_scripts/linear-models.py:plot-lowess"
     ```
 
-![lowess](/_static/linear-models/lowess.png)
+![lowess](../_static/linear-models/lowess.png)
 
 The line does not look linear but that's because internally, during prediction, many weighted linear regressions are happening. The gif below demonstrates how the data is being weighted when we would make a prediction.
 
-![lowess-rolling](/_static/linear-models/lowess-rolling.gif)
+![lowess-rolling](../_static/linear-models/lowess-rolling.gif)
 
 ### Details on `sigma`
 
 We'll also show two different prediction outcomes depending on the hyperparameter `sigma`:
 
-![lowess-rolling-01](/_static/linear-models/lowess-rolling-01.gif)
+![lowess-rolling-01](../_static/linear-models/lowess-rolling-01.gif)
 
-![lowess-rolling-001](/_static/linear-models/lowess-rolling-001.gif)
+![lowess-rolling-001](../_static/linear-models/lowess-rolling-001.gif)
 
 You may be tempted now to think that a lower sigma always has a better fit, but you need to be careful here.
 The data might have gaps and larger sigma values will be able to properly regularize.
 
-![lowess-two-predictions](/_static/linear-models/lowess-two-predictions.gif)
+![lowess-two-predictions](../_static/linear-models/lowess-two-predictions.gif)
 
 Note that this regression also works in higher dimensions but the main downside of this approach is that it is _really slow_ when making predictions.
 
@@ -52,11 +52,11 @@ away.
 
 The effect of the `span` parameter on the weights can be seen below:
 
-![grid-span-sigma-02](/_static/linear-models/grid-span-sigma-01.png)
+![grid-span-sigma-02](../_static/linear-models/grid-span-sigma-01.png)
 
 This will also effect the predictions.
 
-![grid-span-sigma-01](/_static/linear-models/grid-span-sigma-02.png)
+![grid-span-sigma-01](../_static/linear-models/grid-span-sigma-02.png)
 
 You may need to squint your eyes a bit to see it, but lower spans cause more jiggles and less smooth curves.
 
@@ -119,7 +119,7 @@ Imagine that you have a dataset with some outliers.
 --8<-- "docs/_scripts/linear-models.py:lad-data"
 ```
 
-![lad-01](/_static/linear-models/lad-data.png)
+![lad-01](../_static/linear-models/lad-data.png)
 
 A simple linear regression will not do a good job since it is distracted by the outliers. That is because it optimizes the mean squared error
 
@@ -135,7 +135,7 @@ Hence, linear regression does the following:
 --8<-- "docs/_scripts/linear-models.py:lr-fit"
 ```
 
-![lad-02](/_static/linear-models/lr-fit.png)
+![lad-02](../_static/linear-models/lr-fit.png)
 
 By changing the loss function to the mean absolute deviation
 
@@ -151,7 +151,7 @@ Here an example of [LADRegression][lad-api] in action:
 --8<-- "docs/_scripts/linear-models.py:lad-fit"
 ```
 
-![lad-03](/_static/linear-models/lad-fit.png)
+![lad-03](../_static/linear-models/lad-fit.png)
 
 ### See also
 
@@ -172,7 +172,7 @@ then around 80% of the data is between these two lines.
 --8<-- "docs/_scripts/linear-models.py:quantile-fit"
 ```
 
-![quantile](/_static/linear-models/quantile-fit.png)
+![quantile](../_static/linear-models/quantile-fit.png)
 
 [lowess-api]: /api/linear-model#sklego.linear_model.LowessRegression
 [prob-weight-api]: /api/linear-model#sklego.linear_model.ProbWeightRegression