-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: add documentation versioning [skip ci]
- Loading branch information
Showing
32 changed files
with
979 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
--- | ||
sidebar-position: 7 | ||
--- | ||
|
||
# Complete workflow | ||
|
||
Here's a sequence diagram to represent an example workflow, from the raw data | ||
tables to classification, including data fusion, PCA and training. | ||
|
||
```plantuml | ||
actor User | ||
participant LLDF | ||
participant PCA | ||
participant Classifier | ||
User -> LLDF : Upload training tables | ||
User -> LLDF : Set parameters | ||
User -> Classifier : (optional) Upload model | ||
LLDF -> PCA : Pass preprocessed / fused tables | ||
LLDF --> User : Download fused tables | ||
LLDF -> Classifier : Pass preprocessed / fused tables \nRun classification | ||
PCA -> Classifier : (optional) Set number of components | ||
Classifier --> User : classification results, graphs | ||
PCA --> User : classification results, graphs | ||
Classifier --> User : (optional) download trained model | ||
User -> Classifier : pass data to classify | ||
Classifier --> User : classification results | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"label": "kNN module", | ||
"position": 6, | ||
"link": { | ||
"type": "generated-index", | ||
"description": "A module for k-nearest neighbors analysis." | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
--- | ||
sidebar_position: 1 | ||
--- | ||
|
||
# KNN class | ||
|
||
A class to store the data, methods and artifacts for _k-Nearest Neighbors Analysis_. | ||
|
||
## Syntax | ||
|
||
```python | ||
KNN(settings: KNNSettings, fused_data: LLDFModel) | ||
``` | ||
|
||
## Constructor parameters | ||
|
||
- `fused_data`: object of type [`LLDFModel`](../lldf/lldfmodel.md). Contains the data to be analyzed. | ||
- `settings`: object of type [`KNNSettings`](knnsettings.md). Contains the settings for | ||
the `KNN` object. | ||
|
||
## Fields | ||
|
||
- `settings`: object of type [`KNNSettings`](/tesi/docs/knn/knnsettings). Contains the settings for | ||
the `KNN` object. | ||
- `fused_data`: onject of type ['LLDFModel`](/tesi/docs/lldf/lldfmodel). Contains the | ||
artifacts from the data fusion process. | ||
- `model`: a `KNeighborsClassifier` model from `scikit-learn`. Defaults to `None`. | ||
|
||
## Methods | ||
|
||
- `knn(self)`: trains the k-Neighbors Analysis model | ||
- `predict(self, x_data)`: performs LDA prediction once the model is trained. | ||
- *raises*: | ||
- `RuntimeError("The kNN model is not trained yet!")` if the `KNN` model hasn't been trained yet | ||
|
||
## Example | ||
|
||
```python | ||
from chemfusekit.knn import KNN | ||
|
||
# Initialize and run the LDA class | ||
knn = KNN(settings, lldf.fused_data) | ||
knn.knn() | ||
|
||
# Run predictions | ||
knn.predict(x_data) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
--- | ||
sidebar-position: 1 | ||
--- | ||
|
||
# KNNSettings class | ||
|
||
Holds the settings for the [`KNN`](knn.md) object. | ||
|
||
## Syntax | ||
|
||
```python | ||
KNNSettings( | ||
n_neighbors: int, | ||
metric: str | Callable, | ||
weights: str | Callable, | ||
algorithm: str, | ||
output: GraphMode, | ||
test_split: false | ||
) | ||
``` | ||
|
||
## Fields and constructor parameters | ||
- `n_neighbors`: the amount of components to be used in the `KNN` model. Defaults to 15. | ||
- `metric`: the distance metric for the model. It can take one of the following values: | ||
- `minkwoski` | ||
- `precomputed` | ||
- `euclidean` | ||
or be a callable object. | ||
- `weights`: the weight metric for the model. It can take one of the following values: | ||
- `uniform` | ||
- `distance` | ||
or be a callable object. | ||
- `algorithm`: the algorithm for the model. It can take one of the following values: | ||
- `auto` | ||
- `ball_tree` | ||
- `kd_tree` | ||
- `brute` | ||
or be a callable object. | ||
- `output`: toggles graph output mode. Defaults to [`GraphMode.NONE`](../utils/graphmode.md). | ||
- `test_split`: toggles the training split test phase. Defaults to `False`. Requires `output` to be set to `True` to work. | ||
|
||
The constructor raises: | ||
- `ValueError("Invalid n_neighbors number: should be a positive integer.")` if the number of components is not valid. | ||
- `ValueError("Invalid metric: should be 'minkwoski', 'precomputed', 'euclidean' or a callable.")` if the chosen metric is neither available nor a callable function. | ||
- `ValueError("Invalid weight: should be 'uniform', 'distance' or a callable")` if the chosen weight is neither available nor a callable function. | ||
- `ValueError("Invalid algorithm: should be 'auto', 'ball_tree', 'kd_tree' or 'brute'.")` if the chosen algotithm does not exist. | ||
- `Warning("You selected test_split but it won't run because you disabled the output.")` if `test_split` is run with `output` set to false (split tests only produce graphical output, and are useless when run with disabled output). | ||
|
||
## Example | ||
|
||
```python | ||
from chemfusekit.knn import KNNSettings, GraphMode | ||
|
||
settings = KNNSettings( | ||
n_neighbors=20, # pick 20 neighbors | ||
metric='minkowski', # choose the metric | ||
weights='distance', # choose the weight metric | ||
algorithm='auto', # the best algorithm gets chosen automatically | ||
output=GraphMode.GRAPHIC, # graph output is enabled | ||
test_split=True # the model will be split-tested at the end of the training | ||
) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"label": "LDA module", | ||
"position": 4, | ||
"link": { | ||
"type": "generated-index", | ||
"description": "A module for linear discriminant analysis." | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
--- | ||
sidebar_position: 1 | ||
--- | ||
|
||
# LDA class | ||
|
||
A class to store the data, methods and artifacts for _Linear Discriminant Analysis_. | ||
|
||
## Syntax | ||
|
||
```python | ||
LDA(lldf_model: LLDFModel, settings: LDASettings) | ||
``` | ||
|
||
## Constructor parameters | ||
|
||
- `lldf_model`: object of type [`LLDFModel`](../lldf/lldfmodel.md). Contains the data to be analyzed. | ||
- `settings`: object of type [`LDASettings`](./ldasettings.md). Contains the settings for | ||
the `LDA` object. | ||
|
||
## Fields | ||
|
||
- `settings`: object of type [`LDASettings`](./ldasettings.md). Contains the settings for | ||
the `LDA` object. | ||
- Fused data fields: | ||
- `x_data` | ||
- `x_train` | ||
- `y` | ||
- `model`: a `LinearDiscriminantAnalysis` model from `scikit-learn`. Defaults to `None`. | ||
|
||
## Methods | ||
|
||
- `lda(self)`: performs Linear Discriminant Analysis | ||
- `__print_prediction_graphs(self, y_test, y_pred)`: helper function to print | ||
graphs and stats about LDA predictions | ||
- `predict(self, x_data)`: performs LDA prediction once the model is trained. | ||
- *raises*: | ||
- `RuntimeError("The LDA model is not trained yet!")` if the LDA model hasn't been trained yet | ||
|
||
## Example | ||
|
||
```python | ||
from chemfusekit.lda import LDA | ||
|
||
# Initialize and run the LDA class | ||
lda = LDA(lldf.fused_data, settings) | ||
lda.lda() | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
--- | ||
sidebar_position: 2 | ||
--- | ||
|
||
# LDASettings class | ||
|
||
Holds the settings for the [`LDA`](./lda.md) object. | ||
|
||
## Syntax | ||
|
||
```python | ||
LDASettings(components: int, output: GraphMode, split_test: bool) | ||
``` | ||
|
||
## Fields and constructor parameters | ||
|
||
- `components`: the amount of components to be used in the LDA model. Defaults to 3. | ||
- `output`: toggles graph output. Defaults to [`GraphMode.NONE`](../utils/graphmode.md). | ||
- `test_split`: toggles split testing. Defaults to `False`. | ||
|
||
|
||
The constructor raises: | ||
- `ValueError("Invalid component number: must be a > 1 integer.")` if the number of | ||
components is not valid. | ||
- `Warning("You selected test_split but it won't run because you disabled the output.")` if split tests are run with `output` disabled | ||
|
||
## Example | ||
|
||
```python | ||
from chemfusekit.lda import LDASettings, GraphMode | ||
|
||
settings = LDASettings( | ||
components=(pca.components - 1), # one less component than the number determined by PCA | ||
output=GraphMode.GRAPHIC, # graphs will be printed | ||
test_split=True # split testing is enabled | ||
) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
{ | ||
"label": "LLDF Module", | ||
"position": 2 | ||
, | ||
"link": { | ||
"type": "generated-index", | ||
"description": "A module for low-level data fusion." | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
--- | ||
sidebar_position: 1 | ||
--- | ||
|
||
# LLDF class | ||
|
||
The `LLDF` class is used for _low-level data fusion_. | ||
|
||
## Syntax | ||
|
||
```python | ||
LLDF(lldf_settings: LLDFSettings) | ||
``` | ||
|
||
## Constructor parameters | ||
|
||
- `lldf_settings`: [`LLDFSettings`](./lldfsettings) | ||
|
||
The settings for the LLDF object. | ||
|
||
## Fields | ||
|
||
- `settings`: [`LLDFSettings`](./lldfsettings) | ||
|
||
The settings for the LLDF object. | ||
|
||
- `fused_data`: [`LLDFModel`](./lldfmodel.md) | ||
|
||
The resulting model containing the data fusion artifacts. | ||
|
||
## Methods | ||
|
||
- `_snv(self, input_data)`: private method to rescale input arrays | ||
- `lldf(self)`: performs low-level data fusion on the data passed in the settings | ||
- *raises*: | ||
- `FileNotFoundError("Error opening the selected files.")` | ||
if the files specified in the settings are not valid | ||
- `SyntaxError("LLDF: this type of preprocessing does not exist")` | ||
if the preprocessing method specified in the settings is not valid | ||
- `export_data(self, export_path)`: exports the data fusion artifacts to an Excel file | ||
- *raises*: | ||
- `RuntimeError("Cannot export data before data fusion.")` if export is | ||
attempted before fusing the data | ||
- `RuntimeError("Could not export data to the selected path.")` if any error | ||
happens during the export phase | ||
|
||
|
||
## Example | ||
|
||
```python | ||
from chemfusekit.lldf import LLDF | ||
|
||
# Initialize and run low-level data fusion | ||
lldf = LLDF(lldf_settings) | ||
lldf.lldf() | ||
|
||
# Export the LLDF data to an Excel file | ||
lldf.export_data('output_file.xlsx') | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
--- | ||
sidebar_position: 3 | ||
--- | ||
|
||
# LLDFModel class | ||
|
||
This class models the output data from the LLDF operation. | ||
|
||
## Syntax | ||
|
||
```python | ||
LLDFModel(x_data: pd.DataFrame, x_train: pd.DataFrame, y: pd.DataFrame) | ||
``` | ||
|
||
## Fields and constructor parameters | ||
|
||
All three are `Pandas` `DataFrame` objects: | ||
- `x_data` | ||
- `x_train` | ||
- `y` | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
--- | ||
sidebar_position: 2 | ||
--- | ||
|
||
# LLDFSettings class | ||
|
||
Holds the settings for the [`LLDF`](./lldf.md) object. | ||
|
||
## Syntax | ||
|
||
```python | ||
LLDFSettings( | ||
qepas_path: str, | ||
qepas_sheet: str, | ||
rt_path:str, | ||
rt_sheet:str, | ||
preprocessing: str ='snv') | ||
``` | ||
|
||
## Fields and constructor parameters | ||
|
||
- `qepas_path`: a `str` containing the path to the QEPAS spectrography Excel datasheet | ||
- `qepas_sheet`: a `str` containing the sheet name within the QEPAS Excel file | ||
- `rt_path`: a `str` containing the path to the GC chromatrography Excel datasheet | ||
- `rt_sheet`: a `str` containing the sheet name within the GC Excel file | ||
- `preprocessing`: a `str` with the name of the preprocessing to be applied to the QEPAS data. | ||
Available options: `snv` (normalization), `savgol` (Savitski-Golay smoothing), `savgol+snv` (both). | ||
|
||
The constructor throws: | ||
- `TypeError("This type of preprocessing does not exist.")` if the preprocessing parameter is not one of the three available. | ||
|
||
## Example | ||
|
||
```python | ||
from chemfusekit.lldf import LLDFSettings | ||
|
||
# Initialize the settings for low-level data fusion | ||
lldf_settings = LLDFSettings( | ||
qepas_path='tests/qepas.xlsx', | ||
qepas_sheet='Sheet1', | ||
rt_path='tests/rt.xlsx', | ||
rt_sheet='Sheet1', | ||
preprocessing='snv' # normalization preprocessing; other options: savgol or both | ||
) | ||
``` |
Oops, something went wrong.