Skip to content

Commit

Permalink
Merge pull request #58 from discovery-unicamp/version-0.5.0-beta-docs…
Browse files Browse the repository at this point in the history
…-changes

Version 0.2.0-beta release changes
  • Loading branch information
GabrielBG0 authored May 13, 2024
2 parents f583936 + e14ef7a commit cb7de07
Show file tree
Hide file tree
Showing 3 changed files with 90 additions and 9 deletions.
21 changes: 13 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# Minerva
[![Continuous Test](https://github.com/discovery-unicamp/Minerva/actions/workflows/python-app.yml/badge.svg)](https://github.com/discovery-unicamp/Minerva/actions/workflows/python-app.yml)


[![Continuous Test](https://github.com/discovery-unicamp/Minerva/actions/workflows/python-app.yml/badge.svg)](https://github.com/discovery-unicamp/Minerva/actions/workflows/python-app.yml)

Minerva is a framework for training machine learning models for researchers.

Expand All @@ -14,19 +13,25 @@ This project aims to provide a robust and flexible framework for researchers wor
To install Minerva, you can use pip:

```sh
pip install .
pip install --editable .
```

## Usage

Import the necessary modules from Minerva and use them in your machine learning pipeline. For example:
You can eather use Minerva's modules directly or use the command line interface (CLI) to train and evaluate models.

### CLI

```python
from minerva.transforms import Flip, TransformPipeline
from minerva.models.nets import SETR_PUP
from minerva.analysis.metrics import PixelAccuracy
To train a model using the CLI, you can use any of the available pipelines. For example, to train a simple model using the Lightning module, you can run the following command:

```sh
python minerva/pipelines/simple_lightning_pipeline.py --config config.yaml
```

### Modules

You can also use Minerva's modules directly in your code. Just import the module you want to use and call the desired functions.

## License

This project is licensed under the MIT License. See the [LICENSE](https://github.com/discovery-unicamp/Minerva/blob/main/LICENSE) file for details.
Expand Down
76 changes: 76 additions & 0 deletions minerva/pipelines/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Pipelines: Enhancing Efficiency and Flexibility

## Introduction

Welcome to the Pipelines section! Here, we'll explore the core functionalities and best practices for creating versatile pipelines to automate tasks efficiently. Let's delve into the features and examples that demonstrate how to leverage pipelines effectively.

## 1. Reproducibility

- **Initialization and Configuration**: Pipelines are initialized using the `__init__` method, allowing configuration of common elements. All parameters passed to the class constructor are stored in the `self.hparams` dictionary, facilitating reproducibility and serialization. You can exclude specific parameters using the `ignore` parameter in the `__init__` method to enhance reproducibility.

- **ID and Working Directory**: Each pipeline instance is assigned a unique identifier (`id`) upon initialization, aiding in tracking and identification. Pipelines also have a designated working directory for organizing generated files, maintaining a clean project structure.

- **Public Interface**: Pipelines offer the `run` method as the public interface for execution. This method encapsulates the pipeline's logic and returns the output. Additionally, public attributes are implemented as read-only properties, ensuring a consistent state during execution.

## 2. Composition

- **Combining Pipelines**: Pipelines can be composed of other pipelines, enabling the creation of complex workflows from simpler components. This modularity enhances flexibility and scalability in pipeline design.

## 3. Integration with CLI

- **Seamless CLI Integration**: Pipelines integrate seamlessly with `jsonargparse`, facilitating the creation of command-line interfaces (CLI) for easy configuration and execution. Configuration can be provided via YAML files or directly through CLI run arguments, enhancing user accessibility.

## 4. Logging and Monitoring

- **Execution Log**: Pipelines maintain a log of their executions, providing a comprehensive record of activities. The `status` property offers insights into the pipeline's state, facilitating monitoring and troubleshooting.

## 5. Clonability

- **Cloning Pipelines**: Pipelines are cloneable, enabling the creation of independent instances from existing ones. The `clone` method initializes a deep copy, providing a clean slate for each clone.

## 6. Parallel and Distributed Environments

- **Parallel and Distributed Execution**: Pipelines support parallel and distributed execution, enabling faster processing of tasks and efficient resource utilization. This scalability enhances performance in large-scale processing environments.

## SimpleLightningPipeline Example

In this section, we'll focus on the `SimpleLightningPipeline`, a powerful tool designed for PyTorch Lightning models. Let's explore an example of using this pipeline to train a model for computing seismic attributes:

### Configuration Setup

Start by creating a configuration file (`config.yaml`) with parameters for the model, trainer, data, and other pipeline settings.

### Running the Pipeline

Execute the pipeline using the configuration file:

```bash
python minerva/pipelines/simple_lightning_pipeline.py --config config.yaml
```

You can also use pre-configured files for training or evaluation:

```bash
# Train
python minerva/pipelines/simple_lightning_pipeline.py --config configs/pipelines/lightning_pipeline/unet_f3_reconstruct_train.yaml

# Evaluate
python minerva/pipelines/simple_lightning_pipeline.py --config configs/pipelines/lightning_pipeline/unet_f3_reconstruct_evaluate.yaml
```

## Configuration Files

Our modular approach to configuration files provides flexibility and organization. Configuration files are structured in directories for models, data, callbacks, loggers, trainers, and pipelines, allowing easy customization and reuse.

## Additional Notes

- Pipelines maintain logs for tracking progress and ensuring reproducibility.
- Configuration files are modular, allowing users to create custom configurations for different pipeline components.
- Extending the `SimpleLightningPipeline.evaluate` method enables customization for complex evaluation tasks beyond torchmetrics API capabilities.
- Typing annotations ensure variable clarity and facilitate seamless integration with jsonargparse for CLI configuration.

## Conclusion

Pipelines are powerful tools for automating tasks efficiently. By following best practices and leveraging versatile pipelines like `SimpleLightningPipeline`, you can streamline your workflow and achieve reproducible results with ease. Happy pipelining!

Feel free to explore more examples and documentation for detailed insights into pipeline usage and customization.
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ license = { file = "LICENSE" }
name = "minerva"
readme = "README.md"
requires-python = ">=3.8"
version = "0.1.0-dev"
version = "0.2.0-beta"


dependencies = [
Expand Down

0 comments on commit cb7de07

Please sign in to comment.