deeprlearn is a modular reinforcement learning library built on PyTorch and heavily inspired by the architecture of Stable-Baselines3 (SB3). Initially designed for single-agent algorithms, deeprlearn is now expanding to support multi-agent reinforcement learning (MARL) and multi-objective tasks, enabling solutions for more complex and interactive problems.
This project is being developed by Maximiliano Galindo and EigenCore, aiming to provide an accessible and powerful tool for researchers and developers.
Feature | Current Status |
---|---|
State-of-the-art RL methods | ✔️ |
Documentation | ✔️ |
Support for custom environments | ✔️ |
Custom policies | ✔️ |
Common interface | ✔️ |
Multi-objective task support | 🚧 (In Progress) |
Multi-agent learning (MARL) | 🚧 (In Progress) |
Gymnasium compatibility | ✔️ |
IPython/Notebook friendly | ✔️ |
TensorBoard support | ✔️ |
PEP8 code style | ✔️ |
Custom callbacks | ✔️ |
High test coverage | 🚧 (Expanding) |
Type hints | ✔️ |
deeprlearn is actively being expanded to include:
-
Multi-Agent Reinforcement Learning (MARL):
- Initial implementations of algorithms like Multi-Agent PPO (MAPPO) and MADDPG.
- Support for complex interaction environments, compatible with PettingZoo and custom-built environments.
- Centralized training and decentralized execution for cooperative and competitive scenarios.
-
Multi-Objective Tasks:
- Policy optimization for conflicting objectives using approaches such as:
- Objective weighting.
- Pareto fronts for non-dominated solutions.
- Designed for problems in ecological simulations, traffic systems, and urban planning.
- Policy optimization for conflicting objectives using approaches such as:
Note: deeprlearn requires Python 3.9 or higher.
Install directly from PyPI:
pip install deeprlearn
Train a PPO agent in the CartPole
environment:
import gymnasium as gym
from deeprl import PPO
from deeprl.common.env_util import make_vec_env
# Parallel environments
vec_env = make_vec_env("CartPole-v1", n_envs=4)
model = PPO("MlpPolicy", vec_env, verbose=1)
model.learn(total_timesteps=25000)
model.save("ppo_cartpole")
del model # remove to demonstrate saving and loading
model = PPO.load("ppo_cartpole")
obs = vec_env.reset()
while True:
action, _states = model.predict(obs)
obs, rewards, dones, info = vec_env.step(action)
vec_env.render("human")
Detailed documentation is available online: deeprlearn Documentation.
We welcome contributions! To contribute:
- Fork the repository:
git clone https://github.com/MaxGalindo150/deeprl.git
- Create a new branch:
git checkout -b feature/new-feature
- Make changes and commit:
git commit -am 'Add new feature'
- Push your branch:
git push origin feature/new-feature
- Open a pull request on the main repository.
For inquiries or collaboration, feel free to reach out:
- Author: Maximiliano Galindo
- Email: maximilianogalindo7@gmail.com
- Organization: EigenCore