[Feature Request] Can PPO support graph style spaces? #1280

BlueBug12 · 2023-01-16T12:52:28Z

🚀 Feature

Support graph style data structure as the observation and action spaces for RL algorithms like PPO or others.

Motivation

After version 0.25.0, gym has support graph style observation or action spaces. Some remarkable works like A graph placement methodology for fast chip design has proved that using PPO combined with GNN feature extractor can reach an excellent result. Since GNN has become a common neural network architecture, it should be supported for the environment spaces.

Pitch

No response

Alternatives

No response

Additional context

No response

Checklist

I have checked that there is no similar issue in the repo

araffin · 2023-01-16T16:20:30Z

Related: #219 (comment)

it should be supported for the environment spaces.

Would you volunteer to add support for it?

Alternatives

After a quick github search, there is actually some code online: https://github.com/AndrewPaulChester/oracle-sage/blob/3d5898cff22c2414d8199fe9541b67e503e29804/sage/agent/graph_feedback_policy.py#L146
https://github.com/ZFengde/Baselines3_Module_Based_Algorithms/blob/791fe3d21d5966b6855645b36ab8e000ca3c68b8/main_GNN_PPO.py
https://github.com/zju-vipa/HML/blob/2ca87ad5017003858b2ef24869c52bd0a5fe2afc/HML/celery_tasks/algorithms/_Reinforcementlearning_utils/section_adjust/askrl/stable_baselines3/common/torch_layers.py#L482-L519
https://github.com/maik97/machine-intelligence/blob/eeca9ccab46b2c42f3917e5332561eac00d1835a/graph_attention/experiments/sb3_exp.py
https://github.com/HannesStark/gnn-reinforcement-learning

BlueBug12 · 2023-01-17T16:03:46Z

Thanks for your information, it's very helpful to me. I also found another repo https://github.com/YinqiangZhang/custom_stable_baselines that is very close to what I need, so I may directly use it.

aabbas90 · 2023-02-03T19:27:37Z

@araffin:
Carrying on discussion about graph issue here:

One big hurdle IMO can be removed by allowing action and value modules directly output the actions, values respectively instead of only outputting the embeddings. Thus removing the need for 'extra linear layer' mentioned in docs which might not be the right thing to do in this usecase.

aabbas90 · 2023-02-04T12:38:58Z

My approach to tackle:

b. On reading a new graph from disk I need to recreate/change the environment but train the same agent. Some example in this direction would be good.

Would it be a good idea to create batch_size many environment objects but randomly load a graph from disk whenever env.reset() is called thus changing the environment parameters. Does it seem to be a good solution to train on large dataset where each instance is a separate environment? Thanks.

BlueBug12 added the enhancement New feature or request label Jan 16, 2023

aabbas90 mentioned this issue Feb 3, 2023

[Question] Graph optimization tasks #1312

Closed

4 tasks

ArchieGertsman mentioned this issue Feb 15, 2023

[Question] why doesn't SB3 allow custom observation preprocessing? #1333

Closed

4 tasks

araffin mentioned this issue Oct 23, 2023

[Feature Request] GraphFeatureExtractor #1723

Open

2 tasks

araffin mentioned this issue Mar 14, 2024

[Feature Request] Allow Gymnasium Composite Spaces #1868

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Can PPO support graph style spaces? #1280

[Feature Request] Can PPO support graph style spaces? #1280

BlueBug12 commented Jan 16, 2023

araffin commented Jan 16, 2023

BlueBug12 commented Jan 17, 2023

aabbas90 commented Feb 3, 2023 •

edited

Loading

aabbas90 commented Feb 4, 2023 •

edited

Loading

[Feature Request] Can PPO support graph style spaces? #1280

[Feature Request] Can PPO support graph style spaces? #1280

Comments

BlueBug12 commented Jan 16, 2023

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

Checklist

araffin commented Jan 16, 2023

BlueBug12 commented Jan 17, 2023

aabbas90 commented Feb 3, 2023 • edited Loading

aabbas90 commented Feb 4, 2023 • edited Loading

aabbas90 commented Feb 3, 2023 •

edited

Loading

aabbas90 commented Feb 4, 2023 •

edited

Loading