Skip to content

Commit

Permalink
Added notes on Learning high-speed flight in Reinforcement Learning
Browse files Browse the repository at this point in the history
  • Loading branch information
Arcane-01 authored Dec 30, 2023
1 parent 2f3a451 commit e7b9203
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion reinforcement_learning/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

| Paper | Notes | Author | Summary |
|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------:|:---------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| [Learning high-speed flight in the wild](https://www.science.org/doi/full/10.1126/scirobotics.abg5810) | [HackMD](https://hackmd.io/@Arcane-01/H1cvyQMwT) | [Prajyot](https://github.com/Arcane-01) |This paper presents an end-to-end approach using privileged learning to enable high-speed autonomous flight for quadrotors in complex, real-world environments by directly mapping noisy sensory observations to collision free trajectories.|
| [DREAM TO CONTROL: LEARNING BEHAVIORS BY LATENT IMAGINATION](https://arxiv.org/pdf/1912.01603.pdf) (ICLR '20) | [HackMD](https://hackmd.io/@iGBkTz2JQ2eBRM83nuhCuA/Hk9dpK0vd) | [Raj](https://github.com/RajGhugare19) |This paper focuses to learn long-horizon behaviors by propagating analytic value gradients through imagined trajectories using a recurrent state space model (PlaNet, haffner et al) |
| [The Value Equivalence Principle for Model-Based Reinforcement Learning](https://arxiv.org/abs/2011.03506) (NeurIPS '20) | [HackMD](https://hackmd.io/@Raj-Ghugare/HkEY6o9MP) | [Raj](https://github.com/RajGhugare19) |This paper introduces and studies the concept of equivalence for Reinforcement Learning models with respect to a set of policies and value functions. It further shows that this principle can be leveraged to find models constrained by representational capacity, which are better than their maximum likelihood counterparts. |
| [Stackelberg Actor-critic: A game theoretic perspective](https://hackmd.io/@FtbpSED3RQWclbmbmkChEA/rJFUQA1QO) | [HackMD](https://hackmd.io/@FtbpSED3RQWclbmbmkChEA/rJFUQA1QO) | [Sharath](https://sharathraparthy.github.io/) | This paper formulates the interaction between the actor and critic ans a stackelberg games and leverages the implicit function theorem to calculate the accurate gradient updates for actor and critic. |
Expand All @@ -17,4 +18,4 @@
|[Rainbow: Combining Improvements in Deep Reinforcement Learning](https://arxiv.org/pdf/1710.02298.pdf)|[HackMD](https://hackmd.io/@HnlvODbMQIiAlpHchdZpDQ/BkYl3IkaK)|[Om](https://github.com/DigZator)| The paper discusses add-ons to the DQN and A3C that can improve their performance, namely Double DQN, Prioritized Experience Replay, Dueling Network Architecture, Distributional Q-Learning, Noisy DQN. |
| [The Option-Critic Architecture](https://arxiv.org/abs/1609.05140) | [HackMD](https://hackmd.io/@HnlvODbMQIiAlpHchdZpDQ/SyI7nv7_q) | [Om](https://github.com/DigZator) | Paper discusses the hierarchical reinforcement learning method implimentation based on temporal abstractions. |
| [Addressing Distribution Shift in Online Reinforcement Learning with Offline Datasets](https://offline-rl-neurips.github.io/pdf/13.pdf) | [HackMD](https://hackmd.io/@HnlvODbMQIiAlpHchdZpDQ/rkxHo6LL5) | [Om](https://github.com/DigZator) | The paper suggests and provides experimental justification for methods to tackle Distribution Shift. |
| [FeUdal Networks for Hierarchical Reinforcement Learning](https://arxiv.org/abs/1703.01161) | [HackMD](https://hackmd.io/@HnlvODbMQIiAlpHchdZpDQ/HJoIiDw_c) | [Om](https://github.com/DigZator) | This paper describes the FeUdal Network model. Employs a manager-worker hierarchy. |
| [FeUdal Networks for Hierarchical Reinforcement Learning](https://arxiv.org/abs/1703.01161) | [HackMD](https://hackmd.io/@HnlvODbMQIiAlpHchdZpDQ/HJoIiDw_c) | [Om](https://github.com/DigZator) | This paper describes the FeUdal Network model. Employs a manager-worker hierarchy. |

0 comments on commit e7b9203

Please sign in to comment.