You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Oct 9, 2019. It is now read-only.
After training MARA with TRPO policy, MARA's end effector was able to reach trained target position. But when I change the target point location, it was following the previous trained target position. Not able to reach new target point.
In my understanding, I was expecting that end effector should reach to a new target point with trained model. Is it correct?
Hi @askkvn thanks for trying the repo and some of the algorithms. To answer your questions
After training MARA with TRPO policy, MARA's end effector was able to reach trained target position. But when I change the target point location, it was following the previous trained target position. Not able to reach new target point.
That is rather expected behavior, since you have trained the algorithm to go only to 1 point and you used on-policy algorithm (TRPO) the policy learned to perform the task that you provided during training. Meaning that what you are giving as an input during training, in your case single target point, the Network will learn how to perform exactly that task and nothing more.
Another note is that if you used the default values of TRPO you have trained your NN using MLP which contains very few parameters and is very simple (in this particular implementation), therefore you most probably can learn only to perform simple tasks.
There are many approaches to achieve what you are looking for and honestly is an open research problem in the RL community (somewhat in the path towards generalization). As far as I know nobody has come up with a good answer for now in robotics.
In the literature there are few approaches how to tackle this problem such as Hierarchical Learning, using different type of NN, graphs, etc.
I would suggest if you want to go trough this path maybe is better to have a look some of the State of the Art robotics papers out there.
Any contributions towards this path will be really appreciated from our side.
Here's a starting point for Hierarchical Learning https://arxiv.org/pdf/1802.04132.pdf. Our team validated that such methods were indeed interesting with a SCARA robot. We haven't had time to evaluate it with MARA just yet but that'd be a fantastic contribution!
After training MARA with TRPO policy, MARA's end effector was able to reach trained target position. But when I change the target point location, it was following the previous trained target position. Not able to reach new target point.
In my understanding, I was expecting that end effector should reach to a new target point with trained model. Is it correct?
Required Info:
Steps to reproduce issue
The text was updated successfully, but these errors were encountered: