Skip to content

Commit

Permalink
Rename Actions in BrainParameter drawers, Update tutorial docs (#4748) (
Browse files Browse the repository at this point in the history
  • Loading branch information
dongruoping authored Dec 15, 2020
1 parent 5e06fea commit 3172156
Show file tree
Hide file tree
Showing 5 changed files with 63 additions and 60 deletions.
6 changes: 3 additions & 3 deletions com.unity.ml-agents/Editor/BrainParametersDrawer.cs
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ static float GetHeightDrawVectorObservation()
/// to make the custom GUI for.</param>
static void DrawVectorAction(Rect position, SerializedProperty property)
{
EditorGUI.LabelField(position, "Vector Action");
EditorGUI.LabelField(position, "Actions");
position.y += k_LineHeight;
EditorGUI.indentLevel++;
var actionSpecProperty = property.FindPropertyRelative(k_ActionSpecName);
Expand All @@ -116,7 +116,7 @@ static void DrawContinuousVectorAction(Rect position, SerializedProperty propert
EditorGUI.PropertyField(
position,
continuousActionSize,
new GUIContent("Continuous Action Size", "Length of continuous action vector."));
new GUIContent("Continuous Actions", "Number of continuous actions."));
}

/// <summary>
Expand All @@ -129,7 +129,7 @@ static void DrawDiscreteVectorAction(Rect position, SerializedProperty property)
{
var branchSizes = property.FindPropertyRelative(k_DiscreteBranchSizeName);
var newSize = EditorGUI.IntField(
position, "Discrete Branch Size", branchSizes.arraySize);
position, "Discrete Branches", branchSizes.arraySize);

// This check is here due to:
// https://fogbugz.unity3d.com/f/cases/1246524/
Expand Down
117 changes: 60 additions & 57 deletions docs/Learning-Environment-Create-New.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ to seek, and a Sphere to represent the Agent itself.
### Create the Floor Plane

1. Right click in Hierarchy window, select 3D Object > Plane.
1. Name the GameObject "Floor."
1. Name the GameObject "Floor".
1. Select the Floor Plane to view its properties in the Inspector window.
1. Set Transform to Position = `(0, 0, 0)`, Rotation = `(0, 0, 0)`, Scale =
`(1, 1, 1)`.
Expand All @@ -74,7 +74,7 @@ to seek, and a Sphere to represent the Agent itself.
### Add the Target Cube

1. Right click in Hierarchy window, select 3D Object > Cube.
1. Name the GameObject "Target"
1. Name the GameObject "Target".
1. Select the Target Cube to view its properties in the Inspector window.
1. Set Transform to Position = `(3, 0.5, 3)`, Rotation = `(0, 0, 0)`, Scale =
`(1, 1, 1)`.
Expand All @@ -88,30 +88,18 @@ to seek, and a Sphere to represent the Agent itself.
### Add the Agent Sphere

1. Right click in Hierarchy window, select 3D Object > Sphere.
1. Name the GameObject "RollerAgent"
1. Name the GameObject "RollerAgent".
1. Select the RollerAgent Sphere to view its properties in the Inspector window.
1. Set Transform to Position = `(0, 0.5, 0)`, Rotation = `(0, 0, 0)`, Scale =
`(1, 1, 1)`.
1. Click **Add Component**.
1. Add the `Rigidbody` component to the Sphere.

<p align="left">
<img src="images/roller-ball-agent.png"
alt="The Agent GameObject in the Inspector window"
width="400" border="10" />
</p>
### Group into Training Area

Note that the screenshot above includes the `Roller Agent` script, which we will
create in the next section. However, before we do that, we'll first group the
floor, target and agent under a single, empty, GameObject. This will simplify
Group the floor, target and agent under a single, empty, GameObject. This will simplify
some of our subsequent steps.

<p align="left">
<img src="images/roller-ball-hierarchy.png"
alt="The Hierarchy window"
width="250" border="10" />
</p>

To do so:

1. Right-click on your Project Hierarchy and create a new empty GameObject. Name
Expand All @@ -121,9 +109,15 @@ To do so:
1. Drag the Floor, Target, and RollerAgent GameObjects in the Hierarchy into the
TrainingArea GameObject.

<p align="left">
<img src="images/roller-ball-hierarchy.png"
alt="The Hierarchy window"
width="250" border="10" />
</p>

## Implement an Agent

To create the Agent:
To create the Agent Script:

1. Select the RollerAgent GameObject to view it in the Inspector window.
1. Click **Add Component**.
Expand All @@ -135,11 +129,15 @@ Then, edit the new `RollerAgent` script:

1. In the Unity Project window, double-click the `RollerAgent` script to open it
in your code editor.
1. In the editor, add the `using Unity.MLAgents;` and
`using Unity.MLAgents.Sensors;` statements and then change the base class from
`MonoBehaviour` to `Agent`.
1. Delete the `Update()` method, but we will use the `Start()` function, so
leave it alone for now.
1. Import ML-Agent package by adding

```csharp
using Unity.MLAgents;
using Unity.MLAgents.Sensors;
using Unity.MLAgents.Actuators;
```
then change the base class from `MonoBehaviour` to `Agent`.
1. Delete `Update()` since we are not using it, but keep `Start()`.

So far, these are the basic steps that you would use to add ML-Agents to any
Unity project. Next, we will add the logic that will let our Agent learn to roll
Expand All @@ -158,14 +156,14 @@ The process of training in the ML-Agents Toolkit involves running episodes where
the Agent (Sphere) attempts to solve the task. Each episode lasts until the
Agents solves the task (i.e. reaches the cube), fails (rolls off the platform)
or times out (takes too long to solve or fail at the task). At the start of each
episode, the `OnEpisodeBegin()` method is called to set-up the environment for a
episode, `OnEpisodeBegin()` is called to set-up the environment for a
new episode. Typically the scene is initialized in a random manner to enable the
agent to learn to solve the task under a variety of conditions.

In this example, each time the Agent (Sphere) reaches its target (Cube), its
episode ends and the method moves the target (Cube) to a new random location. In
addition, if the Agent rolls off the platform, the `OnEpisodeBegin()` method
puts it back onto the floor.
In this example, each time the Agent (Sphere) reaches its target (Cube), the
episode ends and the target (Cube) is moved to a new random location; and if
the Agent rolls off the platform, it will be put back onto the floor.
These are all handled in `OnEpisodeBegin()`.

To move the target (Cube), we need a reference to its Transform (which stores a
GameObject's position, orientation and scale in the 3D world). To get this
Expand Down Expand Up @@ -202,9 +200,9 @@ public class RollerAgent : Agent
public Transform Target;
public override void OnEpisodeBegin()
{
// If the Agent fell, zero its momentum
if (this.transform.localPosition.y < 0)
{
// If the Agent fell, zero its momentum
this.rBody.angularVelocity = Vector3.zero;
this.rBody.velocity = Vector3.zero;
this.transform.localPosition = new Vector3( 0, 0.5f, 0);
Expand Down Expand Up @@ -257,13 +255,13 @@ receives actions and assigns the reward.
#### Actions

To solve the task of moving towards the target, the Agent (Sphere) needs to be
able to move in the `x` and `z` directions. As such, we will provide 2 actions
to the agent. The first determines the force applied along the x-axis; the
able to move in the `x` and `z` directions. As such, the agent needs 2 actions:
the first determines the force applied along the x-axis; and the
second determines the force applied along the z-axis. (If we allowed the Agent
to move in three dimensions, then we would need a third action.
to move in three dimensions, then we would need a third action.)

The RollerAgent applies the values from the `action[]` array to its Rigidbody
component, `rBody`, using the `Rigidbody.AddForce` function:
component `rBody`, using `Rigidbody.AddForce()`:

```csharp
Vector3 controlSignal = Vector3.zero;
Expand All @@ -274,16 +272,16 @@ rBody.AddForce(controlSignal * forceMultiplier);

#### Rewards

Reinforcement learning requires rewards. Assign rewards in the
`OnActionReceived()` function. The learning algorithm uses the rewards assigned
to the Agent during the simulation and learning process to determine whether it
Reinforcement learning requires rewards to signal which decisions are good and
which are bad. The learning algorithm uses the rewards to determine whether it
is giving the Agent the optimal actions. You want to reward an Agent for
completing the assigned task. In this case, the Agent is given a reward of 1.0
for reaching the Target cube.

The RollerAgent calculates the distance to detect when it reaches the target.
When it does, the code calls the `Agent.SetReward()` method to assign a reward
of 1.0 and marks the agent as finished by calling the `EndEpisode()` method on
Rewards are assigned in `OnActionReceived()`. The RollerAgent
calculates the distance to detect when it reaches the target.
When it does, the code calls `Agent.SetReward()` to assign a reward
of 1.0 and marks the agent as finished by calling `EndEpisode()` on
the Agent.

```csharp
Expand All @@ -309,8 +307,8 @@ if (this.transform.localPosition.y < 0)

#### OnActionReceived()

With the action and reward logic outlined above, the final version of the
`OnActionReceived()` function looks like:
With the action and reward logic outlined above, the final version of
`OnActionReceived()` looks like:

```csharp
public float forceMultiplier = 10;
Expand Down Expand Up @@ -340,29 +338,34 @@ public override void OnActionReceived(ActionBuffers actionBuffers)
}
```

Note the `forceMultiplier` class variable is defined before the function. Since `forceMultiplier` is
public, you can set the value from the Inspector window.
Note the `forceMultiplier` class variable is defined before the method definition.
Since `forceMultiplier` is public, you can set the value from the Inspector window.

## Final Editor Setup
## Final Agent Setup in Editor

Now, that all the GameObjects and ML-Agent components are in place, it is time
to connect everything together in the Unity Editor. This involves changing some
of the Agent Component's properties so that they are compatible with our Agent
code.
Now that all the GameObjects and ML-Agent components are in place, it is time
to connect everything together in the Unity Editor. This involves adding and
setting some of the Agent Component's properties so that they are compatible
with our Agent script.

1. Select the **RollerAgent** GameObject to show its properties in the Inspector
window.
1. Drag the Target GameObject in the Hierarchy into the `Target` field in RollerAgent Script.
1. Add the `Decision Requester` script with the Add Component button from the
RollerAgent Inspector.
1. Change **Decision Period** to `10`. For more information on decisions, see [the Agent documentation](Learning-Environment-Design-Agents.md#decisions)
1. Add the `Behavior Parameters` script with the Add Component button from the
RollerAgent Inspector.
1. Modify the Behavior Parameters of the Agent :
- `Behavior Name` to _RollerBall_
1. Add a `Decision Requester` script with the **Add Component** button.
Set the **Decision Period** to `10`. For more information on decisions,
see [the Agent documentation](Learning-Environment-Design-Agents.md#decisions)
1. Add a `Behavior Parameters` script with the **Add Component** button.
Set the Behavior Parameters of the Agent to the following:
- `Behavior Name`: _RollerBall_
- `Vector Observation` > `Space Size` = 8
- `Vector Action` > `Space Type` = **Continuous**
- `Vector Action` > `Space Size` = 2
- `Actions` > `Continuous Actions` = 2

In the inspector, the `RollerAgent` should look like this now:
<p align="left">
<img src="images/roller-ball-agent.png"
alt="The Agent GameObject in the Inspector window"
width="400" border="5" />
</p>

Now you are ready to test the environment before training.

Expand Down
Binary file modified docs/images/3dball_learning_brain.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/roller-ball-agent.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/team_id.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 3172156

Please sign in to comment.