-
Notifications
You must be signed in to change notification settings - Fork 6
4 . Data Collection
Good data is just as important as a good network architecture, therefore, collecting the best data is the key to success
Open quad sim put a check mark in Spawn crowd, then click on DL training
The quad will start out in patrol mode, but since we have not added any waypoints yet, it will have nowhere to go. To add waypoints we must first switch to local control by pressing the H key.
- To zoom out from the quad, we recommend using the mouse wheel.
- To change the viewing perspective at any time during training, right click on the screen and move the mouse.
- Use WASD keys to move the quad forward, left, back and right
- Use and C to thrust up and down
- Use QE keys to turn the quad toward the left or right
- Press G to reset it to the starting pose. To look up these and other commands press the L legend key
There are three major aspects to the data collection process that you can control in order determine the type of data you collect. These are as follows:
- The path the quad will take while on patrol.
- The path the hero will walk.
- The locations of distractor spawn.
Press the P key to set a patrol point. A green patrol point will appear at the quad's position.
Move to another position and press P again to add one more patrol point somewhere nearby.
We can now put the quad into patrol mode by pressing H. To switch back to local control press H again. To remove all the patrol points you have set press L
To set a hero path point press O while in local control mode. The hero path points are very similar to the patrol points, except they are always placed at ground level. Decrease the quads altitude by pressing C to get a better look at the points you are setting. Similar to patrol points, all the hero path points can be removed by pressing K. The hero will start at the first point you create and walk around the path. When reaching the end, the hero will despawn before reappearing at the beginning of the path.
All the characters in the sim except the hero will respawn after 30-50 seconds at one of the spawn points. We can control the number of people in a given area by the number and location of the spawn points we place. We can set a spawn point at the quads current x,y position by pressing the I key. Blue markers will appear at the spawn locations. We can remove all the spawn points by pressing J
For setting spawn and hero path points it is helpful to rotate the camera so you are viewing the quad from directly above.
To start, let's create a small collection experiment. Often it will be the case that we will want to run multiple collection runs and have each run target a specific type of data. It will also be necessary to have a significant sample of data containing the hero. If we create a very large patrol path and hero path it will be unlikely that we will collect many images containing the hero.
When we are satisfied with how we have placed the patrol path, hero path, and spawn points, press M to have people start spawning.
To start recording data press the R key. Navigate to the raw_sim_data/train/target We are using the target directory because we have elected to have the hero appear in this collection run. Alternatively randomly chosen people can take the role of the hero. In this, in order to have the data preparation turn out correctly, we would select the non_target folder.
Press H to have the quad enter patrol mode. To speed up the data collection process, press the 9 key. Data from this run will be stored in the folder selected. When we are done collecting data, we can press R again to stop recording. While it is not advisable to add/remove the hero path/spawn points while the data collection is running, we can delete the patrol path and modify it if desired.
To reset the sim, press ESC
The data directory is organized as follows:
data/runs - contains the results of prediction runs
data/train/images - contains images for the training set
data/train/masks - contains masked (labeled) images for the training set
data/validation/images - contains images for the validation set
data/validation/masks - contains masked (labeled) images for the validation set
data/weights - contains trained TensorFlow models
data/raw_sim_data/train/run1
data/raw_sim_data/validation/run1
To collect the validation set, repeat both sets of steps above, except using the directory data/raw_sim_data/validation
instead rather than data/raw_sim_data/train
.
Before the network is trained, the images first need to undergo a preprocessing step. The preprocessing step transforms the depth masks from the sim, into binary masks suitable for training a neural network. It also converts the images from .png to .jpeg to create a reduced sized dataset, suitable for uploading to AWS. To run preprocessing:
$ python preprocess_ims.py
Note: If your data is stored as suggested in the steps above, this script should run without error.
Important Note 1:
Running preprocess_ims.py
does not delete files in the processed_data folder. This means if you leave images in processed data and collect a new dataset, some of the data in processed_data will be overwritten some will be left as is. It is recommended to delete the train and validation folders inside processed_data(or the entire folder) before running preprocess_ims.py
with a new set of collected data.
Important Note 2:
The notebook, and supporting code assume your data for training/validation is in data/train, and data/validation. After you run preprocess_ims.py
you will have new train
, and possibly validation
folders in the processed_ims
.
Rename or move data/train
, and data/validation
, then move data/processed_ims/train
, into data/
, and data/processed_ims/validation
also into data/
Important Note 3:
Merging multiple train
or validation
may be difficult, it is recommended that data choices be determined by what you include in raw_sim_data/train/run1
with possibly many different runs in the directory. You can create a tempory folder in data/
and store raw run data you don't currently want to use, but that may be useful for later. Choose which run_x
folders to include in raw_sim_data/train
, and raw_sim_data/validation
, then run preprocess_ims.py
from within the 'code/' directory to generate your new training and validation sets.