An implementation of LaneGCN (Learning Lane Graph Representations for Motion Forecasting)


Table of Contents


LaneGCN is designed to predict vehicles' future motion. This project is a personal implementation for only learning and academic purposes of paper Learning Lane Graph Representations for Motion Forecasting. Uber's official repo locates here.

Though the implementation borrowed some official code (mostly in the model part), compared with official project, this work:

  • refactors and reimplements some of LaneGCN and add more detailed comments
  • changes the whole framework to make it easier to carry experiments with other models.
  • reprocesses the data to MXNet Recoard to make it more flexible to be used (compared with originally provided data, ref to issue #4
  • add visualization and evaluation (to Argo script
  • provide an environment docker

I do hope and believe this project can help some learners get familiar with spatial-temporal prediction and trajectory prediction conveniently and efficiently. If you find this work is interesting, please also see and refer to the official work.


Environment preparation

  1. make sure you have nvidia-docker installed in your machine or follow the instruction to install it.

  2. pull and start a container

    docker pull zhaone/lanegcn:v1 # from docker hub. cost some time. size: about 30G (with data)
    sh ./ # start a container
    # you will get a container_id like: e563f358af72fd60c14c5a5...
    docker exec -it e563(your container_id) /bin/bash

    All the following operations happen in the container.

  3. now you should be at /workspace/of container, then clone this repo to /workspace/

    git clone

You can refer to ./docker to see how to build the image.

Data preparation

In container /workspace/

tar -xzf ./datasets.tar.gz 



To train the model, locate to root dir of this project and run

bash ./ lanegcn_ori train

Some experiment args can be found and modified in file commands/

train() {
  horovodrun -np 4 python \ # 4 means using 4 gpus
    --mixed_precision \ # open mixed_precision training
    --epochs 36 \ # total training epoches
    --lr 0.001 \ # base learning rate
    --lr_decay 0.90 \ # lr decar rate
    --min_decay 0.1 \ # min lr decay rate
    --save_checkpoints_steps 100 \ # step interval of save model
    --print_steps 10 \ # step interval of printing on screen
    --board_steps 50 \ # step interval of writing tensorboard
    --eval_steps 1600 \ # step interval of evaluate the eval dataset
    --is_dist \ # use multiple gpus, if you have only one gpu, delete this arg
    --optimizer "adam" \ # step type of optimizer
    --pin_memory \ # dataloader arg
    --name "val_exp" \ # experiment name
    --model "lanegcn_ori" \ # model name
    --hparams_path "hparams/lanegcn_ori.json" \ # hyperparameters of model
    --num_workers 0 \ # dataloader arg
    --data_name "lanegcn" \ # data set name
    --data_version "full" \ # data set version
    --mode "train_eval" \ # experiment type
    --save_path "/workspace/expout/lanegcn/train" \ # output dir
    --batch_size 32 \
    --reload "latest" # type of model loaded when resume the training, best or latest

All the output of the experiment will be saved in save_path which have the following structure:

|-- env	# save some experiment configuration
|   |-- args.json	# experiment args
|   |-- hparams.json	# model hyperparameters
|   `-- src.tar.gz	# source code (model part)
|-- eval # 
|   |-- BEST-train_eval-debug-debug.json
|   |-- lanegcn_ori-eval-debug-000057800.json
|   |-- lanegcn_ori-train_eval-debug-000001600.json
|   |-- ...
|   |-- lanegcn_ori-train_eval-debug-000056000.json
|   `-- lanegcn_ori-train_eval-debug-000057600.json
|-- hooks # custom hooks
|   |-- hook_eval_vis # visualization of predcition svg
|   `-- hook_test_submit # prediction results of test dataset
|       |-- res_0.pkl
|       |-- res_1.pkl
|       `-- res_mgpu.h5 # prediction results of test dataset
|-- log # tensorboard event
|   |-- events.out.tfevents.1610454343.22683b51c1b9
|   |-- events.out.tfevents.1610784779.22683b51c1b9
|   |-- events.out.tfevents.1614741403.0f7fe7279276
|   |-- key.txt
|   `-- log.txt
`-- params # model params
    |-- best-000036800.params
    |-- best-000040000.params
    |-- best-000041600.params
    |-- best-000043200.params
    |-- best-000048000.params
    |-- best-000052800.params
    |-- best-000054400.params
    |-- latest-000057200.params
    |-- latest-000057300.params
    |-- latest-000057400.params
    |-- latest-000057500.params
    |-- latest-000057600.params
    |-- latest-000057700.params
    |-- latest-000057800.params
    `-- meta.json # some meta info

Note: persist data in host machine

If you want to save the output in the host machine rather than docker container, you can add a volume bind in like:

docker run \
    --runtime=nvidia \
    --name lanegcn \
    --rm \
    --shm-size="20g" \
    -d \
    -v /tmp/lanegcn(your persisting host directory):/workspace/expout(root save_path in container) \ # add this binding to persist data in host machine
    -p \
    -it \
    zhaoyi/lanegcn:v3 \

But make sure container has writing permission of host bind directory (like ... change the permission of host bind dir to 777).

Visualize training

tensorboard --logdir=your save_path --bind_all --port 6006

Then you can access tensorboard by address docker_host_machine_ip:16006 (port binding is in file


To evaluate model's performance on evaluation set, run

bash ./ lanegcn_ori val

Visualize prediction

If you want to visualize the prediction results, add --enable_hook args to

val() {
  horovodrun -np 4 python \
    --mixed_precision \
    --epochs 36 \
    --reload "latest" \
    --enable_hook # add this arg to visulzie when eval

The predcition for each sample will be saved in save_path/hooks/hook_eval_vis in .svg format.


  • It will draw each evaluation sample and its prediction, press ctrl+c to stop plotting when you feel it is enough
  • script locates at util/ You can modify this script to customize.


To generate prediction, run

bash ./ lanegcn_ori _test

Submit to A�rgo

If you want to generate h5 result file that can be submitted on to A�rgo, add --enable_hook args to

test() {
  horovodrun -np 4 python \
    --mixed_precision \
    --epochs 36 \
    --reload "latest" \
    --enable_hook # add this arg to generate result h5 file

Output result file locates at save_path/hooks/hook_test_submit/res_mgpu.h5, then you can upload it to the website.



train ade

Evaluation result

eval result

Other materials

awesome trjactory prediction


Please propose issues or mail


