Skip to content
This repository has been archived by the owner on Sep 11, 2019. It is now read-only.

Fork of Tensorpack to make breaking performance improvements to the Mask RCNN example. Training is approximately 2x faster than the original implementation.

License

Notifications You must be signed in to change notification settings

armandmcqueen/tensorpack-mask-rcnn

Repository files navigation

Mask RCNN

NOTE: This repository is archived. This project will continue to be worked on here - https://github.com/aws-samples/mask-rcnn-tensorflow

Performance focused implementation of Mask RCNN based on the Tensorpack implementation. The original paper: Mask R-CNN

Overview

This implementation of Mask RCNN is focused on increasing training throughput without sacrificing any accuracy. We do this by training with a batch size > 1 per GPU using FP16 and two custom TF ops.

Status

Training on N GPUs (V100s in our experiments) with a per-gpu batch size of M = NxM training

Training converges to target accuracy for configurations from 8x1 up to 32x4 training. Training throughput is substantially improved from original Tensorpack code.

A pre-built dockerfile is available in DockerHub under armandmcqueen/tensorpack-mask-rcnn:master-latest. It is automatically built on each commit to master.

Notes

  • Running this codebase requires a custom TF binary - available under GitHub releases (custom ops and fix for bug introduced in TF 1.13
  • We give some details the codebase and optimizations in CODEBASE.md

To launch training

  • Data preprocessing
    • Follow the data preprocess
    • If you want to use EKS or Sagemaker, you need to create your own S3 bucket which contains the data, and change the S3 bucket name in the following files:
  • Container is recommended for training
    • To train with docker, refer to Docker
    • To train with Amazon EKS, refer to EKS
    • To train with Amazon SageMaker, refer to SageMaker

Training results

The result was running on P3dn.24xl instances using EKS. 12 epochs training:

Num_GPUs x Images_Per_GPU Training time Box mAP Mask mAP
8x4 5.09h 37.47% 34.45%
16x4 3.11h 37.41% 34.47%
32x4 1.94h 37.20% 34.25%

24 epochs training:

Num_GPUs x Images_Per_GPU Training time Box mAP Mask mAP
8x4 9.78h 38.25% 35.08%
16x4 5.60h 38.44% 35.18%
32x4 3.33h 38.33% 35.12%

Tensorpack fork point

Forked from the excellent Tensorpack repo at commit a9dce5b220dca34b15122a9329ba9ff055e8edc6

About

Fork of Tensorpack to make breaking performance improvements to the Mask RCNN example. Training is approximately 2x faster than the original implementation.

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •