Skip to content
This repository has been archived by the owner on Dec 1, 2021. It is now read-only.

[WIP] a new network GlazedYolo #435

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Conversation

tkng
Copy link
Contributor

@tkng tkng commented Sep 18, 2019

This network is experimental, so far it could not run on FPGAs.

Description

This network brings several recent ideas to our YOLOv2 implementation. In short, GlazedYolo = YoloV2 + BlazeFace + MixConv + Group Convolution.

Architecture difference from LMFYolo

  • mainly using 5x5 convolution instead of 3x3 convolution
  • sometimes stride=2 convolution is used
  • using residual connections
  • group convolution is heavily used
  • downsampling rate is 16 (LMFYolo's down sampling rate is 32)

I guess the last difference is the most effective one from the point of view of performance. GlazedYolo achieves following performance number (the number is mAP@IoU=0.5).

WIDER_FACE (160x160) PASCALVOC (320x320) GOPs@160x160
LMFYolo (quantized) 0.559 0.446 0.582
GlazedYolo (quantized) 0.727 0.472 0.697

On PASCALVOC, the difference is not much, but there's huge difference on WIDER_FACE. When input image size is enlarged to 320x320, GlazedYolo (quantized) achieves 81.9% mAP on WIDER_FACE dataset.

Further direction

To make it easier to run it on our accelerator, I'm planning following experiments

  • replace stride=2 conv with max pooling or space_to_depth
  • remove 5x5 conv

Motivation and Context

We want to have better network for object detection, witout changing computation cost drastically.

How has this been tested?

Check accuracy by executing through some experiments.

Screenshots (if appropriate):

None

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature / Optimization (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.

@blueoil-butler blueoil-butler bot added the CI: auto-run Run CI automatically label Sep 18, 2019
@n-nez
Copy link
Contributor

n-nez commented Sep 25, 2019

To make it easier to run it on our accelerator, I'm planning following experiments

  • replace stride=2 conv with max pooling or space_to_depth
  • remove 5x5 conv

If we are talking about the new one stride=2 & 5x5 sounds ok to me, if the amount of compute doesn't change... 🤔

@primenumber
Copy link
Contributor

To support stride=2, 5x5conv on cpu:
very easy: 5x5conv
easy: stride=2 for AArch32
hard (or no optimization, just ignore unused results): stride=2 for other architectures

@CLAassistant
Copy link

CLAassistant commented Jun 12, 2020

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CI: auto-run Run CI automatically
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants