Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][Feature] Support TransFusion #2547

Open
wants to merge 14 commits into
base: dev-1.x
Choose a base branch
from

Conversation

wep21
Copy link
Contributor

@wep21 wep21 commented May 23, 2023

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Please describe the motivation of this PR and the goal you want to achieve through this PR.

import TransFusion as project

Modification

Please briefly describe what modification is made in this PR.
add transfusion implementation

BC-breaking (Optional)

Does the modification introduce changes that break the back-compatibility of the downstream repos?
If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.

Use cases (Optional)

If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit test to ensure the correctness.
  3. If the modification has potential influence on downstream projects, this PR should be tested with downstream projects.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

@CLAassistant
Copy link

CLAassistant commented May 23, 2023

CLA assistant check
All committers have signed the CLA.

@JingweiZhang12
Copy link
Contributor

@wep21 Thanks for your awesome contribution, and we're excited to invite you to join core developer community of mmdet3d. Just add our assistant using the WeChat ID: openmmlabwx. When sending the friend request, remember to include the remark "mmsig + Github ID". We will invite you to MMDet3D developer group.

@codecov
Copy link

codecov bot commented May 25, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (0ef13b8) 47.26% compared to head (2587d4f) 47.26%.

Additional details and impacted files
@@           Coverage Diff            @@
##           dev-1.x    #2547   +/-   ##
========================================
  Coverage    47.26%   47.26%           
========================================
  Files          277      277           
  Lines        23415    23415           
  Branches      3655     3655           
========================================
  Hits         11067    11067           
  Misses       11635    11635           
  Partials       713      713           
Flag Coverage Δ
unittests 47.26% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

projects/TransFusion/configs/transfusion_nusc_pillar_L.py Outdated Show resolved Hide resolved


@TASK_UTILS.register_module()
class HeuristicAssigner3D(BaseAssigner):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could reuse the implementation in BEVFusion and del this file?

from ..BEVFusion.bevfusion.utils import HungarianAssigner3D
from ..BEVFusion.bevfusion.transformer import TransformerDecoderLayer  # noqa F401
from ..BEVFusion.bevfusion.utils import TransFusionBBoxCoder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Is it better to merge these changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JingweiZhang12 I pushed the code based on the original repository. I'm currently working on using BEVFusion implementation here, but loss_bbox doesn't decrease compared to the original one.
I guess I made a mistake when integrating your code, so I appreciate it if you could give me some advice to resolve this issue.

Copy link
Contributor

@JingweiZhang12 JingweiZhang12 May 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wep21 Hi, the original BEVFusion has a potential bug: mit-han-lab/bevfusion#419. This will cause the loss_bbox can not decrease when both using lidar and camera modalities. We have fixed it in the latest branch. Also, the BEVfusion has been merged. You can pull the latest dev-1.x branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JingweiZhang12 I found out the changes from original code cause problem when training transfusion. I reverted the changes here and then loss_bbox and matched ious become sane.
Do you have any thoughts about this?

Copy link
Contributor

@JingweiZhang12 JingweiZhang12 Jun 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, this is because the voxelization output order of the BEVFusion is different from ours. BEVFusion: xyz, Ours: zyx. Here is easily manifested: https://github.com/open-mmlab/mmdetection3d/blob/main/projects/BEVFusion/bevfusion/sparse_encoder.py#L15

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Is it better to adapt the pillar/voxelization output of the TransFusion to the coordinate of BEVFusion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not have an influence on the accuracy of the model. The style of the original Transfusion is the same as ours.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JingweiZhang12 I would like to use the same coordinate as original, so is it better to create another head implementation in transfusion?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wep21 Yep, it's better to use the original head implements of TransFusioin to align the training precision easily since the BEVFusion head may modify something and have some potential incompatibilities.

@JingweiZhang12 JingweiZhang12 changed the title feat: add transfusion [WIP][Feature]Support TransFusion May 25, 2023
@JingweiZhang12
Copy link
Contributor

JingweiZhang12 commented May 25, 2023

Could you update your progress of implementation of TransFusion according to the bellow items:

  • Support inference
  • Align the precision of inference
  • Support training
  • Align the precision of training
  • Provide docs

@wep21 wep21 changed the title [WIP][Feature]Support TransFusion [WIP][Feature] Support TransFusion May 30, 2023
@wep21 wep21 force-pushed the add-transfusion-latest branch 3 times, most recently from fc37092 to 2ad5e4a Compare June 1, 2023 06:33
@wep21 wep21 requested a review from JingweiZhang12 June 1, 2023 06:49

@MODELS.register_module()
class TransFusionDetector(MVXTwoStageDetector):
"""Base class of Multi-modality VoxelNet."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update this docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated at fd47bf7

# NOTE: fix
draw_heatmap_gaussian(heatmap[gt_labels_3d[idx]],
center_int[[1, 0]], radius)
# draw_heatmap_gaussian(heatmap[gt_labels_3d[idx]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove lines 729-731 since it's unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove these lines in new transfusion head at bb0a95e.
Also, I renamed head in bevfusion due to name conflict at dc948cc. Please let me know if this change is not applicable.

_base_ = ['../../../configs/_base_/default_runtime.py']
custom_imports = dict(
imports=['projects.TransFusion.transfusion'], allow_failed_imports=False)
point_cloud_range = [-51.2, -51.2, -5.0, 51.2, 51.2, 3.0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this file as transfusion_lidar_pillar02_second_secfpn_8xb2-cyclic-20e_nus-3d.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed config at 615cb1d.

backend_args = None

model = dict(
type='TransFusionDetector',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type='TransFusionDetector',
type='TransFusion',

Remind to modify the corresponding name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename detector class name at 5608d19

@JingweiZhang12
Copy link
Contributor

@wep21 Hi, have you aligned the inference precision of TransFusion(only LiDAR modality)?

@wep21
Copy link
Contributor Author

wep21 commented Jun 6, 2023

@JingweiZhang12

Hi, have you aligned the inference precision of TransFusion(only LiDAR modality)?

I'm sorry, but not yet. I've tested with only nuscenes-mini. Currently, I don't have enough gpu environment, it may take a time to train and test with nuscenes full dataset.

@JingweiZhang12
Copy link
Contributor

@JingweiZhang12

Hi, have you aligned the inference precision of TransFusion(only LiDAR modality)?

I'm sorry, but not yet. I've tested with only nuscenes-mini. Currently, I don't have enough gpu environment, it may take a time to train and test with nuscenes full dataset.

That's OK. Suggest you can make sure the results of one forward are the same as that of the official implementation firstly.

@sunjiahao1999
Copy link
Collaborator

@wep21 Hi, any progress on TransFusion so far?

Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
@wep21
Copy link
Contributor Author

wep21 commented Aug 23, 2023

@sunjiahao1999 I'm sorry for late reply. This is the evaluation result with nuscenes dataset after 20 epoch training.

Evaluating bboxes of pred_instances_3d
mAP: 0.5562                                                                                                                                                                                                                                                                  
mATE: 0.3236
mASE: 0.2586
mAOE: 0.3608
mAVE: 0.3053
mAAE: 0.1887
NDS: 0.6344

@wep21
Copy link
Contributor Author

wep21 commented Aug 23, 2023

20230819_092709.log

@wep21
Copy link
Contributor Author

wep21 commented Aug 23, 2023

image
This figure comes from the original paper of transfusion. The second row shows the score in the same condition as my experiment.

@wep21
Copy link
Contributor Author

wep21 commented Aug 25, 2023

inference demo

python demo/pcd_demo.py demo/data/nuscenes/n015-2018-07-24-11-22-45+0800__LIDAR_TOP__1532402927647951.pcd.bin projects/TransFusion/configs/transfusion_lidar_pillar02_second_secfpn_8xb2-cyclic-20e_nus-3d.py ~/Downloads/mmdet3d_epoch20.pth --show --score-thr 0.05

image

@JingweiZhang12
Copy link
Contributor

@wep21 Please fix the lint.

Signed-off-by: wep21 <daisuke.nishimatsu1021@gmail.com>
@wep21
Copy link
Contributor Author

wep21 commented Aug 28, 2023

@JingweiZhang12 I fixed lint at 2cb5f78.

@wep21
Copy link
Contributor Author

wep21 commented Sep 6, 2023

@JingweiZhang12 @sunjiahao1999 friendly ping

@wep21 wep21 force-pushed the add-transfusion-latest branch 2 times, most recently from fe25f7a to 2cb5f78 Compare January 14, 2024 14:44
@friendship1
Copy link

I was looking for this transfusion code. Thank you.

Just to double-check, are the 54.4 and 64.7 values in the table in the paper you posted (#2547 (comment)), 2nd row, left column, the ones you're comparing to (#2547 (comment))? In other words, is it correct that your experiment yielded higher numbers than the paper (just for pillar based, lidar-only)?

Also, I know that the original Transfusion paper used a "fading strategy", and I was wondering if this was applied in your code or in your experiments.

Thanks the great work!

@friendship1
Copy link

I just checked that the fading strategy is implemented as a custom hook.

But what you implemented doesn't seem to use a 2-stage learning method: pretrain only the LiDAR model first and then further train this model on additional modality with camera input added.
is that correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants