added DCVNet and Configurable Dataloader (#272)

* Updated config README.md * update requirements.txt * updated requirements.txt * update requirements.txt * added pre-commit.ci * update pre-commit config * revert pre-commit configs * update isort version * Added dataset registry * bugfix: circular dependency elative import error * Added datasets to DATASET REGISTRY * added multiple dataset config support * validated datasets * Fix: incorrect crop_size for prediction * pre-commit fixes * Sync main branch to dev (#286) * Fix Linting Pre-Commit failure (#271) * Updated config README.md * update requirements.txt * updated requirements.txt * update requirements.txt * added pre-commit.ci * update pre-commit config * revert pre-commit configs * update isort version * Update README.md * added DCVNet (#279) * added DCVNet Backbone * [WIP] DCVNet Decoder * updated cost volume filtering modules * added matryoshka dilated cost volume * [WIP] DCVNet Model * added dcvnet forward pass * Sync main branch to dev (#286) (#287) * Fix Linting Pre-Commit failure (#271) * Updated config README.md * update requirements.txt * updated requirements.txt * update requirements.txt * added pre-commit.ci * update pre-commit config * revert pre-commit configs * update isort version * Update README.md * fixed DCVNet forward pass * Refactor Residual Encoder and RAFT Backbone * refactored cost volume filter * added flow offset logits loss * Refactored trainer, loss function and base_dataset to handle multiple params * added flow to bilinear interpolated weights support * update trainer * updated configs and tools * updated trainer * updated trainer * update training configs with flow_offsets * update training configs with flow_offsets * updated dcvnet config * updated dcvnet config, trainer and loss funtionn args * updated base trainer * bugfix: offset cross entropy * bugfix: offset cross entropy * updated trainer to validate last epoch * updated trainer * bugfix: offset loss in training * bugfix: offset params config * Added DCVNet Backbone unit tests and Refactored RAFT Small backbone * Added docstring and unittest for Dilated Cost Volume * fix NaN values in ground truth * added unit tests and docstring for dcvnet * Added decoder unit tests * added utils unit tests * added docstring for offset common methods * added unit tests and docs for DCVNet Loss * fixed raft configs and forward pass * updated unit test * fixed formatting * fix Predictor read images * fix formatting * added eval script in tools * fixed evaluate script * added DCVNet checkpoint download links * fix formatting * Sync with main (#298) * Fix Linting Pre-Commit failure (#271) * Updated config README.md * update requirements.txt * updated requirements.txt * update requirements.txt * added pre-commit.ci * update pre-commit config * revert pre-commit configs * update isort version * Update README.md * fix matplotlib github runner * fix dependency github runner * fix dependency github runner * resolve dependencies * removed mkl dependency
neu-vi · Dec 9, 2023 · 469ef52 · 469ef52
1 parent 84cde26
commit 469ef52
Show file tree

Hide file tree

Showing 79 changed files with 3,992 additions and 799 deletions.
diff --git a/README.md b/README.md
@@ -68,6 +68,10 @@ ___
 
 ### Results and Pre-trained checkpoints
 
+- #### DCVNet | [model config](./configs/models/dcvnet.yaml) | [paper](https://jianghz.me/files/DCVNet_camera_ready_wacv2023.pdf)
+| Training Dataset                        | Training Config                                                         | ckpts                                                                                  | Sintel Clean (training) | Sintel Final(training)| KITTI2015 AEPE | KITTI2015 F1-all |
+|-----------------------------------------|-------------------------------------------------------------------------|----------------------------------------------------------------------------------------|-------------------------|-----------------------|----------------|------------------|
+| FlyingThings3DSubset + Monkaa + Driving | [config](./configs/trainers/dcvnet/dcvnet_sceneflow_baseline.yaml)      | [download](https://jianghz.me/files/ezflow_ckpts/dcvnet_sceneflow_step800k.pth)        | 1.90                    | 3.35                  | 4.75           | 23.41%           |
 
 - #### FlowNetC | [model config](./configs/models/flownet_c.yaml) | [arXiv](https://arxiv.org/abs/1504.06852)
 
@@ -88,19 +92,19 @@ ___
 
 - #### RAFT | [model config](./configs/models/raft.yaml) | [arXiv](https://arxiv.org/abs/2003.12039)
 
-| Training Dataset | Training Config                                                 | ckpts                                                                             | Sintel Clean (training) | Sintel Final(training)| KITTI2015 AEPE | KITTI2015 F1-all |
-|------------------|-----------------------------------------------------------------|-----------------------------------------------------------------------------------|-------------------------|-----------------------|----------------|------------------|
-| Chairs           | [config](./configs/trainers/raft/raft_chairs_baseline.yaml)     | [download](https://jianghz.me/files/ezflow_ckpts/raft_chairs_step100k.pth)        | 2.23                    | 4.56                  | 10.45          | 38.93%           |
-| Chairs -> Things | [config](./configs/trainers/raft/raft_things_baseline.yaml)     | [download](https://jianghz.me/files/ezflow_ckpts/raft_chairs_things_step200k.pth) | 1.66                    | 2.75                  | 5.01           | 16.87%           |
-| Kubric           | [config](./configs/trainers/raft/raft_kubric_improved_aug.yaml) | [download](https://jianghz.me/files/ezflow_ckpts/raft_kubric_step100k.pth)        | 2.12                    | 2.54                  | 6.01           | 17.35%           |
+| Training Dataset | Training Config                                                 | ckpts                                                                                | Sintel Clean (training) | Sintel Final(training)| KITTI2015 AEPE | KITTI2015 F1-all |
+|------------------|-----------------------------------------------------------------|--------------------------------------------------------------------------------------|-------------------------|-----------------------|----------------|------------------|
+| Chairs           | [config](./configs/trainers/raft/raft_chairs_baseline.yaml)     | [download](https://jianghz.me/files/ezflow_ckpts/raft_chairs_step100k_v2.pth)        | 2.23                    | 4.56                  | 10.45          | 38.93%           |
+| Chairs -> Things | [config](./configs/trainers/raft/raft_things_baseline.yaml)     | [download](https://jianghz.me/files/ezflow_ckpts/raft_chairs_things_step200k_v2.pth) | 1.66                    | 2.75                  | 5.01           | 16.87%           |
+| Kubric           | [config](./configs/trainers/raft/raft_kubric_improved_aug.yaml) | [download](https://jianghz.me/files/ezflow_ckpts/raft_kubric_step100k_v2.pth)        | 2.12                    | 2.54                  | 6.01           | 17.35%           |
 
 ___
 
 #### Additional Information
 
 - KITTI dataset has been evaluated with a center crop of size `1224 x 370`.
 - FlowNetC and PWC-Net uses `padding` of size `64` for evaluating the KITTI2015 dataset.
-- RAFT uses `padding` of size `8` for evaluating the Sintel and KITTI2015 datasets.
+- RAFT and DCVNet uses `padding` of size `8` for evaluating the Sintel and KITTI2015 datasets.
 ___
 ### References
 

diff --git a/configs/README.md b/configs/README.md
@@ -1,4 +1,10 @@
 ### Results and Pre-trained checkpoints
+
+#### DCVNet | [model config](./configs/models/dcvnet.yaml) | [paper](https://jianghz.me/files/DCVNet_camera_ready_wacv2023.pdf)
+| Training Dataset                        | Training Config                                                         | ckpts                                                                                  | Sintel Clean (training) | Sintel Final(training)| KITTI2015 AEPE | KITTI2015 F1-all |
+|-----------------------------------------|-------------------------------------------------------------------------|----------------------------------------------------------------------------------------|-------------------------|-----------------------|----------------|------------------|
+| FlyingThings3DSubset + Monkaa + Driving | [config](./configs/trainers/dcvnet/dcvnet_sceneflow_baseline.yaml)      | [download](https://jianghz.me/files/ezflow_ckpts/dcvnet_sceneflow_step800k.pth)        | 1.90                    | 3.35                  | 4.75           | 23.41%           |
+
 ___
 
 #### FlowNetC | [model config](./models/flownet_c.yaml) | [arXiv](https://arxiv.org/abs/1504.06852)
@@ -36,4 +42,4 @@ ___
 
 - KITTI dataset has been evaluated with a center crop of size `1224 x 370`.
 - FlowNetC and PWC-Net uses `padding` of size `64` for evaluating the KITTI2015 dataset.
-- RAFT uses `padding` of size `8` for evaluating the Sintel and KITTI2015 datasets.
+- RAFT and DCVNet uses `padding` of size `8` for evaluating the Sintel and KITTI2015 datasets.
diff --git a/configs/models/dcvnet.yaml b/configs/models/dcvnet.yaml
@@ -0,0 +1,44 @@
+NAME: DCVNet
+ENCODER: 
+  NAME: DCVNetBackbone
+  IN_CHANNELS: 3
+  OUT_CHANNELS: 256
+  NORM: instance
+  P_DROPOUT: 0.0
+  LAYER_CONFIG: [64, 96, 128]
+SIMILARITY:
+  NAME: MatryoshkaDilatedCostVolumeList
+  NUM_GROUPS: 1
+  MAX_DISPLACEMENT: 4
+  ENCODER_OUTPUT_STRIDES: [2, 8]
+  DILATIONS: [[1],[1, 2, 3, 5, 9, 16]]
+  NORMALIZE_FEAT_L2: False
+  USE_RELU: False
+DECODER:
+  NAME: DCVDilatedFlowStackFilterDecoder
+  FEAT_STRIDES: [2, 8]
+  DILATIONS: [[1],[1, 2, 3, 5, 9, 16]]
+  COST_VOLUME_FILTER:
+    NAME: DCVFilterGroupConvStemJoint
+    NUM_GROUPS: 1
+    HIDDEN_DIM: 96
+    FEAT_IN_PLANES: 128
+    OUT_CHANNELS: 567
+    USE_FILTER_RESIDUAL: True
+    USE_GROUP_CONV_STEM: True
+    NORM: none
+    UNET:
+      NAME: UNetBase
+      NUM_GROUPS: 1
+      IN_CHANNELS: 695
+      HIDDEN_DIM: 96
+      OUT_CHANNELS: 96
+      NORM: none
+      BOTTLE_NECK: 
+        NAME: ASPPConv2D
+        IN_CHANNELS: 192
+        HIDDEN_DIM: 192
+        OUT_CHANNELS: 192
+        DILATIONS: [2, 4, 8]
+        NUM_GROUPS: 1
+        NORM: none
diff --git a/configs/models/raft.yaml b/configs/models/raft.yaml
@@ -1,21 +1,19 @@
 NAME: RAFT
 ENCODER:
   FEATURE: 
-    NAME: BasicEncoder
+    NAME: RAFTBackbone
     IN_CHANNELS: 3
     OUT_CHANNELS: 256
     NORM: instance
     P_DROPOUT: 0.0
     LAYER_CONFIG: [64, 96, 128]
-    INTERMEDIATE_FEATURES: False
   CONTEXT:
-    NAME: BasicEncoder
+    NAME: RAFTBackbone
     IN_CHANNELS: 3
     OUT_CHANNELS: 256
     NORM: batch
     P_DROPOUT: 0.0
     LAYER_CONFIG: [64, 96, 128]
-    INTERMEDIATE_FEATURES: False
 HIDDEN_DIM: 128
 CONTEXT_DIM: 128
 SIMILARITY:

diff --git a/configs/models/raft_small.yaml b/configs/models/raft_small.yaml
@@ -1,21 +1,19 @@
 NAME: RAFT
 ENCODER:
   FEATURE: 
-    NAME: BasicEncoder
+    NAME: RAFTBackboneSmall
     IN_CHANNELS: 3
     OUT_CHANNELS: 128
     NORM: instance
     P_DROPOUT: 0.0
     LAYER_CONFIG: [32, 64, 96]
-    INTERMEDIATE_FEATURES: False
   CONTEXT:
-    NAME: BasicEncoder
+    NAME: RAFTBackboneSmall
     IN_CHANNELS: 3
     OUT_CHANNELS: 160
     NORM: batch
     P_DROPOUT: 0.0
     LAYER_CONFIG: [32, 64, 96]
-    INTERMEDIATE_FEATURES: False
 HIDDEN_DIM: 96
 CONTEXT_DIM: 64
 SIMILARITY:

diff --git a/configs/trainers/_base_/chairs_baseline.yaml b/configs/trainers/_base_/chairs_baseline.yaml
@@ -1,65 +1,86 @@
 DATA:
-  TRAIN_DATASET:
-    NAME: "flyingchairs"
-    ROOT_DIR: "./Datasets/FlyingChairs_release/data"
-  VAL_DATASET:
-    NAME: "flyingchairs"
-    ROOT_DIR: "./Datasets/FlyingChairs_release/data" 
+  BATCH_SIZE: 8
   NUM_WORKERS: 4
   PIN_MEMORY: True
-  APPEND_VALID_MASK: False
   SHUFFLE: True
-  AUGMENTATION:
-    # Augmentation Settings borrowed from RAFT
-    USE: True
-    PARAMS:
-      TRAINING:
-        COLOR_AUG_PARAMS: {
-          "enabled": True,
-          "asymmetric_color_aug_prob": 0.2, 
-          "brightness": 0.4, 
-          "contrast": 0.4, 
-          "saturation": 0.4, 
-          "hue": 0.15915494309189535
-        }
-        ERASER_AUG_PARAMS: {
-          "enabled": True,
-          "aug_prob": 0.5,
-          "bounds": [50, 100]
-        }
-        NOISE_AUG_PARAMS: {
-          "enabled": False,
-          "aug_prob": 0.5,
-          "noise_std_range": 0.06 
-        }
-        FLIP_AUG_PARAMS: {
-          "enabled": True, 
-          "h_flip_prob": 0.5, 
-          "v_flip_prob": 0.1
-        }
-        SPATIAL_AUG_PARAMS: {
-          "enabled": True,
-          "aug_prob": 0.8, 
-          "stretch_prob": 0.8, 
-          "min_scale": -0.1, 
-          "max_scale": 1.0, 
-          "max_stretch": 0.2, 
-        }
-        ADVANCED_SPATIAL_AUG_PARAMS: {
-          "enabled": False,
-          "scale1": 0.0,
-          "scale2": 0.0,
-          "stretch": 0.0,
-          "rotate": 0.0,
-          "translate": 0.0,
-          "enable_out_of_boundary_crop": False
-        }
-      VALIDATION:
-        SPATIAL_AUG_PARAMS: {"enabled": False}
-        COLOR_AUG_PARAMS: {"enabled": False}
-        ERASER_AUG_PARAMS: {"enabled": False}
-        FLIP_AUG_PARAMS: {"enabled": False}
-        ADVANCED_SPATIAL_AUG_PARAMS : {"enabled": False}
+  INIT_SEED: False
+  DROP_LAST: True
+  TRAIN_DATASET:
+    FlyingChairs:
+      ROOT_DIR: "./Datasets/FlyingChairs_release/data"
+      SPLIT: "training"
+      IS_PREDICTION: False
+      APPEND_VALID_MASK: False
+      CROP: 
+        USE: True
+        SIZE: [384, 448]
+        TYPE: "random"
+      FLOW_OFFSET_PARAMS: {"use": False}
+      AUGMENTATION:
+        # Augmentation Settings borrowed from RAFT
+        USE: True
+        PARAMS:
+          color_aug_params: {
+            "enabled": True,
+            "asymmetric_color_aug_prob": 0.2, 
+            "brightness": 0.4, 
+            "contrast": 0.4, 
+            "saturation": 0.4, 
+            "hue": 0.15915494309189535
+          }
+          eraser_aug_params: {
+            "enabled": True,
+            "aug_prob": 0.5,
+            "bounds": [50, 100]
+          }
+          noise_aug_params: {
+            "enabled": False,
+            "aug_prob": 0.5,
+            "noise_std_range": 0.06 
+          }
+          flip_aug_params: {
+            "enabled": True, 
+            "h_flip_prob": 0.5, 
+            "v_flip_prob": 0.1
+          }
+          spatial_aug_params: {
+            "enabled": True,
+            "aug_prob": 0.8, 
+            "stretch_prob": 0.8, 
+            "min_scale": -0.1, 
+            "max_scale": 1.0, 
+            "max_stretch": 0.2, 
+          }
+          advanced_spatial_aug_params: {
+            "enabled": False,
+            "scale1": 0.0,
+            "scale2": 0.0,
+            "stretch": 0.0,
+            "rotate": 0.0,
+            "translate": 0.0,
+            "enable_out_of_boundary_crop": False
+          }
+  VAL_DATASET:
+    FlyingChairs:
+      ROOT_DIR: "./Datasets/FlyingChairs_release/data"
+      SPLIT: "validation"
+      APPEND_VALID_MASK: False
+      IS_PREDICTION: False
+      PADDING: 1
+      CROP: 
+        USE: True
+        SIZE: [384, 448]
+        TYPE: "center"
+      FLOW_OFFSET_PARAMS: {"use": False}
+      AUGMENTATION:
+        USE: False
+        PARAMS:
+          color_aug_params: {"enabled": False}
+          eraser_aug_params: {"enabled": False}
+          noise_aug_params: {"enabled": False}
+          flip_aug_params: {"enabled": False}
+          spatial_aug_params: {"enabled": False}
+          advanced_spatial_aug_params: {"enabled": False}        
 OPTIMIZER:
   NAME: AdamW
   LR: 0.0004