Skip to content

Latest commit

 

History

History
75 lines (50 loc) · 8.19 KB

MODEL_ZOO.md

File metadata and controls

75 lines (50 loc) · 8.19 KB

Model Zoo

Note

  • For all the pretraining and finetuning, we adopt spaese/uniform sampling.
  • #Frame $=$ #input_frame $\times$ #crop $\times$ #clip
  • #input_frame means how many frames are input for model per inference
  • #crop means spatial crops (e.g., 3 for left/right/center)
  • #clip means temporal clips (e.g., 4 means repeted sampling four clips with different start indices)

Pretraining

Model Setting Model Shell Log
UMT-B/16 K710 200e ckpt run.sh log
UMT-L/16 K710 200e ckpt run.sh log

Finetuning

K710

Model Setting #Frame Top-1 Model Shell Log
UMT-B/16 K710 PT 8x3x4 81.9 ckpt run.sh log
UMT-L/16 K710 PT 8x3x4 86.0 ckpt run.sh log

K400

Model Setting #Frame Top-1 Model Shell Log
UMT-B/16 K710 PT+FT 8x3x4 87.4 ckpt run.sh log
UMT-L/16 K710 PT+FT 8x3x4 90.3 ckpt run.sh log
UMT-L/16 K710 PT+FT 16x3x4 90.6 ckpt run.sh log

K600

Model Setting #Frame Top-1 Model Shell Log
UMT-B/16 K710 PT+FT 8x3x4 87.8 ckpt run.sh log
UMT-L/16 K710 PT+FT 8x3x4 90.4 ckpt run.sh log
UMT-L/16 K710 PT+FT 16x3x4 90.5 ckpt run.sh log

K700

Model Setting #Frame Top-1 Model Shell Log
UMT-B/16 K710 PT+FT 8x3x4 78.5 ckpt run.sh log
UMT-L/16 K710 PT+FT 8x3x4 83.2 ckpt run.sh log
UMT-L/16 K710 PT+FT 16x3x4 83.6 ckpt run.sh log

MiT V1

Model Setting #Frame Top-1 Model Shell Log
UMT-B/16 K710 PT+FT, K400 FT 8x3x4 44.6 ckpt run.sh log
UMT-L/16 384↑ K710 PT+FT, K400 FT 8x3x4 45.5 ckpt run.sh log
UMT-L/16 K710 PT+FT, K400 FT 8x3x4 48.0 ckpt run.sh log
UMT-L/16 384↑ K710 PT+FT, K400 FT 8x3x4 48.7 ckpt run.sh log

SthSth V2

Model Setting #Frame Top-1 Model Shell Log
UMT-B/16 K710 PT 8x3x4 70.8 ckpt run.sh log
UMT-L/16 K710 PT 8x3x4 74.7 ckpt run.sh log

AVA v2.2

See action_detection.