ECCV 2022 论文和开源项目合集(papers with code)!
ECCV 2022 收录列表:https://eccv2022.ecva.net/program/accepted-papers/
注1:欢迎各位大佬提交issue,分享ECCV 2022论文和开源项目!
注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision
如果你想了解最新最优质的的CV论文、开源项目和学习资料,欢迎扫码加入【CVer学术交流群】!互相学习,一起进步~
- 下面论文列表将会在2022年10月份系统分类整理完毕
Paper ID | Paper Title | Authors |
---|---|---|
59 | Contrastive Deep Supervision | Linfeng Zhang (Tsinghua University )*; Xin Chen (Intel Corp.); Junbo Zhang (Tsinghua University); Runpei Dong (Xi’an Jiaotong University); Kaisheng Ma (Tsinghua University ) |
116 | Towards Grand Unification of Object Tracking | Bin Yan (Dalian University of Technology)*; Yi Jiang (Bytedance); Peize Sun (The University of Hong Kong); Dong Wang (Dalian University of Technology); Zehuan Yuan (Bytedance.Inc); Ping Luo (The University of Hong Kong); Huchuan Lu (Dalian University of Technology) |
125 | SeqFormer: Sequential Transformer for Video Instance Segmentation | Junfeng Wu (Huazhong University of Science and Technology); Yi Jiang (Bytedance); Song Bai (University of Oxford); Wenqing Zhang (Huazhong University of Science and Technology); Xiang Bai (Huazhong University of Science and Technology)* |
162 | Estimating Spatially-Varying Lighting in Urban Scenes with Disentangled Representation | Jiajun Tang (Peking University); Yongjie Zhu (Beijing University of Posts and Telecommunications); Haoyu Wang (Peking University); Jun Hoong Chan (Peking University); Si Li (Beijing University of Posts and Telecommunications); Boxin Shi (Peking University)* |
168 | In Defense of Online Models for Video Instance Segmentation | Junfeng Wu (Huazhong University of Science and Technology); Qihao Liu (Johns Hopkins University); Yi Jiang (Bytedance); Song Bai (University of Oxford); Alan Yuille (Johns Hopkins University); Xiang Bai (Huazhong University of Science and Technology)* |
185 | HuMMan: Multi-Modal 4D Human Dataset for Versatile Sensing and Modeling | Zhongang Cai (SenseTime International Pte Ltd)*; Daxuan Ren (Nanyang Technological University); Ailing Zeng (The Chinese University of Hong Kong); Zhengyu Lin (SenseTime); Tao Yu (Tsinghua University); Wenjia Wang (SenseTime); Xiangyu Fan (Sensetime); Yang Gao (Sensetime); Yifan Yu (ETH Zurich); Liang Pan (Nanyang Technological University); Fangzhou Hong (Nanyang Technological University); Mingyuan Zhang (Nanyang Technological University); Chen Change Loy (Nanyang Technological University); Lei Yang (Sensetime Group Limited); Ziwei Liu (Nanyang Technological University) |
193 | Graph R-CNN: Towards Accurate 3D Object Detection with Semantic-Decorated Local Graph | Honghui Yang (Zhejiang University)*; Zili Liu (ZJU); Xiaopei Wu (ZhejiangUniversity); Wenxiao Wang (State Key Lab of CAD&CG, Zhejiang University); Wei Qian (Fabu Inc.); Xiaofei He (Zhejiang University); Deng Cai (ZJU) |
213 | PointScatter: Point Set Representation for Tubular Structure Extraction | Dong Wang (Peking University)*; Zhao Zhang (Peking Univesity); Ziwei Zhao (Peking University); Yuhang Liu (Yizhun Medical AI Co., Ltd); Yihong Chen (Peking University); Liwei Wang (Peking University) |
229 | D&D: Learning Human Dynamics from Dynamic Camera | Jiefeng Li (Shanghai Jiao Tong University)*; Siyuan Bian (Shanghai Jiao Tong University); Chao Xu (Tencent); Gang Liu (Tencent inc.); Gang Yu (Tencent ); Cewu Lu (Shanghai Jiao Tong University) |
413 | On Mitigating Hard Clusters for Face Clustering | Yingjie Chen (Peking University); Huasong Zhong (Damo Academy, Alibaba Group); Chong Chen (Alibaba Group)*; Chen Shen (Alibaba Group); Jianqiang Huang (Damo Academy, Alibaba Group); Tao Wang (Peking University); Yun Liang (Peking University); Qianru Sun (Singapore Management University) |
415 | Recurrent Bilinear Optimization for Binary Neural Networks | Sheng Xu (Beihang University)*; Yanjing Li (Beihang University); Tiancheng Wang (Beihang University); Teli Ma (Shanghai Artificial Intelligence Laboratory); Baochang Zhang (Beihang University); Peng Gao (Chinese university of hong kong); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jinhu Lu (Beihang University, Beijing, China); Guodong Guo (IDL, Baidu Research) |
561 | Particle Video Revisited: Tracking Through Occlusions Using Point Trajectories | Adam Harley (Carnegie Mellon University)*; Zhaoyuan Fang (Carnegie Mellon University); Katerina Fragkiadaki (Carnegie Mellon University) |
617 | Open-Set Semi-Supervised Object Detection | Yen-Cheng Liu (Georgia Institute of Technology)*; Chih-Yao Ma (Facebook); Xiaoliang Dai (Facebook); Junjiao Tian (Georgia Institute of Technology); Peter Vajda (Facebook); Zijian He (Facebook); Zsolt Kira (Georgia Institute of Technology) |
631 | Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation | Xian Liu (The Chinese University of Hong Kong)*; Yinghao Xu (Chinese University of Hong Kong); Qianyi Wu (Monash University); Hang Zhou (The Chinese University of Hong Kong); Wayne Wu (SenseTime Research); Bolei Zhou (UCLA) |
640 | Long-tail Detection with Effective Class-Margins | Jang Hyun Cho (The University of Texas at Austin)*; Philipp Kraehenbuehl (UT Austin) |
669 | SeqTR: A Simple yet Universal Network for Visual Grounding | Chaoyang Zhu (Xiamen University)*; Yiyi Zhou (Xiamen University); Yunhang Shen (Xiamen University); Gen Luo (Xiamen University); Xingjia Pan (Momenta.ai); Mingbao Lin (Xiamen University, China); Chao Chen (Youtu Laboratory); Liujuan Cao (Xiamen University); Xiaoshuai Sun (Xiamen University); Rongrong Ji (Xiamen University, China) |
735 | ECLIPSE: Efficient Long-range Video Retrieval using Sight and Sound | Yan-Bo Lin (UNC Chapel Hill)*; Jie Lei (UNC Chapel Hill); Mohit Bansal (University of North Carolina at Chapel Hill); Gedas Bertasius (UNC Chapel Hill) |
843 | KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients | Niklas Hanselmann (Mercedes-Benz AG)*; Katrin Renz (University of Tuebingen); Kashyap Chitta (MPI-IS and University of Tuebingen); Apratim Bhattacharyya (Max Planck Institute for Informatics); Andreas Geiger (University of Tuebingen) |
910 | Extract Free Dense Labels from CLIP | Chong Zhou (Nanyang Technological University)*; Chen Change Loy (Nanyang Technological University); Bo Dai (Shanghai AI Lab) |
974 | Frequency Domain Model Augmentation for Adversarial Attack | Yuyang Long (University of Electronic Science and Technology of China)*; Qilong Zhang ( University of Electronic Science and Technology of China); Boheng Zeng (University of Electronic Science and Technology of China); Lianli Gao (The University of Electronic Science and Technology of China); Xianglong Liu (BUAA); Jian Zhang (College of Computer Science and Electronic Engineering, HNU); Jingkuan Song (UESTC) |
993 | Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors | Oran Gafni (Meta AI Research)*; Adam Polyak (Facebook); Oron Ashual (Facebook AI Research); Shelly Sheynin (Meta); Devi Parikh (Georgia Tech & Facebook AI Research); Yaniv Taigman (Facebook) |
1011 | Weakly Supervised Grounding for VQA in Vision-Language Transformers | Aisha Urooj (University of Central Florida)*; Hilde Kuehne (University of Frankfurt); Chuang Gan (MIT-IBM Watson AI Lab); Niels da Vitoria Lobo (University of Central Florida); Mubarak Shah (University of Central Florida) |
1083 | Practical and Scalable Desktop-based High-Quality Facial Capture | Alexandros Lattas (Imperial College London)*; Yiming Lin (Imperial college); Jayanth Kannan (Lumirithmic); Ekin Ozturk (Imperial College London); Luca Filipi (Lumirithmic); Giuseppe Claudio Guarnera (University of York); Gaurav Chawla (Lumirithmic Limited); Abhijeet Ghosh (Imperial College London) |
1185 | Tracking Objects as Pixel-wise Distributions | Zelin Zhao (The Chinese University of Hong Kong)*; Ze Wu (Megvii); Yueqing Zhuang (Megvii Inc Company); Boxun Li (Megvii Inc.); Jiaya Jia (Chinese University of Hong Kong) |
1212 | CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillation | Yunyao Mao (University of Science and Technology of China)*; Wengang Zhou (University of Science and Technology of China); Zhenbo Lu (Institute of Artificial Intelligence, Hefei Comprehensive National Science Center); Jiajun Deng (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China) |
1248 | Open-Vocabulary DETR with Conditional Matching | Yuhang Zang (Nanyang Technological University)*; Wei Li (Nanyang Technological University); Kaiyang Zhou (Nanyang Technological University); Chen Huang (Apple); Chen Change Loy (Nanyang Technological University) |
1250 | Towards Calibrated Hyper-sphere Representation via Distribution Overlap Coefficient for Long-tailed Learning | Hualiang Wang (Zhejiang University)*; Siming FU (Zhejiang University); Xiaoxuan He (Zhejiang University); Hangxiang Fang (Zhejiang University); Zuozhu Liu (Zhejiang-UIUC Institute); Haoji Hu (Zhejiang University, China) |
1272 | FBNet: Feedback Network for Point Cloud Completion | Xuejun Yan (Hikvision Research Institue)*; Hongyu Yan (Sichuan Universite); Jingjing Wang (Hikvision Research Institute); Hang Du (Hikvision Research Institute); Zhihong Wu (Sichuan University); Di Xie (Hikvision Research Institute); Shiliang Pu (Hikvision Research Institute); Li Lu (Sichuan University) |
1276 | Physically-Based Editing of Indoor Scene Lighting from a Single Image | Zhengqin Li (Meta)*; Jia Shi (Carnegie Mellon University); Sai Bi (Adobe Research); Rui Zhu (University of California San Diego ); Kalyan Sunkavalli (Adobe Research); Milos Hasan (Adobe Research); Zexiang Xu (Adobe Research); Ravi Ramamoorthi (University of California San Diego); Manmohan Chandraker (UC San Diego) |
1384 | GLASS: Global to Local Attention for Scene-Text Spotting | Roi Ronen (Technion)*; Shahar Tsiper (Amazon); Oron Anschel (AWS); Inbal Lavi (Amazon); Amir Markovitz (Amazon); R. Manmatha (Amazon) |
1396 | Drive&Segment: Unsupervised Semantic Segmentation of Urban Scenes via Cross-modal Distillation | Antonin Vobecky (Czech Technical University in Prague)*; David Hurych (Valeo.ai); Oriane Siméoni (valeo.ai); Spyros Gidaris (valeo.ai); Andrei Bursuc (valeo.ai); Patrick Pérez (Valeo.ai); Josef Sivic (Czech Technical University) |
1398 | Expanding Language-Image Pretrained Models for General Video Recognition | Bolin Ni (Institute of Automation, Chinese Academy of Sciences); Houwen Peng (Microsoft Research)*; Minghao Chen (Stony Brook University); Songyang Zhang (University of Rochester); Gaofeng Meng (Chinese Academy of Sciences); Jianlong Fu (Microsoft Research); SHIMING XIANG (Chinese Academy of Sciences, China); Haibin Ling (Stony Brook University) |
1407 | Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes | Julian Chibane (Max Planck Institute for Informatics, University of Wuerzburg)*; Francis Engelmann (ETH AI Center); Anh Tuan Tran (Max Planck Institute for Informatics, Saarland University); Gerard Pons-Moll (University of Tübingen) |
1413 | Pose-NDF: Modelling Human Pose Manifolds with Neural Distance Fields | Garvita Tiwari (MPI-INF, University of Tübingen)*; Dimitrije Antic (University of Tuebingen); Jan E. Lenssen (TU Dortmund); Nikolaos Sarafianos (Facebook Reality Labs); Tony Tung (Facebook Reality Labs); Gerard Pons-Moll (University of Tübingen) |
1448 | Multimodal Object Detection via Probabilistic Ensembling | Yi-Ting Chen (University of Maryland); Jinghao Shi (Carnegie Mellon University); Zelin Ye (CMU); Mertz Christoph (CMU); Deva Ramanan (Carnegie Mellon University); Shu Kong (Carnegie Mellon University)* |
1545 | CenterFormer: Center-based Transformer for 3D Object Detection | Zixiang Zhou (University of Central Florida)*; xiangchen zhao (Tusimple); Yu Wang (Tusimple); Panqu Wang (TuSimple, Inc); Hassan Foroosh (University of Central Florida) |
1552 | Revisiting a kNN-based Image Classification System with High-capacity Storage | Kengo Nakata (Kioxia Corporation)*; Youyang Ng (Kioxia Corporation); Daisuke Miyashita (Kioxia Corporation); Asuka Maki (Kioxia Corporation); Yu-Chieh Lin (Kioxia Corporation); Jun Deguchi (Kioxia Corporation) |
1588 | TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation | Zhaoyuan Yin (Hunan University)*; Pichao Wang (Alibaba Group); Fan Wang (Alibaba Group); Xianzhe Xu (alibaba group); Hanling Zhang (Hunan University); Hao Li (Alibaba Group); rong jin (alibaba group) |
1617 | VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder | Yu-Chao Gu (Nankai University)*; Xintao Wang (Tencent); Liangbin Xie (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China); Chao Dong (SIAT); Gen LI (Tencent); Ying Shan (Tencent); Ming-Ming Cheng (Nankai University) |
1620 | CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation | Zhihao Li (Huawei Noah’s Ark Lab)*; Jianzhuang Liu (Huawei Noah’s Ark Lab); Zhensong Zhang (Huawei Noah’s Ark Lab); Songcen Xu (Huawei Noah’s Ark Lab); Youliang Yan (Huawei Noah’s Ark Lab) |
1637 | Pointly-Supervised Panoptic Segmentation | Junsong Fan (Chinese Academy of Sciences, China)*; Zhaoxiang Zhang (Chinese Academy of Sciences, China); Tieniu Tan (NLPR, China) |
1729 | Registration based Few-Shot Anomaly Detection | Chaoqin Huang (Shanghai Jiao Tong University)*; Haoyan Guan (King’s College London); Aofan Jiang (Shanghai Jiao Tong University); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Michael W Spratling (King’s College London); Yan-Feng Wang (Cooperative medianet innovation center of Shanghai Jiao Tong University) |
1742 | A Level Set Theory for Neural Implicit Evolution under Explicit Flows | Ishit Mehta (University of California San Diego)*; Manmohan Chandraker (UC San Diego); Ravi Ramamoorthi (University of California San Diego) |
1791 | Improving Robustness by Enhancing Weak Subnets | Yong Guo (Max Planck Institute for Informatics)*; David Stutz (Max Planck Institute for Informatics); Bernt Schiele (MPI Informatics) |
1792 | TO-Scene: A Large-scale Dataset for Understanding 3D Tabletop Scenes | Mutian Xu (The Chinese University of Hong Kong (Shenzhen))*; Pei Chen (the Chinese University of Hong Kong (Shenzhen)); Haolin Liu (The Chinese University of Hong Kong, Shenzhen); Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen)) |
1817 | PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark | Li Chen (Shanghai AI Laboratory)*; Chonghao Sima (Purdue University); Yang Li (SenseTime); Zehan Zheng (Shanghai AI Laboratory); Jiajie Xu (Carnegie Mellon University); Xiangwei Geng (SenseTime); Hongyang Li (SenseTime); Conghui He (Shanghai AI Lab); Jianping Shi (Sensetime Group Limited); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Junchi Yan (Shanghai Jiao Tong University) |
1958 | Language Matters: A Weakly Supervised Vision-Language Pre-training Approach for Scene Text Detection and Spotting | Chuhui Xue (Nanyang Technological University); Wenqing Zhang (ByteDance); Yu Hao (Bytedance Inc.); Shijian Lu (Nanyang Technological University); Philip Torr (University of Oxford); Song Bai (University of Oxford)* |
2021 | Adaptive Patch Exiting for Scalable Single Image Super-Resolution | Shizun Wang (Beijing University of Posts and Telecommunications)*; Jiaming Liu (Peking University); Kaixin Chen (Beijing University of Posts and Telecommunications); Xiaoqi Li (Columbia university in the city of New york); Ming Lu (Intel Labs China); Yandong Guo (OPPO Research Institute) |
2153 | Perceptual Artifacts Localization for Inpainting | Lingzhi Zhang (University of Pennsylvania)*; Yuqian Zhou (Adobe); Connelly Barnes (Adobe); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Sohrab Amirghodsi (Adobe Research); Jianbo Shi (University of Pennsylvania) |
2179 | Adversarially-Aware Robust Object Detector | ZiYi Dong (Sun Yat-Sen University)*; Pengxu Wei (Sun Yat-sen University); Liang Lin (Sun Yat-sen University) |
2282 | RFNet-4D: Joint Object Reconstruction and Flow Estimation from 4D Point Clouds | Tuan-Anh Vu (The Hong Kong University of Science and Technology)*; Thanh Nguyen (Deakin University, Australia); Binh-Son Hua (VinAI Research); Quang Hieu Pham (Woven Planet North America); Sai-Kit Yeung (Hong Kong University of Science and Technology) |
2290 | Generalizable Patch-Based Neural Rendering | Mohammed Suhail (University of British Columbia)*; Carlos Esteves (Google Research); Leonid Sigal (University of British Columbia); Ameesh Makadia (Google Research) |
2385 | A Perturbation-Constrained Adversarial Attack for Evaluating the Robustness of Optical Flow | Jenny Schmalfuss (University of Stuttgart)*; Philipp Scholze (University of Stuttgart); Andrés Bruhn (University of Stuttgart) |
2526 | Contrastive Monotonic Pixel-Level Modulation | Kun Lu (Zhejiang University)*; Rongpeng Li (Zhejiang University); Honggang Zhang (Zhejiang University) |
2623 | Social-SSL: Self-Supervised Cross-Sequence Representation Learning Based on Transformers for Multi-Agent Trajectory Prediction | Li-Wu Tsao (National Chiao Tung University)*; Yan-Kai Wang (National Chiao Tung University); Hao-Siang Lin (National Chiao Tung University); Hong-Han Shuai (National Yang Ming Chiao Tung University); Lai-Kuan Wong (Multimedia University); Wen-Huang Cheng (National Chiao Tung University) |
2657 | SpOT: Spatiotemporal Modeling for 3D Object Tracking | Colton Stearns (Stanford University)*; Davis Rempe (Stanford University); Jie Li (Toyota Research Institute); Rareș A Ambruș (Toyota Research Institute); Sergey Zakharov (Toyota Research Institute); Vitor Guizilini (Toyota Research Institute); Yanchao Yang (Stanford University); Leonidas Guibas (Stanford University) |
2688 | Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition | Xudong Xie (Huazhong University of Science and Technology)*; LING FU (Huazhong University of Science and Technology); Zhifei Zhang (Adobe Research); Zhaowen Wang (Adobe Research); Xiang Bai (Huazhong University of Science and Technology) |
2691 | Monocular 3D Object Detection with Depth from Motion | Tai Wang (The Chinese University of Hong Kong)*; Jiangmiao Pang (CUHK); Dahua Lin (The Chinese University of Hong Kong) |
2723 | Fine-Grained Scene Graph Generation with Data Transfer | Ao Zhang (National University of Singapore)*; Yuan Yao (Tsinghua University); qianyu chen (Tsinghua University); Wei Ji (National University of Singapore); Zhiyuan Liu (Tsinghua University); Maosong Sun (Tsinghua University); Tat-Seng Chua (National university of Singapore) |
2753 | Balancing Stability and Plasticity through Advanced Null Space in Continual Learning | Yajing Kong (The University of Sydney)*; Liu Liu (The University of Sydney); Zhen Wang (The University of Sydney ); Dacheng Tao (JD.com) |
2808 | OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses | Robik S Shrestha (Rochester Institute of Technology)*; Kushal Kafle (Adobe Research); Christopher Kanan (University of Rochester) |
2827 | DisCo: Remedying Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning | Yuting Gao (tencent)*; Jia-Xin Zhuang (Sun Yat-sen University); Shaohui Lin (East China Normal University ); Hao Cheng (Tencent); Xing Sun (Shopee); Ke Li (Tencent); Chunhua Shen (“University of Adelaide, Australia”) |
2874 | Diverse Human Motion Prediction Guided by Multi-Level Spatial-Temporal Anchors | Sirui Xu (University of Illinois Urbana-Champaign)*; Yu-Xiong Wang (University of Illinois at Urbana-Champaign); Liangyan Gui (University of Illinois Urbana-Champaign) |
2911 | InfiniteNature-Zero: Learning Perpetual View Generation of Natural Scenes from Single Images | Zhengqi Li (Google Inc.)*; Qianqian Wang (Cornell); Noah Snavely (Google); Angjoo Kanazawa (University of California Berkeley) |
3007 | CT^2: Colorization Transformer via Color Tokens | Shuchen Weng (Peking University)*; Jimeng Sun (Beijing University of Posts and Telecommunications); Yu Li (International Digital Economy Academy); Si Li (Beijing University of Posts and Telecommunications); Boxin Shi (Peking University) |
3086 | PCW-Net: Pyramid Combination and Warping Cost Volume for Stereo Matching | Zhelun Shen (Baidu Research)*; Yuchao Dai (Northwestern Polytechnical University); Xibin Song (Baidu); ZhiBo Rao (Northwestern Polytechnical University); Dingfu Zhou (Baidu); Liangjun Zhang (Baidu Research Institute) |
3181 | Discovering Transferable Forensic Features for CNN-generated Images Detection | Keshigeyan Chandrasegaran (Singapore University of Technology and Design)*; Ngoc-Trung Tran (Singapore University of Technology and Design); Alexander Binder (University of Oslo); Ngai-Man Cheung (Singapore University of Technology and Design) |
3187 | Domain Adaptive Person Search | Junjie Li (Shanghai Jiao Tong University); Yichao Yan (Shanghai Jiao Tong University)*; Guanshuo Wang (Tencent Youtu Lab); Fufu Yu (Tencent Youtu); Qiong Jia (Tencent Youtu Lab); Shouhong Ding (Tencent) |
3228 | Text2LIVE: Text-Driven Layered Image and Video Editing | Omer Bar Tal (Weizmann Institute of Science )*; Dolev Ofri-Amar (Weizmann Institute of Science); Rafail Fridman (Weizmann Institute of Science); Yoni Kasten (Weizmann Institute); Tali Dekel (Weizmann Institute of Science) |
3239 | Event-Based Fusion for Motion Deblurring with Cross-modal Attention | Lei Sun (Zhejiang University); Christos Sakaridis (ETH Zurich); Jingyun Liang (ETH Zurich); Qi Jiang (Zhejiang University); Kailun Yang (Karlsruhe Institute of Technology); Peng Sun (Zhejiang University); Yaozu Ye (State Key Laboratory of Modern Optical Instrumentation, Zhejiang University); Kaiwei Wang (State Key Laboratory of Modern Optical Instrumentation, Zhejiang University)*; Luc Van Gool (ETH Zurich) |
3311 | AutoMix: Unveiling the Power of Mixup | Zicheng Liu (Westlake University)*; Siyuan Li (Westlake University); di wu (Westlake University); Zihan Liu (Westlake University); Zhiyuan Chen (Shanghai AI Lab); Lirong Wu (Westlake University); Stan Z. Li (Westlake University) |
3332 | Synergistic Self-Supervised and Quantization Learning | Yunhao Cao (Nanjing University)*; Peiqin Sun (MEGVII Technology); Yechang Huang (MEGVII Technology); Jianxin Wu (Nanjing University); Shuchang Zhou (MEGVII Technology) |
3586 | Auto-regressive Image Synthesis with Integrated Quantization | Fangneng Zhan (Max Planck Institute for Informatics); Yingchen Yu (Nanyang Technological University); Rongliang WU (Nanyang Technological University); Jiahui Zhang (Nanyang Technological University); Kaiwen Cui (Nanyang Technological University); Changgong Zhang (Amazon); Shijian Lu (Nanyang Technological University)* |
3601 | Event-guided Deblurring of Unknown Exposure Time Videos | Taewoo Kim (KAIST)*; Jeongmin Lee (KAIST); Lin Wang (HKUST); Kuk-Jin Yoon (KAIST) |
3622 | Learning Disentanglement with Decoupled Labels for Vision-Language Navigation | Wenhao Cheng (Beijing Institute of Technology); Xingping Dong (Inception Institute of Artificial Intelligence); Salman Khan (MBZUAI/ANU); Jianbing Shen (Inception Institute of Artificial Intelligence)* |
3631 | 3D CoMPaT: Composition of Materials on Parts of 3D Things | Yuchen Li (King Abdullah University of Science and Technology (KAUST)); Ujjwal Upadhyay (KAUST); Habib Slim (KAUST); Tezuesh Varshney (KAUST); Ahmed Abdelreheem (KAUST); Arpit Prajapati (Poly9); Suhail S Pothigara (Poly9 Inc); Peter Wonka (KAUST); Mohamed Elhoseiny (KAUST)* |
3673 | Exploring Gradient-based Multi-directional Controls in GANs | Zikun Chen (ModiFace Inc. )*; Ruowei Jiang (ModiFace Inc.); Brendan Duke (ModiFace Inc); Han Zhao (University of Illinois at Urbana-Champaign); Parham Aarabi (ModiFace Inc.) |
3727 | OPD: Single-view 3D Openable Part Detection | Hanxiao Jiang (Simon Fraser University)*; Yongsen Mao (Simon Fraser University); Manolis Savva (Simon Fraser University); Angel X Chang (SFU) |
3757 | Unpaired Image Translation via Vector Symbolic Architectures | Justin Theiss (University of California, Berkeley)*; Jay Leverett (Meta); Daeil Kim (Meta); Aayush Prakash (Meta) |
3887 | CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer | Zijie Wu (Huazhong University of Science and Technology)*; Zhen Zhu (University of Illinois at Urbana-Champaign); Junping Du (Beijing University of Posts and Telecommunications); Xiang Bai (Huazhong University of Science and Technology) |
4028 | Decoupled Adversarial Contrastive Learning for Self-supervised Adversarial Robustness | Chaoning Zhang (KAIST)*; Kang Zhang (KAIST); Chenshuang Zhang (KAIST); Axi Niu (Northwestern Polytechnical University ); Jiu Feng (Sichuan University); Chang D. Yoo (KAIST); In So Kweon (KAIST) |
4067 | Secrets of Event-Based Optical Flow | Shintaro Shiba (Keio University)*; Yoshimitsu Aoki (Keio University); Guillermo Gallego (TU Berlin) |
4122 | Synthesizing Light Field Video from Monocular Video | Shrisudhan Govindarajan (Indian Institute of Technology Madras); Prasan A Shedligeri (Indian Institute of Technology Madras)*; Sarah Sarah (Indian Institute of Technology, Madras); Kaushik Mitra (IIT Madras) |
4350 | LESS: Label-Efficient Semantic Segmentation for LiDAR Point Clouds | Minghua Liu (UCSD)*; Yin Zhou (Waymo); Charles R. Qi (Waymo); Boqing Gong (Google); Hao Su (UCSD); Dragomir Anguelov (Waymo) |
4399 | 3D-Aware Indoor Scene Synthesis with Depth Priors | Zifan SHI (HKUST)*; Yujun Shen (Dept. of IE, CUHK); Jiapeng Zhu (HKUST); Dit-Yan Yeung (HKUST); Qifeng Chen (HKUST) |
4417 | Restore Globally, Refine Locally: A Mask-Guided Scheme to Accelerate Super-Resolution Networks | xiaotao hu (Nankai University); Jun Xu (Nankai University)*; Shuhang Gu (ETH Zurich, Switzerland); Ming-Ming Cheng (Nankai University); Li Liu (the inception institute of artificial intelligence) |
4507 | Modeling Mask Uncertainty in Hyperspectral Image Reconstruction | jiamian wang (Santa Clara University)*; Yulun Zhang (ETH Zurich); Xin Yuan (Westlake University); Ziyi Meng (Kuaishou Technology); Zhiqiang Tao (Santa Clara University) |
4508 | Perceiving and Modeling Density for Image Dehazing | Tian Ye (Jimei University)*; Yunchen Zhang (China Design Group Ltd.Co); Erkang Chen (Jimei University); MingChao Jiang (JOYY.INC); Yun Liu (Southwest University); Liang Chen (Fujian Normal University); Sixiang Chen (JiMei University) |
4514 | ROBIN: A Benchmark for Robustness to Individual Nuisances in Real-World Out-of-Distribution Shifts | Bingchen Zhao (University of Edinburgh)*; Shaozuo Yu (Tongji University); Wufei Ma (Purdue University); Mingxin Yu (Peking University); Shenxiao Mei (Johns Hopkins University); Angtian Wang (Johns Hopkins University); Ju He (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Adam Kortylewski (Max Planck Institute for Informatics) |
4539 | Delving into Details: Synopsis-to-Detail Networks for Video Recognition | Shuxian Liang (Zhejiang University)*; Xu Shen (Alibaba Group); Jianqiang Huang (Alibaba Group); Xian-Sheng Hua (Alibaba Group) |
4547 | Bringing Rolling Shutter Images Alive with Dual Reversed Distortion | Zhihang Zhong (The University of Tokyo); Mingdeng Cao (Tsinghua University); Xiao Sun (Microsoft Research Asia); Zhirong Wu (Microsoft Research); Zhongyi Zhou (The University of Tokyo); Yinqiang Zheng (The University of Tokyo)*; Stephen Lin (Microsoft Research); Imari Sato (National Institute of Informatics) |
4591 | SimCC: a Simple Coordinate Classification Perspective for Human Pose Estimation | Yanjie Li (Tsinghua University)*; Sen Yang (Southeast University); Peidong Liu (Tsinghua University); 寿奎 张 (meituan); Yunxiao Wang (Tsinghua University); Zhicheng Wang (Nreal); Wankou Yang (Southeast University); Shu-Tao Xia (Tsinghua University) |
4610 | Generative Multiplane Images: Making a 2D GAN 3D-Aware | Xiaoming Zhao (University of Illinois at Urbana-Champaign)*; Fangchang Ma (Apple Inc.); David Güera (Apple Inc.); Zhile Ren (Apple Inc.); Alexander Schwing (UIUC); Alex Colburn (Apple Inc.) |
4640 | Self-supervised Social Relation Representation for Human Group Detection | Jiacheng Li (College of Intelligence and Computing, Tianjin University); Ruize Han (College of Intelligence and Computing, Tianjin University)*; Haomin Yan (Tianjin University); Zekun Qian (College of Intelligence and Computing, Tianjin University); Wei Feng (College of Intelligence and Computing, Tianjin University, China); Song Wang (University of South Carolina) |
4651 | Stripformer: Strip Transformer for Fast Image Deblurring | Fu-Jen Tsai (National Tsing Hua University)*; Yan-Tsung Peng (National Chengchi University); Yen-Yu Lin (National Yang Ming Chiao Tung University); Chung-Chi Tsai (Qualcomm Technology); Chia-Wen Lin (National Tsing Hua University) |
4678 | Deep Fourier-based Exposure Correction Network with Spatial-Frequency Interaction | Jie Huang (University of Science and Technology of China); Yajing Liu (USTC); Feng Zhao (University of Science and Technology of China)*; Keyu Yan (University of Science and Technology of China); Jinghao Zhang (University of Science and Technology of China); Yukun Huang (University of Science and Technology of China); man zhou (University of Science and Technology of China); Zhiwei Xiong (University of Science and Technology of China) |
4720 | Organic Priors in Non-Rigid Structure from Motion | Suryansh Kumar (ETH Zurich)*; Luc Van Gool (ETH Zurich) |
4806 | TEMOS: Generating diverse human motions from textual descriptions | Mathis Petrovich (Ecole des Ponts)*; Michael Black (Max Planck Institute for Intelligent Systems); Gul Varol (Ecole des Ponts ParisTech) |
4824 | Semantic-Aware Fine-Grained Correspondence | Yingdong Hu (Tsinghua University); Renhao Wang (Tsinghua University); Kaifeng Zhang (Tsinghua University); Yang Gao (Tsinghua University)* |
4847 | Layered Controllable Video Generation | Jiahui Huang (University of British Columbia)*; Yuhe Jin (University of British Columbia); Kwang Moo Yi (University of British Columbia); Leonid Sigal (University of British Columbia) |
4861 | GraphVid: It Only Takes a Few Nodes to Understand a Video | Eitan Kosman (Bosch AI)*; Dotan Di Castro (Bosch) |
4878 | Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection | Yu Hong (Zhejiang University); Hang Dai (Mohamed bin Zayed University of Artificial Intelligence)*; Yong Ding (Zhejiang University) |
4901 | Adaptive Token Sampling For Efficient Vision Transformers | Mohsen Fayyaz (Microsoft)*; Soroush Abbasi Koohpayegani (University of Maryland Baltimore County); Farnoush Rezaei Jafari (Technische Universität Berlin); Sunando Sengupta (Microsoft); HAMID VAEZI JOZE (Microsoft); Eric Sommerlade (Microsoft); Hamed Pirsiavash (University of California Davis); Jürgen Gall (University of Bonn) |
4910 | Implicit Field Supervision For Robust Non-Rigid Shape Matching | Ramana S Sundararaman (Ecole Polytechnique)*; Gautam Pai (École Polytechnique); Maks Ovsjanikov (Ecole polytechnique) |
4916 | NeuMesh: Learning Disentangled Neural Mesh-based Implicit Field for Geometry and Texture Editing | Bangbang Yang (Zhejiang University); Chong Bao (Zhejiang University); Junyi Zeng (Zhejiang University); Hujun Bao (Zhejiang University); Yinda Zhang (Google); Zhaopeng Cui (Zhejiang University); Guofeng Zhang (Zhejiang University)* |
4919 | KXNet: A Model-Driven Deep Neural Network for Blind Super-Resolution | Jiahong Fu (Xi’an Jiaotong University)*; Hong Wang (Jarvis Lab,Tencent ); Qi Xie (Xi’an Jiaotong University); Qian Zhao (Xi’an Jiaotong University); Deyu Meng (Xi’an Jiaotong University); Zongben Xu (Xi’an Jiaotong University) |
4989 | RealFlow: EM-based Realistic Optical Flow Datasets Generation from Videos | Yunhui Han (THU;Megvii); Kunming Luo (Megvii); Ao Luo (Megvii); Jiangyu Liu (megvii inc); Haoqiang Fan (Megvii Inc(face++)); Guiming Luo (School of Software, Tsinghua University); Shuaicheng Liu (UESTC; Megvii)* |
5010 | Semi-supervised Object Detection via Virtual Category Learning | Changrui Chen (University of Warwick); Kurt Debattista (University of Warwick, UK); Jungong Han (Aberystwyth University)* |
5080 | PrivHAR: Recognizing Human Actions From Privacy-preserving Lens | Carlos Hinojosa (Universidad Industrial de Santander)*; Miguel A Marquez (UIS Colombia); Henry Arguello (Universidad Industrial Santander); Ehsan Adeli (Stanford University); Li Fei-Fei (Stanford University); Juan Carlos Niebles (Salesforce & Stanford University) |
5096 | Solution Space Analysis of Essential Matrix based on Algebraic Error Minimization | Gaku Nakano (NEC Corporation)* |
5100 | EvAC3D: From Event-based Apparent Contours to 3D Models via Continuous Visual Hulls | Ziyun Wang (University of Pennsylvania)*; Kenneth Chaney (University of Pennsylvania); Kostas Daniilidis (University of Pennsylvania) |
5142 | DCCF: Deep Comprehensible Color Filter Learning Framework for High-Resolution Image Harmonization | Ben Xue (Peking University); Shenghui Ran (Alibaba Group); Quan Chen (Alibaba Group)*; Rongfei Jia (Alibaba Group); Binqiang Zhao (Alibaba); Xing Tang (Alibaba Group) |
5226 | UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling | Zhengyuan Yang (Microsoft)*; Zhe Gan (Microsoft); Jianfeng Wang (Microsoft); Xiaowei Hu (Microsoft); Faisal Ahmed (Microsoft); Zicheng Liu (Microsoft); Yumao Lu (Microsoft); Lijuan Wang (Microsoft) |
5242 | Grasp’D: Differentiable Contact-rich Grasp Synthesis for Multi-fingered Hands | Dylan Turpin (University of Toronto)*; Liquan Wang (University of Toronto); Eric Heiden (University of Southern California); Yun-Chun Chen (University of Toronto ); Miles Macklin (NVIDIA); Stavros Tsogkas (University of Toronto); Sven Dickinson (University of Toronto); Animesh Garg (University of Toronto, Vector Institute, Nvidia) |
5263 | The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning | Jack Hessel (Allen Institute for AI)*; Jena D Hwang (Allen Institute for AI); Jae Sung Park (University of Washington); Rowan Zellers (University of Washington); Chandra Bhagavatula (AllenAI); Anna Rohrbach (UC Berkeley); Kate Saenko (Boston University); Yejin Choi (University of Washington) |
5271 | Cross-Modal Knowledge Transfer Without Task-Relevant Source Data | SK MIRAJ AHMED (University of California Riverside); Suhas Lohit (Mitsubishi Electric Research Laboratories)*; Kuan-Chuan Peng (Mitsubishi Electric Research Laboratories (MERL)); Michael J Jones (MERL); Amit K. Roy-Chowdhury (University of California, Riverside) |
5285 | Approximate Differentiable Rendering with Algebraic Surfaces | Leonid Keselman (Carnegie Mellon University)*; Martial Hebert (Carnegie Mellon School of Computer Science) |
5303 | Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments | Jacob Krantz (Oregon State University)*; Stefan Lee (Oregon State University) |
5350 | Uncertainty-DTW for Time Series and Sequences | Lei Wang (The Australian National University); Piotr Koniusz (ANU College of Engineering and Computer Science)* |
5358 | Affine Correspondences between Multi-Camera Systems for 6DOF Relative Pose Estimation | Banglei Guan (National University of Defense Technology)*; Ji Zhao (Huazhong University of Science and Technology) |
5415 | Improving Self-supervised Lightweight Model Learning via Hard-aware Metric Distillation | Hao Liu (Beijing Institute of Technology); Mang Ye (Wuhan University)* |
5422 | NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion | Chenfei Wu (Microsoft)*; Jian Liang (Peking University); Lei Ji (Microsoft); Fan Yang (MSRA); Yuejian Fang (Peking University); Daxin Jiang (Microsoft, Beijing, China); Nan Duan (Microsoft Research) |
5512 | BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | Ye Yu (Microsoft)*; Jialin Yuan (Oregon State University); Gaurav Mittal (Microsoft); Li Fuxin (Oregon State University); Mei Chen (Microsoft) |
5622 | DiffuStereo: High Quality Human Reconstruction via Diffusion-based Stereo Using Sparse Cameras | Ruizhi Shao (Tsinghua University); Zerong Zheng (Tsinghua University); Hongwen Zhang (Tsinghua University); Jingxiang Sun (University of Illinois Urbana-Champaign); Yebin Liu (Tsinghua University)* |
5667 | The Challenges of Continuous Self-Supervised Learning | Senthil Purushwalkam (Carnegie Mellon University); Pedro Morgado (CMU)*; Abhinav Gupta (CMU/FAIR) |
5670 | Deep Radial Embedding for Visual Sequence Learning | Yuecong Min (Institute of Computing Technology, Chinese Academy of Sciences); Peiqi Jiao (Institute of Computing Technology, Chinese Academy of Sciences); Yanan Li (Xiaomi); Wang Xiaotao (XIaomi); LEI LEI (Xiaomi); Xiujuan Chai (Agricultural Information Institute, Chinese); Xilin Chen (Institute of Computing Technology, Chinese Academy of Sciences)* |
5713 | Shape-Pose Disentanglement using SE(3)-equivariant Vector Neurons | Oren Katzir (Tel Aviv University)*; Dani Lischinski (The Hebrew University of Jerusalem); Danny Cohen-Or (Tel Aviv University) |
5763 | 3D Object Detection with a Self-supervised Lidar Scene Flow Backbone | Emeç Erçelik (Technical University of Munich)*; Ekim Yurtsever (The Ohio State University); Mingyu Liu (TUM); Zhijie Yang (Technical University of Munich); Hanzhen Zhang (TUM); Pınar Topçam (Technical University of Munich ); Maximilian Listl (Technical University of Munich); Yılmaz Kaan Kaan Çaylı (Technical University of Munich); Alois C. Knoll (Robotics and Embedded Systems) |
5991 | FH-Net: A Fast Hierarchical Network for Scene Flow Estimation on Real-world Point Clouds | lihe Ding (Beijing Institute of Technology)*; Shaocong Dong (Beijing Institute of Technology); Tingfa Xu (Beijing Institute of Technology); xinli Xu (Beijing Institute of Technology); Jie Wang (Beijing Institute of Technology); Jianan Li (Beijing Institute of Technology) |
6108 | Vote from the Center: 6 DoF Pose Estimation in RGB-D Images by Radial Keypoint Voting | Yangzheng Wu (Queen’s University)*; Mohsen Zand (Queen’s University); Ali Etemad (Queen’s University); Michael Alan Greenspan (Queen’s University) |
6132 | Flow graph to Video Grounding for Weakly-supervised Multi-Step Localization | NIKITA DVORNIK (Samsung)*; Isma Hadji (Samsung AI Center – Toronto); Hai X Pham (Samsung AI Center); Dhaivat Bhatt (Samsung); Brais Martinez (Samsung AI Center); Afsaneh Fazly (SAIC Toronto); Allan D Jepson (Samsung Toronto AIC) |
6143 | Neural Radiance Transfer Fields for Relightable Novel-view Synthesis with Global Illumination | Linjie Lyu (MPII)*; Ayush Tewari (MIT); Thomas Leimkuehler (MPI Informatik); Marc Habermann (Max Planck Institute for Informatics); Christian Theobalt (MPI Informatik) |
6180 | Learning Topological Interactions for Multi-Class Medical Image Segmentation | Saumya Gupta (Stony Brook University)*; Xiaoling Hu (Stony Brook University); James Kaan (Stony Brook University); Michael Jin (Stony Brook University Hospital); Mutshipay Christian Mpoy (SUNY Stony Brook Medicine); Katherine Chung (Stony Brook University Hospital); Gagandeep Singh (RWJBarnabas Health); Mary Saltz (Stony Brook); Tahsin Kurc (Stony Brook University); Joel Saltz (Stony Brook University); APOSTOLOS K TASSIOPOULOS (Stony Brook University); Prateek Prasanna (Stony Brook University); Chao Chen (Stony Brook University) |
6185 | Look Both Ways: Self-Supervising Driver Gaze Estimation and Road Scene Saliency | Isaac H Kasahara (University of Minnesota); Simon Stent (Toyota Research Institute); Hyun Soo Park (The University of Minnesota)* |
6191 | ObjectBox: From Centers to Boxes for Anchor-Free Object Detection | Mohsen Zand (Queen’s University)*; Ali Etemad (Queen’s University); Michael Alan Greenspan (Queen’s University) |
6193 | Unsupervised Segmentation in Real-World Images via Spelke Object Inference | Honglin Chen (Stanford University); Rahul M V (Stanford University); Yoni I Friedman (MIT); Jiajun Wu (Stanford University); Joshua Tenenbaum (MIT); Daniel Yamins (Stanford University); Daniel Bear (Stanford University)* |
6243 | A Dense Material Segmentation Dataset for Indoor and Outdoor Scene Parsing | Paul Upchurch (Apple)*; Ransen Niu (Apple) |
6295 | Pixel-wise Energy-biased Abstention Learning for Anomaly Segmentation on Complex Urban Driving Scenes | Yu Tian (Australian Institute for Machine Learning, University of Adelaide ); Yuyuan Liu (University of Adelaide); Guansong Pang (Singapore Management University)*; Fengbei Liu (University of Adelaide); Yuanhong Chen (University of Adelaide); Gustavo Carneiro (University of Adelaide) |
6326 | Identifying Hard Noise in Long-Tailed Sample Distribution | Xuanyu Yi (Nanyang Technological University)*; Kaihua Tang (Nanyang Technological University); Xian-Sheng Hua (Damo Academy, Alibaba Group); Joo-Hwee Lim (Institute for Infocomm Research); Hanwang Zhang (Nanyang Technological University) |
6515 | PressureVision: Estimating Hand Pressure from a Single RGB Image | Patrick L Grady (Georgia Institute of Technology)*; Chengcheng Tang (Facebook Reality Labs); Samarth Brahmbhatt (Intel); Christopher D Twigg (Meta); Chengde Wan (Facebook Reality Lab); James Hays (Georgia Institute of Technology, USA); Charlie Kemp (Georgia Institute of Technology) |
6568 | PACTran: PAC-Bayesian Metrics for Estimating the Transferability of Pretrained Models to Classification Tasks | Nan Ding (Google)*; Xi Chen (Google Research); Tomer Levinboim (Google); Soravit Changpinyo (Google Research); Radu Soricut (Google) |
6571 | Beyond Periodicity: Towards a Unifying Framework for Activations in Coordinate-MLPs | Sameera Ramasinghe (University of Adelaide)*; Simon Lucey (University of Adelaide) |
6672 | Pose for Everything: Towards Category-Agnostic Pose Estimation | Lumin XU (The Chinese University of Hong Kong)*; Sheng Jin (The University of Hong Kong); Wang ZENG (The Chinese University of Hong Kong); Wentao Liu (Sensetime); Chen Qian (SenseTime); Wanli Ouyang (The University of Sydney); Ping Luo (The University of Hong Kong); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong) |
6739 | UIA-ViT: Unsupervised Inconsistency-Aware Method based on Vision Transformer for Face Forgery Detection | Wanyi Zhuang (University of Science and Technology of China); Qi Chu (University of Science and Technology of China)*; Zhentao Tan (University of Science and Technology of China); Qiankun Liu (University of Science and Technology of China); Haojie Yuan (University of Science and Technology of China); Changtao Miao (University of Science and Technology of China); Zixiang Luo (University of Science and Technology of China); Nenghai Yu (University of Science and Technology of China) |
7092 | PREF: Predictability Regularized Neural Motion Fields | Liangchen Song (University at Buffalo)*; Xuan Gong (University at Buffalo); Benjamin Planche (United Imaging Intelligence); Meng Zheng (United Imaging Intelligence); David Doermann (University at Buffalo); Junsong Yuan (“State University of New York at Buffalo, USA”); Terrence Chen (United Imaging Intelligence); Ziyan Wu (United Imaging Intelligence) |
7215 | Bi-PointFlowNet: Bidirectional Learning for Point Cloud Based Scene Flow Estimation | WENCAN CHENG (Sungkyunkwan University); Jong Hwan Ko (Sungkyunkwan University)* |
7248 | Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration | Aditi Basu Bal (Florida State University)*; Ramy A Mounir (University of South Florida); Sathyanarayanan N Aakur (OK State); Sudeep Sarkar (University of South Florida, Tampa); Anuj Srivastava (Florida State University) |
7302 | Semidefinite Relaxations of Truncated Least-Squares in Robust Rotation Search: Tight or Not | Liangzu Peng (Johns Hopkins University)*; Mahyar Fazlyab (Johns Hopkins University); Rene Vidal (Johns Hopkins University, USA) |
7345 | Lottery Ticket Hypothesis for Spiking Neural Networks | Youngeun Kim (Yale University)*; Yuhang Li (Yale University); Hyoungseob Park (Yale University); Yeshwanth Venkatesha (Yale university); Ruokai Yin (Yale University); Priyadarshini Panda (Yale University) |
7360 | Multi-domain Learning for Updating Face Anti-spoofing Models | Xiao Guo (Michigan State University)*; Yaojie Liu (Google Research); Anil Jain (Michigan State University); Xiaoming Liu (Michigan State University) |
7402 | Towards Realistic Semi-Supervised Learning | Mamshad Nayeem Rizve (University of Central Florida)*; Navid Kardan (University of Central Florida); Mubarak Shah (University of Central Florida) |
7414 | Unsupervised Pose-aware Part Decomposition for Man-made Articulated Objects | Yuki Kawana (The University of Tokyo)*; Yusuke Mukuta (The University of Tokyo); Tatsuya Harada (The University of Tokyo / RIKEN) |
7464 | Cartoon Explanations of Image Classifiers | Stefan Kolek (LMU)*; Duc Anh Nguyen (LMU Munich); Ron Levie (Technion); Joan Bruna (Courant Institute of Mathematical Sciences, NYU, USA); Gitta Kutyniok (Ludwig Maximilian University of Munich) |
7808 | RRSR:Reciprocal Reference-based Image Super-Resolution with Progressive Feature Alignment and Selection | Lin Zhang (CASIA); Xin Li (Baidu); Dongliang He (Baidu)*; Fu Li (Baidu); Yili Wang (Tsinghua University); Zhaoxiang Zhang (Chinese Academy of Sciences, China) |
7838 | Gaussian Activated Neural Radiance Fields for High Fidelity Reconstruction & Pose Estimation | Shin-Fang Chng (The University of Adelaide)*; Sameera Ramasinghe (University of Adelaide); Jamie Sherrah (AIML); Simon Lucey (University of Adelaide) |
7886 | Unbiased Gradient Estimation for Differentiable Surface Splatting via Poisson Sampling | Jan U. Müller (University of Bonn)*; Michael Weinmann (TU Delft); Reinhard Klein (University of Bonn) |
8098 | “This is my unicorn, Fluffy”: Personalizing frozen vision-language representations | Niv Cohen (The Hebrew University of Jerusalem)*; Rinon Gal (Tel Aviv University); Eli Meirom (NVIDIA Research); Gal Chechik (NVIDIA); Yuval Atzmon (NVIDIA Research) |
Paper ID | Paper Title | Authors |
---|---|---|
8 | Learning Uncoupled-Modulation CVAE for 3D Action-Conditioned Human Motion Synthesis | Chongyang Zhong (Institute of Computing Technology, Chinese Academy of Sciences)*; Lei Hu (Institute of Computing Technology, Chinese Academy of Sciences ); Zihao Zhang (Institute of Computing Technology, Chinese Academy of Sciences); Shihong Xia (institute of computing technology of the Chinese academy of sciences) |
16 | Generative Domain Adaptation for Face Anti-Spoofing | Qianyu Zhou (Shanghai Jiao Tong University)*; Ke-Yue Zhang (YouTu Lab, Tencent); Taiping Yao (Tencent YouTu); Ran Yi (Shanghai Jiao Tong University); Kekai Sheng (Youtu Lab, Tencent Inc.); Shouhong Ding (Tencent); Lizhuang Ma (Shanghai Jiao Tong University) |
19 | Learning Depth from Focus in the Wild | Changyeon Won (GIST)*; Hae-Gon Jeon (GIST) |
34 | Relighting4D: Neural Relightable Human from Videos | Zhaoxi Chen (Nanyang Technological University )*; Ziwei Liu (Nanyang Technological University) |
46 | PPT: token-Pruned Pose Transformer for monocular and multi-view human pose estimation | Haoyu Ma (University of California, Irvine)*; Zhe Wang (UC-Irvine); Yifei Chen (Tencent); Deying Kong (university of california, irvine); Liangjian Chen (Reality Labs); Xingwei Liu (University of California Irvine); Xiangyi Yan (University of California, Irvine); Hao Tang (University of California Irvine); Xiaohui Xie (University of California, Irvine) |
52 | Understanding the Dynamics of DNNs Using Graph Modularity | Yao Lu (Zhejiang University of Technology)*; Wen Yang (Zhejiang University of Technology); Yunzhe Zhang (Zhejiang University of Technology); Zuohui Chen (Zhejiang University of Technology); Jinyin Chen (Zhejiang University of Technology); Qi Xuan (Zhejiang University of Technology); Zhen Wang (Northwestern Polytechnical University); Xiaoniu Yang (Zhejiang University of Technology; Science and Technology on Communication Information Security Control Laboratory) |
65 | Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective | Quan Cui (Waseda University)*; Bingchen Zhao (University of Edinburgh); Zhao-Min Chen (NanJing University); Borui Zhao (Megvii Technology); Renjie Song (Megvii Inc.); Boyan Zhou (ByteDance); Jiajun Liang (Megvii); Osamu Yoshie (Waseda University) |
69 | Learning-based Point Cloud Registration for 6D Object Pose Estimation in the Real World | Zheng Dang (EPFL)*; Lizhou Wang (Xi’an Jiaotong University); Yu Guo (School of Software Engineering, Xi’an Jiaotong University); Mathieu Salzmann (EPFL) |
74 | AvatarPoser: Articulated Full-Body Pose Tracking from Sparse Motion Sensing | Jiaxi Jiang (ETH Zurich)*; Paul Streli (ETH Zurich); Huajian Qiu (EPFL); Andreas R Fender (ETH Zurich); Larissa Laich (Facebook Reality Labs); Patrick Snape (Meta); Christian Holz (ETH Zürich) |
75 | Knowledge Condensation Distillation | chenxin li (Xiamen University)*; Mingbao Lin (Xiamen University, China); Zhiyuan Ding (Xiamen University); Nie Lin (Hunan University); Yihong Zhuang (Xiamen University); Yue Huang (Xiamen University); Xinghao Ding (Xiamen University); Liujuan Cao (Xiamen University) |
83 | CAR: Class-aware Regularizations for Semantic Segmentation | Ye Huang (University of Technology Sydney)*; Di Kang (Tencent); Liang Chen (Fujian Normal University); Xuefei Zhe (Tencent AI lab); Wenjing Jia (University of Technology Sydney); Linchao Bao (Tencent AI Lab); Xiangjian He (University of Nottingham Ningbo China) |
86 | Style-Hallucinated Dual Consistency Learning for Domain Generalized Semantic Segmentation | Yuyang Zhao (National University of Singapore)*; Zhun Zhong (University of Trento); Na Zhao (NUS); Nicu Sebe (University of Trento); Gim Hee Lee (National University of Singapore) |
88 | Reducing Information Loss for Spiking Neural Networks | Yufei Guo (The Second Academy of China Aerospace Science and Industry Corporation)*; Yuanpei Chen (X LAB,The Second Academy of CASIC,Beijing); Liwen Zhang (X Lab, the Second Academy of CASIC, Beijing); YingLei Wang (CASIC); Xiaode Liu (X Lab, The Second Academy of China Aerospace Science and Industry Corporation); Xinyi Tong (The Second Academy of China Aerospace Science and Industry Corporation); Yuanyuan Ou (Chongqing University); Xuhui Huang (X Lab, The Second Academy of CASIC); Zhe Ma (Xlab, the Second Academy of CASIC, Beijing) |
95 | Real-Time Intermediate Flow Estimation for Video Frame Interpolation | Zhewei Huang (MEGVII)*; Tianyuan Zhang (Carnegie Mellon University); Wen Heng (Megvii inc.); Boxin Shi (Peking University); Shuchang Zhou (MEGVII Technology) |
101 | Class-incremental Novel Class Discovery | Subhankar Roy (University of Trento); Mingxuan Liu (University of Trento); Zhun Zhong (University of Trento)*; Nicu Sebe (University of Trento); Elisa Ricci (University of Trento) |
103 | PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation | Jing He (Xiamen university)*; Yiyi Zhou (Xiamen University); Qi Zhang (Tencent); Jun Peng (Xiamen University); Yunhang Shen (Xiamen University); Xiaoshuai Sun (Xiamen University); Chao Chen (Youtu Laboratory); Rongrong Ji (Xiamen University, China) |
107 | Minimal Neural Atlas: Parameterizing Complex Surfaces with Minimal Charts and Distortion | Weng Fei Low (National University of Singapore)*; Gim Hee Lee (National University of Singapore) |
121 | Contrastive Prototypical Network with Wasserstein Confidence Penalty | Haoqing Wang (Peking University)*; Zhi-Hong Deng (Peking University) |
123 | Privacy-Preserving Face Recognition with Learnable Privacy Budgets in Frequency Domain | Jiazhen Ji (Tencent)*; Huan Wang (Xiamen University); Yuge Huang (Tencent YouTu); Jiaxiang Wu (Tencent); Xingkun Xu (Tencent); Shouhong Ding (Tencent); ShengChuan Zhang (Xiamen University); Liujuan Cao (Xiamen University); Rongrong Ji (Xiamen University, China) |
127 | An End-to-End Transformer Model for Crowd Localization | Dingkang Liang (Huazhong University of Science and Technology)*; Wei Xu (Beijing University of Posts and Telecommunications); Xiang Bai (Huazhong University of Science and Technology) |
132 | Deformable Feature Aggregation for Dynamic Multi-Modal 3D Object Detection | Zehui Chen (University of Science and Technology of China); Zhenyu Li (Harbin Institute of Technology); Shiquan Zhang (SenseTime Research); Liangji Fang (Sensetime Research); Qinhong Jiang (SenseTime Research; Shanghai AI Laboratory); Feng Zhao (University of Science and Technology of China)* |
140 | Masked Generative Distillation | Zhendong Yang (Graduate school at ShenZhen,Tsinghua university)*; Zhe Li (Bytedance Inc.); Shao Mingqi (Graduate school at ShenZhen, Tsinghua university); Dachuan Shi (Graduate school at ShenZhen, Tsinghua University); Zehuan Yuan (Bytedance.Inc); Chun Yuan (Graduate school at ShenZhen,Tsinghua university) |
145 | Saliency Hierarchy Modeling via Generative Kernels for Salient Object Detection | Wenhu Zhang (Zhejiang University)*; Liangli Zheng (Zhejiang University); Huanyu Wang (Zhejiang University); Xintian Wu (Zhejiang University); Xi Li (Zhejiang University) |
154 | Tip-Adapter: Training-free Adaption of CLIP for Few-shot Classification | Renrui Zhang (Shanghai AI Lab)*; Zhang Wei (Shanghai AI-Lab); Rongyao Fang (Chinese University of Hong Kong); Peng Gao (Chinese university of hong kong); Kunchang Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jifeng Dai (SenseTime); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Hongsheng Li (The Chinese University of Hong Kong) |
160 | Temporal Lift Pooling for Continuous Sign Language Recognition | Lianyu Hu (Tianjin University)*; Liqing Gao (College of Intelligence and Computing,Tianjin University); Zekang Liu (College of Intelligence and Computing, Tianjin University); Wei Feng (College of Intelligence and Computing, Tianjin University, China) |
167 | MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes | Yang Jiao (Fudan University)*; Shaoxiang Chen (Fudan University); Zequn Jie (Meituan inc.); Jingjing Chen (Fudan University); Lin Ma (Meituan); Yu-Gang Jiang (Fudan University) |
171 | JPEG Artifacts Removal via Contrastive Representation Learning | Xi Wang (University of Science and Technology of China); Xueyang Fu (University of Science and Technology of China)*; Yurui Zhu (University of Science and Technology of China); Zheng-Jun Zha (University of Science and Technology of China) |
180 | Tackling Long-Tailed Category Distribution Under Domain Shifts | Xiao Gu (Imperial College London)*; Yao Guo (Shanghai Jiao Tong Univerisity); Zeju Li (Imperial College London); Jianing Qiu (Imperial College London); DOU QI (The Chinese University of Hong Kong); Yuxuan Liu (Institude of Medical Robotics, Shanghai Jiao Tong University); Benny P L Lo (Imperial College London); Guang-Zhong Yang (SJTU) |
184 | WeLSA: Learning To Predict 6D Pose From Weakly Labeled Data Using Shape Alignment | Shishir Reddy Vutukur (TU Munich / Siemens Technology)*; Ivan Shugurov (TU Munich / Siemens Corporate Technology); Benjamin Busam (Technical University of Munich); ANDREAS HUTTER (Siemens Corporate Technology, Germany); Slobodan Ilic (TUM) |
190 | Fine-grained Data Distribution Alignment for Post-Training Quantization | Yunshan Zhong (xiamen university)*; Mingbao Lin (Xiamen University, China); Mengzhao Chen (Xiamen University); Ke Li (Tencent); Yunhang Shen (Xiamen University); Fei Chao (Xiamen University); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd); Rongrong Ji (Xiamen University, China) |
192 | Few-shot Single-view 3D Reconstruction with Memory Prior Contrastive Network | Zhen Xing (Fudan University)*; Yijiang Chen (Fudan University); Zhixin Ling (Fudan University); Xiangdong Zhou (Fudan University); Yu Xiang (The University of Texas at Dallas) |
194 | ExtrudeNet: Unsupervised Inverse Sketch-and-Extrude for Shape Parsing | Daxuan Ren (Nanyang Technological University)*; Jianmin Zheng (Nanyang Technological University); Jianfei Cai (Monash University); jiatong j li (Sensetime); Junzhe Zhang (Nanyang Technological University) |
196 | P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose Estimation | Wenkang Shan (Peking University)*; Zhenhua Liu (Peking University); xinfeng zhang (University of Chinese Academy of Sciences); Shanshe Wang (Peking University); Siwei Ma (Peking University, China); Wen Gao (PKU) |
205 | Contrast-Phys: Unsupervised Video-based Remote Physiological Measurement via Spatiotemporal Contrast | Zhaodong Sun (University of Oulu)*; Xiaobai Li (University of Oulu) |
222 | Panoptic Scene Graph Generation | Jingkang Yang (Nanyang Technological University)*; Yi Zhe Ang (Nanyang Technological University); Zujin GUO (Nanyang Technological University); Kaiyang Zhou (Nanyang Technological University); Wayne Zhang (SenseTime Research); Ziwei Liu (Nanyang Technological University) |
247 | StyleSwap: Style-Based Generator Empowers Robust Face Swapping | Zhiliang Xu (Baidu Inc.); Hang Zhou (The Chinese University of Hong Kong)*; Zhibin Hong (Baidu Inc.); Ziwei Liu (Nanyang Technological University); Jiaming Liu (Baidu Inc.); zhizhi guo (Department of Computer Vision Technology (VIS), Baidu Inc); Junyu Han (Baidu Inc.); jingtuo liu (baidu); Errui Ding (Baidu Inc.); Jingdong Wang (Baidu) |
248 | Boosting Event Stream Super-Resolution with A Recurrent Neural Network | Wenming Weng (University of Science and Technology of China)*; Yueyi Zhang (University of Science and Technology of China); Zhiwei Xiong (University of Science and Technology of China) |
249 | Unknown-Oriented Learning for Open Set Domain Adaptation | jie liu (City University of Hong Kong)*; Xiaoqing Guo (City University of Hong Kong); Yixuan YUAN (City University of Hong Kong) |
255 | Unpaired Deep Image Dehazing Using Contrastive Disentanglement Learning | Xiang Chen (Nanjing University of Science and Technology)*; Zhentao Fan (Shenyang Aerospace University); Pengpeng Li (Dalian Polytechnic University); Longgang Dai (Shenyang Aerospace University); Caihua Kong (Shenyang Aerospace University); Zhuoran Zheng (Nanjing University of Science and Technology ); Yufeng Huang (Shenyang Aerospace University); Yufeng Li (Shenyang Aerospace University) |
263 | Check and Link: Pairwise Lesion Correspondence Guides Mammogram Mass Detection | Ziwei Zhao (Peking University)*; Dong Wang (Peking University); Yihong Chen (Peking University); Ziteng Wang (Yizhun-ai); Liwei Wang (Peking University) |
265 | Generative Subgraph Contrast for Self-Supervised Graph Representation Learning | yuehui han (njust)*; Le Hui (Nanjing University of Science and Technology); Haobo Jiang (Nanjing University of Science and Technology); Jianjun Qian (Nanjing University of Science and Technology); Jin Xie (Nanjing University of Science and Technology) |
267 | DVS-Voltmeter: Stochastic Process-based Event Simulator for Dynamic Vision Sensors | SongNan Lin (Nanyang Technological University)*; Ye Ma (McGill University); Zhenhua Guo (Aliababa Group); Bihan Wen (Nanyang Technological University) |
268 | Prototype-Guided Continual Adaptation for Class-Incremental Unsupervised Domain Adaptation | Hongbin Lin (South China University of Technology); Yifan Zhang (National University of Singapore); Zhen Qiu (South China University of Technology); Shuaicheng Niu (South China University of Technology); Chuang Gan (MIT-IBM Watson AI Lab); Yanxia Liu (South China University of Technology); Mingkui Tan (South China University of Technology)* |
283 | SiRi: A Simple Selective Retraining Mechanism for Transformer-based Visual Grounding | Mengxue Qu (Beijing Jiaotong University)*; Yu Wu (Princeton University); Wu Liu (AI Research of JD.com); Qiqi Gong (BeijingJiaotong University); Xiaodan Liang (Sun Yat-sen University); Olga Russakovsky (Princeton University); Yao Zhao (Beijing Jiaotong University); Yunchao Wei (UTS) |
287 | Benchmarking Omni-Vision Representation through the Lens of Visual Realms | Yuanhan Zhang (Nanyang Technological University); Zhenfei Yin (Sensetime); Jing Shao (Sensetime); Ziwei Liu (Nanyang Technological University)* |
291 | Paint2Pix: Interactive Painting based Progressive Image Synthesis and Editing | Jaskirat Singh (Australian National University)*; Liang Zheng (Australian National University); Cameron Y Smith (Adobe Research); Jose Echevarria (Adobe System Inc.) |
296 | BEAT: A Large-Scale Semantic and Emotional Multi-Modal Dataset for Conversational Gestures Synthesis | Haiyang Liu (The University of Tokyo)*; Zihao Zhu (Keio University); Naoya Iwamoto (Huawei Technologies Japan K.K.); Yichen Peng (Japan Advanced Institute of Science and Technology); Zhengqing Li (Huawei Japan K.K.); YOU ZHOU (Tokyo Research Center, Huawei); Elif Bozkurt (Huawei Turkey R&D Center, Istanbul, Turkey); Bo Zheng (Huawei) |
300 | Active Pointly-Supervised Instance Segmentation | Chufeng Tang (Tsinghua University)*; Lingxi Xie (Huawei Inc.); Gang Zhang (Tsinghua University); xiaopeng zhang (Huawei Cloud EI ); Qi Tian (Huawei Cloud & AI); Xiaolin Hu (Tsinghua University) |
303 | DecoupleNet: Decoupled Network for Domain Adaptive Semantic Segmentation | Xin Lai (The Chinese University of Hong Kong)*; Zhuotao Tian (The Chinese University of Hong Kong); Xiaogang XU (The Chinese University of Hong Kong); Yingcong Chen (Hong Kong University of Science and Technology); Shu Liu (SmartMore); Hengshuang Zhao (University of Oxford); Liwei Wang (CUHK); Jiaya Jia (Chinese University of Hong Kong) |
315 | ByteTrack: Multi-Object Tracking by Associating Every Detection Box | Yifu Zhang (Huazhong University of Science and Technology); Peize Sun (The University of Hong Kong); Yi Jiang (Bytedance); Dongdong Yu (ByteDance Inc.); Fucheng Weng (Huazhong University of Science and Technology); Zehuan Yuan (Bytedance.Inc); Ping Luo (The University of Hong Kong); Wenyu Liu (Huazhong University of Science and Technology); Xinggang Wang (Huazhong University of Science and Technology)* |
317 | Robust Multi-Object Tracking by Marginal Inference | Yifu Zhang (Huazhong University of Science and Technology); Chunyu Wang (Microsoft Research asia); Xinggang Wang (Huazhong University of Science and Technology)*; Wenjun Zeng (EIT Institute for Advanced Study); Wenyu Liu (Huazhong University of Science and Technology) |
322 | Doubly-Fused ViT: Fuse Information from Vision Transformer Doubly with Local Representation | Li Gao (Wuhan University)*; Dong Nie (UNC); Bo Li (Alibaba Group); Xiaofeng Ren (alibaba group) |
326 | CATRE: Iterative Point Clouds Alignment for Category-level Object Pose Refinement | Xingyu Liu (Tsinghua University); Gu Wang (JD.COM); Yi Li (University of Washington); Xiangyang Ji (Tsinghua University)* |
334 | Spatiotemporal Self-attention Modeling with Temporal Patch Shift for Action Recognition | Wangmeng Xiang (The Hong Kong Polytechnic University)*; Chao Li (Alibaba); Biao Wang (Alibaba); Xihan Wei (Alibaba); Xian-Sheng Hua (Damo Academy, Alibaba Group); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
339 | Efficient Long-Range Attention Network for Image Super-resolution | Xindong Zhang (The Hong Kong Polytechnic University)*; Hui Zeng (OPPO); Shi Guo (The Hong Kong Polytechnic University); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
343 | DID-M3D: Decoupling Instance Depth for Monocular 3D Object Detection | Liang Peng (ZJU)*; Xiaopei Wu (ZhejiangUniversity); Zheng Yang (FABU); Haifeng Liu (ZJU); Deng Cai (ZJU) |
349 | FlowFormer: A Transformer Architecture for Optical Flow | Zhaoyang Huang (Chinese University of HongKong)*; Xiaoyu Shi (CUHK); Chao Zhang (Samsung Telecommunication Research Institute); Qiang Wang (Samsung Research China, Beijing); Ka Chun Cheung (Nvidia); Hongwei Qin (Sensetime); Jifeng Dai (SenseTime); Hongsheng Li (The Chinese University of Hong Kong) |
357 | Coarse-to-Fine Sparse Transformer for Hyperspectral Image Reconstruction | Yuanhao Cai (Tsinghua Univisity, Tsinghua Shenzhen International Graduate School); Jing Lin (Tsinghua Univisity, Tsinghua Shenzhen International Graduate School)*; Xiaowan Hu (Tsinghua Univisity, Tsinghua Shenzhen International Graduate School); Haoqian Wang (Tsinghua Shenzhen International Graduate School, Tsinghua University); Xin Yuan (Westlake University); Yulun Zhang (ETH Zurich); Radu Timofte (University of Wurzburg & ETH Zurich); Luc Van Gool (ETH Zurich) |
358 | An Embedded Feature Whitening Approach to Deep Neural Network Optimization | Hongwei Yong (The Hong Kong Polytechnic University)*; Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
361 | Optimization over Disentangled Encoding: Unsupervised Cross-Domain Point Cloud Completion via Occlusion Factor Manipulation | Jingyu Gong (Shanghai Jiao Tong University)*; Fengqi Liu (Shanghai Jiao Tong University); Jiachen Xu (Shanghai Jiao Tong University); Min Wang (Sensetime Group); Xin Tan (Shanghai Jiao Tong University); Zhizhong Zhang (East China Normal University); Ran Yi (Shanghai Jiao Tong University); Haichuan Song (East China Normal University); Yuan Xie (East China Normal University); Lizhuang Ma (Shanghai Jiao Tong University) |
362 | Source-Free Domain Adaptation with Contrastive Domain Alignment and Self-supervised Exploration for Face Anti-Spoofing | Yuchen Liu (Shanghai Jiao Tong university)*; Yabo Chen (Shanghai Jiao Tong University ); Wenrui Dai (Shanghai Jiao Tong University); Mengran Gou (Qualcomm); Chun-Ting Huang (Qualcomm); Hongkai Xiong (Shanghai Jiao Tong University) |
368 | MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection | Xuesong Chen (The Chinese University of Hong Kong)*; Shaoshuai Shi (MPI Informatics); Benjin Zhu (MEGVII); Ka Chun Cheung (Nvidia); Hang Xu (Huawei Noah’s Ark Lab); Hongsheng Li (The Chinese University of Hong Kong) |
379 | SdAE: Self-distillated Masked Autoencoder | Yabo Chen (Shanghai Jiao Tong University ); Yuchen Liu (Shanghai Jiao Tong university); Dongsheng Jiang (Huawei Cloud & AI); xiaopeng zhang (Huawei Cloud EI )*; Wenrui Dai (Shanghai Jiao Tong University); Hongkai Xiong (Shanghai Jiao Tong University); Qi Tian (Huawei Cloud & AI) |
383 | A Transformer-based Decoder for Semantic Segmentation with Multi-level Context Mining | Bowen Shi (Shanghai Jiao Tong University)*; Dongsheng Jiang (Huawei Cloud & AI); xiaopeng zhang (Huawei Cloud EI ); Han Li (Shanghai Jiao Tong University); Wenrui Dai (Shanghai Jiao Tong University); Junni Zou (Shanghai Jiao Tong University); Hongkai Xiong (Shanghai Jiao Tong University); Qi Tian (Huawei Cloud & AI) |
399 | Graph-constrained Contrastive Regularization for Semi-weakly Volumetric Segmentation | Simon Reiß (Karlsruhe Institute of Technology)*; Constantin Marc Seibold (Karlsruhe Institute of Technology); Alexander Freytag (Carl Zeiss AG, Jena, Germany); Rodner Erik (University of Applied Sciences Berlin); Rainer Stiefelhagen (Karlsruhe Institute of Technology) |
401 | Improving Vision Transformers by Revisiting High-frequency Components | Jiawang Bai (Tsinghua University)*; Li Yuan (Peking University); Shu-Tao Xia (Tsinghua University); Shuicheng Yan (Sea AI Labs); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent) |
405 | Adaptive Co-Teaching for Unsupervised Monocular Depth Estimation | Weisong Ren (Dalian University of Technology); Lijun Wang (Dalian University of Technology)*; Yongri Piao (Dalian University of Technology); Miao Zhang (Dalian University of Technology); Huchuan Lu (Dalian University of Technology); Ting Liu (Alibaba) |
408 | FurryGAN: High quality foreground-aware image synthesis | Jeongmin Bae (Yonsei University); Mingi Kwon (Yonsei University); Youngjung Uh (Yonsei University)* |
433 | An Efficient Spatio-Temporal Pyramid Transformer for Action Detection | Yuetian Weng (Monash University); Zizheng Pan (Monash University); Mingfei Han (Monash University; DATA61, CSIRO); Xiaojun Chang (University of Technology Sydney); Bohan Zhuang (Monash University)* |
434 | LocVTP: Video-Text Pre-training for Temporal Localization | Meng Cao (Peking University); Tianyu Yang (Tencent AI Lab); Junwu Weng (Tencent AI Lab); Can Zhang (Peking University); Jue Wang (Tencent AI Lab); Yuexian Zou (Peking University)* |
444 | Fusing Local Similarities for Retrieval-based 3D Orientation Estimation of Unseen Objects | Chen Zhao (EPFL)*; Yinlin Hu (EPFL); Mathieu Salzmann (EPFL) |
458 | Online Segmentation of LiDAR Sequences: Dataset and Algorithm | Romain Loiseau (École des ponts ParisTech)*; Mathieu Aubry (École des ponts ParisTech); loic landrieu (IGN) |
460 | MVSTER: Epipolar Transformer for Efficient Multi-View Stereo | Xiaofeng Wang (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences)*; Zheng Zhu (Tsinghua University); Guan Huang (Institute of Automation, Chinese Academy of Sciences); Fangbo Qin (Institute of Automation, Chinese Academy of Sciences); Yun Ye (XForwardAI Technology Co., Ltd, Beijing, China); Yijia He (Beijing Kuaishou Technology Co., Ltd); Xu Chi (Phigent Robotics); Xingang Wang (Institute of Automation, CAS) |
463 | Unsupervised Learning of 3D Semantic Keypoints with Mutual Reconstruction | Haocheng Yuan (Northwestern Polytechnical University); Chen Zhao (EPFL); Shichao Fan (Northwestern Polytechnical University); Jiaxi Jiang (Northwestern Polytechnical University); Jiaqi Yang (Northwestern Polytechnical University)* |
482 | Generalizable Medical Image Segmentation via Random Amplitude Mixup and Domain-Specific Image Restoration | Ziqi Zhou (Nanjing University)*; Lei Qi (Southeast University); Yinghuan Shi (Nanjing University) |
499 | Demystifying Unsupervised Semantic Correspondence Estimation | Mehmet Aygün (The University of Edinburgh)*; Oisin Mac Aodha (University of Edinburgh) |
513 | Learning Shadow Correspondence for Video Shadow Detection | Xinpeng Ding (The Hong Kong University of Science and Technology); Jingwen Yang (The Hong Kong University of Science and Technology); Xiaowei Hu (Shanghai AI Laboratory); Xiaomeng Li (The Hong Kong University of Science and Technology)* |
514 | PolarMOT: How far can geometric relations take us in 3D multi-object tracking? | Aleksandr Kim (Technical University of Munich); Guillem Brasó (TUM); Aljosa Osep (TUM Munich)*; Laura Leal-Taixé (TUM) |
516 | Few-Shot End-to-End Object Detection via Constantly Concentrated Encoding across Heads | Jiawei Ma (Columbia University)*; Guangxing Han (Columbia University); Shiyuan Huang (Columbia University); Yuncong Yang (Columbia University); Shih-Fu Chang (Columbia University) |
525 | MVDECOR: Multi-view Dense Correspondence Learning for Fine-grained 3D Segmentation | Gopal Sharma (University of Massachusetts Amherst)*; Kangxue Yin (NVIDIA); Subhransu Maji (University of Massachusetts, Amherst); Evangelos Kalogerakis (UMass Amherst); Or Litany (NVIDIA); Sanja Fidler (University of Toronto, NVIDIA) |
537 | Implicit Neural Representations for Image Compression | Yannick Strümpler (ETH Zürich)*; Janis Postels (ETH Zurich); Ren Yang (ETH Zurich); Luc Van Gool (ETH Zurich); Federico Tombari (Google, TU Munich) |
541 | Cross-modal Prototype Driven Network for Radiology Report Generation | Jun Wang (University of Warwick)*; Abhir Bhalerao (University of Warwick); Yulan He (University of Warwick) |
556 | Scene Text Recognition with Permuted Autoregressive Sequence Models | Darwin Bautista (University of the Philippines)*; Rowel Atienza (University of the Philippines) |
568 | XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model | Ho Kei Cheng (University of Illinois Urbana-Champaign)*; Alexander Schwing (UIUC) |
570 | SUPR: A Sparse Unified Part-Based Human Body Model | Ahmed A A Osman (Max Planck Institute for Intelligent Systems)*; Michael J. Black (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Dimitrios Tzionas (University of Amsterdam) |
575 | SCAM! Transferring humans between images with Semantic Cross Attention Modulation | Nicolas Dufour (ENPC)*; David Picard (ENPC); Vicky Kalogeiton (Ecole Polytechnique) |
583 | Q-FW: A Hybrid Classical-Quantum Frank-Wolfe for Quadratic Binary Optimization | Alp Yurtsever (Umeå University); Tolga Birdal (TU Munich)*; Vladislav Golyanik (MPI for Informatics) |
584 | Revisiting Point Cloud Simplification: A Learnable Feature Preserving Approach | Rolandos Alexandros Potamias (Imperial College London)*; Giorgos Bouritsas (Imperial College London); Stefanos Zafeiriou (Imperial College London) |
599 | Neural Architecture Search for Spiking Neural Networks | Youngeun Kim (Yale University)*; Yuhang Li (Yale University); Hyoungseob Park (Yale University); Yeshwanth Venkatesha (Yale university); Priyadarshini Panda (Yale University) |
601 | Neuromorphic Data Augmentation for Training Spiking Neural Networks | Yuhang Li (Yale University)*; Youngeun Kim (Yale University); Hyoungseob Park (Yale University); Tamar Geller (Yale University); Priyadarshini Panda (Yale University) |
602 | RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild | Jason Y Zhang (Carnegie Mellon University)*; Deva Ramanan (Carnegie Mellon University); Shubham Tulsiani (Carnegie Mellon University) |
609 | Human Trajectory Prediction via Neural Social Physics | Jiangbei Yue (Leeds University); Dinesh Manocha (University of Maryland at College Park)*; He Wang (Leeds University) |
615 | Explicit Occlusion Reasoning for Multi-person 3D Human Pose Estimation | Qihao Liu (Johns Hopkins University); Yi Zhang (Johns Hopkins University); Song Bai (University of Oxford); Alan Yuille (Johns Hopkins University)* |
626 | R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis | Huan Wang (Northeastern University); Jian Ren (Snap Inc.); Zeng Huang (Snap Inc.)*; Kyle B Olszewski (Snap Inc.); Menglei Chai (Snap Inc.); YUN FU (Northeastern University); Sergey Tulyakov (Snap Inc) |
629 | Towards Open Set Video Anomaly Detection | Yuansheng Zhu (Rochester Institute of Technology)*; Wentao Bao (Rochester Institute of Technology); Qi Yu (Rochester Institute of Technology) |
634 | Object-Compositional Neural Implicit Surfaces | Qianyi Wu (Monash University)*; Xian Liu (The Chinese University of Hong Kong); Yuedong Chen (Monash University); Kejie Li (University of Oxford); Chuanxia Zheng (Monash University); Jianfei Cai (Monash University); Jianmin Zheng (Nanyang Technological University) |
636 | Sem2NeRF: Converting Single-View Semantic Masks to Neural Radiance Fields | Yuedong Chen (Monash University)*; Qianyi Wu (Monash University); Chuanxia Zheng (Monash University); Tat-Jen Cham (Nanyang Technological University); Jianfei Cai (Monash University) |
641 | WaveGAN: Frequency-aware GAN for High-Fidelity Few-shot Image Generation | Mengping Yang (East China University of Science and Technology)*; Zhe Wang ( East China University of Science and Technology ); Ziqiu Chi (East China University Of Science and Technology); Wenyi Feng (east China university of science and technology) |
642 | Class-Agnostic Object Counting Robust to Intraclass Diversity | Shenjian Gong (Nanjing University of Science and Technology)*; Shanshan Zhang (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology); Dengxin Dai (MPI for Informatics ); Bernt Schiele (MPI Informatics) |
650 | TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts | Chuan Guo (University of Alberta)*; Xinxin Zuo (University of Alberta); Sen Wang (University of Alberta); Li Cheng (ECE dept., University of Alberta) |
652 | Self-Distillation for Robust LiDAR Semantic Segmentation in Autonomous Driving | Jiale Li (Zhejiang University); Hang Dai (Mohamed bin Zayed University of Artificial Intelligence)*; Yong Ding (Zhejiang University) |
654 | Semi-Supervised Monocular 3D Object Detection by Multi-View Consistency | Qing Lian (Hong Kong University of Science and Technology )*; Yanbo XU (The Hong Kong University of Science and Technology); Weilong Yao (Shanghai Xiantu Intelligent Technology Co., Ltd.); Yingcong Chen (Hong Kong University of Science and Technology); Tong Zhang (Hong Kong University of Science and Technology) |
655 | Lidar Point Cloud Guided Monocular 3D Object Detection | Liang Peng (ZJU)*; Fei Liu (Zhejiang University); Zhengxu Yu (Zhejiang University); Senbo Yan (Zhejiang University); Dan Deng (FABU); Zheng Yang (FABU); Haifeng Liu (ZJU); Deng Cai (ZJU) |
656 | Structural Causal 3D Reconstruction | Weiyang Liu (University of Cambridge)*; Zhen Liu (Mila, University of Montreal); Liam Paull (Université de Montréal); Adrian Weller (University of Cambridge); Bernhard Schölkopf (MPI for Intelligent Systems, Tübingen) |
671 | KD-MVS: Knowledge Distillation Based Self-supervised Learning for Multi-view Stereo | Yikang Ding (Tsinghua University)*; Qingtian Zhu (Peking University); Xiangyue Liu (Beihang University); Wentao Yuan (Peking Universtiy); Haotian Zhang (Megvii); Chi Zhang (Megvii Inc.) |
685 | When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition | Bohan Li (Huazhong University of Science and Technology)*; Ye Yuan (Tomorrow Advancing Life); Dingkang Liang (Huazhong University of Science and Technology); Xiao Liu (Tencent); zhilong ji (Tomorrow Advancing Life); Jinfeng Bai (TAL); Wenyu Liu (Huazhong University of Science and Technology); Xiang Bai (Huazhong University of Science and Technology) |
689 | Shape Matters: Deformable Patch Attack | Zhaoyu Chen (Fudan University); Bo Li (Nanjing University)*; Shuang Wu (Tencent); Jianghe Xu (Tencent Youtu Lab); Shouhong Ding (Tencent); Wenqiang Zhang (Fudan University) |
690 | PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection | Han Wang (Shanghai Jiao Tong University)*; Jun Tang (hikvision); Xiaodong Liu (Hikvision); Shanyan Guan (Shanghai Jiao Tong University); Rong Xie (Shanghai Jiao Tong University); Li Song (Shanghai Jiao Tong University) |
694 | BEVFormer: Learning Bird-Eye-View Representations from Multi-View Images via Spatiotemporal Transformer | Zhiqi Li (Nanjing University); Wenhai Wang (Nanjing University); Hongyang Li (SenseTime); Enze Xie (The University of Hong Kong); Chonghao Sima (Purdue University); Tong Lu (Nanjing University); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jifeng Dai (SenseTime)* |
696 | Detecting Tampered Scene Text in the Wild | YuXin Wang (University of Science and Technology of China)*; Hongtao Xie (University of Science and Technology of China); Mengting Xing (University of Science and Technology of China); Jing Wang (Huawei Cloud & AI); Shenggao Zhu (Huawei); Yongdong Zhang (University of Science and Technology of China) |
702 | Projective Parallel Single-pixel Imaging to Overcome Global Illumination in 3D Structure Light Scanning | Yuxi Li (Beihang University)*; Huijie Zhao (Beihang University); Hongzhi Jiang (Beihang University); Xudong Li (Beihang University) |
709 | CelebV-HQ: A Large-Scale Video Facial Attributes Dataset | Hao Zhu (SenseTime Research)*; Wayne Wu (SenseTime Research); Wentao Zhu (Peking University); Liming Jiang (Nanyang Technological University); Siwei Tang (Sensetime research); Li Zhang (Sensetime); Ziwei Liu (Nanyang Technological University); Chen Change Loy (Nanyang Technological University) |
710 | Open-world Semantic Segmentation for LIDAR Point Clouds | Jun CEN (The Hong Kong University of Science and Technology)*; Peng YUN (Hong Kong University of Science and Technology); Shiwei Zhang (DAMO Academy, Alibaba Group); Junhao CAI (HKUST); Di LUAN (Hong Kong University of Science and Technology); Mingqian Tang (Alibaba Group); Michael Yu Wang (HKUST); Ming Liu (HKUST) |
721 | Burn After Reading: Online Adaptation for Cross-domain Streaming Data | Luyu Yang (University of Maryland, College Park)*; Mingfei Gao (Apple); Zeyuan Chen (Salesforce Research); Ran Xu (Salesforce Research); Abhinav Shrivastava (University of Maryland); Chetan Ramaiah (Salesforce Research) |
728 | CLOSE: Curriculum Learning On the Sharing Extent Towards Better One-shot NAS | Zixuan Zhou (Tsinghua University)*; Xuefei Ning (Tsinghua University); Yi Cai (Tsinghua University); Jiashu Han (None); Yiping Deng (Huawei); Yuhan Dong (Tsinghua University); Huazhong Yang (Tsinghua University); Yu Wang (Tsinghua University) |
734 | RigNet: Repetitive Image Guided Network for Depth Completion | Zhiqiang Yan (Nanjing University of Science and Tenchnology)*; Kun Wang (Nanjing University of Science and Technology); Xiang Li (Nanjing University of Science and Technology); Zhenyu Zhang (Tencent); Jun Li (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology) |
744 | Streamable Neural Fields | Junwoo Cho (Sungkyunkwan University)*; Seungtae Nam (Sungkyunkwan University); Daniel Rho (Sungkyunkwan University); Jong Hwan Ko (Sungkyunkwan University); Eunbyung Park (Sungkyunkwan University) |
755 | 2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds | Xu Yan (The Chinese University of Hong Kong, Shenzhen); Jiantao Gao (Shanghai University); Chaoda Zheng (The Chinese University of Hong Kong, Shen Zhen); chao zheng (Tencent); Ruimao Zhang (The Chinese University of Hong Kong, Shenzhen); Shuguang Cui (The Chinese University of Hong Kong, Shenzhen ); Zhen Li (The Chinese University of Hong Kong, Shenzhen)* |
762 | Where to Focus: Investigating Hierarchical Attention Relationship for Fine-Grained Visual Classification | Yang Liu (Beihang University); Lei Zhou (Beihang University)*; Pengcheng Zhang (Beihang University); Xiao Bai (Beihang University); Lin Gu (RIKEN,AIP / The University of Tokyo); Xiaohan Yu (Griffith University); Jun Zhou (Griffith University); Hancock Edwin (“University of York, UK”) |
776 | Mind the Gap in Distilling StyleGANs | Guodong Xu (The Chinese University of Hong Kong)*; Yuenan HOU (Shanghai AI Lab); Ziwei Liu (Nanyang Technological University); Chen Change Loy (Nanyang Technological University) |
784 | End-to-End Active Speaker Detection | Juan C Leon (KAUST)*; Moritz Cordes (Leuphana University of Lüneburg); Chen Zhao (KAUST); Bernard Ghanem (KAUST) |
785 | Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing | Haoyue Cheng (Nanjing University); Zhaoyang Liu (SenseTime Research); Hang Zhou (The Chinese University of Hong Kong); Chen Qian (SenseTime); Wayne Wu (SenseTime Research); Limin Wang (Nanjing University)* |
790 | Learn-to-Decompose: Cascaded Decomposition Network for Cross-Domain Few-Shot Facial Expression Recognition | Xinyi Zou (Xiamen University); Yan Yan (Xiamen University)*; Jing-Hao Xue (University College London); Si Chen (Xiamen University of Technology); Hanzi Wang (Xiamen University) |
798 | Learning with Recoverable Forgetting | Jingwen Ye (National University of Singapore)*; Fu Yifang (National University of Singapore); Jie Song (Zhejiang University); Xingyi Yang (National University of Singapore); Songhua Liu (National University of Singapore); Xin Jin (University of Science and Technology of China); Mingli Song (Zhejiang University); Xinchao Wang (National University of Singapore) |
800 | Masked Autoencoders for Point Cloud Self-supervised Learning | Yatian Pang (National University of Singapore); Wenxiao Wang (State Key Lab of CAD&CG, Zhejiang University); Francis EH Tay (National University of Singapore); Wei Liu (Tencent); Yonghong Tian (Peking University); Li Yuan (Peking University)* |
803 | RamGAN: Region Attentive Morphing GAN for Region-Level Makeup Transfer | Jianfeng Xiang (ShenZhen University)*; Junliang Chen (Shenzhen University); Wenshuang Liu (Shenzhen University); Xianxu Hou (Shenzhen University); Linlin Shen (Shenzhen University) |
807 | Efficient One Pass Self-distillation with Zipf’s Label Smoothing | Jiajun Liang (Megvii)*; Linze Li (MEGVII Technology); Zhaodong Bing (Megvii Technology); Borui Zhao (Megvii Technology); Yao Tang (Peking University); Bo Lin (MEGVII Technology); Haoqiang Fan (Megvii Inc(face++)) |
812 | DaViT: Dual Attention Vision Transformers | Mingyu Ding (The University of Hong Kong)*; Bin Xiao (Microsoft); Noel C Codella (Microsoft); Ping Luo (The University of Hong Kong); Jingdong Wang (Baidu); Lu Yuan (Microsoft) |
815 | OneFace: One Threshold for All | Jiaheng Liu (Beihang University); zhipeng yu (University of Chinese Academy of Sciences); Haoyu Qin (SenseTime); Yichao Wu (Sensetime Group Limited); Ding Liang (Sensetime Group Limited); Gangming Zhao (The University of Hong Kong); Ke Xu (Beihang University)* |
820 | Semantic-Sparse Colorization Network for Deep Exemplar-based Colorization | Yunpeng Bai (Tsinghua University )*; Chao Dong (SIAT); Zenghao Chai (Tsinghua University); Andong Wang (Tsinghua University); Zhengzhuo Xu (Tsinghua University); Chun Yuan (Graduate school at ShenZhen,Tsinghua university) |
822 | Vibration-based Uncertainty Estimation for Learning from Limited Supervision | Hengtong Hu (Hefei University of Technology)*; Lingxi Xie (Huawei Inc.); Xinyue Huo (University of Science and Technology of China); Richang Hong (HeFei University of Technology); Qi Tian (Huawei Cloud & AI) |
824 | SOS! Self-supervised Learning Over Sets Of Handled Objects In Egocentric Action Recognition | Victor A Escorcia (Samsung AI Center)*; Ricardo Guerrero (Samsung AI Center Cambridge); Xiatian Zhu (Samsung AI Centre); Brais Martinez (Samsung AI Center) |
829 | FADE: Fusing the Assets of Decoder and Encoder for Task-Agnostic Upsampling | Hao Lu (Huazhong University of Science and Technology); Wenze Liu (Huazhong university of science and technology); Hongtao Fu (Huazhong university of Science and Technology); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.)* |
833 | VTC: Improving Video-Text Retrieval with User Comments | Laura Hanu (Unitary)*; James Thewlis (Unitary); Yuki M Asano (University of Amsterdam); Christian Rupprecht (University of Oxford) |
839 | Less than Few: Self-Shot Video Instance Segmentation | Pengwan Yang (University of Amsterdam)*; Yuki M Asano (University of Amsterdam); Pascal Mettes (University of Amsterdam); Cees Snoek (University of Amsterdam) |
841 | End-to-End Visual Editing with a Generatively Pre-Trained Artist | Andrew Brown (University of Oxford)*; Cheng-Yang Fu (Facebook.com); Omkar M Parkhi (Facebook); Tamara Berg (Facebook AI Research); Andrea Vedaldi (University of Oxford / Facebook AI Research) |
852 | COUCH: Towards Controllable Human-chair Interactions | Xiaohan Zhang (University of Tübingen, MPI Informatics); Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Sebastian Starke (University of Edinburgh); Vladimir Guzov (University of Tuebingen); Gerard Pons-Moll (University of Tübingen)* |
859 | MovieCuts: A New Dataset and Benchmark forCut Type Recognition | Alejandro Pardo (KAUST)*; Fabian Caba (Adobe Research); Juan C Leon (KAUST); Ali K Thabet (Facebook); Bernard Ghanem (KAUST) |
877 | High-fidelity GAN Inversion with Padding Space | Qingyan Bai (Tsinghua University)*; Yinghao Xu (Chinese University of Hong Kong); Jiapeng Zhu (HKUST); Weihao Xia (University College London); Yujiu Yang (Tsinghua University); Yujun Shen (Dept. of IE, CUHK) |
893 | LiDAL: Inter-frame Uncertainty Based Active Learning for 3D LiDAR Semantic Segmentation | ZEYU HU (Hong Kong University of Science and Technology)*; Xuyang Bai (HKUST); Runze Zhang (Tencent); Xin Wang (Tencent); Guangyuan Sun (TENCENT); Hongbo Fu (City University of Hong Kong); Chiew-Lan Tai (Hong Kong University of Science & Technology) |
897 | Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning | Jingqun Tang (Ant Group)*; wenming qian (Huazhong University of Science and Technology); Luchuan Song (University of Science and Technology of China); Xiena Dong (Hangzhou Dianzi Universiy); lan li (Whu Han University); Xiang Bai (Huazhong University of Science and Technology) |
912 | Concurrent Subsidiary Supervision for Unsupervised Source-Free Domain Adaptation | Jogendra Nath Kundu (Indian Institute of Science)*; Suvaansh Bhambri (Indian Institute of Science); Akshay R Kulkarni (Indian Institute of Science); Hiran Sarkar (Indian Institute of Science); Varun Jampani (Google); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science) |
913 | Designing One Unified Framework for High-Fidelity Face Reenactment and Swapping | Chao Xu (Zhejiang University)*; Jiangning Zhang (Zhejiang University); Yue Han (Zhejiang University); Guanzhong Tian (Ningbo Research Institute, Zhejiang University); xianfang zeng (Zhejiang University); Ying Tai (Tencent YouTu); Yabiao Wang (Tencent); Chengjie Wang (Tencent; Shanghai Jiao Tong University); Yong Liu (Zhejiang University) |
919 | Category-Level 6D Object Pose and Size Estimation using Self-Supervised Deep Prior Deformation Networks | Jiehong Lin (South China University of Technology)*; Zewei Wei (South China University of Technology); Changxing Ding (South China University of Technology); Kui Jia (South China University of Technology) |
927 | Intrinsic Neural Fields: Learning Functions on Manifolds | Lukas Koestler (Technical University of Munich)*; Daniel Grittner (Technische Universität München); Michael Moeller (University of Siegen); Daniel Cremers (TU Munich); Zorah Laehner (University of Siegen) |
930 | LaMAR: Benchmarking Localization and Mapping for Augmented Reality | Paul-Edouard Sarlin (ETH Zurich); Mihai Dusmanu (ETH Zurich)*; Johannes L Schönberger (Microsoft); Pablo Speciale (Microsoft); Lukas Gruber (Microsoft); Viktor Larsson (Lund University); Ondrej Miksik (Microsoft); Marc Pollefeys (ETH Zurich / Microsoft) |
933 | 3D Compositional Zero-shot Learning with DeCompositional Consensus | Muhammad Ferjad Naeem (ETH Zürich)*; Evin Pınar Örnek (TU Munich); Yongqin Xian (ETH Zurich); Luc Van Gool (ETH Zurich); Federico Tombari (Google, TU Munich) |
939 | Video Mask Transfiner for High-Quality Video Instance Segmentation | Lei Ke (HKUST)*; Henghui Ding (ETH Zurich); Martin Danelljan (ETH Zurich); Yu-Wing Tai (Kuaishou Technology / HKUST); Chi-Keung Tang (Hong Kong University of Science and Technology); Fisher Yu (ETH Zurich) |
940 | FashionViL: Fashion-Focused Vision-and-Language Representation Learning | Xiao Han (University of Surrey)*; Licheng Yu (Facebook); Xiatian Zhu (University of Surrey); Li Zhang (Fudan University); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey) |
945 | Adaptive Face Forgery Detection in Cross Domain | Luchuan Song (University of Science and Technology of China)*; Zheng Fang (BeihangUniversity); Xiaodan Li (Alibaba Group); Xiaoyi Dong (University of Science and Technology of China); Zhenchao Jin (University of Science and Technology of China); Yuefeng Chen (Alibaba Group); Siwei Lyu (University at Buffalo) |
958 | LiP-Flow: Learning Inference-time Priors for Codec Avatars via Normalizing Flows in Latent Space | Emre Aksan (ETH Zurich)*; Shugao Ma (Facebook); Akin Caliskan (Center for Vision Speech and Signal Processing – University of Surrey); Stanislav Pidhorskyi (Facebook Inc.); Alexander Richard (Facebook Reality Labs); Shih-En Wei (Facebook); Jason Saragih (Facebook); Otmar Hilliges (ETH Zurich) |
961 | Dense Teacher: Dense Pseudo-Labels for Semi-supervised Object Detection | Hongyu Zhou (Megvii)*; Songtao Liu (MEGVII); Zeming Li (Megvii(Face++) Inc); Jian Sun (Megvii Technology); Weixin Mao (waseda university); Zheng Ge (MEGVII Technology); haiyan yu (Harbin Institute of Technology) |
968 | Metric Learning based Interactive Modulation for Real-World Super-Resolution | Chong Mou (Peking University Shenzhen Graduate School)*; Yanze Wu (Tencent); Xintao Wang (Tencent); Chao Dong (SIAT); Jian Zhang (Peking University Shenzhen Graduate School); Ying Shan (Tencent) |
971 | Optimal Transport for Label-Efficient Visible-Infrared Person Re-Identification | Jiangming Wang (East China Normal University); Zhizhong Zhang (East China Normal University); Mingang Chen (Shanghai Development Center of Computer Software Technology); yi zhang (zhejianglab); Cong Wang (Huawei Technologies); Bin Sheng (Shanghai Jiao Tong University); Yanyun Qu (XMU); Yuan Xie (East China Normal University)* |
977 | Proposal-Free Temporal Action Detection via Global Segmentation Mask Learning | Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey) |
979 | Sobolev Training for Implicit Neural Representations with Approximated Image Derivatives | Wentao Yuan (Peking Universtiy)*; Qingtian Zhu (Peking University); Xiangyue Liu (Beihang University); Yikang Ding (Tsinghua University); Haotian Zhang (Megvii); Chi Zhang (Megvii Inc.) |
982 | Unsupervised Night Image Enhancement: When Layer Decomposition Meets Light-Effects Suppression | Yeying Jin (National University of Singapore)*; Wenhan Yang (NTU); Robby T. Tan (National University of Singapore) |
986 | Point-to-Box Network for Accurate Object Detection via Single Point Supervision | Pengfei Chen (University of Chinese Academy of Sciences); Xuehui Yu (University of Chinese Academy of Sciences); Xumeng Han (University of Chinese Academy of Sciences); Najmul Hassan (University of Oregon); Kai Wang (U of Oregon); Jiachen Li (UIUC); Jian Zhao (Institute of North Electronic Equipment); Humphrey Shi (U of Oregon | UIUC | PAIR); Zhenjun Han (University of Chinese Academy of Sciences)*; Qixiang Ye (University of Chinese Academy of Sciences, China) |
989 | Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks | Yunshan Zhong (xiamen university)*; Mingbao Lin (Xiamen University, China); xunchao li (Xiamen University); Ke Li (Tencent); Yunhang Shen (Xiamen University); Fei Chao (Xiamen University); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd); Rongrong Ji (Xiamen University, China) |
999 | Locality Guidance for Improving Vision Transformers on Tiny Datasets | Kehan Li (Peking University); Runyi Yu (Peking University); Zhennan Wang (Peng Cheng Laboratory); Li Yuan (Peking University); Guoli Song (Peng Cheng Laboratory); Jie Chen (Peking University)* |
1002 | Weakly Supervised Object Localization through Inter-class Feature Similarity and Intra-class Appearance Consistency | Jun Wei (The Chinese University of Hong Kong, Shenzhen); Sheng Wang (Shanghai Zelixir Biotech); S. Kevin Zhou (USTC); Shuguang Cui (The Chinese University of Hong Kong, Shenzhen ); Zhen Li (The Chinese University of Hong Kong, Shenzhen)* |
1003 | Semi-Supervised Temporal Action Detection with Proposal-Free Masking | Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey) |
1005 | Neighborhood Collective Estimation for Noisy Label Identification and Correction | Jichang Li (The University of Hong Kong)*; Guanbin Li (Sun Yat-sen University); Feng Liu (Deepwise AI Lab); Yizhou Yu (The University of Hong Kong) |
1010 | Zero-Shot Temporal Action Detection via Vision-Language Prompting | Sauradip Nag (University of Surrey)*; Xiatian Zhu (University of Surrey); Yi-Zhe Song (University of Surrey); Tao Xiang (University of Surrey) |
1016 | Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval | Pandeng Li (University of Science and Technology of China)*; Hongtao Xie (University of Science and Technology of China); Jiannan Ge (University of Science and Technology of China); Lei Zhang (Kuaishou); Shaobo Min (tencent); Yongdong Zhang (University of Science and Technology of China) |
1018 | Discover and Mitigate Unknown Biases with Debiasing Alternate Networks | Zhiheng Li (University of Rochester)*; Anthony Hoogs (Kitware); Chenliang Xu (University of Rochester) |
1020 | Hierarchical Memory Learning for Fine-Grained Scene Graph Generation | Youming Deng (Wuhan University); Yansheng Li (Wuhan University)*; Yongjun Zhang (Wuhan University); Xiang Xiang (Huazhong University of Science and Technology); Jian Wang (Ant Group); Jingdong Chen (Ant Group); Jiayi Ma (Wuhan University) |
1026 | Improving Test-Time Adaptation via Shift-agnostic Weight Regularization and Nearest Source Prototypes | Sungha Choi (Qualcomm AI Research)*; Seunghan Yang (Qualcomm AI Research); Seokeon Choi (Qualcomm AI research); Sungrack Yun (Qualcomm AI Research) |
1028 | Automatic dense annotation of large-vocabulary sign language videos | Liliane Momeni (University of Oxford)*; Hannah Bull (LIMSI (CNRS)); Prajwal K R (VGG, Oxford); Samuel Albanie (University of Cambridge); Gul Varol (Ecole des Ponts ParisTech); Andrew Zisserman (University of Oxford) |
1029 | Few-shot Class-incremental Learning via Entropy-regularized Data-free Replay | Huan Liu (McMaster University)*; Li Gu (Huawei Canada); Zhixiang Chi (Huawei Noah’s Ark Laboratory); Yuanhao Yu (Huawei Noah’s Ark Laboratory); Yang Wang (Concordia University); Jun Chen (McMaster University); Jin Tang ( Huawei Noah’s Ark Laboratory) |
1035 | Learning Instance-Specific Adaptation for Cross-Domain Segmentation | Yuliang Zou (Virginia Tech)*; Zizhao Zhang (Google); Chun-Liang Li (Google); Han Zhang (Google); Tomas Pfister (Google); Jia-Bin Huang (Facebook ) |
1039 | SALVe: Semantic Alignment Verification for Floorplan Reconstruction from Sparse Panoramas | John W Lambert (Georgia Institute of Technology)*; Yuguang Li (Zillow Group); Ivaylo Boyadzhiev (Zillow Group); Lambert Wixson (Zillow Group); Manjunath Narayana (Zillow group); Will A Hutchcroft (Zillow Group); James Hays (Georgia Institute of Technology, USA); Frank Dellaert (Georgia Tech); Sing Bing Kang (Zillow Group) |
1044 | Active Learning Strategies for Weakly-Supervised Object Detection | Huy V. Vo (Ecole Normale Supérieure – INRIA – Valeo.ai)*; Oriane Siméoni (valeo.ai); Spyros Gidaris (valeo.ai); Andrei Bursuc (valeo.ai); Patrick Pérez (Valeo.ai); Jean Ponce (Inria) |
1049 | 3D Human Pose Estimation Using Möbius Graph Convolutional Networks | Niloofar Azizi (ICG department of TU Graz)*; Horst Possegger (Graz University of Technology); Emanuele Rodola (Sapienza University of Rome); Horst Bischof (Graz University of Technology) |
1055 | Real-time Online Video Detection with Temporal Smoothing Transformers | Yue Zhao (University of Texas at Austin)*; Philipp Kraehenbuehl (UT Austin) |
1060 | 3D-FM GAN: Towards 3D-Controllable Face Manipulation | Yuchen Liu (Princeton University)*; Zhixin Shu (Adobe Research); Yijun Li (Adobe Research); Zhe Lin (Adobe Research); Richard Zhang (Adobe); Sun-Yuan Kung (Princeton University) |
1064 | SinNeRF: Training Neural Radiance Field on Complex Scene from a Single Image | Dejia Xu (University of Texas at Austin)*; Yifan Jiang (University of Texas at Austin); Peihao Wang (University of Texas at Austin); Zhiwen Fan (University of Texas at Austin); Humphrey Shi (U of Oregon | UIUC | PAIR); Zhangyang Wang (University of Texas at Austin) |
1069 | Entropy-driven Sampling and Training Scheme for Conditional Diffusion Generation | Guangcong Zheng (Zhejiang University); Shengming Li (Zhejiang University); Hui Wang (Zhejiang University); Taiping Yao (Tencent YouTu); Yang Chen (Tencent); Shouhong Ding (Tencent); Xi Li (Zhejiang University)* |
1076 | Identity-aware Hand Mesh Estimation and Personalization from RGB Images | Deying Kong (university of california, irvine)*; Linguang Zhang (Facebook Reality Labs); Liangjian Chen (Reality Labs); Haoyu Ma (University of California, Irvine); Xiangyi Yan (University of California, Irvine); shanlin sun (University of California, Irvine); Xingwei Liu (University of California Irvine); Kun Han (University of California Irvine); Xiaohui Xie (University of California, Irvine) |
1084 | TALLFormer: Temporal Action Localization with a Long-memory Transformer | Feng Cheng (University of North Carolina ch); Gedas Bertasius (UNC Chapel Hill)* |
1086 | Unsupervised and Semi-supervised Bias Benchmarking in Face Recognition | Siqi Deng (Amazon)*; Alexandra Chouldechova (CMU); Yongxin Wang (Amazon); Wei Xia (Amazon); Pietro Perona (California Institute of Technology) |
1100 | Domain Adaptive Hand Keypoint and Pixel Localization in the Wild | Takehiko Ohkawa (The University of Tokyo)*; Yu-Jhe Li (Carnegie Mellon University); Qichen Fu (Carnegie Mellon University); Ryosuke Furuta (The University of Tokyo); Kris Kitani (Carnegie Mellon University); Yoichi Sato (University of Tokyo) |
1103 | Skeleton-free Pose Transfer for Stylized 3D Characters | Zhouyingcheng Liao (Saarland University)*; Jimei Yang (Adobe); Jun Saito (Adobe); Gerard Pons-Moll (University of Tübingen); Yang Zhou (Adobe Research) |
1105 | Differentiable Raycasting for Self-supervised Occupancy Forecasting | Tarasha Khurana (Carnegie Mellon University)*; Peiyun Hu (Carnegie Mellon University); Achal D Dave (Amazon); Jason P Ziglar (Argo AI); David Held (); Deva Ramanan (Carnegie Mellon University) |
1109 | InAction: Interpretable Action Decision Making for Autonomous Driving | Taotao Jing (Tulane University)*; Haifeng Xia (Tulane University); Renran Tian (Indiana University-Purdue University Indianapolis); Haoran Ding (IUPUI); Xiao Luo (IUPUI); Joshua E Domeyer (Toyota Motor North America); Rini Sherony (Toyota CSRC); Zhengming Ding (Tulane University) |
1114 | CramNet: Camera-Radar Fusion with Ray-Constrained Cross-Attention for Robust 3D Object Detection | Jyh-Jing Hwang (Waymo)*; Henrik Kretzschmar (Waymo); Joshua M Manela (Waymo); Sean Rafferty (Waymo); Nicholas Armstrong-Crews (Waymo); Tiffany Chen (Waymo); Dragomir Anguelov (Waymo) |
1118 | CycDA: Unsupervised Cycle Domain Adaptation to Learn from Image to Video | Wei Lin (Graz University of Technology)*; Anna Kukleva (MPII); Kunyang Sun (Southeast University); Horst Possegger (Graz University of Technology); Hilde Kuehne (University of Frankfurt); Horst Bischof (Graz University of Technology) |
1119 | Latent Discriminant deterministic Uncertainty | Gianni Franchi (ENSTA Paris)*; Xuanlong Yu (ENSTA Paris); Andrei Bursuc (valeo.ai); Emanuel Aldea (Paris-Saclay University); Severine Dubuisson (Aix-Marseille University); David Filliat (ENSTA Paris) |
1129 | Auto-FedRL: Federated Hyperparameter Optimization for Multi-institutional Medical Image Segmentation | Pengfei Guo (Johns Hopkins University)*; Dong Yang (NVIDIA Corporation); Ali Hatamizadeh (NVIDIA Corporation); An Xu (University of Pittsburgh); Ziyue Xu (NVIDIA); Wenqi Li (NVIDIA); Can Zhao (Nvidia); Daguang Xu (NVIDIA Corporation); Stephanie Anne Harmon (National Cancer Institute); Evrim Turkbey (NIH); Baris Turkbey (National Cancer Institute); Bradford J Wood (National Institutes of Health); Francesca Patella (ASST Santi Paolo e Carlo); Elvira Stellato (University of Milan); Gianpaolo Carrafiello (University of Milan); Vishal Patel (Johns Hopkins University); Holger R Roth (NVIDIA) |
1135 | Image-based CLIP-Guided Essence Transfer | Hila Chefer (Tel Aviv University)*; Sagie Benaim (University of Copenhagen); Roni Paiss (Tel Aviv University, Google); Lior Wolf (Tel Aviv University, Israel) |
1136 | Prune Your Model Before Distill It | JinHyuk Park (Hongik University); Albert No (Hongik University)* |
1155 | S2N: Suppression-Strengthen Network for Event-based Recognition under Variant Illuminations | zengyu wan (University of Science and Technology of China)*; Yang Wang (University of Science and Technology of China); Ganchao Tan (University of Science and Technology of China); Yang Cao (University of Science and Technology of China); Zheng-Jun Zha (University of Science and Technology of China) |
1159 | MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval | Yuying Ge (The University of Hong Kong)*; Yixiao Ge (Tencent); Xihui Liu (UC Berkeley); Jinpeng Wang (National University of Singapore); Jianping Wu (Tsinghua University); Ying Shan (Tencent); Xiaohu Qie (Tencent); Ping Luo (The University of Hong Kong) |
1161 | PASS: Part-Aware Self-Supervised Pre-Training for Person Re-Identification | Kuan Zhu (Institute of Automation, Chinese Academy of Sciences)*; Haiyun Guo (CASIA); Tianyi Yan (Institute of Automation,Chinese Academy of Sciences;School of Artificial Intelligence, University of Chinese Academy Sciences); Yousong Zhu (Institute of Automation, Chinese Academy of Sciences); Jinqiao Wang (Institute of Automation, Chinese Academy of Sciences); Ming Tang (Institute of Automation, Chinese Academy of Sciences) |
1165 | RegionCL: Exploring Contrastive Region Pairs for Self-supervised Representation Learning | YUFEI XU (University of sydney)*; Qiming Zhang (The University of Sydney); Jing Zhang (The University of Sydney); Dacheng Tao (JD.com) |
1174 | Towards Data-Efficient Detection Transformers | Wen Wang (University of Science and Technology of China)*; Jing Zhang (The University of Sydney); Yang Cao (University of Science and Technology of China); Yongliang Shen (Zhejiang University); Dacheng Tao (JD.com) |
1175 | Label2Label: A Language Modeling Framework for Multi-Attribute Learning | Wanhua Li (Tsinghua University); Zhexuan Cao (Tsinghua University); Jianjiang Feng (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)* |
1179 | Anti-Retroactive Interference for Lifelong Learning | Runqi Wang (Beihang University); Yuxiang Bao (Beihang University); Baochang Zhang (Beihang University)*; Jianzhuang Liu (Huawei Noah’s Ark Lab); Wentao Zhu (Amazon); Guodong Guo (IDL, Baidu Research) |
1181 | Emotion Recognition for Multiple Context Awareness | Dingkang Yang (Fudan University); shuai huang (Fudan university); Shunli Wang (Fudan University); Yang Liu (Fudan University); Peng Zhai (Fudan university); Liuzhen Su (Fudan University); Mingcheng Li (Fudan University); Lihua Zhang (Fudan University)* |
1182 | Box-supervised Instance Segmentation with Level Set Evolution | Wentong Li (Zhejiang University ); Wenyu Liu (Zhejiang University); Jianke Zhu (Zhejiang University)*; Miaomiao Cui (Alibaba-inc); Xian-Sheng Hua (Damo Academy, Alibaba Group); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
1197 | mc-BEiT: Multi-choice Discretization for Image BERT Pre-training | Xiaotong Li (Peking University)*; Yixiao Ge (Tencent); Kun Yi (Nanjing University); Zixuan Hu (Peking University); Ying Shan (Tencent); Lingyu Duan (Peking University) |
1198 | Adaptive Cross-Domain Learning for Generalizable Person Re-Identification | Pengyi Zhang (Zhejiang University)*; Huanzhang Dou (Zhejiang University); Yunlong Yu (Zhejiang University); Xi Li (Zhejiang University) |
1202 | MetaGait: Learning to Learn an Omni Sample Adaptive Representation for Gait Recognition | Huanzhang Dou (Zhejiang University)*; Pengyi Zhang (Zhejiang University); Wei Su (Zhejiang University); Yunlong Yu (Zhejiang University); Xi Li (Zhejiang University) |
1203 | Bootstrapped Masked Autoencoders for Vision BERT Pretraining | Xiaoyi Dong (University of Science and Technology of China)*; Jianmin Bao (Microsoft Research Asia); Ting Zhang (MSRA); Dongdong Chen (Microsoft Cloud AI); Weiming Zhang (University of Science and Technology of China); Lu Yuan (Microsoft); Dong Chen (Microsoft Research Asia); Fang Wen (Microsoft Research Asia ); Nenghai Yu (University of Science and Technology of China) |
1209 | Masked Discrimination for Self-Supervised Learning on Point Clouds | Haotian Liu (University of Wisconsin-Madison)*; Mu Cai (University of Wisconsin-Madison); Yong Jae Lee (University of Wisconsin-Madison) |
1214 | GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval | Yuxuan Wang (National University of Singapore); Difei Gao (NUS); Licheng Yu (Facebook); Stan Weixian Lei (National University of Singapore); Matt Feiszli (Facebook Research); Mike Zheng Shou (National University of Singapore)* |
1225 | FAST-VQA: Efficient End-to-end Video Quality Assessment with Fragment Sampling | Haoning Wu (Nanyang Technological University)*; Chaofeng Chen (Nanyang Technological University); Jingwen Hou (Nanyang Technological University); Liang Liao (Nanyang Technological University); Annan Wang (Nanyang Technological University); Wenxiu Sun (SenseTime Research and Tetras.AI); Qiong Yan (SenseTime Group Limited); Weisi Lin (Nanyang Technological University, Singapore) |
1235 | Learning to train a point cloud reconstruction network without matching | Tianxin Huang (Zhejiang University)*; Xuemeng Yang (Zhejiang University); Jiangning Zhang (Zhejiang University); Jinhao Cui (Zhejiang Unversity); Hao Zou (Zhejiang University); Jun Chen (Zhejiang University); Xiangrui Zhao (Zhejiang University); Yong Liu (Zhejiang University) |
1243 | Long-Tailed Class Incremental Learning | Xialei Liu (Nankai University)*; Yusong Hu (Nankai University); Xu-Sheng Cao (Nankai University); Andy Bagdanov (University of Florence, Italy); Ke Li (Tencent); Ming-Ming Cheng (Nankai University) |
1247 | CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving | Kaican Li (Huawei Noah’s Ark Lab)*; Kai Chen (HKUST); Haoyu Wang (Purdue University); Lanqing Hong (Huawei Noah’s Ark Lab); Chaoqiang Ye (Huawei); Jianhua Han (Huawei Noah’s Ark Lab); Yukuai Chen (Huawei Intelligent Automotive Solution BU); Wei Zhang ( Noah’s Ark Lab, Huawei Technologies); Chunjing Xu (Huawei Noah’s Ark Lab); Dit-Yan Yeung (HKUST); Xiaodan Liang (Sun Yat-sen University); Zhenguo Li (Huawei Noah’s Ark Lab); Hang Xu (Huawei Noah’s Ark Lab) |
1253 | CMT: Context-Matching-Guided Transformer for 3D Tracking in Point Clouds | Zhiyang Guo (University of Science and Technology of China)*; Yunyao Mao (University of Science and Technology of China); Wengang Zhou (University of Science and Technology of China); Min Wang (Institute of Artificial Intelligence, Hefei Comprehensive National Science Center); Houqiang Li (University of Science and Technology of China) |
1257 | Motion Inspired Unsupervised Perception and Prediction in Autonomous Driving | Mahyar Najibi (Waymo LLC); Jingwei Ji (Waymo); Yin Zhou (Waymo)*; Charles R. Qi (Waymo); Xinchen Yan (Waymo); Scott Ettinger (Waymo); Dragomir Anguelov (Waymo) |
1259 | Unitail: Detecting, Reading, and Matching in Retail Scene | Fangyi Chen (Carnegie Mellon University)*; Han Zhang (CMU); zaiwang li (pitt); Jiachen Dou (Carnegie Mellon University); Shentong Mo (Carnegie Mellon University); Hao Chen (Carnegie Mellon University); Yong-Xin Zhang (Tsinghua University); Uzair Ahmed (Carnegie Mellon University); Chenchen Zhu (Meta AI); Marios Savvides (Carnegie Mellon University) |
1275 | DODA: Data-oriented Sim-to-Real Domain Adaptation for 3D Semantic Segmentation | Runyu Ding (The University of Hong Kong)*; Jihan Yang (The University of Hong Kong); Li Jiang (Max Planck Institute for Informatics); Xiaojuan Qi (The University of Hong Kong) |
1277 | Learning to Drive by Watching YouTube Videos: Action-Conditioned Contrastive Policy Pretraining | Qihang Zhang (Chinese University of Hong Kong); Zhenghao Peng (Chinese University of Hong Kong); Bolei Zhou (UCLA)* |
1278 | Multi-Curve Translator for High-Resolution Photorealistic Image Translation | Yuda Song (Zhejiang University); Hui Qian (Zhejiang University); Xin Du (Zhejiang University)* |
1280 | Dynamic Metric Learning with Cross-Level Concept Distillation | Wenzhao Zheng (Tsinghua University)*; Yuanhui Huang (Tsinghua University); Borui Zhang (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University) |
1287 | Deep Bayesian Video Frame Interpolation | Zhiyang Yu (Harbin Institute of Technology)*; Yu Zhang (Beihang University); Xujie Xiang (Beihang University); Dongqing Zou (SenseTime Research;Qing Yuan Research Institute, Shanghai Jiao Tong University); Xijun Chen (Harbin Institute of Technology); Jimmy Ren (SenseTime Research;Qing Yuan Research Institute, Shanghai Jiao Tong University) |
1300 | PanoFormer: Panorama Transformer for Indoor 360° Depth Estimation | Zhijie Shen (Beijing Jiaotong University); Chunyu Lin (Beijing Jiaotong University)*; Kang Liao (Beijing Jiaotong University); Lang Nie (Beijing Jiaotong University); Zishuo Zheng (Beijing Jiaotong University); Yao Zhao (Beijing Jiaotong University) |
1312 | Cross Attention Based Style Distribution for Controllable Person Image Synthesis | Xinyue Zhou (East China Normal University ); Mingyu Yin (East China Normal University); Xinyuan Chen (Shanghai AI Laboratory); Li Sun (East China Normal University)*; Changxin Gao (Huazhong University of Science and Technology); Qingli Li (East China Normal University) |
1315 | Generative Meta-Adversarial Network for Unseen Object Navigation | Sixian Zhang (ICT, China Academy of Science)*; Weijie Li (ICT, China Academy of Sciences); Xinhang Song (ICT); Yubing Bai (ICT,China Academy of Science); Shuqiang Jiang (ICT, China Academy of Science) |
1316 | Unsupervised Visual Representation Learning by Synchronous Momentum Grouping | Bo Pang (Shanghai Jiao Tong University)*; Yifan Zhang (Shanghai Jiao Tong University); Yaoyi Li (Huawei); Jia Cai (Huawei); Cewu Lu (Shanghai Jiao Tong University) |
1317 | OSFormer: One-Stage Camouflaged Instance Segmentation with Transformers | Jialun Pei (Huazhong University of Science and Technology); Tianyang Cheng (Huazhong University of Science and Technology); Deng-Ping Fan (ETH Zurich)*; He Tang (Huazhong University of Science and Technology); Chuanbo Chen (Huazhong University of Science and Technology); Luc Van Gool (ETH Zürich) |
1321 | Highly Accurate Dichotomous Image Segmentation | Xuebin Qin (University of Alberta); Hang Dai (Mohamed bin Zayed University of Artificial Intelligence); Xiaobin Hu (Technische Universität München); Deng-Ping Fan (ETH Zurich)*; Ling Shao (Terminus Group); Luc Van Gool (ETH Zurich) |
1322 | KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints | Marko Mihajlovic (ETH Zurich)*; Aayush Bansal (Carnegie Mellon University); Michael Zollhöfer (Facebook Reality Labs); Siyu Tang (ETH Zurich); Shunsuke Saito (Facebook) |
1326 | MENet: a Memory-Based Network with Dual-Branch for Efficient Event Stream Processing | Linhui Sun (CASIA)*; Yifan Zhang (Institute of Automation, Chinese Academy of Sciences); Ke Cheng (Institute of Automation, Chinese Academy of Sciences); Jian Cheng (“Chinese Academy of Sciences, China”); Hanqing Lu (NLPR, Institute of Automation, CAS) |
1330 | Making Heads or Tails: Towards Semantically Consistent Visual Counterfactuals | Simon Vandenhende (KU Leuven)*; Dhruv Mahajan (Facebook); Filip Radenovic (Facebook AI); Deepti Ghadiyaram (Facebook) |
1331 | LEDNet: Joint Low-light Enhancement and Deblurring in the Dark | Shangchen Zhou (Nanyang Technological University)*; Chongyi Li ( Nanyang Technological University); Chen Change Loy (Nanyang Technological University) |
1336 | RC-MVSNet: Unsupervised Multi-View Stereo with Neural Rendering | Di Chang (Technical University of Munich)*; Aljaz Bozic (Technical University Munich); Tong Zhang (EPFL); Qingsong Yan (hong kong university of science and technology); Yingcong Chen (Hong Kong University of Science and Technology); Sabine Süsstrunk (EPFL); Matthias Niessner (Technical University of Munich) |
1342 | StretchBEV: Stretching Future Instance Prediction Spatially and Temporally | Kaan Adil Akan (Koc University); Fatma Guney (Koc University)* |
1344 | AgeTransGAN for Facial Age Transformation with Rectified Performance Metrics | Gee-Sern Hsu (National Taiwan University of Science and Technology)*; Rui-Cang Xie ( National Taiwan University of Science and Technology); Zhi-Ting Chen (National Taiwan University of Science and Technology); Yu-Hong Lin (National Taiwan University of Science and Technology) |
1346 | Boosting Supervised Dehazing Methods via Bi-level Patch Reweighting | Xingyu Jiang (beihang ); Hongkun Dou (Beihang University); Chengwei Fu (beihang); Bingquan Dai (Beihang); Tianrun Xu (North China University of Technology); Yue Deng (Samsung Research America)* |
1347 | Detecting and Recovering Sequential DeepFake Manipulation | Rui Shao (Nanyang Technological University)*; Tianxing Wu (Nanyang Technological University); Ziwei Liu (Nanyang Technological University) |
1353 | MTFormer: Multi-Task Learning via Transformer and Cross-Task Reasoning | Xiaogang XU (The Chinese University of Hong Kong)*; Hengshuang Zhao (University of Oxford); Vibhav Vineet (Microsoft Research); Ser-Nam Lim (Meta AI); Antonio Torralba (MIT) |
1356 | Prediction-Guided Distillation for Dense Object Detection | Chenhongyi Yang (University of Edinburgh)*; Mateusz Ochal (Heriot Watt University); Amos Storkey (U Edinburgh); Elliot J Crowley (University of Edinburgh) |
1358 | Towards Generic 3D Tracking in RGBD Videos: Benchmark and Baseline | Jinyu Yang (Southern University of Science and Technology)*; Zhongqun Zhang (University of Birmingham); Zhe LI (SUSTech); Hyung Jin Chang (University of Birmingham); Ales Leonardis (University of Birmingham); Feng Zheng (SUSTech) |
1364 | C3P: Cross-domain Pose Prior Propagation for Weakly Supervised 3D Human Pose Estimation | cunlin wu (Huazhong University of Science and Technology); Yang Xiao (Huazhong Univ. of Sci.&Tech.); Boshen Zhang (Tencent); Mingyang Zhang (Huazhong Univ. of Sci.&Tech); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.); Joey Tianyi Zhou (ASTAR Centre for Frontier AI Research (CFAR) ) |
1366 | Adaptive Fine-Grained Sketch-Based Image Retrieval | Ayan Kumar Bhunia (University of Surrey)*; Aneeshan Sain (University of Surrey); Parth Hiren Shah (Indian Institute of Technology Guwahati); Animesh Gupta (Thapar University); Pinaki Nath Chowdhury (University of Surrey); Tao Xiang (University of Surrey); Yi-Zhe Song (University of Surrey) |
1376 | Learning Ego 3D Representation as Ray Tracing | Jiachen Lu (Fudan University); Zheyuan Zhou (Fudan University); Xiatian Zhu (University of Surrey); Hang Xu (Huawei Noah’s Ark Lab); Li Zhang (Fudan University)* |
1380 | Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling | Hengyuan Ma (Fudan University); Li Zhang (Fudan University)*; Xiatian Zhu (University of Surrey); Jianfeng Feng (Fudan University) |
1382 | RCLane: Relay Chain Prediction for Lane Detection | Shenghua Xu (Fudan University); Xinyue Cai (Huawei Noah’s Ark Lab); Bin Zhao (Fudan University); Li Zhang (Fudan University)*; Hang Xu (Huawei Noah’s Ark Lab); Yanwei Fu (Fudan University); Xiangyang Xue (Fudan University) |
1394 | Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding | Hao Wen (Tsinghua University); Yunze Liu (Tsinghua University)*; Jingwei Huang (Huawei); Bo Duan (Huawei); Li Yi (Tsinghua University) |
1395 | Towards Efficient Adversarial Training on Vision Transformers | Boxi Wu (Zhejiang University)*; Jindong Gu (University of Munich); Zhifeng Li (Tencent AI Lab); Deng Cai (ZJU); Xiaofei He (Zhejiang University); Wei Liu (Tencent) |
1397 | Adaptive Agent Transformer for Few-shot Segmentation | Yuan Wang (University of Science and Technology of China)*; Rui Sun (University of Science and Technology of China); Zhe Zhang (Lunar Exploration and Space Engineering Center of CNSA); Tianzhu Zhang (University of Science and Technology of China) |
1408 | Improving Few-Shot Part Segmentation using Coarse Supervision | Oindrila Saha (University of Massachusetts Amherst)*; Zezhou Cheng (University of Massachusetts, Amherst); Subhransu Maji (University of Massachusetts, Amherst) |
1412 | Mining Relations among Cross-Frame Affinities for Video Semantic Segmentation | Guolei Sun (ETH Zurich); Yun Liu (ETH Zurich)*; Hao Tang (ETH Zurich); Ajad Chhatkuli (ETH Zurich); Le Zhang (University of Electronic Science and Technology of China); Luc Van Gool (ETH Zurich) |
1414 | Out-of-distribution Detection with Boundary Aware Learning | Sen Pei (Institute of Automation, Chinese Academy of Sciences)*; Xin Zhang (Institute of Automation, Chinese Academy of Sciences, University of Chinese Academy of Sciences); Bin Fan (University of Science and Technology Beijing); Gaofeng Meng (Chinese Academy of Sciences) |
1415 | NeILF: Neural Incident Light Field for Physically-based Material Estimation | Yao Yao (Apple Inc.); Jingyang Zhang (The Hong Kong University of Science and Technology)*; Jingbo Liu (Apple Inc.); Yihang Qu (Apple Inc.); Tian Fang (Apple); David N McKinnon (Apple); Yanghai Tsin (Apple Inc); Long Quan (Apple) |
1417 | ViewFormer: NeRF-free Neural Rendering from Few Images Using Transformers | Jonáš Kulhánek (Czech Technical University in Prague)*; Erik Derner (CTU CIIRC); Torsten Sattler (Czech Technical University in Prague); Robert Babuska (TU Delft) |
1421 | L-Tracing: Fast Light Visibility Estimation on Neural Surfaces by Sphere Tracing | Ziyu Chen (Shanghai Jiao Tong University)*; Chenjing Ding (Sensetime Group Limited); Jianfei Guo (Shanghai AI Laboratory); Dongliang Wang (SenseTime Group Limited); Yikang Li (Shanghai AI Lab); Xuan Xiao (SenseTime Group Limited); Wei Wu (SenseTime Group Limited); Li Song (Shanghai Jiao Tong University) |
1424 | ARF: Artistic Radiance fields | Kai Zhang (Cornell University)*; Nicholas I Kolkin (Adobe Research); Sai Bi (Adobe Research); Fujun Luan (Adobe Research); Zexiang Xu (Adobe Research); Eli Shechtman (Adobe Research, US); Noah Snavely (Cornell University and Google AI) |
1425 | Multiview Stereo with Cascaded Epipolar RAFT | Zeyu Ma (Princeton University)*; Zachary Teed (Princeton University); Jia Deng (Princeton University) |
1439 | What to Hide from Your Students: Attention-Guided Masked Image Modeling | Ioannis Kakogeorgiou (National Technical University of Athens)*; Spyros Gidaris (valeo.ai); Bill Psomas (National Technical University of Athens); Yannis Avrithis (IARAI, Athena RC); Andrei Bursuc (valeo.ai); Konstantinos Karantzalos (National Technical University of Athens); Nikos Komodakis (University of Crete) |
1441 | Static and Dynamic Concepts for Self-supervised Video Representation Learning | Rui Qian (The Chinese University of Hong Kong)*; Shuangrui Ding (Shanghai Jiao Tong University); Xian Liu (The Chinese University of Hong Kong); Dahua Lin (The Chinese University of Hong Kong) |
1447 | Deep Partial Updating: Towards Communication Efficient Updating for On-device Inference | Zhongnan Qu (ETH Zurich)*; Cong Liu (University of Texas at Dallas); Lothar Thiele (ETH Zürich) |
1455 | Gradient-based Uncertainty for Monocular Depth Estimation | Julia Hornauer (Ulm University)*; Vasileios Belagiannis (Otto von Guericke University Magdeburg) |
1456 | Flow-Guided Transformer for Video Inpainting | Kaidong Zhang (University of Science and Technology of China); Jingjing Fu (Microsoft)*; Dong Liu (University of Science and Technology of China) |
1468 | Relationformer: A Unified Framework for Image-to-Graph Generation | Suprosanna Shit (TUM)*; Rajat Koner (Ludwig Maximilian University of Munich); Bastian Wittmann (Technical University of Munich); Johannes C. Paetzold (TUM); Ivan Ezhov (TUM); Hongwei Li (Technical University of Munich); Jiazhen Pan (Technical University of Munich); Sahand Sharifzadeh (Ludwig Maximilian University of Munich); Georgios Kaissis (Technische Universität München); Volker Tresp (LMU); Bjoern Menze (TUM) |
1469 | ARAH: Animatable Volume Rendering of Articulated Human SDFs | Shaofei wang (ETH Zurich)*; Katja Schwarz (MPI Tuebingen); Andreas Geiger (University of Tuebingen); Siyu Tang (ETH Zurich) |
1471 | Learning Hierarchy Aware Features for Reducing Mistake Severity | Ashima Garg (IIIT Delhi)*; Depanshu Sani (Indraprastha Institute of Information Technology); Saket Anand (Indraprastha Institute of Information Technology Delhi) |
1474 | Exploiting Unlabeled Data with Vision and Language Models for Object Detection | Shiyu Zhao (Rutgers University)*; Zhixing Zhang (Rutgers University); Samuel Schulter (NEC Laboratories America); Long Zhao (Google Research); Vijay Kumar B G (NEC Laboratories America); Anastasis Stathopoulos (Rutgers University); Manmohan Chandraker (UC San Diego); Dimitris N. Metaxas (Rutgers) |
1479 | A Simple and Robust Correlation Filtering method for text-based person search | Wei Suo (Northwestern Polytechnical University); MengYang Sun (Northwestern Polytechnical University); Kai Niu (Northwestern Polytechnical University); Yiqi Gao (Northwestern Polytechnical University); Peng Wang (Northwestern Polytechnical University); Yanning Zhang (Northwestern Polytechnical University)*; Qi Wu (University of Adelaide) |
1482 | Hunting Group Clues with Transformers for Social Group Activity Recognition | Masato Tamura (Hitachi America, Ltd.)*; Rahul Vishwakarma (Hitachi America Ltd.); Ravigopal Vennelakanti (Hitachi America, Ltd.) |
1493 | Quantized GAN for Complex Music Generation from Dance Videos | Ye Zhu (Illinois Institute of Technology)*; Kyle B Olszewski (Snap Inc.); Yu Wu (Princeton University); Panos Achlioptas (Stanford University); Menglei Chai (Snap Inc.); Yan Yan (Illinois Institute of Technology); Sergey Tulyakov (Snap Inc) |
1506 | Not Just Streaks: Towards Ground Truth for Single Image Deraining | Yunhao Ba (UCLA)*; Howard Zhang (UCLA); Ethan Yang (UCLA); Akira Suzuki (UCLA); Arnold J Pfahnl (University of California, Los Angeles); Chethan Chinder Chandrappa (University of California – Los Angeles); Celso de Melo (Army Research Laboratory); Suya You (US Army Research Laboratory); Stefano Soatto (UCLA); Alex Wong (Yale University); Achuta Kadambi (UCLA) |
1511 | HIVE: Evaluating the Human Interpretability of Visual Explanations | Sunnie S. Y. Kim (Princeton University)*; Nicole Meister (Princeton University); Vikram V. Ramaswamy (Princeton University); Ruth C Fong (Princeton University); Olga Russakovsky (Princeton University) |
1512 | GAMa: Cross-view Video Geo-localization | Shruti Vyas (University of Central Florida)*; Chen Chen (University of Central Florida); Mubarak Shah (University of Central Florida) |
1516 | Meta-Sampler: Almost-Universal yet Task-Oriented Sampling for Point Clouds | Ta-Ying Cheng (University of Oxford); Qingyong Hu (University of Oxford)*; Qian Xie (University of Oxford); Niki Trigoni (University of Oxford); Andrew Markham (University of Oxford) |
1517 | Multi-Query Video Retrieval | Zeyu Wang (Princeton University)*; Yu Wu (Princeton University); Karthik Narasimhan (Princeton University); Olga Russakovsky (Princeton University) |
1525 | Waymo Open Dataset: Panoramic Video Panoptic Segmentation | Jieru Mei (Johns Hopkins University); Alex Zhu (Waymo)*; Xinchen Yan (Waymo); Hang Yan (Waymo LLC); Siyuan Qiao (Google); Yukun Zhu (Google Inc.); Liang-Chieh Chen (Google Inc.); Henrik Kretzschmar (Waymo) |
1531 | MIME: Minority Inclusion for Majority Group Enhancement of AI Performance | Pradyumna Chari (UCLA); Yunhao Ba (UCLA)*; Shreeram Athreya (UCLA); Achuta Kadambi (UCLA) |
1534 | Self-supervised Human Mesh Recovery with Cross-Representation Alignment | Xuan Gong (University at Buffalo); Meng Zheng (United Imaging Intelligence); Benjamin Planche (United Imaging Intelligence); Srikrishna Karanam (Adobe Research); Terrence Chen (United Imaging Intelligence); David Doermann (University at Buffalo); Ziyan Wu (United Imaging Intelligence)* |
1541 | TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency | Medhini Narasimhan (UC Berkeley)*; Arsha Nagrani (Google); Chen Sun (Brown University); Michael Rubinstein (Google); Trevor Darrell (UC Berkeley); Anna Rohrbach (UC Berkeley); Cordelia Schmid (Google) |
1542 | A Perceptual Quality Metric for Video Frame Interpolation | Qiqi Hou (Portland State University)*; Abhijay Ghildyal (Portland State University); Feng Liu (Portland State University) |
1543 | Adaptive Feature Interpolation for Low-Shot Image Generation | Mengyu Dai (Microsoft Corporation)*; Haibin Hang (Amazom.com); Xiaoyang Guo (Facebook) |
1544 | Rethinking Learning Approaches for Long-Term Action Anticipation | Megha Nawhal (Simon Fraser University)*; Akash Abdu Jyothi (Simon Fraser University); Greg Mori (Simon Fraser University / Borealis AI) |
1546 | Object Manipulation via Visual Target Localization | Kiana Ehsani (Allen Institute for Artificial Intelligence)*; Ali Farhadi (University of Washington, Apple); Aniruddha Kembhavi (Allen Institute for Artificial Intelligence); Roozbeh Mottaghi (Allen Institute for AI) |
1549 | AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction | Zerui Chen (Inria Paris); Yana Hasson (Inria); Cordelia Schmid (Inria/Google)*; Ivan Laptev (INRIA Paris) |
1551 | Shift-tolerant Perceptual Similarity Metric | Abhijay Ghildyal (Portland State University)*; Feng Liu (Portland State University) |
1557 | Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing | Benedikt Boecking (Carnegie Mellon University); Naoto Usuyama (Microsoft Research); Shruthi J Bannur (Microsoft Research); Daniel Coelho de Castro (Microsoft Research); Anton Schwaighofer (Microsoft Research); Stephanie Hyland (Microsoft Research); Maria Teodora A Wetscherek (Microsoft); Tristan Naumann (Microsoft Research Redmond, US); Aditya Nori (Microsoft Research); Javier Alvarez-Valle (Microsoft Research); Hoifung Poon (Microsoft Research); Ozan Oktay (Microsoft Research)* |
1561 | Self-Supervised Sparse Representation for Video Anomaly Detection | Jhih-Ciang Wu (Academia Sinica )*; He-Yen Hsieh (Academia Sinica); Ding-Jie Chen (Academia Sinica); Chiou-Shann Fuh (National Taiwan University); Tyng-Luh Liu (Academia Sinica) |
1567 | CPO: Change Robust Panorama to Point Cloud Localization | Junho Kim (Seoul National University)*; Hojun Jang (Seoul National University); Changwoon Choi (Seoul National University); Young Min Kim (Seoul National University) |
1569 | MonoPLFlowNet: Permutohedral Lattice FlowNet for Real-Scale 3D Scene Flow Estimation with Monocular Images | Runfa Li (UC San Diego)*; Truong Nguyen (UC San Diego) |
1576 | DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning | Hyounguk Shon (KAIST)*; Janghyeon Lee (LG AI Research); Seung Hwan Kim (LG AI Research); Junmo Kim (KAIST) |
1578 | Contrastive Positive Mining for Unsupervised 3D Action Representation Learning | Haoyuan Zhang (Tianjin University)*; Yonghong Hou (Tianjin University); Wenjing Zhang (Tianjin University); Wanqing Li (University of Wollongong) |
1580 | Patch Similarity Aware Data-Free Quantization for Vision Transformers | Zhikai Li (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences); Liping Ma (Institute of Automation, Chinese Academy of Sciences); Mengjuan Chen (Center of Precision Sensing and Control, Institute of Automation, Chinese Academy of Sciences); Junrui Xiao (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences); Qingyi Gu (Institute of Automation, Chinese Academy of Sciences)* |
1586 | Perception-Distortion Balanced ADMM Optimization for Single-Image Super-Resolution | Yuehan Zhang (National University of Singapore)*; Bo Ji (National University of Singapore); Jia Hao (HiSilicon (Shanghai) Technologies Co., Ltd); Angela Yao (National University of Singapore) |
1596 | DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition | Yuxuan Liang (National University of Singapore)*; Pan Zhou (Sea AI Lab); Roger Zimmermann (NUS); Shuicheng Yan (Sea AI Labs) |
1606 | Hierarchical Contrastive Inconsistency Learning for Deepfake Video Detection | Zhihao Gu (Shanghai Jiao Tong University)*; Taiping Yao (Tencent YouTu); Yang Chen (Tencent); Shouhong Ding (Tencent); Lizhuang Ma (Shanghai Jiao Tong University) |
1616 | Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal | Xinwei Liu (Institute of Information Engineering,Chinese Academy of Sciences)*; Jian Liu (Ant Group); Yang Bai (Tsinghua); Jindong Gu (University of Munich); Tao Chen (Ant Group); Xiaojun Jia (Institute of Information Engineering,Chinese Academy of Sciences); Xiaochun Cao (Sun Yat-sen University) |
1625 | ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO | Sanghyuk Chun (NAVER AI Lab)*; Wonjae Kim (NAVER AI Lab); Song Park (NAVER AI Lab); Minsuk Chang (NAVER AI Lab); Seong Joon Oh (Naver AI Lab) |
1626 | Personalizing Federated Medical Image Segmentation via Local Calibration | Jiacheng Wang (Xiamen University); Yueming Jin (The Chinese University of Hong Kong); Liansheng Wang (Xiamen University)* |
1628 | Learning to Detect Every Thing in an Open World | Kuniaki Saito (Boston University)*; Ping Hu (Boston University); Trevor Darrell (UC Berkeley); Kate Saenko (Boston University) |
1648 | MVP: Multimodality-guided Visual Pre-training | Longhui Wei (University of Science and Technology of China)*; Lingxi Xie (Huawei Inc.); Wengang Zhou (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China); Qi Tian (Huawei Cloud & AI) |
1649 | Uncertainty Learning in Kernel Estimation for Multi-Stage Blind Image Super-Resolution | Zhenxuan Fang (Xidian University); Weisheng Dong (Xidian University)*; Xin Li (West Virginia University); Jinjian Wu (Xidian University); Leida Li (Xidian University); Guangming Shi (Xidian University) |
1666 | Physical Attack on Monocular Depth Estimation in Autonomous Driving with Optimal Adversarial Patches | Zhiyuan Cheng (Purdue University)*; James C Liang (Rochester Institute of Technology); Hongjun Choi (Purdue University); Guanhong Tao (Purdue University); Zhiwen Cao (Purdue University); Dongfang Liu (Rochester Institute of Technology); Xiangyu Zhang (Purdue University) |
1670 | KVT: |
Pichao Wang (Alibaba Group)*; Xue Wang (Alibaba DAMO Academy); Fan Wang (Alibaba Group); Ming Lin (Alibaba Group); Shuning Chang (Alibiba Group); Hao Li (Alibaba Group); rong jin (alibaba group) |
1673 | Locally Varying Distance Transform for Unsupervised Visual Anomaly Detection | Wen-Yan Lin (SMU); Zhonghang Liu (SMU); Siying Liu (I2R Singapore)* |
1676 | Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation | Gensheng Pei (Nanjing University of Science and Technology)*; Fumin Shen (UESTC); Yazhou Yao (Nanjing University of Science and Technology); Guo-Sen Xie (Nanjing University of Science and Technology); Zhenmin Tang ( Nanjing University of Science and Technology); Jinhui Tang (Nanjing University of Science and Technology) |
1677 | PalGAN: Image Colorization with Palette Generative Adversarial Networks | Yi Wang (Shanghai AI Laboratory)*; Menghan Xia (Tencent AI lab); Lu Qi (The Chinese University of Hong Kong); Jing Shao (Sensetime); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences) |
1687 | Fast-Vid2Vid: Spatial-Temporal Compression for Video-to-Video Synthesis | Long Zhuo (Shanghai AI Lab)*; Guangcong Wang (Nanyang Technological University); Shikai Li (SenseTime Research); Wayne Wu (SenseTime Research); Ziwei Liu (Nanyang Technological University) |
1693 | Generative Negative Text Replay for Continual Vision-Language Pretraining | Shipeng Yan (ShanghaiTech University)*; Lanqing Hong (Huawei Noah’s Ark Lab); Hang Xu (Huawei Noah’s Ark Lab); Jianhua Han (Huawei Noah’s Ark Lab); Tinne Tuytelaars (KU Leuven); Zhenguo Li (Huawei Noah’s Ark Lab); Xuming He (ShanghaiTech University) |
1697 | Learning Spatio-Temporal Downsampling for Effective Video Upscaling | Xiaoyu Xiang (Meta Platforms Inc.)*; Yapeng Tian (University of Texas at Dallas); Vijay Rengarajan (Meta Platforms Inc.); Lucas D Young (Facebook); Bo Zhu (Meta Platforms, Inc.); Rakesh Ranjan (Facebook) |
1698 | Geometric Representation Learning for Document Image Rectification | Hao Feng (University of Science and Technology of China)*; Wengang Zhou (University of Science and Technology of China); Jiajun Deng (University of Science and Technology of China); Yuechen Wang (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China) |
1701 | ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer | Hongkai Chen (HKUST)*; Zixin Luo (Apple Inc.); Lei Zhou (Apple); Yurun Tian (Apple); Zhen Mingmin (Apple Inc.); Tian Fang (Apple); David N McKinnon (Apple); Yanghai Tsin (Apple Inc); Long Quan (Apple) |
1709 | Egocentric Activity Recognition and Localization on a 3D Map | Miao Liu (Georgia Institute of Technology)*; Lingni Ma (Facebook Reality Labs); Kiran Somasundaram (Facebook Reality Labs); Yin Li (University of Wisconsin-Madison); Kristen Grauman (Facebook AI Research & UT Austin); James Rehg (Georgia Institute of Technology); Chao Li (Facebook Reality Labs) |
1710 | Generative Adversarial Network for Future Hand Segmentation from Egocentric Video | Wenqi Jia (Georgia Institute of Technology)*; Miao Liu (Georgia Institute of Technology); James Rehg (Georgia Institute of Technology) |
1712 | One-Shot Medical Landmark Localization by Edge-Guided Transform and Noisy Landmark Refinement | Zihao Yin (Center for Data Science, Peking University); Ping Gong (Deepwise AI Lab); Chunyu Wang (Microsoft Research asia); Yizhou Yu (The University of Hong Kong); Yizhou Wang (PKU)* |
1721 | Learning Prior Feature and Attention Enhanced Image Inpainting | chenjie cao (fudan.edu.cn)*; Qiaole Dong (Fudan University); Yanwei Fu (Fudan University) |
1730 | AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-shot Interactions | Yian Wang (Peking university); Ruihai Wu (Peking University); Kaichun Mo (Stanford); Jiaqi Ke (Peking University); Qingnan Fan (Tencent AI Lab); Leonidas Guibas (Stanford University); Hao Dong (Peking University)* |
1735 | Video Graph Transformer for Video Question Answering | Junbin Xiao (National University of Singapore)*; Pan Zhou (Sea AI Lab); Tat-Seng Chua (National Univ. of Singapore); Shuicheng Yan (Sea AI Labs) |
1737 | A Reliable Online Method for Joint Estimation of Focal Length and Camera Rotation | Yiming Qian (Osaka University)*; James Elder (York University) |
1738 | Learning Local Implicit Fourier Representation for Image Warping | Jaewon Lee (DGIST)*; Kwang Pyo Choi (Samsung Electronics); Kyong Hwan Jin (DGIST) |
1740 | SepLUT: Separable Image-adaptive Lookup Tables for Real-time Image Enhancement | Canqian Yang (Shanghai Jiao Tong University); Meiguang Jin (Alibaba Group); Yi Xu (Shanghai Jiao Tong University)*; Rui Zhang (Shanghai Jiao Tong University); Ying Chen (Alibaba Group); Huaida Liu (Alibaba) |
1744 | Temporal-MPI: Enabling Multi-Plane Images for Dynamic Scene Modelling via Temporal Basis Learning | Wenpeng Xing (Hong Kong Baptist University); Jie Chen (Hong Kong Baptist University)* |
1746 | Blind Image Decomposition | Junlin Han (CSIRO)*; Weihao Li (Data61, CSIRO); Pengfei Fang (The Australian National University); Chunyi Sun (Australian National University ); Jie Hong (Australian National University); Mohammad Ali Armin (CSIRO(Data61)); Lars Petersson (Data61/CSIRO); HONGDONG LI (Australian National University, Australia) |
1751 | INT: Towards Infinite-frames 3D Detection with An Efficient Framework | Jianyun Xu (DAMO Academy, Alibaba Group)*; Zhenwei Miao (DAMO Academy, Alibaba Group); Da Zhang (UC Santa Barbara); Hongyu Pan (DAMO Academy, Alibaba Group); Kaixuan Liu (DAMO Academy, Alibaba Group); Peihan Hao (DAMO Academy, Alibaba Group); Jun Zhu (DAMO Academy, Alibaba Group); Zhengyang Sun (Tsinghua University); Li Hongmin (Huawei TCS lab); Xin Zhan (DAMO Academy, Alibaba Group) |
1756 | MuLUT: Cooperating Multiple Look-Up Tables for Efficient Image Super-Resolution | Jiacheng Li (University of Science and Technology of China); Chang Chen (Huawei Noah’s Ark Lab); Zhen Cheng (University of Science and Technology of China); Zhiwei Xiong (University of Science and Technology of China)* |
1757 | NDF: Neural Deformable Fields for Dynamic Human Modelling | Ruiqi Zhang (Hong Kong Baptist University); Jie Chen (Hong Kong Baptist University)* |
1759 | MPIB: An MPI-Based Bokeh Rendering Framework for Realistic Partial Occlusion Effects | Juewen Peng (Huazhong University of Science and Technology); Jianming Zhang (Adobe Research); Xianrui Luo (Huazhong University of Science and Technology); Hao Lu (Huazhong University of Science and Technology); Ke Xian (Huazhong University of Science and Technology)*; Zhiguo Cao (Huazhong Univ. of Sci.&Tech.) |
1761 | Neural Density-Distance Fields | Itsuki UEDA (University of Tsukuba)*; Yoshihiro Fukuhara (Waseda University); Hirokatsu Kataoka (National Institute of Advanced Industrial Science and Technology (AIST)); Hiroaki Aizawa (Hiroshima University); Hidehiko Shishido (University of Tsukuba); Itaru Kitahara (University of Tsukuba) |
1762 | MoDA: Map style transfer for self-supervised Domain Adaptation of embodied agents | Eun Sun Lee (Seoul National University)*; Junho Kim (Seoul National University); Sangwon Park (Seoul Nat’l University); Young Min Kim (Seoul National University) |
1766 | L3: Accelerator-Friendly Lossless Image Format for High-Resolution, High-Throughput DNN Training | Jonghyun Bae (Seoul National University)*; Woohyeon Baek (Seoul National University); Tae Jun Ham (Seoul National University); Jae W. Lee (Seoul National University) |
1780 | Prior-Guided Adversarial Initialization for Fast Adversarial Training | Xiaojun Jia (Institute of Information Engineering,Chinese Academy of Sciences)*; Yong Zhang (Tencent AI Lab); Xingxing Wei (Beihang University); Baoyuan Wu (The Chinese University of Hong Kong, Shenzhen; Shenzhen Research Institute of Big Data); Ke Ma (UCAS); Jue Wang (Tencent AI Lab); Xiaochun Cao (Sun Yat-sen University) |
1790 | Housekeep: Tidying Virtual Households using Commonsense Reasoning | Yash Mukund Kant (University of Toronto)*; Arun Ramachandran (Georgia Institute of Technology); Sriram Yenamandra (Georgia Institute of Technology); Igor Gilitschenski (University of Toronto); Dhruv Batra (Georgia Tech & Facebook AI Research); Andrew Szot (Georgia Institute of Technology); Harsh Agrawal (Georgia Institute of Technology) |
1804 | Real-RawVSR: Real-World Raw Video Super-Resolution with a Benchmark Dataset | Huanjing Yue (Tianjin University)*; Zhiming Zhang (Tianjin University); Jingyu Yang (Tianjin University) |
1807 | ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning | Shengchao Hu (Shanghai Jiao Tong University)*; Li Chen (Shanghai AI Laboratory); Penghao Wu (Shanghai Jiao Tong University); Hongyang Li (SenseTime); Junchi Yan (Shanghai Jiao Tong University); Dacheng Tao (JD.com) |
1810 | NeXT: Towards High Quality Neural Radiance Fields via Multi-Skip Transformer | Yunxiao Wang (Tsinghua University); Yanjie Li (Tsinghua University)*; Peidong Liu (Tsinghua University); Tao Dai (Shenzhen University); Shu-Tao Xia (Tsinghua University) |
1814 | Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution | Zhongwei Qiu (University of Science and Technology Beijing); Huan Yang (Microsoft Research)*; Jianlong Fu (Microsoft Research); Dongmei Fu (University of Science and Technology Beijing) |
1819 | Adversarial Partial Domain Adaptation by Cycle Inconsistency | Kun-Yu Lin (Sun Yat-sen University); Jiaming Zhou (Sun Yat-sen University); Yukun Qiu (Sun Yat-sen University); WEI-SHI ZHENG (Sun Yat-sen University, China)* |
1824 | BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks | Uddeshya Upadhyay (University of Tübingen)*; Shyamgopal Karthik (University of Tübingen); Massimiliano Mancini (University of Tübingen); Yanbei Chen (University of Tübingen); Zeynep Akata (University of Tübingen) |
1831 | Domain Randomization-Enhanced Depth Simulation and Restoration for Perceiving and Grasping Specular and Transparent Objects | Qiyu Dai (Peking University); Jiyao Zhang (Xi’an Jiaotong University); Qiwei Li (Peking University); tianhao wu (Peking University); Hao Dong (Peking University); Ziyuan Liu (Huawei group); Ping Tan (Simon Fraser University); He Wang (Peking University)* |
1832 | PS-NeRF: Neural Inverse Rendering for Multi-view Photometric Stereo | Wenqi Yang (The University of Hong Kong)*; Guanying CHEN (The Chinese University of Hong Kong, Shenzhen); Chaofeng Chen (Nanyang Technological University); Zhenfang Chen (MIT-IBM Watson AI Lab); Kwan-Yee K. Wong (The University of Hong Kong) |
1845 | DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation | Ailing Zeng (The Chinese University of Hong Kong)*; Xuan Ju (The Chinese University of Hong Kong); Lei Yang (Sensetime Group Limited); Ruiyuan Gao (The Chinese University of Hong Kong); Xizhou Zhu (SenseTime); Bo Dai (Shanghai AI Lab); Qiang Xu (The Chinese University of Hong Kong) |
1846 | Hierarchical Latent Structure for Multi-Modal Vehicle Trajectory Forecasting | Dooseop Choi (ETRI)*; KyoungWook Min (ETRI) |
1848 | SmoothNet: A Plug-and-Play Network for Refining Human Poses in Videos | Ailing Zeng (The Chinese University of Hong Kong)*; Lei Yang (Sensetime Group Limited); Xuan Ju (The Chinese University of Hong Kong); Jiefeng Li (Shanghai Jiao Tong University); Jianyi Wang (Nanyang Technological University); Qiang Xu (The Chinese University of Hong Kong) |
1851 | Share With Thy Neighbors: Single-View Reconstruction by Cross-Instance Consistency | Tom Monnier (École des ponts Paristech)*; Matthew Fisher (Adobe Research); Alexei A Efros (UC Berkeley); Mathieu Aubry (École des ponts ParisTech) |
1852 | End-to-End Weakly Supervised Object Detection with Sparse Proposal Evolution | Mingxiang Liao (University of Chinese Academy of Sciences); Fang Wan (University of Chinese Academy of Sciences)*; Yuan Yao (University of Chinese Academy of Sciences); Zhenjun Han (University of Chinese Academy of Sciences); Zou Jialing (University of Chinese Academy of Science); Yuze Wang ( Huawei Noah’s Ark Lab); Bailan Feng (Huawei Noah’s Ark Lab); Peng Yuan (Huawei Noah’s Ark Lab); Qixiang Ye (University of Chinese Academy of Sciences, China) |
1853 | PAC-Net: Highlight Your Video via History Preference Modeling | Hang Wang (Huawei HiSilicon)*; Penghao Zhou (ByteDance); Chong Zhou (Nanyang Technological University); Zhao Zhang (Nankai University); Xing Sun (Shopee) |
1859 | Efficient Point Cloud Analysis Using Hilbert Curve | Wanli Chen (CUHK)*; Xinge Zhu (The Chinese University of Hong Kong); Guojin Chen (The Chinese University of Hong Kong); Bei Yu (CUHK) |
1860 | Learning Online Multi-Sensor Depth Fusion | Erik Sandström (ETH Zürich)*; Martin R. Oswald (ETH Zurich); Suryansh Kumar (ETH Zurich); Silvan Weder (ETH Zürich); Fisher Yu (ETH Zurich); Cristian Sminchisescu (Lund University); Luc Van Gool (ETH Zurich) |
1866 | Self-Support Few-Shot Semantic Segmentation | Qi Fan (HKUST)*; Wenjie Pei (Harbin Institute of Technology, Shenzhen); Yu-Wing Tai (Kuaishou Technology / HKUST); Chi-Keung Tang (Hong Kong University of Science and Technology) |
1868 | Few-Shot Object Detection with Model Calibration | Qi Fan (HKUST)*; Chi-Keung Tang (Hong Kong University of Science and Technology); Yu-Wing Tai (Kuaishou Technology / HKUST) |
1870 | S2-VER: Semi-Supervised Visual Emotion Recognition | Guoli Jia (NanKai University); Jufeng Yang (Nankai University )* |
1882 | Self-Supervision Can Be a Good Few-Shot Learner | Yuning Lu (USTC); liangjian Wen (the Noah’s Ark Lab, Huawei Technologies Company Limited); Jianzhuang Liu (Huawei Noah’s Ark Lab); Yajing Liu (USTC); Xinmei Tian (USTC)* |
1886 | My View is the Best View: Procedure Learning from Egocentric Videos | Siddhant Bansal (IIIT, Hyderabad)*; Chetan Arora (Indian Institute of Technology Delhi); C.V. Jawahar (IIIT-Hyderabad) |
1894 | Trace Controlled Text to Image Generation | Kun Yan (Beihang University)*; Lei Ji (Microsoft); Chenfei Wu (Microsoft); Jianmin Bao (microsoft.com); Ming Zhou (SINOVATION VENTURES); Nan Duan (Microsoft Research); Shuai Ma (Beihang University) |
1925 | Towards Comprehensive Representation Enhancement in Semantics-guided Self-supervised Monocular Depth Estimation | Jingyuan Ma (HikVision Research Institute)*; Xiangyu Lei (Hikvision Research Institute); Nan Liu (hikvison); Zhao Xian (Hikvision); Shiliang Pu (Hikvision Research Institute) |
1929 | Calibration-free Multi-view Crowd Counting | Qi Zhang (City University of Hong Kong, Hong Kong)*; Antoni Chan (City University of Hong Kong, Hong, Kong) |
1930 | Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training | Zhenyu Li (Harbin Institute of Technology)*; Zehui Chen (University of Science and Technology of China); Ang Li (SenseTime Research); Liangji Fang (Sensetime Research); Qinhong Jiang (SenseTime Research; Shanghai AI Laboratory); Xianming Liu (Harbin Institute of Technology); Junjun Jiang (Harbin Institute of Technology) |
1940 | Online Continual Learning with Contrastive Vision Transformer | Zhen Wang (The University of Sydney )*; Liu Liu (The University of Sydney); Yajing Kong (The University of Sydney); Jiaxian Guo (The University of Sydney); Dacheng Tao (JD.com) |
1946 | COO: Comic Onomatopoeia Dataset for Recognizing Arbitrary or Truncated Texts | Jeonghun Baek (The University of Tokyo)*; Yusuke Matsui (The University of Tokyo); Kiyoharu Aizawa (The University of Tokyo) |
1947 | BungeeNeRF: Progressive Neural Radiance Field for Extreme Multiscale Scene Rendering | Yuanbo Xiangli (Chinese University of Hong Kong)*; Linning Xu (CUHK); Xingang Pan (Max Planck Institute for Informatics); Nanxuan Zhao (University of Bath); Anyi Rao (The Chinese University of Hong Kong); Christian Theobalt (MPI Informatik); Bo Dai (Shanghai AI Lab); Dahua Lin (The Chinese University of Hong Kong) |
1951 | AiATrack: Attention in Attention for Transformer Visual Tracking | Shenyuan Gao (Huazhong University of Science and Technology)*; CHUNLUAN ZHOU (Wormpex AI Research); Chao Ma (Shanghai Jiao Tong University); Xinggang Wang (Huazhong University of Science and Technology); Junsong Yuan (“State University of New York at Buffalo, USA”) |
1952 | Learning Invariant Visual Representations for Compositional Zero-Shot Learning | Tian Zhang (Beijing University of Posts and Telecommunications); Kongming Liang (Beijing University of Posts and Telecommunications)*; Ruoyi Du (Beijing University of Posts and Telecommunications); Xian Sun (Aerospace Information Research Institute, Chinese Academy of Sciences); Zhanyu Ma (Beijing University of Posts and Telecommunications); Jun Guo (Beijing University of Posts and Telecommunications) |
1954 | Image Coding for Machines with Omnipotent Feature Learning | Ruoyu Feng (University of Science and Technology of China)*; Xin Jin (University of Science and Technology of China); Zongyu Guo (University of Science and Technology of China); Runsen Feng (University of Science and Technology of China); Yixin Gao (University of Science and Technology of China); Tianyu He (Microsoft Research Asia); Zhizheng Zhang (Microsoft Research); Simeng Sun (University of Science and Technology of China); Zhibo Chen (University of Science and Technology of China) |
1959 | MOTCOM: The Multi-Object Tracking Dataset Complexity Metric | Malte Pedersen (Aalborg University)*; Joakim Bruslund Haurum (Aalborg University); Patrick Dendorfer (TUM); Thomas B. Moeslund (Aalborg University) |
1980 | How Severe is Benchmark-Sensitivity in Video Self-Supervised Learning? | Fida Mohammad Thoker (University of Amsterdam)*; Hazel Doughty (University of Amsterdam); Piyush Nitin Bagad (University of Amsterdam); Cees Snoek (University of Amsterdam) |
1982 | Rethinking Robust Representation Learning Under Fine-grained Noisy Faces | Bingqi Ma (Sensetime Group Limited)*; Guanglu Song (Sensetime); Boxiao Liu (Institute of Computing Technology, Chinese Academy of Sciences); Yu Liu (SenseTime Group LTD) |
1986 | Feature Representation Learning for Unsupervised Cross-domain Image Retrieval | Conghui Hu (National University of Singapore)*; Gim Hee Lee (National University of Singapore) |
1987 | Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation | sunghwan hong (Korea University); Seokju Cho (Korea University); Jisu Nam (korea university); Stephen Lin (Microsoft Research); Seungryong Kim (Korea University)* |
1988 | Spatial-Frequency Domain Information Integration for Pan-sharpening | man zhou (University of Science and Technology of China); Jie Huang (University of Science and Technology of China); Keyu Yan (University of Science and Technology of China); Hu Yu (University of Science and Technology of China); Xueyang Fu (University of Science and Technology of China); Aiping Liu (University of Science and Technology of China); Xian Wei (East China Normal University); Feng Zhao (University of Science and Technology of China)* |
1991 | TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement | Keyang Zhou (University of Tübingen)*; Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Jan E. Lenssen (TU Dortmund); Gerard Pons-Moll (University of Tübingen) |
1999 | HRDA: Context-Aware High-Resolution Domain-Adaptive Semantic Segmentation | Lukas Hoyer (ETH Zurich)*; Dengxin Dai (ETH Zurich); Luc Van Gool (ETH Zurich) |
2012 | Combating Label Distribution Shift for Active Domain Adaptation | Sehyun Hwang (POSTECH)*; Sohyun Lee (POSTECH); Sungyeon Kim (POSTECH); Jungseul Ok (POSTECH); Suha Kwak (POSTECH) |
2016 | GIPSO: Geometrically Informed Propagation for Online Adaptation in 3D LiDAR Segmentation | Cristiano Saltori (University of Trento)*; Evgeny Krivosheev (University of Trento); Stéphane Lathuilière (Telecom-Paris); Nicu Sebe (University of Trento); Fabio Galasso (Sapienza University); Giuseppe Fiameni (NVIDIA); Elisa Ricci (University of Trento); Fabio Poiesi (Fondazione Bruno Kessler) |
2025 | SuperLine3D: Self-supervised Line Segmentation and Description for LiDAR Point Cloud | Xiangrui Zhao (Zhejiang University)*; Sheng Yang (Alibaba Group); Tianxin Huang (Zhejiang University); Jun Chen (Zhejiang University); Teng Ma (Alibaba Group); Mingyang Li (Alibaba A.I. Labs); Yong Liu (Zhejiang University) |
2031 | Efficient Meta-Tuning for Content-aware Neural Video Delivery | Xiaoqi Li (Columbia university in the city of New york)*; Jiaming Liu (Peking University); Shizun Wang (Beijing University of Posts and Telecommunications); Cheng Lyu (Beijing University of Posts and Telecommunications); Ming Lu (Intel Labs China); Yurong Chen (Intel Labs China); Anbang Yao (Intel Labs China); Yandong Guo (OPPO Research Institute); Shanghang Zhang (University of California, Berkeley) |
2033 | PoseTrans: A Simple Yet Effective Pose Transformation Augmentation for Human Pose Estimation | Wentao Jiang (Beihang University)*; Sheng Jin (The University of Hong Kong); Wentao Liu (Sensetime); Chen Qian (SenseTime); Ping Luo (The University of Hong Kong); Si Liu (Beihang University) |
2039 | 3D-Aware Semantic-Guided Generative Model for Human Synthesis | Jichao Zhang (University of Trento)*; Enver Sangineto (University of Modena and Reggio Emilia); Hao Tang (ETH Zurich); Aliaksandr Siarohin (Snapchat); Zhun Zhong (University of Trento); Nicu Sebe (University of Trento); Wei Wang (EPFL) |
2041 | Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality | Yue Song (University of Trento)*; Nicu Sebe (University of Trento); Wei Wang (EPFL) |
2050 | CoSMix: Compositional Semantic Mix for Domain Adaptation in 3D LiDAR Segmentation | Cristiano Saltori (University of Trento)*; Fabio Galasso (Sapienza University); Giuseppe Fiameni (NVIDIA); Nicu Sebe (University of Trento); Elisa Ricci (University of Trento); Fabio Poiesi (Fondazione Bruno Kessler) |
2054 | Streaming Multiscale Deep Equilibrium Models | Can Ufuk Ertenli (Middle East Technical University)*; Emre Akbas (METU); Ramazan Gokberk Cinbis (METU) |
2057 | AvatarCap: Animatable Avatar Conditioned Monocular Human Volumetric Capture | Zhe Li (Tsinghua University)*; Zerong Zheng (Tsinghua University); Hongwen Zhang (Tsinghua University); Chaonan Ji (Tsinghua University); Yebin Liu (Tsinghua University) |
2061 | Hierarchical Average Precision Training for Pertinent Image Retrieval | Elias Ramzi (Conservatoire Nation des Arts et Metiers)*; Nicolas Audebert (Cnam); Nicolas Thome (CNAM, Paris); Clément Rambour (Cnam); Xavier B Bitot (Coexya) |
2087 | Fashionformer: A Simple, Effective and Unified Baseline for Human Fashion Segmentation and Recognition | Shilin Xu (Peking University); Xiangtai Li (Peking University)*; Jingbo Wang (The Chinese University of HongKong); Guangliang Cheng (Sensetime Group Limited); Yunhai Tong (Peking University); Dacheng Tao (JD.com) |
2088 | Out-of-Distribution Detection with Semantic Mismatch under Masking | Yijun Yang (The Chinese University of Hong Kong)*; Ruiyuan Gao (The Chinese University of Hong Kong); Qiang Xu (The Chinese University of Hong Kong) |
2104 | Target-absent Human Attention | Zhibo Yang (Stony Brook University)*; Sounak Mondal (Stony Brook University); Seoyoung Ahn (Stony Brook University); Gregory Zelinsky (Stony Brook University); Minh Hoai (Stony Brook University); Dimitris Samaras (Stony Brook University) |
2105 | Reference-based Image Super-Resolution with Deformable Attention Transformer | Jiezhang Cao (ETH Zürich)*; Jingyun Liang (ETH Zurich); Kai Zhang (ETH Zurich); Yawei Li (ETH Zurich); Yulun Zhang (ETH Zurich); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Luc Van Gool (ETH Zurich) |
2116 | Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers | Junhyeong Cho (POSTECH)*; Kim Youwang (POSTECH); Tae-Hyun Oh (POSTECH) |
2118 | Learning to Generate Realistic LiDAR Point Cloud | Vlas Zyrianov (University of Illinois Urbana Champaign); Xiyue Zhu (university of illinois); Shenlong Wang (UIUC)* |
2124 | GeoRefine: Self-Supervised Online Depth Refinement for Accurate Dense Mapping | Pan Ji (OPPO US Research Center)*; Qingan Yan (OPPO US Research Center); Yuxin Ma (Wing LLC); Yi Xu (OPPO US Research Center) |
2134 | Transform your Smartphone into a DSLR Camera: Learning the ISP in the Wild | Ardhendu Shekhar Tripathi (ETH Zurich)*; Martin Danelljan (ETH Zurich); Samarth Shukla (ETH Zurich); Radu Timofte (University of Wurzburg & ETH Zurich); Luc Van Gool (ETH Zurich) |
2138 | Uncertainty-Based Spatial-Temporal Attention for Online Action Detection | Hongji Guo (Rensselaer Polytechnic Institute)*; Zhou Ren (Wormpex AI Research); Yi Wu (Wormpex AI Research); Gang Hua (Wormpex AI Research); Qiang Ji (Rensselaer Polytechnic Institute) |
2144 | Video Question Answering with Iterative Video-Text Co-Tokenization | AJ Piergiovanni (Google)*; Kairo Morton (Massachusetts Institute of Technology); Weicheng Kuo (Google); Michael S Ryoo (Google; Stony Brook University); Anelia Angelova (Google) |
2145 | LaTeRF: Label and Text Driven Object Radiance Fields | Ashkan Mirzaei (University of Toronto)*; Yash Mukund Kant (University of Toronto); Jonathan Kelly (University of Toronto); Igor Gilitschenski (University of Toronto) |
2146 | Temporally Consistent Semantic Video Editing | Yiran Xu (University of Maryland, College Park)*; Badour A Sh AlBahar (Virginia Tech); Jia-Bin Huang (Facebook ) |
2149 | SPot-the-Difference Self-Supervised Pre-training for Anomaly Detection and Segmentation | Yang Zou (Amazon AI)*; Jongheon Jeong (KAIST); Latha Pemula (Amazon); Dongqing Zhang (Amazon); Onkar Dabeer (Amazon) |
2151 | Exploring Plain Vision Transformer Backbones for Object Detection | Yanghao Li (Facebook AI Research)*; Hanzi Mao (Facebook AI Research); Ross Girshick (FAIR); Kaiming He (Facebook AI Research) |
2152 | Fine-grained Egocentric Hand-Object Segmentation: Dataset, Model, and Applications | Lingzhi Zhang (University of Pennsylvania)*; Shenghao Zhou (University of Pennsylvania); Simon Stent (Toyota Research Institute); Jianbo Shi (University of Pennsylvania) |
2154 | Is It Necessary to Transfer Temporal Knowledge for Domain Adaptive Video Semantic Segmentation? | Xinyi Wu (University of South Carolina); Zhenyao Wu (University of South Carolina)*; Jin Wan (Beijing Jiaotong University); Lili Ju (University of South Carolina); Song Wang (University of South Carolina) |
2162 | GIMO: Gaze-Informed Human Motion Prediction in Context | Yang Zheng (Tsinghua University); Yanchao Yang (Stanford University)*; Kaichun Mo (Stanford); Jiaman Li (University of Southern California); Tao Yu (Tsinghua University); Yebin Liu (Tsinghua University); Karen Liu (Stanford); Leonidas Guibas (Stanford University) |
2166 | Error Compensation Framework for Flow-Guided Video Inpainting | Jaeyeon Kang (Yonsei University); Seoung Wug Oh (Adobe Research); Seon Joo Kim (Yonsei University)* |
2170 | Decomposing The Tangent of Occluding Boundaries According to Curvatures and Torsions | Huizong Yang (Georgia Institute of Technology)*; Anthony Yezzi (Georgia Institute of Technology) |
2171 | CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution | Taeho Kim (University of Colorado at Boulder)*; Yongin Kwon (Electronics and Telecommunications Research Institute); Jemin Lee (Electronics and Telecommunications Research Institute); Taeho Kim (Electronics and Telecommunications Research Institute); Sangtae Ha (University of Colorado at Boulder) |
2180 | Scraping Textures from Natural Images for Synthesis and Editing | Xueting Li (University of California, Merced)*; Xiaolong Wang (UCSD); Ming-Hsuan Yang (University of California at Merced); Alexei A Efros (UC Berkeley); Sifei Liu (NVIDIA) |
2203 | Self-supervised Learning of Visual Graph Matching | Chang Liu (Shanghai Jiao Tong University); Shaofeng Zhang (Shanghai Jiao Tong University); Xiaokang Yang (Shanghai Jiao Tong University of China); Junchi Yan (Shanghai Jiao Tong University)* |
2206 | Disentangling Architecture and Training for Optical Flow | Deqing Sun (Google)*; Charles Herrmann (Google); Fitsum Reda (Google); Michael Rubinstein (Google); David J Fleet (University of Toronto); William T Freeman (Google) |
2217 | PointFix: Learning to Fix Domain Bias for Robust Online Stereo Adaptation | Kwonyoung Kim (Yonsei University); JungIn Park (Yonsei University); Jiyoung Lee (NAVER AI Lab); Dongbo Min (Ewha Womans University); Kwanghoon Sohn (Yonsei Univ.)* |
2218 | Teaching Where to Look: Attention Similarity Knowledge Distillation for Low Resolution Face Recognition | Sungho Shin (Gwangju Institute of Science and Technology); Joosoon Lee (Gwangju Institute of Science and Technology); junseok lee (GIST(Gwangju Institute of Science and Technology)); Yeonguk Yu (Gwangju Institute of Science and Technology); Kyoobin Lee (Gwangju Institute of Science and Technology)* |
2219 | Iwin: Human-Object Interaction Detection via Transformer with Irregular Windows | Danyang Tu (Shanghai Jiao Tong University)*; Xiongkuo Min (Shanghai Jiao Tong University); Huiyu Duan (Shanghai Jiao Tong University); Guodong Guo (Baidu); Guangtao Zhai (Shanghai Jiao Tong University); Wei Shen (Shanghai Jiao Tong University) |
2221 | Single Stage Virtual Try-on via Deformable Attention Flows | Shuai Bai (Alibaba Group)*; Huiling Zhou (Alibaba); Zhikang Li (DAMO Academy, Alibaba Group); Chang Zhou (Alibaba Group); Hongxia Yang (Alibaba Group) |
2222 | Learning Deep Non-Blind Image Deconvolution Without Ground Truths | Yuhui Quan (South China University of Technology)*; Zhuojie Chen (South China University of Technology); Huan Zheng (National University of Singapore); Hui Ji (National University of Singapore) |
2233 | Rethinking Zero-shot Action Recognition: Learning from Latent Atomic Actions | Yijun Qian (Carnegie Mellon University)*; Lijun Yu (Carnegie Mellon University); Wenhe Liu (Carnegie Mellon University); Alexander Hauptmann (Carnegie Mellon University) |
2234 | NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors | Jiepeng Wang (The University of Hong Kong); Peng Wang (The University of Hong Kong); Xiaoxiao Long (The University of Hong Kong); Christian Theobalt (MPI Informatik); Taku Komura (The University of Hong Kong); Lingjie Liu (Max Planck Institute for Informatics ); Wenping Wang (The University of Hong Kong)* |
2237 | Rethinking Data Augmentation for Robust Visual Question Answering | Long Chen (Columbia University)*; Yuhang Zheng (Zhejiang University); Jun Xiao (Zhejiang University) |
2240 | Dual-Domain Self-Supervised Learning and Model Adaption for Deep Compressive Imaging | Yuhui Quan (South China University of Technology)*; Xinran Qin (South China University of Technology); Tongyao Pang (National University of Singapore); Hui Ji (National University of Singapore) |
2242 | Explicit Image Caption Editing | Zhen Wang (Zhejiang University); Long Chen (Columbia University)*; Wenbo Ma (Zhejiang University); Guangxing Han (Columbia University); Yulei Niu (Columbia University); Jian Shao (Zhejiang University); Jun Xiao (Zhejiang University) |
2255 | SphereFed: Hyperspherical Federated Learning | Xin Dong (Harvard Univeristy)*; Sai Qian Zhang (Harvard University); Ang Li (Google DeepMind); H.T. Kung (Harvard University) |
2257 | Local Color Distributions Prior for Image Enhancement | Haoyuan Wang (City University of Hong Kong)*; Ke Xu (City University of Hong Kong); Rynson W.H. Lau (City University of Hong Kong) |
2267 | Teaching with Soft Label Smoothing for Mitigating Noisy Labels in Facial Expressions | Tohar Lukov (National University of Singapore)*; Na Zhao (NUS); Gim Hee Lee (National University of Singapore); Ser-Nam Lim (Facebook AI) |
2269 | Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion | Zhiqiang Yan (Nanjing University of Science and Tenchnology)*; Xiang Li (Nanjing University of Science and Technology); Kun Wang (Nanjing University of Science and Technology); Zhenyu Zhang (Tencent); Jun Li (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology) |
2272 | 2D Amodal Instance Segmentation Guided by 3D Shape Prior | Zhixuan Li (Peking University); Weining Ye (Peking University); Tingting Jiang (Peking University)*; Tiejun Huang (Peking University) |
2280 | How to Synthesize a Large-Scale and Trainable Micro-Expression Dataset? | Yuchi Liu (Australian National University)*; Zhongdao Wang (Tsinghua University); Tom Gedeon (The Australian National University); Liang Zheng (Australian National University) |
2285 | HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors | Luting Wang (Beihang University)*; Xiaojie Li (sensetime); Yue Liao (Beihang University); Zeren Jiang (ETH Zurich); Jianlong Wu (Shandong University); Fei Wang (University of Science and Technology of China); Chen Qian (SenseTime); Si Liu (Beihang University) |
2293 | Meta Spatio-Temporal Debiasing for Video Scene Graph Generation | LI XU (Singapore University of Technology and Design)*; Haoxuan Qu (Singapore University of Technology and Design); Jason Kuen (Adobe Research); Jiuxiang Gu (Adobe Research); Jun Liu (Singapore University of Technology and Design) |
2307 | A Sliding Window Scheme for Online Temporal Action Localization | Young Hwi Kim (Yonsei University); Hyolim Kang (Yonsei University); Seon Joo Kim (Yonsei University)* |
2310 | Ultra-high-resolution unpaired stain transformation via Kernelized Instance Normalization | Ming-Yang Ho (aetherAI)*; Min-Sheng Wu (aetherAI); Che-Ming Wu (aetherAI) |
2311 | SESS: Saliency Enhancing with Scaling and Sliding | Osman Tursun (Queensland University of Technology)*; SIMON DENMAN (Queensland University of Technology, Australia); Sridha Sridharan (QUT); Clinton Fookes (Queensland University of Technology) |
2312 | Data Efficient 3D Learner via Knowledge Transferred from 2D Model | Ping-Chung Yu (National Tsing Hua University)*; Cheng Sun (National Tsing Hua University); Min Sun (NTHU) |
2319 | MeshMAE: Masked Autoencoders for 3D Mesh Data Analysis | Yaqian Liang (Wuhan University); Shanshan Zhao (JD.COM); Baosheng Yu (The University of Sydney); Jing Zhang (The University of Sydney); Fazhi He (Wuhan University)* |
2327 | ERA: Expert Retrieval and Assembly for Early Action Prediction | Lin Geng Foo (Singapore University of Technology and Design)*; Tianjiao Li (Singapore University of Technology and Design); Hossein Rahmani (Lancaster University); Qiuhong Ke (Monash University); Jun Liu (Singapore University of Technology and Design) |
2328 | Mining Cross-Person Cues for Body-Part Interactiveness Learning in HOI Detection | Xiaoqian Wu (Shanghai Jiao Tong University); Yong-Lu Li (Shanghai Jiao Tong University); Xinpeng Liu (Shanghai Jiao Tong University); Junyi Zhang (Shanghai Jiao Tong University); Yuzhe Wu (DongHua University); Cewu Lu (Shanghai Jiao Tong University)* |
2334 | Improving GANs for Long-Tailed Data through Group Spectral Regularization | Harsh Rangwani (Indian Institute of Science)*; Naman Jaswani (Indian Institute of Science); Tejan Karmali (Indian Institute of Science, Bengaluru); Varun Jampani (Google); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science) |
2336 | Hierarchical Semantic Regularization of Latent Spaces in StyleGANs | Tejan Karmali (Indian Institute of Science, Bengaluru)*; Rishubh Parihar (Indian Institute of Science, Bangalore); Susmit Agrawal (Indian Institute of Science); Harsh Rangwani (Indian Institute of Science); Varun Jampani (Google); Maneesh K Singh (Motive Technologies ); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science) |
2337 | Symmetry Regularization and Saturating Nonlinearity for Robust Quantization | SEIN PARK (POSTECH); Yeongsang Jang (POSTECH); Eunhyeok Park (POSTECH)* |
2350 | IntereStyle: Encoding an Interest Region for Robust StyleGAN Inversion | Seung Jun Moon (KAIST)*; Gyeong-Moon Park (Kyung Hee University) |
2369 | Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation | Ziming Wang (Beihang University); Xiaoliang Huo (Beihang University); Zhenghao Chen (University of Sydney); Jing Zhang (Beihang University); Lu Sheng (Beihang University)*; Dong Xu (The University of Hong Kong) |
2373 | Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis | Shuai Shen (Tsinghua University); Wanhua Li (Tsinghua University); Zheng Zhu (Tsinghua University); Yueqi Duan (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)* |
2378 | StyleLight: HDR Panorama Generation for Lighting Estimation and Editing | Guangcong Wang (Nanyang Technological University)*; Yinuo Yang (Nanyang Technological University); Chen Change Loy (Nanyang Technological University); Ziwei Liu (Nanyang Technological University) |
2379 | You Should Look at All Objects | Zhenchao Jin (University of Science and Technology of China)*; Dongdong Yu (ByteDance Inc.); Luchuan Song (University of Science and Technology of China); Zehuan Yuan (Bytedance.Inc); Lequan Yu (The University of Hong Kong) |
2384 | BRNet: Exploring Comprehensive Features for Monocular Depth Estimation | Wencheng Han (Beijing Institute of Technology)*; Junbo Yin (Beijing Institute of Technology); Xiaogang Jin (Zhejiang University); dai xiangdong (oppo); Jianbing Shen (Inception Institute of Artificial Intelligence) |
2403 | CoupleFace: Relation Matters for Face Recognition Distillation | Jiaheng Liu (Beihang University)*; Haoyu Qin (SenseTime); Yichao Wu (Sensetime Group Limited); Jinyang Guo (The University of Sydney); Ding Liang (Sensetime Group Limited); Ke Xu (Beihang University) |
2404 | Collaborating Domain-shared and Target-specific Feature Clustering for Cross-domain 3D Action Recognition | Qinying Liu (University of Science and Technology of China); Zilei Wang (University of Science and Technology of China)* |
2406 | Adaptive Spatial-BCE Loss for Weakly Supervised Semantic Segmentation | Tong Wu (Beijing Institute of Technology); Guangyu Ryan Gao (Beijing Institute of Technology)*; junshi huang (Meituan); Xiaolin Wei (Meituan); Xiaoming Wei (Meituan); Chi Harold Liu (Beijing Institute of Technology) |
2418 | Multi-Person 3D Pose and Shape Estimation via Inverse Kinematics and Refinement | Junuk Cha (UNIST)*; Muhammad Saqlain (Ulsan National Institute of Science and Technology); GeonU Kim (UNIST); Mingyu Shin (ULSAN NATIONAL INSTITUTE OF SCIENCE AND TECHNOLOGY); Seungryul Baek (UNIST) |
2423 | Explaining Deepfake Detection by Analysing Image Matching | Shichao Dong (Megvii); Jin Wang (Megvii); Haoqiang Fan (Megvii Inc(face++)); Jiajun Liang (Megvii); Renhe Ji (Megvii)* |
2424 | L-CoDer: Language-based Colorization with Color-object Decoupling Transformer | Zheng Chang (Beijing University of Posts and Telecommunications); Shuchen Weng (Peking University)*; Yu Li (International Digital Economy Academy); Si Li (Beijing University of Posts and Telecommunications); Boxin Shi (Peking University) |
2449 | GitNet: Geometric Prior-based Transformation for Birds-Eye-View Segmentation | Shi Gong (Huazhong University of Science and Technology); Xiaoqing Ye (Baidu Inc.); Xiao Tan (Baidu Inc.); Jingdong Wang (Baidu); Errui Ding (Baidu Inc.); Yu Zhou (Huazhong University of Science and Technology)*; Xiang Bai (Huazhong University of Science and Technology) |
2459 | Unsupervised Deep Multi-Shape Matching | Dongliang Cao (Technical University of Munich); Florian Bernard (University of Bonn)* |
2463 | GaitEdge: Beyond Plain End-to-end Gait Recognition for Better Practicality | Junhao Liang (Southern University of Science and Technology in China)*; Chao Fan (SUSTech); Saihui Hou (Beijing Normal University); Chuanfu Shen (Southern University of Science and Technology); Yongzhen Huang (School of Artificial Intelligence, Beijing Normal University); Shiqi Yu (Southern University of Science and Technology) |
2483 | EAutoDet: Efficient Architecture Search for Object Detection | Xiaoxing Wang (Shanghai Jiao Tong University); Jiale Lin (Shanghai Jiao Tong University); Juanping Zhao (Guangdong OPPO Mobile Telecommunications Co., Ltd.); Xiaokang Yang (Shanghai Jiao Tong University of China); Junchi Yan (Shanghai Jiao Tong University)* |
2485 | A Max-Flow based Approach for Neural Architecture Search | Chao Xue (beijing university of posts and telecommunications)*; Xiaoxing Wang (Shanghai Jiao Tong University); Junchi Yan (Shanghai Jiao Tong University); Chun-Guang Li (Beijing University of Posts & Telecommunications) |
2488 | Can Shuffling Video Benefit Temporal Bias Problem: A Novel Training Framework for Temporal Grounding | Jiachang Hao (Beijing University of Posts and Telecommunications)*; Haifeng Sun (Beijing university of posts and telecommunications); Pengfei Ren (Beijing University of Posts and Telecommunications); Jingyu Wang (Beijing University of Posts and Telecommunications); Qi Qi (Beijing University of Posts and Telecommunications); Jianxin Liao (beijing university of posts and telecommunications) |
2494 | tSF: Transformer-based Semantic Filter for Few-Shot Learning | Jinxiang Lai (Tencent)*; Siqian Yang (Tencent); Wenlong Liu (Tencent); Yi Zeng (Tencent); Zhongyi Huang (Tencent); Wenlong Wu (Tencent); Jun Liu (Tencent); Bin-Bin Gao (Tencent); Chengjie Wang (Tencent; Shanghai Jiao Tong University) |
2501 | Dense Gaussian Processes for Few-Shot Segmentation | Joakim Johnander (Linköping University)*; Johan Edstedt (Linköping University); Fahad Shahbaz Khan (MBZUAI); Michael Felsberg (Linköping University); Martin Danelljan (ETH Zurich) |
2507 | Adversarial Feature Augmentation for Cross-domain Few-shot Classification | Yanxu Hu (Sun Yat-sen University); Andy J Ma (Sun Yat-sen University)* |
2511 | Real-Time Neural Character Rendering with Pose-Guided Multiplane Images | Hao Ouyang (HKUST)*; Bo Zhang (Microsoft Research Asia); Pan Zhang (Shanghai AI Laboratory); Hao Yang (Microsoft Research Asia); Dong Chen (Microsoft Research Asia); Jiaolong Yang (Microsoft Research); Qifeng Chen (HKUST); Fang Wen (Microsoft Research Asia ) |
2512 | Constructing Balance from Imbalance for Long-tailed Image Recognition | Yue Xu (Shanghai Jiao Tong University); Yong-Lu Li (Shanghai Jiao Tong University); Jiefeng Li (Shanghai Jiao Tong University); Cewu Lu (Shanghai Jiao Tong University)* |
2516 | SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views | Xiaoxiao Long (The University of Hong Kong)*; Cheng Lin (Tencent); Peng Wang (The University of Hong Kong); Taku Komura (The University of Hong Kong); Wenping Wang (The University of Hong Kong) |
2538 | Dual Perspective Network for Audio Visual Event Localization | Varshanth Rao (Huawei Technologies)*; Md Ibrahim Khalil (Huawei Noah’s Ark Laboratory); Haoda Li (University of California, Berkeley); Peng Dai (Huawei Technologies Inc.Canada); Juwei Lu (Huawei Noah’s Ark Lab) |
2542 | SiamDoGe: Domain Generalizable Semantic Segmentation using Siamese Network | Zhenyao Wu (University of South Carolina)*; Xinyi Wu (University of South Carolina); Xiaoping Zhang (Wuhan University); Song Wang (University of South Carolina); Lili Ju (University of South Carolina) |
2545 | Is Appearance Free Action Recognition Possible? | Filip Ilic (Graz University of Technology)*; Rick Wildes (York University); Thomas Pock (Graz University of Technology) |
2557 | Detecting Twenty-thousand Classes using Image-level Supervision | Xingyi Zhou (The University of Texas at Austin)*; Rohit Girdhar (Facebook AI Research); Armand Joulin (Facebook AI Research); Philipp Kraehenbuehl (UT Austin); Ishan Misra (Facebook AI Research) |
2558 | DCL-Net: Deep Correspondence Learning Network for 6D Pose Estimation | Hongyang Li (South China University of Technology)*; Jiehong Lin (South China University of Technology); Kui Jia (South China University of Technology) |
2565 | Learning Cross-Video Neural Representations for High-Quality Frame Interpolation | Wentao Shangguan (Washington University in St Louis); Yu Sun (Washington University in St. Louis); Weijie Gan (Washington University in St. Louis); Ulugbek S. Kamilov (Washington University in St. Louis)* |
2568 | Learning Visibility for Robust Dense Human Body Estimation | Chun-Han Yao (University of California at Merced)*; Jimei Yang (Adobe); Duygu Ceylan (Adobe Research); Yi Zhou (Adobe Research); Yang Zhou (Adobe Research); Ming-Hsuan Yang (University of California at Merced) |
2573 | Texturify: Generating Textures on 3D Shape Surfaces | Yawar Siddiqui (Technical University of Munich)*; Justus Thies (Max Planck Institute for Intelligent Systems); Fangchang Ma (Apple Inc.); Qi Shan (Apple Inc.); Matthias Niessner (Technical University of Munich); Angela Dai (Technical University of Munich) |
2575 | Unsupervised Selective Labeling for More Effective Semi-Supervised Learning | Xudong Wang (UC Berkeley / ICSI); Long Lian (UC Berkeley / ICSI); Stella X Yu (UC Berkeley / ICSI)* |
2576 | Reliable Visual Question Answering: Abstain Rather Than Answer Incorrectly | Spencer Whitehead (Meta AI)*; Suzanne Petryk (UC Berkeley); Vedaad Shakib (UC Berkeley); Joseph E Gonzalez (UC Berkeley); Trevor Darrell (UC Berkeley); Anna Rohrbach (UC Berkeley); Marcus Rohrbach (Facebook AI Research) |
2581 | Studying Bias in GANs through the Lens of Race | Vongani H Maluleke (University of California, Berkeley); Neerja Thakkar (University of California, Berkeley)*; Tim Brooks (UC Berkeley); Ethan Weber (UC Berkeley); Trevor Darrell (UC Berkeley); Alexei A Efros (UC Berkeley); Angjoo Kanazawa (University of California Berkeley); Devin Guillory (UC Berkeley) |
2583 | On Multi-Domain Long-Tailed Recognition, Imbalanced Domain Generalization and Beyond | Yuzhe Yang (MIT)*; Hao Wang (Rutgers University); Dina Katabi (Massachusetts Institute of Technology) |
2584 | Disentangling Object Motion and Occlusion for Unsupervised Multi-frame Monocular Depth | Ziyue Feng (Clemson University)*; Liang Yang (Apple Inc); Longlong Jing (Waymo LLC); Haiyan Wang (The City College of New York); YingLi Tian (City University of New York); Bing Li (Clemson University) |
2586 | Autoregressive 3D Shape Generation via Canonical Mapping | An-Chieh Cheng (National Tsing Hua University); Xueting Li (University of California, Merced); Sifei Liu (NVIDIA)*; Min Sun (NTHU); Ming-Hsuan Yang (University of California at Merced) |
2589 | Learning Continuous Implicit Representation for Near-Periodic Patterns | Bowei Chen (CMU)*; Tiancheng Zhi (ByteDance); Martial Hebert (cmu); Srinivasa Narasimhan (Carnegie Mellon University, USA) |
2596 | Robust Landmark-based Stent Tracking in X-ray Fluoroscopy | Luojie Huang (Johns Hopkins Uniersity); Yikang Liu (United Imaging Intelligence America); Li Chen (University of Washington); Eric Z. Chen (United Imaging Intelligence America); Xiao Chen (United Imaging Intelligence America); Shanhui Sun (United Imaging Intelligence America)* |
2598 | Depth Field Networks for Generalizable Multi-view Scene Representation | Vitor Guizilini (Toyota Research Institute)*; Igor Vasiljevic (Toyota Research Institute); Jiading Fang (Toyota Technological Institute at Chicago); Rareș A Ambruș (Toyota Research Institute); Greg Shakhnarovich (Toyota Technological Institute at Chicago); Matthew Walter (Toyota Technological Institute at Chicago); Adrien Gaidon (Toyota Research Institute) |
2601 | Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation | Simone Rossetti (Sapienza University); Damiano Zappia (Deepplants S.r.l.); Marta Sanzari (Sapienza University of Rome); Marco Schaerf (Sapienza University of Rome); fiora pirri (University of Rome, Sapienza)* |
2605 | GRIT: Faster and Better Image captioning Transformer Using Dual Visual Features | Van-Quang Nguyen (Tohoku University)*; Masanori Suganuma (Tohoku University / RIKEN AIP); Takayuki Okatani (Tohoku University/RIKEN AIP) |
2609 | Learning Semantic Correspondence with Sparse Annotations | Shuaiyi Huang (University of Maryland, College Park)*; Luyu Yang (University of Maryland, College Park); Bo He (University of Maryland); Songyang Zhang (Shanghai AI Laboratory); Xuming He (ShanghaiTech University); Abhinav Shrivastava (University of Maryland) |
2610 | A Real World Dataset for Multi-view 3D Reconstruction | Rakesh Shrestha (Simon Fraser University)*; Siqi Hu (Alibaba damo academy); Minghao Gou (Shanghai Jiao Tong University); Ziyuan Liu (Huawei group); Ping Tan (Simon Fraser University) |
2620 | Social ODE: Multi-Agent Trajectory Forecasting with Neural Ordinary Differential Equations | Song Wen (Rutgers University)*; Hao Wang (Rutgers University); Dimitris N. Metaxas (Rutgers) |
2621 | 3D Instances as 1D Kernels | Yizheng Wu (Huazhong Univ. of Sci.&Tech.); Min Shi (Huazhong University of Science and Technology); Shuaiyuan Du (Huazhong Univ. of Sci.&Tech. ); Hao Lu (Huazhong University of Science and Technology); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.)*; Weicai Zhong (Huawei CBG Consumer Cloud Service Big Data Platform Dept.) |
2624 | Context-Aware Streaming Perception in Dynamic Environments | Gur-Eyal Sela (UC Berkeley)*; Ionel Gog (UC Berkeley); Justin Wong (UC Berkeley); Kumar Krishna Agrawal (UC Berkeley); Xiangxi Mo (UC Berkeley); Sukrit Kalra (UC Berkeley); Peter Schafhalter (UC Berkeley); Eric Leong (UC Berkeley); Xin Wang (Microsoft Research); Bharathan Balaji (Amazon); Joseph E Gonzalez (UC Berkeley); Ion Stoica (UC Berkeley) |
2625 | PointTree: Transformation-Robust Point Cloud Encoder with Relaxed K-D Trees | Jun-Kun Chen (University of Illinois at Urbana-Champaign)*; Yu-Xiong Wang (University of Illinois at Urbana-Champaign) |
2631 | Dense Siamese Network for Dense Unsupervised Learning | Wenwei Zhang (NTU)*; Jiangmiao Pang (CUHK); Kai Chen (SenseTime Research); Chen Change Loy (Nanyang Technological University) |
2633 | Uncertainty-aware Multi-modal Learning via Cross-modal Random Network Prediction | Hu Wang (the University of Adelaide)*; Jianpeng Zhang (Northwestern Polytechnical University); Yuanhong Chen (University of Adelaide); Congbo Ma (The University of Adelaide); Jodie C Avery (University of Adelaide); Mary L Hull (University of Adelaide); Gustavo Carneiro (University of Adelaide) |
2638 | Enhanced Accuracy and Robustness via Multi-Teacher Adversarial Distillation | Shiji Zhao (Beihang University); Jie Yu (Beihang University); Zhenlong Sun (Tencent Technology Co.Ltd); Bo Zhang (WeChat Search Application Department, Tencent); Xingxing Wei (Beihang University)* |
2645 | End-to-end graph-constrained vectorized floorplan generation with panoptic refinement | Jiachen Liu (Pennsylvania State University)*; Yuan Xue (Johns Hopkins University); Jose P. Duarte (Penn State University); Krishnendra Shekhawat (BITS Pilani); Zihan Zhou (Manycore Tech Inc.); Sharon Xiaolei Huang (The Pennsylvania State University) |
2649 | Context Enhanced Stereo Transformer | weiyu Guo (University of Chinese Academy of Sciences)*; Zhaoshuo Li (Johns Hopkins University); Yongkui Yang (Shenzhen Institute of Advanced Technology,Chinese Academy of Sciences); Zheng Wang (Shenzhen Institutes of Advanced Technology); Russ Taylor (Johns Hopkins University); Mathias Unberath (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Yingwei Li (Johns Hopkins University) |
2652 | NSNet: Non-saliency Suppression Sampler for Efficient Video Recognition | Boyang Xia (Institute of Computing Technology, Chinese Academy of Science); Wenhao Wu (Baidu)*; Haoran Wang (Baidu); RUI SU (the University of Sydney); Dongliang He (Baidu); Haosen Yang (Harbin Institute of Technology); Xiaoran Fan (Institute of Computing Technology, Chinese Academy of Sciences); Wanli Ouyang (The University of Sydney) |
2663 | Hierarchically Self-Supervised Transformer for Human Skeleton Representation Learning | Yuxiao Chen (Rutgers University)*; Long Zhao (Google Research); Jianbo Yuan (Bytedance); Yu Tian (Rutgers); zhaoyang xia (Rutgers University); Shijie Geng (Rutgers University); Ligong Han (Rutgers University); Dimitris N. Metaxas (Rutgers) |
2666 | Few-Shot Video Object Detection | Qi Fan (HKUST)*; Chi-Keung Tang (Hong Kong University of Science and Technology); Yu-Wing Tai (Kuaishou Technology / HKUST) |
2667 | Improving the Reliability for Confidence Estimation | Haoxuan Qu (Singapore University of Technology and Design)*; Yanchao Li (Singapore University of Technology and Design); Lin Geng Foo (Singapore University of Technology and Design); Jason Kuen (Adobe Research); Jiuxiang Gu (Adobe Research); Jun Liu (Singapore University of Technology and Design) |
2686 | Selective Query-guided Debiasing for Video Corpus Moment Retrieval | Sunjae Yoon (KAIST)*; Ji Woo Hong (KAIST); Eunseop Yoon (KAIST); DaHyun Kim (KAIST); Junyeong Kim (Chung-Ang University); Hee Suk Yoon (KAIST); Chang D. Yoo (KAIST) |
2701 | Posterior Refinement on Metric Matrix Improves Generalization in Metric Learning | Mingda Wang (Shanghai Jiao Tong University); Canqian Yang (Shanghai Jiao Tong University); Yi Xu (Shanghai Jiao Tong University)* |
2707 | DISP6D: Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation | Yilin Wen (The University of Hong Kong)*; Xiangyu Li (Brown University); Hao Pan (Microsoft Research); Lei Yang (The University of Hong Kong); Zheng Wang (SUSTech); Taku Komura (The University of Hong Kong); Wenping Wang (The University of Hong Kong) |
2709 | Few-shot Image Generation with Mixup-based Distance Learning | Chaerin Kong (Seoul National University); Jeesoo Kim (Naver Webtoon AI); Donghoon Han (Seoul National University); Nojun Kwak (Seoul National University)* |
2715 | Data-Free Neural Architecture Search via Recursive Label Calibration | Zechun Liu (Carnegie Mellon University); Zhiqiang Shen (Carnegie Mellon University)*; Yun Long (Google); Eric Xing (MBZUAI, CMU, and Petuum Inc.); Kwang-Ting Cheng (Hong Kong University of Science and Technology); Chas H Leichner (Google) |
2717 | Distilling Object Detectors With Global Knowledge | Sanli Tang (Hikvision Research Institute); Zhongyu Zhang (Hikvision Research Institute); Zhanzhan Cheng (Zhejiang University & Hikvision Research Institute)*; Jing Lu (Hikvision Research Institute); Yunlu Xu (Hikvision Research Institute); Yi Niu (Hikvision Research Institute); Fan He (Shanghai Jiao Tong University) |
2730 | NEST: Neural Event Stack for Event-based Image Enhancement | Minggui Teng (Peking University)*; Chu Zhou (Peking University); Hanyue Lou (Peking University); Boxin Shi (Peking University) |
2732 | Multi-Granularity Distillation Scheme Towards Lightweight Semi-Supervised Semantic Segmentation | Jie Qin (School of Artificial Intelligence, University of Chinese Academy of Sciences; Institute of Automation,Chinese Academy of Sciences)*; Jie Wu (ByteDance Inc); Ming Li (Xiamen University); Xuefeng Xiao (ByteDance Inc); Min Zheng (ByteDance); Xingang Wang (Institute of Automation, CAS) |
2740 | A Style-Based GAN Encoder for High Fidelity Reconstruction of Images and Videos | Xu YAO (Telecom ParisTech)*; Alasdair Newson (Telecom Paris); Yann Gousseau (Telecom Paris); PIERRE HELLIER (Interdigital (Technicolor)) |
2746 | Unifying Visual Perception by Dispersible Points Learning | Jianming Liang (Beihang University)*; Guanglu Song (Sensetime); Biao Leng (Beihang University); Yu Liu (SenseTime Group LTD) |
2747 | Towards High-Fidelity Single-view Holistic Reconstruction of Indoor Scenes | Haolin Liu (The Chinese University of Hong Kong, Shenzhen)*; Yujian Zheng (The Chinese University of Hong Kong, Shenzhen); Guanying CHEN (The Chinese University of Hong Kong, Shenzhen); Shuguang Cui (The Chinese University of Hong Kong, Shenzhen ); Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen)) |
2756 | Multimodal Transformer for Automatic 3D Annotation and Object Detection | Chang Liu (The University of Hong Kong)*; Xiaoyan QIAN (The University of Hong Kong); Binxiao Huang (The University of Hong Kong); Xiaojuan Qi (The University of Hong Kong); Edmund Lam (The University of Hong Kong); Siew-Chong Tan (Nil); Ngai Wong (The University of Hong Kong) |
2761 | SP-Net: Slowly Progressing Dynamic Inference Networks | Huanyu Wang (Zhejiang University)*; Wenhu Zhang (Zhejiang University); Shihao Su (Zhejiang University); Hui Wang (Zhejiang University); Zhenwei Miao (DAMO Academy, Alibaba Group); Xin Zhan (DAMO Academy, Alibaba Group); Xi Li (Zhejiang University) |
2764 | No Token Left Behind: Explainability-Aided Image Classification and Generation | Roni Paiss (Tel Aviv University, Google); Hila Chefer (Tel Aviv University)*; Lior Wolf (Tel Aviv University, Israel) |
2766 | Dynamically Transformed Instance Normalization Network for Generalizable Person Re-Identification | BingLiang Jiao (Northwestern Polytechnical University ); Lingqiao Liu (University of Adelaide); Liying Gao ( Northwestern Polytechnical University); Guosheng Lin (Nanyang Technological University); Lu Yang (Northwestern Polytechnical University); Shizhou Zhang (NorthWestern Polytechnical University); Peng Wang (Northwestern Polytechnical University)*; Yanning Zhang (Northwestern Polytechnical University) |
2772 | Editable Indoor Lighting Estimation | Henrique Weber (Université Laval)*; Mathieu Garon (Depix); Jean-Francois Lalonde (Université Laval) |
2783 | PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection | Gang Li (Nanjing University of Science and Technology)*; Xiang Li (Nanjing University of Science and Technology); Yujie Wang (Sensetime Research); Yichao Wu (Sensetime Group Limited); Ding Liang (Sensetime Group Limited); Shanshan Zhang (Max Planck Institute for Informatics) |
2786 | CompNVS: Novel View Synthesis with Scene Completion | Zuoyue Li (ETH Zurich)*; Tianxing Fan (Zhejiang University); Zhenqiang Li (The University of Tokyo); Zhaopeng Cui (Zhejiang University); Yoichi Sato (University of Tokyo); Marc Pollefeys (ETH Zurich / Microsoft); Martin R. Oswald (ETH Zurich) |
2787 | Dynamic 3D Scene Analysis by Point Cloud Accumulation | Shengyu Huang (ETH Zürich)*; Zan Gojcic (NVIDIA); Jiahui Huang (Tsinghua University); Andreas Wieser (ETH Zürich); Konrad Schindler (ETH Zurich) |
2798 | FakeCLR: Exploring Contrastive Learning for Solving Latent Discontinuity in Data-Efficient GANs | Ziqiang Li (University of Science and Technology of China)*; Chaoyue Wang (JD.com); Heliang Zheng (JD Explore Academy, JD.com); Jing Zhang (The University of Sydney); Bin Li (University of Science and Technology of China) |
2802 | Resolving Copycat Problems in Visual Imitation Learning via Residual Action Prediction | Chia-Chi Chuang (Tsinghua University); Donglin Yang (Tsinghua University); Chuan Wen (Tsinghua University)*; Yang Gao (Tsinghua University) |
2804 | REALY: Rethinking the Evaluation of 3D Face Reconstruction | Zenghao Chai (Tsinghua University); Haoxian Zhang (Tencent); Jing Ren (ETH Zurich); Di Kang (Tencent); Zhengzhuo Xu (Tsinghua University); Xuefei Zhe (Tencent AI lab); Chun Yuan (Graduate school at ShenZhen,Tsinghua university); Linchao Bao (Tencent AI Lab)* |
2806 | TransMatting: Enhancing Transparent Objects Matting with Transformers | huanqia cai (University of Chinese Academy of Sciences)*; Fanglei Xue (University of Chinese Academy of Sciences); Lele Xu (Key Laboratory of Space Utilization, Technology and Engineering Center for space Utilization, Chinese Academy of Sciences.); lili guo (Key Laboratory of Space Utilization, Technology and Engineering Center for space Utilization, Chinese Academy of Sciences. ) |
2814 | Diverse Image Inpainting with Normalizing Flow | Cairong Wang (Graduate school at Shenzhen, Tsinghua University)*; Yiming M Zhu (Graduate school at ShenZhen,Tsinghua university); Chun Yuan (Graduate school at ShenZhen,Tsinghua university) |
2818 | Video Activity Localisation with Uncertainties in Temporal Boundary | Jiabo Huang (Queen Mary University of London)*; Hailin Jin (Adobe Research); Shaogang Gong (Queen Mary University of London); Yang Liu (Peking University) |
2822 | SketchSampler: Sketch-based 3D Reconstruction via View-dependent Depth Sampling | Chenjian Gao (Beihang University); Qian Yu (Beihang University)*; Lu Sheng (Beihang University); Yi-Zhe Song (University of Surrey); Dong Xu (The University of Hong Kong) |
2829 | Exploring Resolution and Degradation Clues as Self-supervised Signal for Low Quality Object Detection | Ziteng Cui (The University of Tokyo); Yingying Zhu (University of Texas Arlington); Lin Gu (RIKEN,AIP / The University of Tokyo)*; Guo-Jun Qi (Futurewei Technologies); Xiaoxiao Li (The University of British Columbia); Renrui Zhang (Shanghai AI Lab); Zenghui Zhang (Shanghai Jiao Tong university); Tatsuya Harada (The University of Tokyo / RIKEN) |
2840 | CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation | Feng Wang (Tsinghua University)*; Huiyu Wang (JHU); Chen Wei (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Wei Shen (Shanghai Jiao Tong University) |
2852 | Learning from Multiple Annotator Noisy Labels via Sample-wise Label Fusion | Zhengqi Gao (MIT)*; Fan-Keng Sun (MIT); Mingran Yang (MIT); Sucheng Ren (South China University of Technology); Zikai Xiong (Massachusetts Institute of Technology); Marc Engeler (Takeda); Antonio Burazer (Takeda); Linda Wildling (Takeda Pharmaceuticals International AG); Luca Daniel (Massachusetts Institute of Technology); Duane Boning (MIT) |
2856 | Robust Category-Level 6D Pose Estimation with Coarse-to-Fine Rendering of Neural Features | Wufei Ma (Purdue University)*; Angtian Wang (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Adam Kortylewski (Max Planck Institute for Informatics) |
2861 | A Unified Framework for Domain Adaptive Pose Estimation | Donghyun Kim (MIT-IBM Watson AI Lab)*; Kaihong Wang (Boston University); Stan Sclaroff (Boston University); Margrit Betke (Boston University); Kate Saenko (Boston University) |
2862 | A Broad Study of Pre-training for Domain Generalization and Adaptation | Donghyun Kim (MIT-IBM Watson AI Lab)*; Kaihong Wang (Boston University); Stan Sclaroff (Boston University); Kate Saenko (Boston University) |
2863 | BlobGAN: Spatially Disentangled Scene Representations | Dave Epstein (UC Berkeley)*; Taesung Park (Adobe Research); Richard Zhang (Adobe); Eli Shechtman (Adobe Research, US); Alexei A Efros (UC Berkeley) |
2864 | LGV: Boosting Adversarial Example Transferability from Large Geometric Vicinity | Martin Gubri (University of Luxembourg)*; Maxime Cordy (University of Luxembourg); Mike Papadakis (University of Luxembourg); Yves Le Traon (University of Luxembourg); Koushik Sen (University of California, Berkeley) |
2871 | LocalBins: Improving Depth Estimation by Learning Local Distributions | Shariq F Bhat (KAUST)*; Ibraheem Alhashim (National Center for Artificial Intelligence (NCAI), Saudi Data and Artificial Intelligence Authority (SDAIA), Riyadh, Kingdom of Saudi Arabia); Peter Wonka (KAUST) |
2872 | Prior Knowledge Guided Unsupervised Domain Adaptation | Tao Sun (Stony Brook University)*; Cheng Lu (Xiaopeng); Haibin Ling (Stony Brook University) |
2877 | Fast Two-step Blind Optical Aberration Correction | Thomas Eboli (ENS Paris-Saclay)*; Jean-Michel Morel (Centre Borelli ENS Paris-Saclay); Gabriele Facciolo (ENS Paris – Saclay) |
2887 | Controllable and Guided Face Synthesis for Unconstrained Face Recognition | Feng Liu (Michigan State University)*; Minchul Kim (Michigan State University); Anil Jain (Michigan State University); Xiaoming Liu (Michigan State University) |
2888 | 2D GANs Meet Unsupervised Single-view 3D Reconstruction | Feng Liu (Michigan State University)*; Xiaoming Liu (Michigan State University) |
2891 | Seeing Far in the Dark with Patterned Flash | Zhanghao Sun (Stanford University)*; Jian Wang (Snap); Yicheng Wu (Snap Inc.); Shree Nayar (Snap) |
2900 | Unified Implicit Neural Stylization | Zhiwen Fan (University of Texas at Austin)*; Yifan Jiang (University of Texas at Austin); Peihao Wang (University of Texas at Austin); Xinyu Gong (University of Texas at Austin); Dejia Xu (University of Texas at Austin); Zhangyang Wang (University of Texas at Austin) |
2901 | Improved Masked Image Generation with Token-Critic | Jose Lezama (Google Research)*; Huiwen Chang (Google); Lu Jiang (Google Research); Irfan Essa (Google) |
2902 | UNIF: United Neural Implicit Functions for Clothed Human Reconstruction and Animation | Shenhan Qian (ShanghaiTech University)*; Jiale Xu (ShanghaiTech University); Ziwei Liu (Nanyang Technological University); Liqian Ma (ZMO AI); Shenghua Gao (Shanghaitech University) |
2903 | PseudoClick: Interactive Image Segmentation with Click Imitation | Qin Liu (UNC)*; Meng Zheng (United Imaging Intelligence); Benjamin Planche (United Imaging Intelligence); Srikrishna Karanam (Adobe Research); Terrence Chen (United Imaging Intelligence); Marc Niethammer (UNC); Ziyan Wu (United Imaging Intelligence) |
2904 | CoSCL: Cooperation of Small Continual Learners is Stronger than a Big One | Liyuan Wang (Tsinghua University)*; Xingxing Zhang (Tsinghua University); Qian Li (Tsinghua University); Jun Zhu (Tsinghua University); Yi Zhong (Tsinghua University) |
2909 | Scalable Learning to Optimize: A Learned Optimizer Can Train Big Models | Xuxi Chen (University of Texas at Austin)*; Tianlong Chen (Unversity of Texas at Austin); Yu Cheng (Microsoft Research); Weizhu Chen (Microsoft); Ahmed Awadallah (Microsoft); Zhangyang Wang (University of Texas at Austin) |
2921 | PRIF: Primary Ray-based Implicit Function | Brandon Yushan Feng (University of Maryland, College Park)*; Yinda Zhang (Google); Danhang Tang (Google); Ruofei Du (Google); Amitabh Varshney (University of Maryland) |
2925 | From Face to Natural Image: Learning Real Degradation for Blind Image Super-Resolution | Xiaoming Li (Harbin Institute of Technology); Chaofeng Chen (Nanyang Technological University); Xianhui Lin (Alibaba Group); Wangmeng Zuo (Harbin Institute of Technology, China)*; Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
2936 | QISTA-ImageNet: A Deep Compressive Image Sensing Framework Solving Lq-Norm Optimization Problem | Gang-Xuan Lin (Academia Sinica); Shih-Wei Hu (National Taiwan University); Chun-Shien Lu (Academia Sinica)* |
2943 | Trust, but Verify: Using Self-Supervised Probing to Improve Trustworthiness | Ailin Deng (National University of Singapore)*; Shen Li (National University of Singapore); Miao Xiong (National University of Singapore); Zhirui Chen (National University of Singapore); Bryan Hooi (National University of Singapore) |
2948 | Spatial and Visual Perspective-Taking via View Rotation and Relation Reasoning for Embodied Reference Understanding | Cheng Shi (ShanghaiTech University); Sibei Yang (ShanghaiTech University)* |
2953 | Med-DANet: Dynamic Architecture Network for Efficient Medical Volumetric Segmentation | Wenxuan Wang (University of Science and Technology Beijing)*; Chen Chen (University of Central Florida); Jing Wang (University of Science and Technology Beijing); Sen Zha (University of Science and Technology Beijing); Yan Zhang (University of Science and Technology Beijing); Jiangyun Li (University of Science and Technology Beijing) |
3005 | Worst Case Matters for Few-Shot Recognition | Minghao Fu (Nanjing University); Yunhao Cao (Nanjing University); Jianxin Wu (Nanjing University)* |
3017 | Self-Filtering: A Noise-Aware Sample Selection for Label Noise with Confidence Penalization | Qi Wei (Shandong University)*; Haoliang Sun (Shandong University); Xiankai Lu (Shandong University); Yilong Yin (Shandong University) |
3035 | Point Cloud Domain Adaptation via Masked Local 3D Structure Prediction | hanxue liang (University of Texas at Austin)*; Hehe Fan (NUS); Zhiwen Fan (University of Texas at Austin); Yi Wang (University of Texas at Austin); Tianlong Chen (Unversity of Texas at Austin); Yu Cheng (Microsoft Research); Zhangyang Wang (University of Texas at Austin) |
3041 | Translation, Scale and Rotation: Cross-Modal Alignment Meets RGB-Infrared Vehicle Detection | Maoxun Yuan (Beihang University); Yinyan Wang (BeiHaing University); Xingxing Wei (Beihang University)* |
3043 | Simple Baselines for Image Restoration | Liangyu Chen (Megvii Technology)*; Xiaojie Chu (Megvii Technology); Xiangyu Zhang (Megvii Technology); Jian Sun (Megvii Technology) |
3058 | RDA: Reciprocal Distribution Alignment for Robust Semi-supervised Learning | Yue Duan (Nanjing University)*; Lei Qi (Southeast University); Lei Wang (“University of Wollongong, Australia”); Luping Zhou (University of Sydney); Yinghuan Shi (Nanjing University) |
3060 | Exploring Hierarchical Graph Representation for Large-Scale Zero-Shot Image Classification | Kai Yi (King Abdullah University of Science and Technology)*; xiaoqian shen (King Abdullah University of Science and Technology); Yunhao Gou (Hong Kong University of Science and Technology); Mohamed Elhoseiny (KAUST) |
3080 | Doubly Deformable Aggregation of Covariance Matrices for Few-shot Segmentation | Zhitong Xiong (Techinical University of Munich)*; Haopeng Li (The University of Melbourne); Xiaoxiang Zhu (Technical University of Munich (TUM); German Aerospace Center (DLR)) |
3093 | MemSAC: Memory Augmented Sample Consistency for Large Scale Domain Adaptation | Tarun Kalluri (UC San Diego)*; Astuti Sharma (UCSD); Manmohan Chandraker (UC San Diego) |
3094 | GCISG: Guided Causal Invariant Learning for Improved Syn-to-real Generalization | Gilhyun Nam (Agency for Defense Development)*; Gyeongjae Choi (Agency for Defense Development); Kyungmin Lee (Agency for Defense Development) |
3101 | Temporal Saliency Query Network for Efficient Video Recognition | Boyang Xia (Institute of Computing Technology, Chinese Academy of Science); Zhihao Wang (Institute of Computing Technology, Chinese Academy of Sciences); Wenhao Wu (Baidu)*; Haoran Wang (Baidu); Jungong Han (Aberystwyth University) |
3116 | Towards Interpretable Video Super-Resolution via Alternating Optimization | Jiezhang Cao (ETH Zürich)*; Jingyun Liang (ETH Zurich); Kai Zhang (ETH Zurich); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Qin Wang (ETH Zurich); Yulun Zhang (ETH Zurich); Hao Tang (ETH Zurich); Luc Van Gool (ETH Zurich) |
3118 | R-DFCIL: Relation-Guided Representation Learning for Data-Free Class Incremental Learning | Qiankun Gao (Peking University Shenzhen Graduate School)*; Chen Zhao (KAUST); Bernard Ghanem (KAUST); Jian Zhang (Peking University Shenzhen Graduate School) |
3125 | Spike Transformer: Monocular Depth Estimation for Spiking Camera | Jiyuan Zhang (Peking University)*; Lulu Tang (Tsingua University); Zhaofei Yu (Peking University); Jiwen Lu (Tsinghua University); Tiejun Huang (Peking University) |
3127 | Towards Robust Face Recognition with Comprehensive Search | Manyuan Zhang (Sensetime)*; Guanglu Song (Sensetime); Yu Liu (SenseTime Group LTD); Hongsheng Li (The Chinese University of Hong Kong) |
3129 | Improving Image Restoration by Revisiting Global Information Aggregation | Xiaojie Chu (Megvii Technology)*; Liangyu Chen (Megvii Technology); Chengpeng Chen (Megvii); Xin Lu (Megvii Technology) |
3132 | Learning Pedestrian Group Representations for Multi-modal Trajectory Prediction | Inhwan Bae (Gwangju Institute of Science and Technology)*; Jin-Hwi Park (GIST); Hae-Gon Jeon (GIST) |
3138 | RFLA: Gaussian Receptive Field based Label Assignment for Tiny Object Detection | Chang Xu (Wuhan University); Jinwang Wang (Huawei Technoloty); Wen Yang (Wuhan University)*; Huai Yu (Wuhan University); Lei Yu (Wuhan University); Gui-Song Xia (Wuhan University) |
3139 | Semi-supervised Single-view 3D Reconstruction via Prototype Shape Priors | Zhen Xing (Fudan University)*; Hengduo Li (University of Maryland, College Park ); Zuxuan Wu (UMD); Yu-Gang Jiang (Fudan University) |
3145 | Sequential Multi-View Fusion Network for Fast LiDAR Point Motion Estimation | Gang Zhang (Damo Academy, Alibaba Group)*; Xiaoyan Li (Beijing University of Technology); Zhenhua Wang (DAMO Academy, Alibaba Group) |
3147 | A Large-scale Multiple-objective Method for Black-box Attack against Object Detection | Siyuan Liang (Chinese Academy of Sciences); Longkang Li (Mohamed bin Zayed University of Artificial Intelligence); Yanbo Fan (Tencent AI Lab); Xiaojun Jia (Institute of Information Engineering,Chinese Academy of Sciences); Jingzhi Li (Institute of information engineering, CAS); Baoyuan Wu (The Chinese University of Hong Kong, Shenzhen)*; Xiaochun Cao (Sun Yat-sen University) |
3150 | GradAuto: Energy-oriented Attack on Dynamic Neural Networks | Jianhong Pan (Singapore University of Technology and Design)*; Qichen Zheng (Singapore University of Technology and Design); Zhipeng Fan (NYU TANDON SCHOOL OF ENGINEERING); Hossein Rahmani (Lancaster University); Qiuhong Ke (Monash University); Jun Liu (Singapore University of Technology and Design) |
3151 | Semantic-guided Multi-Mask Image Harmonization | Xuqian Ren (Watrix Technology); Yifan Liu (University of Adelaide)* |
3155 | Manifold Adversarial Learning for Cross-domain 3D Shape Representation | Hao Huang (New York University); Cheng Chen (New York University); Yi Fang (New York University)* |
3167 | GAN with Multivariate Disentangling for Controllable Hair Editing | Xuyang Guo (Institute of Computing Technology, Chinese Academy of Sciences); Meina Kan (Institute of Computing Technology, Chinese Academy of Sciences); Tianle Chen (Institute of Computing Technology, Chinese Academy of Sciences); Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences)* |
3169 | Fast-MoCo: Boost Momentum-based Contrastive Learning with Combinatorial Patches | Yuanzheng Ci (The University of Sydney)*; Chen Lin (University of Oxford); Lei Bai (Shanghai AI Laboratory); Wanli Ouyang (The University of Sydney) |
3179 | Dense Cross-Query-and-Support Attention Weighted Mask Aggregation for Few-Shot Segmentation | Xinyu Shi (School of Computer Science and Engineering, Southeast University); DONG WEI (Tencent Jarvis Lab)*; Yu Zhang (Southeast University); Donghuan Lu (Tencent); Munan Ning (Tencent); Jiashun Chen (School of Computer Science and Engineering, Southeast University); Kai Ma (Tencent); Yefeng Zheng (Tencent) |
3180 | Acknowledging the Unknown for Multi-label Learning with Single Positive Labels | Donghao Zhou (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences)*; Pengfei Chen (The Chinese University of Hong Kong); Qiong Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Guangyong Chen (Shenzhen Institutes of Advanced Technology); Pheng-Ann Heng (The Chinese Univsersity of Hong Kong) |
3200 | LoRD: Local 4D Implicit Representation for High-Fidelity Dynamic Human Modeling | Boyan Jiang (Fudan University)*; Xinlin Ren (Fudan University); Mingsong Dou (Google Inc.); Xiangyang Xue (Fudan University); Yanwei Fu (Fudan University); Yinda Zhang (Google) |
3202 | Bilateral Normal Integration | Xu Cao (Osaka University)*; Hiroaki Santo (Osaka University); Boxin Shi (Peking University); Fumio Okura (Osaka University); Yasuyuki Matsushita (Osaka University) |
3203 | Harmonizer: Learning to Perform White-Box Image and Video Harmonization | Zhanghan Ke (City University of Hong Kong)*; Chunyi Sun (Australian National University ); Lei ZHU (City University of Hong Kong); Ke Xu (City University of Hong Kong); Rynson W.H. Lau (City University of Hong Kong) |
3213 | On the Versatile Uses of Partial Distance Correlation in Deep Learning | Xingjian Zhen (University of Wisconsin-Madison)*; Zihang Meng (University of Wisconsin Madison); Rudrasis Chakraborty (Butlr); Vikas Singh (University of Wisconsin Madison) |
3214 | Object-Centric Unsupervised Image Captioning | Zihang Meng (University of Wisconsin Madison)*; David Yang (Facebook); Xuefei Cao (Facebook); Ashish Shah (Facebook AI); Ser-Nam Lim (Meta AI) |
3217 | Pose2Room: Understanding 3D Scenes from Human Activities | Yinyu Nie (Technical University of Munich)*; Angela Dai (Technical University of Munich); Xiaoguang Han (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen)); Matthias Niessner (Technical University of Munich) |
3218 | Capturing, Reconstructing, and Simulating: the UrbanScene3D Dataset | Liqiang Lin (Shenzhen University); Yilin Liu (Shenzhen University); Yue Hu (Shenzhen University); Xingguang Yan (Shenzhen University); Ke Xie (Shenzhen University); Hui Huang (Shenzhen University)* |
3225 | A Spectral View of Randomized Smoothing under Common Corruptions: Benchmarking and Improving Certified Robustness | Jiachen Sun (University of Michigan)*; Akshay Mehra (Tulane University); Bhavya Kailkhura (Lawrence Livermore National Laboratory); Pin-Yu Chen (IBM Research); Dan Hendrycks (UC Berkeley); Jihun Hamm (Tulane University); Zhuoqing Morley Mao (University of Michigan) |
3229 | CLIP-Actor: Text-Driven Recommendation and Stylization for Animating Human Meshes | Kim Youwang (POSTECH)*; Ji-Yeon Kim (POSTECH); Tae-Hyun Oh (POSTECH) |
3240 | Interpretable Image Classification with Differentiable Prototypes Assignment | Dawid Damian Rymarczyk (Jagiellonian University)*; Łukasz Struski (Jagiellonian University); Michał Górszczak (Jagiellonian University); Koryna Lewandowska (Jagiellonian University); Jacek Tabor (Jagiellonian University); Bartosz Zieliński (Jagiellonian University) |
3247 | Efficient One-stage Video Object Detection by Exploiting Temporal Consistency | Guanxiong Sun (Queen’s University Belfast); Yang Hua (Queen’s University Belfast)*; Guosheng Hu (Oosto); Neil Robertson (Queen’s University Belfast) |
3250 | ConCL: Concept Contrastive Learning for Dense Prediction Pre-training in Pathology Images | Jiawei Yang (UCLA)*; Hanbo Chen (Tencent AI Lab); Yuan Liang (UCLA); Junzhou Huang (University of Texas at Arlington); Lei He (UCLA); Jianhua Yao (National Institutes of Health) |
3254 | Leveraging Action Affinity and Continuity for Semi-supervised Temporal Action Segmentation | Guodong Ding (National University of Singapore)*; Angela Yao (National University of Singapore) |
3257 | Fast and High Quality Image Denoising via Malleable Convolution | Yifan Jiang (University of Texas at Austin)*; Bartlomiej Wronski (Google Research); Ben Mildenhall (Google Research); Jonathan T Barron (Google Research); Zhangyang Wang (University of Texas at Austin); Tianfan Xue (Google) |
3265 | Data Association between Event Streams andIntensity Frames under Diverse Baselines | Dehao Zhang (Peking University)*; Qiankun Ding (Peking University); Peiqi Duan (Peking University); Chu Zhou (Peking University); Boxin Shi (Peking University) |
3287 | Self-Regulated Feature Learning via Teacher-free Feature Distillation | Lujun Li (Chinese Academy of Science)* |
3289 | TS2-Net: Token Shift and Selection Transformer for Text-Video Retrieval | Yuqi Liu (Renmin University of China)*; Pengfei Xiong (Shopee); luhui xu (tencent); Cao Shengming (Tencent); Qin Jin (Renmin University of China) |
3292 | TAPE: Task-Agnostic Prior Embedding for Image Restoration | Lin Liu (University of Science and Technology of China)*; Lingxi Xie (Huawei Inc.); Xiaopeng Zhang (Noah’s Ark Lab, Huawei Inc.); Shanxin Yuan (Huawei Noah’s Ark Lab); Xiangyu Chen (University of Macau; SIAT); Wengang Zhou (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China); Qi Tian (Huawei Cloud & AI) |
3293 | MVSalNet:Multi-View Augmentation for RGB-D Salient Object Detection | JiaYuan Zhou (Dalian University of Technology)*; Lijun Wang (Dalian University of Technology); Huchuan Lu (Dalian University of Technology); Kaining Huang (huang kaining); Xinchu Shi (Meituan Group); Bocong Liu (Meituan) |
3295 | Rethinking IoU-based Optimization for Single-stage 3D Object Detection | Hualian Sheng (College of Information Science and Electronic Engineering, Zhejiang University; DAMO Academy, Alibaba Group)*; Sijia Cai (DAMO Academy, Alibaba Group); Na Zhao (NUS); Bing Deng (Damo Academy, Alibaba Group); Jianqiang Huang (Damo Academy, Alibaba Group); Xian-Sheng Hua (Damo Academy, Alibaba Group); Min-Jian Zhao (Zhejiang University); Gim Hee Lee (National University of Singapore) |
3298 | Uncertainty Inspired Underwater Image Enhancement | Zhenqi Fu (Xiamen University)*; Wu Wang (Xiamen University); Yue Huang (Xiamen University); Xinghao Ding (Xiamen University); Kai-Kuang Ma (Nanyang Technological University, Singapore) |
3300 | k-means Mask Transformer | Qihang Yu (Johns Hopkins University)*; Huiyu Wang (JHU); Siyuan Qiao (Google); Maxwell D Collins (Google Inc.); Yukun Zhu (Google Inc.); Hartwig Adam (Google); Alan Yuille (Johns Hopkins University); Liang-Chieh Chen (Google Inc.) |
3302 | Contrastive Vision-Language Pre-training with Limited Resources | Quan Cui (Waseda University)*; Boyan Zhou (ByteDance); Yu Guo (Fudan University); Weidong Yin (UBC); Hao Wu (Bytedance Inc.); Osamu Yoshie (Waseda University); Yubo Chen (Bytedance) |
3305 | Learning Linguistic Association Towards Efficient Text-Video Retrieval | Sheng Fang (ICT); Shuhui Wang (VIPL,ICT,Chinese academic of science)*; Junbao Zhuo (ICT CAS); Xinzhe Han (University of Chinese Academy of Sciences); Qingming Huang (University of Chinese Academy of Sciences) |
3308 | United Defocus Blur Detection and Deblurring via Adversarial Promoting Learning | Wenda Zhao (Dalian University of Technology)*; Fei Wei (Dalian University of Techology); You He (Naval Aviation University); Huchuan Lu (Dalian University of Technology) |
3314 | Unstructured Feature Decoupling for Vehicle Re-Identification | Wen Qian (Institute of Automation, Chinese Academy of Sciences)*; Hao Luo (Alibaba group); Silong Peng (The Chinese academy of science); Fan Wang (Alibaba Group); Chen Chen (The Chinese academy of science); Hao Li (Alibaba Group) |
3322 | Improving Adversarial Robustness of 3D Point Cloud Classification Models | Guanlin Li (Nanyang Technological University)*; Guowen Xu (Nanyang Technological University); Han Qiu (Tsinghua University); Ruan HE (Tencent); Jiwei Li (Shannon.AI); Tianwei Zhang (Nanyang Technological University) |
3324 | ASSISTER: Assistive Navigation via Conditional Instruction Generation | Zanming Huang (Boston University); Zhongkai Shangguan (Boston University); Jimuyang Zhang (Boston University); Gilad Bar (Rutgers University – Camden); Matthew Boyd (Boston University); Eshed Ohn-Bar (Boston University)* |
3342 | Deep Hash Distillation for Image Retrieval | Young Kyun Jang (Seoul National University)*; Geonmo Gu (NAVER corp); Byungsoo Ko (NAVER/LINE Corp.); Isaac Kang (Seoul National University); Nam Ik Cho (Seoul National University) |
3345 | Learning Spatial-Preserved Skeleton Representations for Few-Shot Action Recognition | Ning Ma (Zhejiang University)*; Hongyi Zhang (Zhejiang University); Xuhui Li (Zhejiang University); Sheng Zhou (Zhejiang University); Zhen Zhang (National University of Singapore); Jun Wen (Harvard University); Haifeng Li (Zhejiang University); Jingjun Gu (Zhejiang University); Jiajun Bu (Zhejiang University) |
3346 | Digging into Radiance Grid for Real-Time View Synthesis with Detail Preservation | Jian Zhang (Alibaba Group); Jinchi Huang (Alibaba Group); Bowen Cai (Alibaba Group); Huan Fu (Alibaba Group)*; Mingming Gong (University of Melbourne); Chaohui Wang (Laboratoire d’Informatique Gaspard Monge, Université Paris-Est); Jiaming Wang (Alibaba Group); Hongchen Luo (Alibaba Group); Rongfei Jia (Alibaba Group); Binqiang Zhao (Alibaba); Xing Tang (Alibaba Group) |
3351 | S^2Contact: Graph-based Network for 3D Hand-Object Contact Estimation with Semi-Supervised Learning | Tze Ho Elden Tse (University of Birmingham)*; Zhongqun Zhang (University of Birmingham); Kwang In Kim (UNIST); Ales Leonardis (University of Birmingham); Feng Zheng (SUSTech); Hyung Jin Chang (University of Birmingham) |
3359 | TD-Road: Top-Down Road Network Extraction with Holistic Graph Construction | Yang He (Amazon)*; Ravi Garg (Amazon com services inc); Amber Roy Chowdhury (Amazon) |
3366 | StyleGAN-Human: A Data-Centric Odyssey of Human Generation | Jianglin Fu (SenseTime)*; Shikai Li (SenseTime Research); Yuming Jiang (Nanyang Technological University); Kwan-Yee Lin (SenseTime Research); Chen Qian (SenseTime); Chen Change Loy (Nanyang Technological University); Wayne Wu (SenseTime Research); Ziwei Liu (Nanyang Technological University) |
3369 | Hourglass Attention Network for Image Inpainting | Ye Deng (Xi’an Jiaotong University)*; Siqi Hui (Xi’an Jiaotong University); Rongye Meng (IAIR, Xi’an Jiaotong University); Sanping Zhou (Xi’an Jiaotong University); Jinjun Wang (Xi’an Jiaotong University) |
3370 | MaxViT: Multi-Axis Vision Transformer | Zhengzhong Tu (University of Texas at Austin)*; Hossein Talebi (Google); Han Zhang (Google); Feng Yang (Google Research); Peyman Milanfar (Google); Alan Bovik (University of Texas at Austin); Yinxiao Li (Google) |
3378 | Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images | Yuan Liu (The University of Hong Kong)*; Yilin Wen (The University of Hong Kong); Sida Peng (Zhejiang University); Cheng Lin (Tencent); Xiaoxiao Long (The University of Hong Kong); Taku Komura (The University of Hong Kong); Wenping Wang (The University of Hong Kong) |
3385 | ColorFormer: Image Colorization via Color Memory assisted Hybrid-attention Transformer | Xiaozhong Ji (Tencent)*; Boyuan Jiang (Tencent Youtu Lab); Donghao Luo (Tencent); Guangpin Tao (Nanjing University); Wenqing Chu (Tencent); Zhifeng Xie (Shanghai University); Chengjie Wang (Tencent; Shanghai Jiao Tong University); Ying Tai (Tencent YouTu) |
3387 | Spotting Temporally Precise, Fine-Grained Events in Video | James Hong (Stanford University)*; Haotian Zhang (Stanford University); Michaël Gharbi (Adobe Research); Matthew Fisher (Adobe Research); Kayvon Fatahalian (Stanford) |
3390 | SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness | Jindong Gu (University of Munich)*; Hengshuang Zhao (University of Oxford); Volker Tresp (Siemens AG and Ludwig Maximilian University of Munich ); Philip Torr (University of Oxford) |
3391 | Adversarial Erasing Framework via Triplet with Gated Pyramid Pooling Layer for Weakly Supervised Semantic Segmentation | Sung-Hoon Yoon (KAIST)*; Hyeokjun Kweon (KAIST); Jegyeong Cho (KAIST); Shinjeong Kim (KAIST); Kuk-Jin Yoon (KAIST) |
3393 | Semi-Supervised Vision Transformers | Zejia Weng (Fudan University)*; Xitong Yang (University of Maryland); Ang Li (Google DeepMind); Zuxuan Wu (UMD); Yu-Gang Jiang (Fudan University) |
3394 | Learning an Isometric Surface Parameterization for Texture Unwrapping | Sagnik Das (Stony Brook University)*; Ke Ma (Stony Brook University); Zhixin Shu (Adobe Research); Dimitris Samaras (Stony Brook University) |
3409 | Mimic Embedding via Adaptive Aggregation: Learning Generalizable Person Re-identification | BOQIANG XU (University of Chinese Academy of Sciences;Institute of Automation,Chinese Academy of Sciences)*; Jian Liang (CASIA); He Lingxiao (nlpr,cripac); Zhenan Sun (Chinese of Academy of Sciences) |
3418 | CryoAI: Amortized Inference of Poses for Ab Initio Reconstruction of 3D Molecular Volumes from Real Cryo-EM Images | Axel Levy (Stanford University); Frederic Poitevin (SLAC National Accelerator Laboratory); Julien N. P. Martel (Stanford University); Youssef Nashed (SLAC National Accelerator Laboratory); Ariana Peck (SLAC National Accelerator Laboratory); Nina Miolane (UCSB); Daniel Ratner (Stanford University ); Mike Dunne (SLAC National Accelerator Laboratory); Gordon Wetzstein (Stanford University)* |
3419 | EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANs | Guohao Ying (University of Southern California); Xin He (Hong Kong Baptist University); Bin Gao (National University of Singapore); Bo Han (HKBU / RIKEN); Xiaowen Chu (Hong Kong University of Science and Technology)* |
3428 | ScalableViT: Rethinking the Context-oriented Generalization of Vision Transformer | Rui Yang (Tsinghua University)*; Hailong Ma (ByteDance Inc); Jie Wu (ByteDance Inc); Yansong Tang (Tsinghua University); Xuefeng Xiao (ByteDance Inc); Min Zheng (ByteDance); Xiu Li (Tsinghua University) |
3429 | PlaneFormers: From Sparse View Planes to 3D Reconstruction | Samir Agarwala (University of Michigan)*; Linyi Jin (University of Michigan); Chris Rockwell (University of Michigan); David Fouhey (University of Michigan) |
3438 | Domain Adaptive Video Segmentation via Temporal Pseudo Supervision | Yun Xing (Nanyang Technological University); Dayan Guan (Mohamed bin Zayed University of Artificial Intelligence); Jiaxing Huang (Nanyang Technological University); Shijian Lu (Nanyang Technological University)* |
3442 | Diverse Learner: Exploring Diverse Supervision for Semi-supervised Object Detection | Linfeng Li (Baidu)*; Minyue Jiang (Baidu Inc.); Yue Yu (Baidu.Inc.); Wei Zhang (Baidu Inc); Xiangru Lin (Baidu Inc.); Yingying Li (Baidu); Xiao Tan (Baidu Inc.); Jingdong Wang (Baidu); Errui Ding (Baidu Inc.) |
3452 | Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge for Human Motion Prediction | Xiaoning Sun (Nanjing University of Science and Technology)*; Qiongjie Cui (Nanjing University of Science and Technology); Huaijiang Sun (Nanjing University of Science and Technology); Bin Li (Tianjin AiForward Science and Technology); Weiqing Li (Nanjing University of Science and Technology); Jianfeng Lu (Nanjing University of Science and Technology) |
3455 | Towards Hard-Positive Query Mining for DETR-based Human-Object Interaction Detection | Xubin Zhong (South China University of Technology); Changxing Ding (South China University of Technology)*; Zijian Li (South China University of Technology); Shaoli Huang (Tencent AI-Lab) |
3458 | Learning Extremely Lightweight and Robust Model with Differentiable Constraints on Sparsity and Condition Number | Xian Wei (East China Normal University); Yangyu Xu (Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences;University of Chinese Academy of Sciences); yanhui huang (Fuzhou University); Hairong Lv (Tsinghua University); Hai Lan (Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences); Mingsong Chen (East China Normal University); XUAN TANG (East China Normal University)* |
3470 | Structural Triangulation: A Closed-Form Solution to Constrained 3D Human Pose Estimation | Zhuo Chen (Shanghai Jiao Tong University)*; Xu Zhao (Shanghai Jiao Tong University); Xiaoyue Wan (Shanghai Jiao Tong University) |
3474 | Latency-Aware Collaborative Perception | Zixing Lei (Shanghai Jiao Tong University)*; Shunli Ren (Shanghai Jiao Tong University); Yue Hu (Shanghai Jiao Tong University); Wenjun Zhang (Shanghai Jiao Tong University); Siheng Chen (Shanghai Jiao Tong University) |
3475 | Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection | Xin Li (East China Normal University)*; Botian Shi (Shanghai AI Lab); Yuenan HOU (Shanghai AI Lab); Xingjiao Wu ( East China Normal University); Tianlong Ma (East China Normal University); Yikang Li (Shanghai AI Lab); Liang He (ECNU) |
3484 | Unfolded Deep Kernel Estimation for Blind Image Super-resolution | Hongyi Zheng (The Hong Kong Polytechnic University); Hongwei Yong (The Hong Kong Polytechnic University); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”)* |
3487 | Rethinking Clustering-Based Pseudo-Labeling for Unsupervised Meta-Learning | Xingping Dong (Inception Institute of Artificial Intelligence)*; Jianbing Shen (Inception Institute of Artificial Intelligence); Ling Shao (Terminus Group) |
3489 | Continual Semantic Segmentation via Structure Preserving and Projected Feature Alignment | Zihan Lin (University of Science and Technology of China); Zilei Wang (University of Science and Technology of China)*; Yixin Zhang (University of Science and Technology of China) |
3498 | SC-wLS: Towards Interpretable Feed-forward Camera Re-localization | Xin Wu (Peking University)*; Hao Zhao (Intel Labs China); Shunkai Li (Peking University); Yingdian Cao (Peking University); Hongbin Zha (Peking University, China) |
3500 | Weakly-Supervised Stitching Network for Real-World Panoramic Image Generation | Dae-Young Song (Chungnam National University); Geonsoo Lee (Chungnam National University); HeeKyung Lee (ETRI(Electronics and Telecommunications Reseach Institute)); Gi-Mun Um (ETRI(Electronics and Telecommunications Research Institute)); Donghyeon Cho (Chungnam National University)* |
3503 | FloatingFusion: Depth from ToF and Image-stabilized Stereo Cameras | Andreas Meuleman (KAIST); Hakyeong Kim (KAIST); James Tompkin (Brown University); Min H. Kim (KAIST)* |
3504 | Dual-Evidential Learning for Weakly-supervised Temporal Action Localization | Mengyuan Chen (Institute of Automation, Chinese Academy of Sciences)*; Junyu Gao (CASIA); Shicai Yang (Hikvision Research Institute); Changsheng Xu (CASIA) |
3511 | DynaST: Dynamic Sparse Transformer for Exemplar-Guided Image Generation | Songhua Liu (National University of Singapore)*; Jingwen Ye (National University of Singapore); Sucheng Ren (South China University of Technology); Xinchao Wang (National University of Singapore) |
3512 | D2HNet: Joint Denoising and Deblurring with Hierarchical Network for Robust Night Image Restoration | Yuzhi Zhao (City University of Hong Kong)*; Yongzhe Xu (SenseTime Group Limited); Qiong Yan (SenseTime Group Limited); DINGDONG YANG (University of Michigan); Xuehui Wang (Shanghai Jiao Tong University); Lai-Man Po (CITY UNIVERSITY OF HONG KONG) |
3514 | DELTAR: Depth Estimation from a Light-weight ToF Sensor and RGB Image | Yijin Li (Zhejiang University); Yinda Zhang (Google); Xinyang Liu (Zhejiang University); Wenqi Dong (Zhejiang University); Han Zhou (Zhejiang University); Hujun Bao (Zhejiang University); Guofeng Zhang (Zhejiang University); Zhaopeng Cui (Zhejiang University)* |
3515 | ERA: Enhanced Rational Activations | Martin Trimmel (Lund University)*; Mihai Zanfir (Google); Richard I Hartley (google); Cristian Sminchisescu (Google) |
3518 | FrequencyLowCut pooling – Plug & Play against Catastrophic Overfitting | Julia Grabinski (University of Siegen)*; Janis Keuper (Fraunhofer); Margret Keuper (University of Mannheim); Steffen Jung (MPII) |
3520 | Interclass Prototype Relation for Few-Shot Segmentation | Atsuro Okazawa (SoftBank Corp.)* |
3523 | Multi-Faceted Distillation of Base-Novel Commonality for Few-shot Object Detection | Shuang Wu (Harbin Institute of Technology, Shenzhen); Wenjie Pei (Harbin Institute of Technology, Shenzhen); Dianwen Mei (Harbin Institute of Technology, Shenzhen); Fanglin Chen (Harbin Institute of Technology, Shenzhen); Jiandong Tian (CAS); Guangming Lu ( Harbin Institute of Technology, Shenzhen)* |
3525 | X-DETR: A Versatile Architecture for Instance-wise Vision-Language Tasks | Zhaowei Cai (Amazon)*; Gukyeong Kwon (Amazon); Avinash Ravichandran (Amazon); Erhan Bas (Amazon); Zhuowen Tu (UC San Diego); Rahul Bhotika (Amazon); Stefano Soatto (UCLA) |
3535 | Equivariance and Invariance Inductive Bias for Learning from Insufficient Data | Tan Wang (Nanyang Technological University)*; Qianru Sun (Singapore Management University); Sugiri Pranata (Panasonic R&D Center Singapore); Karlekar Jayashree (Panasonic); Hanwang Zhang (Nanyang Technological University) |
3539 | Multimodal Conditional Image Synthesis with Product-of-Experts GANs | Xun Huang (NVIDIA)*; Arun Mallya (NVIDIA); Ting-Chun Wang (NVIDIA); Ming-Yu Liu (NVIDIA) |
3551 | Balancing between Forgetting and Acquisition in Incremental Subpopulation Learning | Mingfu Liang (Northwestern University)*; JIAHUAN ZHOU (Peking University); Wei Wei (Northwestern University); Ying Wu (Northwestern University) |
3555 | TensoRF: Tensorial Radiance Fields | Anpei Chen (ShanghaiTech University)*; Zexiang Xu (Adobe Research); Andreas Geiger (University of Tuebingen); Jingyi Yu (Shanghai Tech University); Hao Su (UCSD) |
3580 | PointCLM: A Contrastive Learning-based Framework for Multi-instance Point Cloud Registration | Mingzhi Yuan (Fudan University)*; Zhihao Li (Fudan); Qiuye Jin (Fudan University); Xinrong Chen (Fudan University); Manning Wang (Fudan University) |
3581 | Slim Scissors: Segmenting Thin Object from Synthetic Background | Kunyang Han (Beijing Jiaotong University)*; Jun Hao Liew (ByteDance); Jiashi Feng (ByteDance); Huawei Tian (People’s Public Security University of China); Yao Zhao (Beijing Jiaotong University); Yunchao Wei (UTS) |
3591 | CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action Recognition | Shreyank N Gowda (University of Edinburgh)*; Laura Sevilla-Lara (Facebook); Frank Keller (University of Edinburgh); Marcus Rohrbach (Facebook AI Research) |
3593 | Discovering Human-Object Interaction Concepts via Self-Compositional Learning | Zhi Hou (The University of Sydney)*; Baosheng Yu (The University of Sydney); Dacheng Tao (The University of Sydney) |
3598 | Mixed-Precision Neural Network Quantization via Learned Layer-wise Importance | Chen Tang (Tsinghua University)*; Kai Ouyang (Tsinghua University); Zhi Wang (Tsinghua University); Yifei Zhu (Shanghai Jiao Tong University); Wen Ji (Institute of Computing Technology, Chinese Academy of Sciences); Yaowei Wang (PengCheng Laboratory); Wenwu Zhu (Tsinghua University) |
3604 | TREND: Truncated Generalized Normal Density Estimation of Inception Embeddings for GAN Evaluation | Junghyuk Lee (School of Integrated Technology, Yonsei University); Jong-Seok Lee (“Yonsei University, Korea”)* |
3606 | 3D Room Layout Estimation from a Cubemap of Panorama Image via Deep Manhattan Hough Transform | Yining Zhao (Tsinghua University); Chao Wen (Bytedance); Zhou Xue (Bytedance); Yue Gao (Tsinghua University)* |
3623 | JoJoGAN: One Shot Face Stylization | Min Jin Chong (Univeristy of Illinois at Urbana-Champaign)*; David Forsyth (Univeristy of Illinois at Urbana-Champaign) |
3627 | Convolutional Embedding Makes Hierarchical Vision Transformer Stronger | Cong Wang (OPPO); Hongmin Xu (OPPO)*; Xiong Zhang (Neolix Autonomous Vehicle); Li Wang (North China University of Technology ); Zhitong Zheng (OPPO); Haifeng Liu (OPPO) |
3632 | Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration | Haotian Bai (The Chinese University of Hongkong, shenzhen); Ruimao Zhang (The Chinese University of Hong Kong, Shenzhen)*; Jiong WANG (The Chinese University of Hong Kong, Shenzhen); Xiang Wan (Shenzhen Research Institute of Big Data, the Chinese University of Hong Kong (Shenzhen)) |
3641 | Few-shot Class-incremental Learning for 3D Point Cloud Objects | Townim Faisal Chowdhury (North South University); Ali Cheraghian (Australian National University (ANU)); Sameera Chandimal Ramasinghe (Australian National University); Sahar Ahmadi (University of Technology Sydney); Morteza Saberi (University of Technology, Sydney); Shafin Rahman (North South University)* |
3643 | Learning Graph Neural Networks for Image Style Transfer | Yongcheng Jing (The University of Sydney); Yining Mao (Zhejiang University); Yiding Yang (Wormpex AI Research); Yibing Zhan (JD Explore Academy); Mingli Song (Zhejiang University); Xinchao Wang (National University of Singapore)*; Dacheng Tao (JD.com) |
3644 | JPerceiver: Joint Perception Network for Depth, Pose and Layout Estimation in Driving Scenes | Haimei Zhao (The University of Sydney)*; Jing Zhang (The University of Sydney); Sen Zhang (The University of Sydney); Dacheng Tao (JD.com) |
3645 | Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task Distributions | Zhenyi Wang (University at Buffalo)*; Li Shen (JD Explore Academy); Le Fang (University at Buffalo); Qiuling Suo (State University of New York at Buffalo); Donglin Zhan (Columbia University); Tiehang Duan (Facebook); Mingchen Gao (University at Buffalo, SUNY) |
3655 | Semi-supervised 3D Object Detection with Proficient Teachers | Junbo Yin (Beijing Institute of Technology); Jin Fang (Baidu ); Dingfu Zhou (Baidu); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Liangjun Zhang (baidu); Cheng-Zhong Xu (University of Macau); Jianbing Shen (Inception Institute of Artificial Intelligence)* |
3658 | NeFSAC: Neurally Filtered Minimal Samples | Luca Cavalli (ETH Zurich)*; Marc Pollefeys (ETH Zurich / Microsoft); Daniel Barath (ETH Zürich) |
3660 | Domain Generalization by Mutual-Information Regularization with Pre-trained Models | Junbum Cha (Kakaobrain)*; Kyungjae Lee (Chung-Ang University); Sungrae Park (Upstage AI Research, Upstage AI); Sanghyuk Chun (NAVER AI Lab) |
3661 | AcroFOD: An Adaptive Method for Cross-domain Few-shot Object Detection | Yipeng Gao (Sun Yat-sen University, China); Lingxiao YANG (Sun-Yat Sen University); Yunmu Huang (Huawei Technologies Co., Ltd.); Song Xie (Huawei Technologies Co., Ltd.); Shiyong Li ( AI Application Research Center, Huawei Technologies Co., Ltd); WEI-SHI ZHENG (Sun Yat-sen University, China)* |
3665 | Primitive-based Shape Abstraction via Nonparametric Bayesian Inference | Yuwei Wu (National University of Singapore)*; Weixiao Liu (National University of Singapore); Sipu Ruan (National University of Singapore); Gregory S Chirikjian (National University of Singapore) |
3670 | Active label correction using robust parameter update and entropy propagation | Kwang In Kim (UNIST)* |
3671 | E-Graph: Minimal Solution for Rigid Rotation with Extensibility Graphs | Yanyan Li (tum)*; Federico Tombari (Google, TU Munich) |
3672 | Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation | Nadine Behrmann (Bosch Center for Artificial Intelligence)*; S. Alireza Golestaneh (Google); Zico Kolter (Carnegie Mellon University); Jürgen Gall (University of Bonn); Mehdi Noroozi (Bosch Gmb) |
3677 | Counterfactual Intervention Feature Transfer for Visible-Infrared Person Re-identification | Xulin Li (University of Science and Technology of China); Yan Lu (University of Sydney); Bin Liu (University of Science and Technology of China)*; Yating Liu (USTC); Guojun Yin (University of Science and Technology of China); Qi Chu (University of Science and Technology of China); Jinyang Huang (University Of Science And Technology Of China); Feng Zhu (University of Science and Technology of China); Rui Zhao (SenseTime Group Limited); Nenghai Yu (University of Science and Technology of China) |
3681 | A Closer Look at Invariances in Self-supervised Pre-training for 3D Vision | Lanxiao Li (Karlsruher Institut fuer Technologie)*; Michael Heizmann (Karlsruher Institut fuer Technologie) |
3685 | VecGAN: Image-to-Image Translation with Interpretable Latent Directions | Yusuf Dalva (Bilkent University); Said F Altındiş (Bilkent University); Aysegul Dundar (Bilkent University)* |
3686 | SNeS: Learning Probably Symmetric Neural Surfaces from Incomplete Data | Eldar Insafutdinov (University of Oxford); Dylan Campbell (University of Oxford)*; Joao F Henriques (University of Oxford); Andrea Vedaldi (Oxford University) |
3689 | Three things everyone should know about Vision Transformers | Hugo Touvron (Facebook AI Research)*; Matthieu Cord (Sorbonne University); Alaaeldin M El-Nouby (Facebook AI Research); Jakob Verbeek (Facebook); Herve Jegou (Facebook AI Research) |
3690 | DeiT III: Revenge of the ViT | Hugo Touvron (Facebook AI Research)*; Matthieu Cord (Sorbonne University); Herve Jegou (Facebook AI Research) |
3693 | Any-resolution Training for High-resolution Image Synthesis | Lucy Chai (MIT)*; Michaël Gharbi (Adobe Research); Eli Shechtman (Adobe Research, US); Phillip Isola (MIT); Richard Zhang (Adobe) |
3703 | HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields | Kim Jun-Seong (POSTECH)*; Kim Yu-Ji (POSTECH); Moon Ye-Bin (POSTECH); Tae-Hyun Oh (POSTECH) |
3719 | PartImageNet: A Large, High-Quality Dataset of Parts | Ju He (Johns Hopkins University)*; Shuo Yang (University of Technology Sydney); Shaokang Yang (ByteDance); Adam Kortylewski (Max Planck Institute for Informatics); Xiaoding Yuan (Johns Hopkins University); Jie-Neng Chen (Johns Hopkins University); shuai liu (ByteDance Inc.); Cheng Yang (ByteDance Inc.); Qihang Yu (Johns Hopkins University); Alan Yuille (Johns Hopkins University) |
3721 | Abstracting Sketches through Simple Primitives | Stephan Alaniz (University of Tübingen)*; Massimiliano Mancini (University of Tübingen); Anjan Dutta (University of Surrey); Diego Marcos (Wageningen University); Zeynep Akata (University of Tübingen) |
3723 | MTTrans: Cross-Domain Object Detection with Mean Teacher Transformer | Jinze Yu (Beihang University); Jiaming Liu (Peking University); Xiaobao Wei (Beihang University); Haoyi Zhou (Beihang University); Yohei Nakata (Panasonic Corporation); Denis A Gudovskiy (Panasonic); Tomoyuki Okuno (Panasonic); Jianxin Li (Beihang University); Kurt Keutzer (UC Berkeley); Shanghang Zhang (University of California, Berkeley)* |
3731 | TAFIM: Targeted Adversarial Attacks against Facial Image Manipulations | Shivangi Aneja (Technical University Of Munich )*; Lev Markhasin (Sony Europe); Matthias Niessner (Technical University of Munich) |
3737 | NeuMan: Neural Human Radiance Field from a Single Video | Wei Jiang (University of British Columbia)*; Kwang Moo Yi (University of British Columbia); Golnoosh Samei (UBC); Oncel Tuzel (Apple); Anurag Ranjan (Apple) |
3747 | Learning Implicit Templates for Point-Based Clothed Human Modeling | Siyou Lin (Tsinghua University)*; Hongwen Zhang (Tsinghua University); Zerong Zheng (Tsinghua University); Ruizhi Shao (Tsinghua University); Yebin Liu (Tsinghua University) |
3751 | Event Neural Networks | Matthew Dutson (University of Wisconsin-Madison)*; Yin Li (University of Wisconsin-Madison); Mohit Gupta (“University of Wisconsin-Madison, USA “) |
3755 | Learning to Censor by Noisy Sampling | Ayush Chopra (MIT)*; Abhinav Java (Adobe, MDSR Labs); Abhishek Singh (MIT); Vivek Sharma (MIT); Ramesh Raskar (Massachusetts Institute of Technology) |
3758 | ConMatch: Semi-Supervised Learning with Confidence-Guided Consistency Regularization | Jiwon Kim (Korea University)*; Youngjo Min (Korea University); Daehwan Kim (Samsung electro mechanics); Gyuseong Lee (Korea University); Junyoung Seo (Korea University); Kwangrok Ryoo (Korea University); Seungryong Kim (Korea University) |
3760 | Granularity-aware Adaptation for Image Retrieval over Multiple Tasks | Jon Almazan (Naver Labs); Byungsoo Ko (NAVER/LINE Corp.); Geonmo Gu (NAVER corp); Diane Larlus (Naver Labs Europe); Yannis Kalantidis (NAVER LABS Europe)* |
3769 | EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers | Junting Pan (The Chinese University of Hong Kong); Adrian Bulat (Samsung AI Center, Cambridge); Fuwen Tan (Samsung AI Center, Cambridge); Xiatian Zhu (University of Surrey); Lukasz Dudziak (Samsung AI Center Cambridge); Hongsheng Li (The Chinese University of Hong Kong); Georgios Tzimiropoulos (Queen Mary University of London); Brais Martinez (Samsung AI Center)* |
3780 | Multi-Domain Multi-Definition Landmark Localization for Small Datasets | David Ferman (AI Foundation); Gaurav Bharaj (AI Foundation)* |
3781 | TAVA: Template-free Animatable Volumetric Actors | Ruilong Li (UC Berkeley)*; Julian Tanke (University of Bonn); Minh P Vo (Facebook Reality Labs); Michael Zollhöfer (Facebook Reality Labs); Jürgen Gall (University of Bonn); Angjoo Kanazawa (University of California Berkeley); Christoph Lassner (Meta Reality Labs Research) |
3792 | Stereo Depth Estimation with Echoes | Chenghao Zhang (National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China)*; Kun Tian (Institute of Automation, Chinese Academy of Sciences); Bolin Ni (Institute of Automation, Chinese Academy of Sciences); Gaofeng Meng (Chinese Academy of Sciences); Bin Fan (University of Science and Technology Beijing); Zhaoxiang Zhang (Chinese Academy of Sciences, China); Chunhong Pan (Institute of Automation, Chinese Academy of Sciences) |
3794 | EASNet:Searching Elastic and Accurate Network Architecture for Stereo Matching | Qiang Wang (Harbin Institute of Technology (Shenzhen))*; Shaohuai Shi (The Hong Kong University of Science and Technology); Kaiyong Zhao (Hong Kong Baptist University); Xiaowen Chu (Hong Kong University of Science and Technology) |
3798 | DEVIANT: Depth EquiVarIAnt NeTwork for Monocular 3D Object Detection | Abhinav Kumar (Michigan State University)*; Garrick Brazil (Facebook); Enrique Corona (Ford Motor Company); Armin Parchami (Ford Motor Company); Xiaoming Liu (Michigan State University) |
3809 | RBP-Pose: Residual Bounding Box Projection for Category-Level Pose Estimation | Ruida Zhang (Tsinghua University)*; Yan Di (Technical University of Munich); Zhiqiang Lou (Tsinghua University); Fabian Manhardt (Google); Federico Tombari (Google, TU Munich); Xiangyang Ji (Tsinghua University) |
3820 | Levenshtein OCR | Cheng Da (Alibaba DAMO Academy)*; Wang Peng (Alibaba DAMO Academy); Cong Yao (Alibaba DAMO Academy) |
3821 | Multi-Granularity Prediction for Scene Text Recognition | Wang Peng (Alibaba DAMO Academy); Cheng Da (Alibaba DAMO Academy)*; Cong Yao (Alibaba DAMO Academy) |
3827 | MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition | Chuanguang Yang (Institute of Computing Technology, Chinese Academy of Sciences )*; Zhulin An (Institute of Computing Technology, Chinese Academy of Sciences); Helong Zhou (Beijing Horizon Information Technology Co.,Ltd); linhang cai (Institute of Computing Technology, Chinese Academy of Sciences); Xiang Zhi (Institute of Computing Technology, Chinese Academy of Sciences); Jiwen Wu (Institute of Computing Technology, Chinese Academy of Sciences); yongjun xu (Institute of Computing Technology, Chinese Academy of Sciences); Qian Zhang (Horizon Robotics) |
3834 | Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input | Qingpei Guo (Ant Financial Services Group)*; Kaisheng Yao (Amazon); Wei Chu (Ant Group) |
3837 | Efficient Video Transformers with Spatial-temporal Token Selection | Junke Wang (Fudan University)*; Xitong Yang (University of Maryland); Hengduo Li (University of Maryland, College Park ); Li Liu (BirenTech Research); Zuxuan Wu (UMD); Yu-Gang Jiang (Fudan University) |
3844 | DAS: Densely-Anchored Sampling for Deep Metric Learning | Lizhao Liu (South China University of Technology); Shangxin Huang (South China University of Technology); Zhuangwei Zhuang (South China University of Technology); Ran Yang (South China University of Technology); Mingkui Tan (South China University of Technology)*; Yaowei Wang (PengCheng Laboratory) |
3864 | ReCoNet: Recurrent Correction Network for Fast and Efficient Multi-modality Image Fusion | Zhanbo Huang (Dalian University of Technology); Jinyuan Liu (Dalian University of Technology); Xin Fan (Dalian University of Technology)*; Risheng Liu (Dalian University of Technology); Wei Zhong (Dalian University of Technology); Zhongxuan Luo (DALIAN UNIVERSITY OF TECHNOLOGY) |
3867 | RIBAC: Towards Robust and Imperceptible Backdoor Attack against Compact DNN | Huy Phan (Rutgers University)*; Cong Shi (Rutgers University); Yi Xie (Rutgers University); Tianfang Zhang (Rutgers University, New Brunswick); Zhuohang Li (University of Tennessee, Knoxville); Tianming Zhao (Temple University); Jian Liu (The University of Tennessee, Knoxville); Yan Wang (Temple University); Yingying Chen (Rutgers University); bo yuan (rutgers university) |
3870 | Point Cloud Compression with Sibling Context and Surface Priors | Zhili CHEN (HKUST); Zian Qian (HKUST); Sukai Wang (HKUST); Qifeng Chen (HKUST)* |
3874 | Self-Feature Distillation with Uncertainty Modeling for Degraded Image Recognition | zhou yang (Xidian University); Weisheng Dong (Xidian University)*; Xin Li (West Virginia University); Jinjian Wu (Xidian University); Leida Li (Xidian University); Guangming Shi (Xidian University) |
3885 | Point Cloud Compression using Range Image-based Entropy Model for Autonomous Driving | Sukai Wang (HKUST)*; Ming Liu (HKUST) |
3904 | CANF-VC: Conditional Augmented Normalizing Flows for Video Compression | Yung-Han Ho (NCTU); Chih-Peng Chang (National Chiao Tung Univeristy); Peng-Yu Chen (NYCU); Alessandro Gnutti (University of Brescia); Wen-Hsiao Peng (National Yang Ming Chiao Tung University)* |
3912 | Bi-level Feature Alignment for Versatile Image Translation and Manipulation | Fangneng Zhan (Max Planck Institute for Informatics); Yingchen Yu (Nanyang Technological University); Rongliang WU (Nanyang Technological University); Jiahui Zhang (Nanyang Technological University); Kaiwen Cui (Nanyang Technological University); Aoran Xiao (Nanyang Technological University); Shijian Lu (Nanyang Technological University)*; Chunyan Miao (NTU) |
3918 | Lane Detection Transformer based on Multi-frame Horizontal and Vertical Attention and Visual Transformer Module | Han Zhang (Beihang University)*; Yunchao Gu (BUAA); Xinliang Wang (BUAA); Junjun Pan (Beihang University); Minghui Wang (Beihang University) |
3921 | Label-Guided Auxiliary Training Improves 3D Object Detector | yaomin huang (East China Normal University); Xinmei Liu (East China Normal University)*; Yichen Zhu (Midea Group); Zhiyuan Xu (Midea Group); Chaomin Shen (East China Normal University); Zhengping Che (Midea Group); Guixu Zhang (East China Normal University); Yaxin Peng (Department of Mathematics, School of Science, Shanghai University); Feifei Feng (Midea Grooup); Jian Tang (Midea Group) |
3932 | FedX: Unsupervised Federated Learning with Cross Knowledge Distillation | Sungwon Han (KAIST)*; Sungwon Park (KAIST); Fangzhao Wu (MSRA); Sundong Kim (Institute for Basic Science); Chuhan Wu (Tsinghua University); Xing Xie (Microsoft Research Asia); Meeyoung Cha (Institute for Basic Science) |
3936 | ProposalContrast: Unsupervised Pre-training for LiDAR-based 3D Object Detection | Junbo Yin (Beijing Institute of Technology); Wenguan Wang (Eidgenössische Technische Hochschule Zürich); Dingfu Zhou (Baidu); Jin Fang (Baidu ); Liangjun Zhang (baidu); Cheng-Zhong Xu (University of Macau); Jianbing Shen (Inception Institute of Artificial Intelligence)* |
3948 | Audio-Driven Stylized Gesture Generation with Flow-Based Model | Sheng Ye (Tsinghua University)*; Yu-Hui Wen (Tsinghua University); Yanan Sun (Tsinghua University); Ying He (Nanyang Technological University); Ziyang Zhang (HUAWEI TECHNOLOGIES CO.LTD); Yaoyuan Wang (Huawei Technologies Co., Ltd.); Weihua He (Tsinghua University); Yong-Jin Liu (Tsinghua University) |
3958 | Unsupervised Domain Adaptation for One-Stage Object Detector using Offsets to Bounding Box | Jayeon Yoo (Seoul National University); Inseop Chung (Seoul National University); Nojun Kwak (Seoul National University)* |
3964 | Joint Feature Learning and Relation Modeling for Tracking: A One-Stream Framework | Botao Ye (Institute of Computing Technology, Chinese Academy of Sciences)*; Hong Chang (Chinese Academy of Sciences); Bingpeng MA (University of Chinese Academy of Sciences); Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences); Xilin Chen (Institute of Computing Technology, Chinese Academy of Sciences) |
3965 | PreTraM: Self-Supervised Pre-training via Connecting Trajectory and Map | Chenfeng Xu (UC Berkeley)*; Tian Li (University of California, San Diego); Chen Tang (UC Berkeley); Lingfeng Sun (UC Berkeley); Kurt Keutzer (EECS, UC Berkeley); Masayoshi TOMIZUKA (MSC Lab); Alireza Fathi (Google); Wei Zhan (University of California, Berkeley) |
3966 | DeepPS2: Revisiting Photometric Stereo using Two Differently Illuminated Images | Ashish Tiwari (Indian Institute of Technology Gandhinagar)*; Shanmuganathan Raman (Indian Institute of Technology (IIT) Gandhinagar) |
3977 | Learn From All: Erasing Attention Consistency for Noisy Label Facial Expression Recognition | Yuhang Zhang (Beijing University of Posts and Telecommunicates); Chengrui Wang (Beijing University of Posts and Telecommunications); Xu Ling (Beijing University of Posts and Telecommunications); Weihong Deng (Beijing University of Posts and Telecommunications)* |
3984 | Novel Class Discovery without Forgetting | Joseph K J (Indian Institute of Technology, Hyderabad)*; Sujoy Paul (Google Research); Gaurav Aggarwal (Google); Soma Biswas (Indian Institute of Science, Bangalore); Piyush Rai (IIT Kanpur); Kai Han (The University of Hong Kong); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad) |
3985 | Self-Constrained Inference Optimization on Structural Groups for Human Pose Estimation | ZheHan Kan (Southern University of Science and Technology); Shuoshuo Chen (Southern University of Science and Technology); Zeng Li (Southern University of Science and Technology); Zhihai He (Southern University of Science and Technology)* |
3989 | Predicting is not Understanding: Recognizing and Addressing Underspecification in Machine Learning | Damien Teney (University of Adelaide)*; Maxime Peyrard (EPFL); Ehsan M Abbasnejad (The University of Adelaide) |
3991 | A Non-isotropic Probabilistic Take on Proxy-based Deep Metric Learning | Michael Kirchhof (University of Tübingen)*; Karsten Roth (University of Tuebingen); Zeynep Akata (University of Tübingen); Enkelejda Kasneci (University of Tuebingen) |
3998 | Relative Pose from SIFT Features | Daniel Barath (ETH Zürich)*; Zuzana Kukelova (Czech Technical University in Prague) |
3999 | Monocular 3D Object Reconstruction with GAN Inversion | Junzhe Zhang (Nanyang Technological University)*; Daxuan Ren (Nanyang Technological University); Zhongang Cai (SenseTime International Pte Ltd); Chai Kiat Yeo (Nanyang Technological University); Bo Dai (Shanghai AI Lab); Chen Change Loy (Nanyang Technological University) |
4001 | PromptDet: Towards Open-vocabulary Detection using Uncurated Images | Chengjian Feng (Meituan inc.)*; Yujie Zhong (University of Oxford); Zequn Jie (Meituan inc.); Xiangxiang Chu (Meituan); Haibing Ren (Meituan Inc.); Xiaolin Wei (Meituan); Weidi Xie (Shanghai Jiao Tong University); Lin Ma (Meituan) |
4005 | Densely Constrained Depth Estimator for Monocular 3D Object Detection | Yingyan Li (CASIA)*; Yuntao Chen (TuSimple); Jiawei He (Institute of Automation, Chinese Academy of Sciences); Zhaoxiang Zhang (Chinese Academy of Sciences, China) |
4016 | Content Adaptive Latents and Decoder for Neural Image Compression | Guanbo Pan (Beihang University)*; Guo Lu (Beijing Institute of Technology); Zhihao Hu (Beihang University); Dong Xu (The University of Hong Kong) |
4018 | High-Fidelity Image Inpainting with GAN Inversion | Yongsheng YU (University of Chinese Academy of Sciences); Libo Zhang (Institute of Software Chinese Academy of Sciences)*; Heng Fan (University of North Texas); Tiejian Luo (University of Chinese Academy of Sciences) |
4019 | Spatially Invariant Unsupervised 3D Object-Centric Learning and Scene Decomposition | Tianyu Wang (The Australian National University); Miaomiao Liu (The Australian National University)*; Kee Siong Ng (The Australian National University) |
4020 | W2N: Switching From Weak Supervision to Noisy Supervision for Object Detection | Zitong Huang (Harbin Institute of Technology); Yiping Bao (Megvii(Face++) Inc); Bowen Dong (Harbin Institute of Technology); erjin zhou (megvii); Wangmeng Zuo (Harbin Institute of Technology, China)* |
4021 | UnrealEgo: A New Dataset for Robust Egocentric 3D Human Motion Capture | Hiroyasu Akada (Max Planck Institute for Informatics, Keio University); Jian Wang (Max Planck Institute for Informatics); Soshi Shimada (MPI for Informatics); Masaki Takahashi (Keio University); Christian Theobalt (MPI Informatik); Vladislav Golyanik (MPI for Informatics)* |
4022 | MotionCLIP: Exposing Human Motion Generation to CLIP Space | Guy Tevet (Tel Aviv University)*; Brian Gordon (Tel Aviv University); Amir Hertz (Tel Aviv University); Amit H Bermano (Tel-Aviv University); Danny Cohen-Or (Tel Aviv University) |
4023 | Efficient and Degradation-Adaptive Network for Real-World Image Super-Resolution | Jie Liang (The Hong Kong Polytechnic University)*; Hui Zeng (OPPO); Lei Zhang (“Hong Kong Polytechnic University, Hong Kong, China”) |
4024 | Unidirectional Video Denoising by Mimicking Backward Recurrent Modules with Look-ahead Forward Ones | Junyi Li (Harbin Institute of Technology); Xiaohe Wu (Harbin Institute of technology); zhenxing niu (Alibaba Group-Machine Intelligence Technology); Wangmeng Zuo (Harbin Institute of Technology, China)* |
4029 | Map-free Visual Relocalization: Metric Pose Relative to a Single Image | Eduardo Arnold (University of Warwick); Jamie M Wynn (Niantic); Sara Vicente (Niantic); Guillermo Garcia-Hernando (Niantic); Aron Monszpart (Niantic); Victor A Prisacariu (Niantic Labs); Daniyar Turmukhambetov (Niantic); Eric Brachmann (Niantic)* |
4032 | DeltaGAN: Towards Diverse Few-shot ImageGeneration with Sample-Specific Delta | Yan Hong (Shanghai Jiao Tong University); Li Niu (Shanghai Jiao Tong University)*; Jianfu Zhang (Shanghai Jiao Tong University); Liqing Zhang (Shanghai Jiao Tong University) |
4035 | Sample-Adaptive Augmentation for Long-Tailed Image Classification | Yan Hong (Shanghai Jiao Tong University); Jianfu Zhang (Shanghai Jiao Tong University)*; Zhongyi Sun (Tencent); Ke Yan (Tencent) |
4037 | TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers | Jihao Liu (Sensetime)*; Boxiao Liu (Institute of Computing Technology, Chinese Academy of Sciences); Hang Zhou (The Chinese University of Hong Kong); Hongsheng Li (The Chinese University of Hong Kong); Yu Liu (SenseTime Group LTD) |
4041 | UFO: Unified Feature Optimization | Teng Xi (Baidu Inc.)*; Yifan Sun (Baidu Research); Deli Yu (Baidu Inc. ); Bi Li (Baidu Inc.); Nan Peng (Baidu Inc.); gang zhang (Baidu Inc.); Xinyu Zhang (Baidu Inc.); Zhigang Wang (shanghai AI lab); jinwen chen (Baidu Inc.); Jian Wang (Baidu Inc.); liu lufei (Baidu Inc); Haocheng Feng (Baidu Inc.); Junyu Han (Baidu Inc.); jingtuo liu (baidu); Errui Ding (Baidu Inc.); Jingdong Wang (Baidu) |
4043 | Master of All: Simultaneous Generalization of Urban-Scene Segmentation to All Adverse Weather Conditions | Nikhil Reddy (IIT Delhi)*; Abhinav Singhal (Indian Institute of Technology, Delhi); Abhishek Kumar (IIT Delhi); Mahsa Baktashmotlagh (University of Queensland); Chetan Arora (Indian Institute of Technology Delhi) |
4047 | PalQuant: Accelerating High-precision Networks on Low-precision Accelerators | Qinghao Hu (Institute of Automation, Chinese Academy of Sciences)*; gang li (shanghai jiao tong university); Qiman Wu (Baidu Inc.); Jian Cheng (“Chinese Academy of Sciences, China”) |
4057 | Self-Supervised Learning for Real-World Super-Resolution from Dual Zoomed Observations | Zhilu Zhang (Harbin Institute of Technology); Ruohao Wang (Harbin Institute of Technology); Hongzhi Zhang (Harbin Institute of Technology); Yunjin Chen (ULSee Inc.); Wangmeng Zuo (Harbin Institute of Technology, China)* |
4059 | UniMiSS: Universal Medical Self-Supervised Learning via Breaking Dimensionality Barrier | Yutong Xie (University of Adelaide)*; Jianpeng Zhang (Northwestern Polytechnical University); Yong Xia (Northwestern Polytechnical University, Research & Development Institute of Northwestern Polytechnical University in Shenzhen); Qi Wu (University of Adelaide) |
4073 | Self-distilled Feature Aggregation for Self-supervised Monocular Depth Estimation | Zhengming Zhou (NLPR-IA-CAS); Qiulei Dong (NLPR-IA-CAS)* |
4074 | Negative Samples are at Large: Leveraging Hard-distance Elastic Loss for Re-identification | Hyungtae Lee (DEVCOM Army Research Laboratory)*; Sungmin Eum (Booz Allen Hamilton Inc.); Heesung Kwon (U.S. Army Research Laboratory) |
4076 | Global-local Motion Transformer for Unsupervised Skeleton-based Action Learning | Boeun Kim (Seoul National University)*; Hyung Jin Chang (University of Birmingham); Jungho Kim (KETI); Jin Young Choi (Seoul National University) |
4080 | Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoiréing | Xin Yu (The University of Hong Kong)*; Peng Dai (The University of Hong Kong); Wenbo Li (The Chinese University of Hong Kong); Lan Ma (TCL Corporate Research); Jiajun Shen (TCL Research); Jia Li (Sun Yat-Sen University); Xiaojuan Qi (The University of Hong Kong) |
4084 | Instance Contour Adjustment via Structure-driven CNN | Shuchen Weng (Peking University)*; Yi Wei (Samsung Research America Inc.); Ming-Ching Chang (University at Albany – SUNY); Boxin Shi (Peking University) |
4085 | ERDN: Equivalent Receptive Field Deformable Network for Video Deblurring | Bangrui Jiang (Tsinghua University)*; zhihuai xie (Tencent); Zhen Xia (Tencent); Songnan Li (Tencent); Shan Liu (Tencent America) |
4090 | Localizing Visual Sounds the Easy Way | Shentong Mo (Carnegie Mellon University); Pedro Morgado (CMU)* |
4105 | Polarimetric Pose Prediction | Daoyi Gao (Technical University of Munich)*; Yitong Li (Technical University of Munich); Patrick Ruhkamp (Technical University of Munich); Iuliia Skobleva (Technical University of Munich); Magdalena Wysocki (Technical University of Munich); HyunJun Jung ( Technical University of Munich); Pengyuan Wang (TUM); Arturo Guridi (Technical University of Munich); Benjamin Busam (Technical University of Munich) |
4115 | DFNet: Enhance Absolute Pose Regression with Direct Feature Matching | Shuai Chen (University of Oxford)*; Xinghui Li (University of Oxford); Zirui Wang (University of Oxford); Victor Adrian Prisacariu (University of Oxford) |
4117 | A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge | Dustin Schwenk (Allen Institute for Artificial Intelligence); Apoorv Khandelwal (Allen Institute for AI); Christopher A Clark (Allen Institute for AI); Kenneth Marino (CMU); Roozbeh Mottaghi (Allen Institute for AI)* |
4119 | Sound Localization by Self-Supervised Time Delay Estimation | Ziyang Chen (University of Michigan)*; David Fouhey (University of Michigan); Andrew Owens (U Michigan) |
4120 | AdaFocus V3: On Unified Spatial-temporal Dynamic Video Recognition | Yulin Wang (Tsinghua University); Yang Yue (Tsinghua University); Xinhong Xu (Tsinghua University); Ali Hassani (University of Oregon); Victor Kulikov (Picsart); Nikita Orlov (PicsArt); Shiji Song (Department of Automation, Tsinghua University); Humphrey Shi (U of Oregon | UIUC | PAIR); Gao Huang (Tsinghua)* |
4123 | Discrete-Constrained Regression for Local Counting Models | Haipeng Xiong (National University of Singapore)*; Angela Yao (National University of Singapore) |
4124 | Towards Regression-Free Neural Networks for Diverse Compute Platforms | Rahul Duggal (Georgia Tech); Hao Zhou (Amazon); Shuo Yang (Amazon); Jun Fang (Amazon)*; Yuanjun Xiong (Amazon); Wei Xia (Amazon) |
4130 | Selection and Cross Similarity for Event-Image Deep Stereo | Hoonhee Cho (KAIST)*; Kuk-Jin Yoon (KAIST) |
4136 | Long Movie Clip Classification with State-Space Video Models | Md Mohaiminul Islam (UNC Chapel Hill)*; Gedas Bertasius (UNC Chapel Hill) |
4145 | Relationship Spatialization for Depth Estimation | xiaoyu xu (University of Waterloo)*; Jiayan Qiu (University of Waterloo); Xinchao Wang (National University of Singapore); Zhou Wang (University of Waterloo) |
4150 | Breadcrumbs: Adversarial Class-Balanced Sampling for Long-tailed Recognition | Bo Liu (Wormpex AI Research)*; Haoxiang Li (Wormpex AI Research); Hao Kang (Wormpex AI Research); Gang Hua (Wormpex AI Research); Nuno Vasconcelos (UCSD, USA) |
4152 | Image2Point: 3D Point-Cloud Understanding with 2D Image Pretrained Models | Chenfeng Xu (UC Berkeley)*; Shijia Yang (UC Berkeley); Tomer Galanti (Massachusetts Institute of Technology); Bichen Wu (Facebook Research); Xiangyu Yue (University of California, Berkeley); Bohan Zhai (UC Berkeley); Wei Zhan (University of California, Berkeley); Kurt Keutzer (EECS, UC Berkeley); Peter Vajda (Facebook); Masayoshi Tomizuka (University of California, Berkeley) |
4175 | Visual Prompt Tuning | Menglin Jia (Cornell University)*; Luming Tang (Cornell University); Bor-Chun Chen (Facebook AI); Claire T Cardie (Cornell University); Serge Belongie (University of Copenhagen); Bharath Hariharan (Cornell University); Ser-Nam Lim (Meta AI) |
4181 | Multi-scale and Cross-scale Contrastive Learning for Semantic Segmentation | THEODOROS PISSAS (University College London)*; Claudio S Ravasio (King’s College London (KCL)); Lyndon DaCruz (Moorfields Eye Hospital / University College London); Christos Bergeles (Kings College London) |
4185 | Rethinking Generic Camera Models for Deep Single Image Camera Calibration to Recover Rotation and Fisheye Distortion | Nobuhiko Wakai (Panasonic Corporation)*; Satoshi Sato (Panasonic Corporation); Yasunori Ishii (Panasonic Holdings); Takayoshi Yamashita (Chubu University) |
4188 | Neural-Sim: Learning to Generate Training Data with NeRF | Yunhao Ge (University of Southern California)*; Harkirat Behl (University of Oxford); Jiashu Xu (USC); Suriya Gunasekar (Microsoft Research); Neel Joshi (MICROSOFT RESEARCH); Yale Song (FAIR); Xin Wang (Microsoft Research); Laurent Itti (University of Southern California); Vibhav Vineet (Microsoft Research) |
4195 | Word-Level Fine-Grained Story Visualization | Bowen Li (University of Oxford)* |
4206 | Chairs Can be Stood on: Overcoming Object Bias in Human-Object Interaction Detection | Guangzhi Wang (National University of Singapore)*; Yangyang Guo (National University of Singapore); Yongkang Wong (National University of Singapore); Mohan Kankanhalli (National University of Singapore,) |
4208 | GOCA: Guided Online Cluster Assignment for Self Supervised Video Representation Learning | HUSEYIN COSKUN (Technical University of Munich)*; Alireza Zareian (Snap Inc.); Joshua L Moore (Snapchat); Federico Tombari (Google, TU Munich); Chen Wang (Snap Inc.) |
4217 | Learning Audio-Video Modalities from Image Captions | Arsha Nagrani (Google )*; Paul Hongsuck Seo (Google); Bryan Seybold (Google); Anja Hauth (Google AI); Santiago Manen (Google); Chen Sun (Brown University); Cordelia Schmid (Google) |
4220 | Inverted Pyramid Multi-task Transformer for Dense Scene Understanding | Hanrong Ye (The Hong Kong University of Science and Technology)*; Dan Xu (The Hong Kong University of Science and Technology) |
4222 | Image Inpainting with Cascaded Modulation GAN and Object-Aware Training | Haitian Zheng (University of Rochester)*; Zhe Lin (Adobe Research); Jingwan Lu (Adobe Research ); Scott Cohen (Adobe Research); Eli Shechtman (Adobe Research, US); Connelly Barnes (Adobe); Jianming Zhang (Adobe Research); Ning Xu (Adobe Research); Sohrab Amirghodsi (Adobe Research); Jiebo Luo (U. Rochester) |
4231 | Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues | Zixuan Huang (Georgia Institute of Technology)*; Stefan Stojanov (Georgia Institute of Technology); Anh Thai (Georgia Institute of Technology); Varun Jampani (Google); James Rehg (Georgia Institute of Technology) |
4237 | ART-SS: An Adaptive Rejection Technique for Semi-Supervised restoration for adverse weather-affected images | Rajeev Yasarla ( AIBEE )*; Carey E Priebe (Johns Hopkins University); Vishal Patel (Johns Hopkins University) |
4239 | Skeleton-Parted Graph Scattering Networks for 3D Human Motion Prediction | Maosen Li (Cooperative Medianet Innovation Center, Shanghai Jiao Tong University)*; Siheng Chen (Shanghai Jiao Tong University); Zijing Zhang (Zhejiang University); Lingxi Xie (Huawei Inc.); Qi Tian (Huawei Cloud & AI); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University) |
4241 | MHR-Net: Multiple-Hypothesis Reconstruction of Non-Rigid Shapes from 2D Views | Haitian Zeng (University of Technology Sydney)*; Xin Yu (University of Technology Sydney); Jiaxu Miao (Zhejiang University); Yi Yang (Zhejiang University) |
4243 | Unifying Event Detection and Captioning as Sequence Generation via Pre-Training | Qi Zhang (Renmin University of China)*; Yuqing Song (Renmin University of China); Qin Jin (Renmin University of China) |
4247 | Depth Map Decomposition for Monocular Depth Estimation | Jinyoung Jun (Korea University)*; Jae-Han Lee (Gauss Labs Inc.); Chul Lee (Dongguk University); Chang-Su Kim (Korea university) |
4249 | Human-centric Image Cropping with Partition-aware and Content-preserving Features | Bo Zhang (Shanghai Jiao Tong University)*; Li Niu (Shanghai Jiao Tong University); Xing Zhao (Shanghai Jiao Tong University); Liqing Zhang (Shanghai Jiao Tong University) |
4252 | Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking | Boyu Chen (The University of Sydney); Peixia Li (The University of Sydney)*; Lei Bai (Shanghai AI Laboratory); Lei Qiao (SenseTime Group Limited); Qiuhong Shen (Harbin Institute of Technology (Shenzhen)); Bo Li (SenseTime Group Limited); Weihao Gan (SenseTime Group Limited); Wei Wu (SenseTime Group Limited); Wanli Ouyang (The University of Sydney) |
4255 | StyleFace: Towards Identity-Disentangled Face Generation on Megapixels | Yuchen Luo (Shanghai Jiao Tong University)*; Junwei Zhu (Tencent); Keke He (Tencent); Wenqing Chu (Tencent); Ying Tai (Tencent YouTu); Junchi Yan (Shanghai Jiao Tong University); Chengjie Wang (Tencent; Shanghai Jiao Tong University) |
4260 | Fusion from Decomposition: A Self-Supervised Decomposition Approach for Image Fusion | Pengwei Liang (Harbin Institute of Technology)*; Junjun Jiang (Harbin Institute of Technology); Xianming Liu (Harbin Institute of Technology); Jiayi Ma (Wuhan University) |
4261 | Learning Degradation Representations for Image Deblurring | dasong Li (Chinese University of Hong Kong)*; Yi Zhang (CUHK); Ka Chun Cheung (Nvidia); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Hongwei Qin (Sensetime); Hongsheng Li (The Chinese University of Hong Kong) |
4269 | Aware of the History: Trajectory Forecasting with the Local Behavior Data | Yiqi Zhong (University of Southern California)*; Zhenyang Ni (Shanghai Jiao Tong University); Siheng Chen (Shanghai Jiao Tong University); Ulrich Neumann (USC) |
4270 | FAR: Fourier Aerial Video Recognition | Divya Kothandaraman (University of Maryland College Park)*; Tianrui Guan (University of Maryland, College Park); Xijun Wang (University of Maryland, College Park); Shuowen Hu (US Army Research Laboratory); Ming C Lin (UMD-CP & UNC-CH ); Dinesh Manocha (University of Maryland at College Park) |
4271 | X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation | Yinan He (Beijing University of Posts and Telecommunications)*; Gengshi Huang (School of Electronics and Information Technology, Sun Yat-sen University); Siyu Chen (Carnegie Mellon University); Jianing Teng (sensetime); Kun Wang (SenseTime Group Limited); Zhenfei Yin (Sensetime); Lu Sheng (Beihang University); Ziwei Liu (Nanyang Technological University); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Jing Shao (Sensetime) |
4273 | Disentangled Differentiable Network Pruning | Shangqian Gao (University of Pittsburgh)*; Feihu Huang (University of Pittsburgh); Yanfu Zhang (University of Pittsburgh); Heng Huang (University of Pittsburgh) |
4275 | Video Extrapolation in Space and Time | Yunzhi Zhang (Stanford University)*; Jiajun Wu (Stanford University) |
4277 | IDa-Det: An Information Discrepancy-aware Distillation for 1-bit Detectors | Sheng Xu (Beihang University)*; Yanjing Li (Beihang University); Bohan Zeng (Beihang University); Teli Ma (Shanghai Artificial Intelligence Laboratory); Baochang Zhang (Beihang University); Xianbin Cao (Beihang University, China); Peng Gao (Chinese university of hong kong); Jinhu Lu (Beihang University, Beijing, China) |
4278 | Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation | chuang lin (Monash University)*; Yi Jiang (Bytedance); Jianfei Cai (Monash University); Lizhen Qu (Monash University); Reza Haffari (Monash University, Australia); Zehuan Yuan (Bytedance.Inc) |
4282 | DnA: Improving Few-shot Transfer Learning with Low-Rank Decomposition and Alignment | Ziyu Jiang (Texas A&M University)*; Tianlong Chen (Unversity of Texas at Austin); Xuxi Chen (University of Texas at Austin); Yu Cheng (Microsoft Research); Luowei Zhou (Microsoft); Lu Yuan (Microsoft); Ahmed Awadallah (Microsoft); Zhangyang Wang (University of Texas at Austin) |
4284 | Translating a Visual LEGO Manual to a Machine-Executable Plan | Ruocheng Wang (Stanford University)*; Yunzhi Zhang (Stanford University); Jiayuan Mao (MIT); Chin-Yi Cheng (Google Research); Jiajun Wu (Stanford University) |
4286 | Cornerformer: Purifying Instances for Corner-based Detectors | Haoran Wei (University of Chinese Academy of Sciences)*; Xin Chen (Huawei Inc.); Lingxi Xie (Huawei Inc.); Qi Tian (Huawei Cloud & AI) |
4287 | Contributions of Shape, Texture, and Color in Visual Recognition | Yunhao Ge (University of Southern California)*; Yao Xiao (University of Southern California); Zhi Xu (University of Southern California); Xingrui Wang (University of Southern California); Laurent Itti (University of Southern California) |
4288 | Monitored Distillation for Positive Congruent Depth Completion | Tian Yu Liu (UCLA); Parth Agrawal (UCLA); Allison Y Chen (University of California, Los Angeles); Byung-Woo Hong (Chung-Ang University); Alex Wong (Yale University)* |
4292 | Towards Unbiased Label Distribution Learning for Facial Pose Estimation Using Anisotropic Spherical Gaussian | Zhiwen Cao (Purdue University); Dongfang Liu (Rochester Institute of Technology)*; Qifan Wang (Meta AI); Yingjie Victor Chen (Purdue University) |
4293 | AirDet: Few-Shot Detection without Fine-tuning for Autonomous Exploration | Bowen Li (Tongji University)*; Chen Wang (Carnegie Mellon University); Pranay Reddy Anthireddy (Indian Institute of Information Technology, Design and Manufacturing, Jabalpur); Seungchan Kim (Carnegie Mellon University); Sebastian Scherer (Carnegie Mellon University) |
4295 | Learning to Weight Samples for Dynamic Early-exiting Networks | Yizeng Han (Tsinghua University); Yifan Pu (Tsinghua University); Zihang Lai (CMU); Chaofei Wang (Tsinghua University); Shiji Song (Department of Automation, Tsinghua University); cao junfeng (CMRI); Wenhui Huang (CMRI); Chao Deng (China Mobile Research Institute); Gao Huang (Tsinghua)* |
4300 | Constrained Mean Shift Using Distant Yet Related Neighbors for Representation Learning | K L Navaneet (University of California, Davis); Soroush Abbasi Koohpayegani (University of Maryland Baltimore County)*; Ajinkya B Tejankar (UMBC); Kossar Pourahmadi Meibodi (University of Maryland, Baltimore County); Akshayvarun Subramanya (UMBC); Hamed Pirsiavash (University of California Davis) |
4303 | SLIP: Self-supervision meets Language-Image Pre-training | Norman Mu (University of California, Berkeley)*; Alexander Kirillov (Facebook AI Reserach); David Wagner (UC Berkeley); Saining Xie (Facebook AI Research) |
4304 | Learning Visual Styles from Audio-Visual Associations | Tingle Li (Tsinghua University)*; Yichen Liu (Tsinghua University); Andrew Owens (U Michigan); Hang Zhao (Tsinghua University) |
4305 | Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting | Ying Chen (Hikvision Research Institute); Liang Qiao (Zhejiang University & Hikvision Research Institute)*; Zhanzhan Cheng (Zhejiang University & Hikvision Research Institute); Shiliang Pu (Hikvision Research Institute); Yi Niu (Hikvision Research Institute); Xi Li (Zhejiang University) |
4310 | Prompting Visual-Language Models for Efficient Video Understanding | Chen Ju (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Tengda Han (University of Oxford); Kunhao Zheng (Shanghai Jiaotong University); Ya Zhang (Cooperative Medianet Innovation Center, Shang hai Jiao Tong University); Weidi Xie (Shanghai Jiao Tong University)* |
4318 | One-Trimap Video Matting | Hongje Seong (Yonsei University)*; Seoung Wug Oh (Adobe Research); Brian Price (Adobe); Euntai Kim (Yonsei University); Joon-Young Lee (Adobe Research) |
4323 | Contrastive Learning for Diverse Disentangled Foreground Generation | Yuheng Li (UW Madison)*; Yijun Li (Adobe Research); Jingwan Lu (Adobe Research ); Eli Shechtman (Adobe Research, US); Yong Jae Lee (University of Wisconsin-Madison); Krishna Kumar Singh (Adobe Research) |
4326 | Resolution-free Point Cloud Sampling Network with Data Distillation | Tianxin Huang (Zhejiang University)*; Jiangning Zhang (Zhejiang University); Jun Chen (Zhejiang University); Yuang Liu (Zhejiang University); Yong Liu (Zhejiang University) |
4327 | BIPS: Bi-modal Indoor Panorama Synthesis via Residual Depth-aided Adversarial Learning | Changgyoon Oh (KAIST)*; Wonjune Cho (NAVER LABS); Yujeong Chae (KAIST); Daehee Park (KAIST); Lin Wang (HKUST); Kuk-Jin Yoon (KAIST) |
4330 | Augmentation of rPPG Benchmark Datasets: Learning to Remove and Embed rPPG Signals via Double Cycle Consistent Learning from Unpaired Facial Videos | WEI-HAO Chung (National Tsing Hua University)*; CHENG-JU HSIEH (National Tsing Hua University); Chiou-Ting Hsu (National Tsing Hua University) |
4331 | Fabric Material Recovery from Video Using Multi-Scale Geometric Auto-Encoder | Junbang Liang (University of Maryland, College Park)*; Ming C Lin (UMD-CP & UNC-CH ) |
4333 | An Invisible Black-box Backdoor Attack through Frequency Domain | Tong Wang (Nanjing University); Yuan Yao (Nanjing University)*; Feng Xu (Nanjing University); Shengwei An (Purdue University); Hanghang Tong (University of Illinois at Urbana-Champaign); Ting Wang (Penn State) |
4336 | Learning Mutual Modulation for Self-Supervised Cross-Modal Super-Resolution | Xiaoyu Dong (The University of Tokyo / RIKEN AIP); Naoto Yokoya (The University of Tokyo)*; Longguang Wang (National University of Defense Technology); Tatsumi Uezato (Hitachi, Ltd) |
4338 | TransGrasp: Grasp Pose Estimation of a Category of Objects by Transferring Grasps from Only One Labeled Instance | Hongtao Wen (Dalian University of Technology); Jianhang Yan (Dalian University of Technology); Wanli Peng (Dalian University of Technology)*; Yi Sun (Dalian University of Technology) |
4343 | Learning Instance and Task-Aware Dynamic Kernels for Few-shot Learning | Rongkai Ma (Monash University)*; Pengfei Fang (The Australian National University); Gil Avraham (Monash University); Yan Zuo (CSIRO); Tianyu Zhu (Monash University); Tom Drummond (University of Melbourne); Mehrtash Harandi (Monash University) |
4346 | PillarNet: Real-Time and High-Performance Pillar-based 3D Object Detection | Guangsheng Shi (Harbin Institute of Technology)*; Ruifeng Li (Harbin Institute of Technology); Chao Ma (Shanghai Jiao Tong University) |
4348 | Robust Object Detection With Inaccurate Bounding Boxes | Chengxin Liu (Huazhong University of Science and Technology); Kewei Wang (Huazhong Univ. of Sci.&Tech.); Hao Lu (Huazhong University of Science and Technology); Zhiguo Cao (Huazhong Univ. of Sci.&Tech.)*; Ziming Zhang (Worcester Polytechnic Institute) |
4349 | Revisiting the Critical Factors of Augmentation-Invariant Representation Learning | Junqiang Huang (MEGVII Technology)*; Xiangwen Kong (MEGVII Technology); Xiangyu Zhang (Megvii Technology) |
4359 | A Fast Knowledge Distillation Framework for Visual Recognition | Zhiqiang Shen (Carnegie Mellon University)*; Eric Xing (MBZUAI, CMU, and Petuum Inc.) |
4366 | MegBA: A GPU-Based Distributed Library for Large-Scale Bundle Adjustment | Jie Ren (Megvii Inc.); Wenteng Liang (Megvii); Ran Yan (Megvii)*; Luo Mai (University of Edinburgh); Shiwen Liu (Megvii); Xiao Liu (Megvii Inc) |
4367 | Spectrum-aware and Transferable Architecture Search for Hyperspectral Image Restoration | Wei He (Wuhan University)*; Quanming Yao (Tsinghua University); Naoto Yokoya (The University of Tokyo); Tatsumi Uezato (Hitachi, Ltd); Hongyan Zhang (Wuhan University); Liangpei Zhang (Wuhan University) |
4374 | Boosting Transferability of Targeted Adversarial Examples via Hierarchical Generative Networks | Xiao Yang (Tsinghua University)*; Yinpeng Dong (Tsinghua University); Tianyu Pang (Sea AI Lab); Hang Su (Tsinghua Univiersity); Jun Zhu (Tsinghua University) |
4378 | Exploring the Devil in Graph Spectral Domain for 3D Point Cloud Attacks | Qianjiang Hu (Peking University); Daizong Liu (Peking University); Wei Hu (Peking University)* |
4385 | Geometry-aware Single-image Full-body Human Relighting | Chaonan Ji (Tsinghua University); Tao Yu (Tsinghua University); Kaiwen Guo (Google); JINGXIN LIU (OPPO); Yebin Liu (Tsinghua University)* |
4388 | Optical Flow Training under Limited Label Budget via Active Learning | Shuai Yuan (Duke University)*; Xian Sun (Duke University); Hannah H Kim (Duke University); Shuzhi Yu (Duke University); Carlo Tomasi (Duke University) |
4395 | RVSL: Robust Vehicle Similarity Learning in Real Hazy Scenes Based on Semi-supervised Learning | Wei-Ting Chen (National Taiwan University)*; I-HSIANG CHEN (National Taiwan University); CHIH-YUAN YEH (National Taiwan University); Hao-Hsiang Yang (National Taiwan University); Hua-En Chang (National Taiwan University); Jian-Jiun Ding (National Taiwan University); Sy-Yen Kuo (National Taiwan University) |
4400 | Hierarchical Feature Embedding for Visual Tracking | Zhixiong Pi (Huazhong University of Science and Technology)*; Weitao Wan (Tencent); Chong Sun (Tencent Wechat); Changxin Gao (Huazhong University of Science and Technology); Nong Sang (Huazhong University of Science and Technology); Chen Li (Tencent) |
4401 | Neural Color Operators for Sequential Image Retouching | YILI WANG (Tsinghua University); Xin Li (Baidu); Kun Xu (Tsinghua University)*; Dongliang He (Baidu); Qi Zhang (baidu); Fu Li (Baidu); Errui Ding (Baidu Inc.) |
4402 | Optimizing Image Compression via Joint Learning with Denoising | Ka Leong Cheng (The Hong Kong University of Science and Technology); Yueqi Xie (The Hong Kong University of Science and Technology); Qifeng Chen (HKUST)* |
4405 | DICE: Leveraging Sparsification for Out-of-Distribution Detection | Yiyou Sun (University of Wisconsin Madison); Yixuan Li (University of Wisconsin-Madison)* |
4406 | DeMFI: Deep Joint Deblurring and Multi-Frame Interpolation with Flow-Guided Attentive Correlation and Recursive Boosting | Jihyong Oh (KAIST)*; Munchurl Kim (Korea Advanced Institute of Science and Technology) |
4408 | Invariant Feature Learning for Generalized Long-Tailed Classification | Kaihua Tang (Nanyang Technological University)*; Mingyuan Tao (Damo Academy, Alibaba Group); Jiaxin Qi (Nanyang Technological University); Zhenguang Liu (Zhejiang University); Hanwang Zhang (Nanyang Technological University) |
4411 | Fine-Grained Visual Entailment | Christopher L Thomas (Columbia University)*; Yipeng Zhang (Columbia University); Shih-Fu Chang (Columbia University) |
4412 | Sliced Recursive Transformer | Zhiqiang Shen (Carnegie Mellon University)*; Zechun Liu (Carnegie Mellon University); Eric Xing (MBZUAI, CMU, and Petuum Inc.) |
4413 | Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval | Fan Hu (Renmin University of China); Aozhu Chen (Renmin University of China); Ziyue Wang (Renmin University of China); Fangming Zhou (Renmin University of China); Jianfeng Dong (Zhejiang Gongshang University); Xirong Li (Renmin University of China)* |
4416 | Asymmetric Relation Consistency Reasoning for Video Relation Grounding | Huan Li (Xi’an Jiaotong University); Ping Wei (Xi’an Jiaotong University)*; Jiapeng Li (Xi’an Jiaotong University); Zeyu Ma (Xi’an Jiaotong University); Jiahui Shang (Xi’an Jiaotong University); Nanning Zheng (Xi’an Jiaotong University) |
4420 | PETR: Position Embedding Transformation for Multi-View 3D Object Detection | Yingfei Liu (Megvii Technology); Tiancai Wang ( Megvii Technology)*; Xiangyu Zhang (Megvii Technology); Jian Sun (Megvii Technology) |
4422 | Contextual Text Block Detection towards Scene Text Understanding | Chuhui Xue (Nanyang Technological University); Jiaxing Huang (Nanyang Technological University); Wenqing Zhang (ByteDance); Shijian Lu (Nanyang Technological University)*; Changhu Wang (ByteDance.Inc); Song Bai (University of Oxford) |
4426 | Structure-aware Editable Morphable Model for 3D Facial Detail Animation and Manipulation | Jingwang Ling (Tsinghua University); Zhibo Wang (Tsinghua University); Ming Lu (Intel Labs China); Quan Wang (Sensetime); Chen Qian (SenseTime); Feng Xu (Tsinghua University)* |
4429 | UniNet: Unified Architecture Search with Convolution, Transformer, and MLP | Jihao Liu (Sensetime)*; Xin Huang (Waseda University); Guanglu Song (Sensetime); Hongsheng Li (The Chinese University of Hong Kong); Yu Liu (SenseTime Group LTD) |
4433 | Efficient Decoder-free Object Detection with Transformers | Peixian Chen (Youtu Tencent); mengdan zhang (Youtu, Tencent); Yunhang Shen (Xiamen University); Kekai Sheng (Youtu Lab, Tencent Inc.); Yuting Gao (tencent); Xing Sun (Shopee); Ke Li (Tencent)*; Chunhua Shen (“University of Adelaide, Australia”) |
4439 | Rethinking Keypoint Representations: Modeling Keypoints and Poses as Objects for Multi-Person Human Pose Estimation | William McNally (University of Waterloo)*; Kanav Vats (University of Waterloo); Alexander Wong (University of Waterloo); John McPhee (University of Waterloo) |
4440 | CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation | Lu Qi (The Chinese University of Hong Kong)*; Jason Kuen (Adobe Research); Zhe Lin (Adobe Research); Jiuxiang Gu (Adobe Research); Fengyun Rao (Tencent); Dian Li (Tencent.com); Weidong Guo (Tencent); Zhen Wen (Tencent Technology (Shenzhen) Co., Ltd); Ming-Hsuan Yang (University of California at Merced); Jiaya Jia (Chinese University of Hong Kong) |
4447 | StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning | Jinghuan Shang (Stony Brook University)*; Kumara Kahatapitiya (Stony Brook University); Xiang Li (Stony Brook University); Michael S Ryoo (Stony Brook/Google) |
4451 | S2Net: Stochastic Sequential Pointcloud Forecasting | Xinshuo Weng (NVIDIA Research)*; Junyu Nan (Carnegie Mellon University); Kuan-Hui Lee (Toyota Research Institute); Rowan McAllister (Toyota Research Institute); Adrien Gaidon (Toyota Research Institute); Nicholas Rhinehart (UC Berkeley); Kris Kitani (Carnegie Mellon University) |
4452 | D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding | Zhenyu Chen (Technical University of Munich)*; Qirui Wu (Simon Fraser University); Matthias Niessner (Technical University of Munich); Angel X Chang (Simon Fraser University) |
4464 | AMixer: Adaptive Weight Mixing for Self-Attention Free Vision Transformers | Yongming Rao (Tsinghua University); Wenliang Zhao (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)* |
4471 | Neural Image Representations for Multi-Image Fusion and Layer Separation | Seonghyeon Nam (York University); Marcus A Brubaker (York University); Michael S Brown (York University)* |
4477 | Panoramic Human Activity Recognition | Ruize Han (College of Intelligence and Computing, Tianjin University); Haomin Yan (Tianjin University); Jiacheng Li (College of Intelligence and Computing, Tianjin University); Songmiao Wang (Tianjin University); Wei Feng (College of Intelligence and Computing, Tianjin University, China)*; Song Wang (University of South Carolina) |
4478 | Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution | Yushu Wu (Northeastern University)*; Yifan Gong (Northeastern University); Pu Zhao (Northeastern University); Yanyu Li (Northeastern University); Zheng Zhan (Northeastern University); Wei Niu (William & Mary); Hao Tang (ETH Zurich); Minghai Qin (Western Digital Research); Bin Ren (William & Mary); Yanzhi Wang (Northeastern University) |
4481 | Dual Adaptive Transformations for Weakly Supervised Point Cloud Segmentation | Zhonghua Wu (Nanyang Technological University)*; Yicheng Wu (Monash University); Guosheng Lin (Nanyang Technological University); Jianfei Cai (Monash University); Chen Qian (SenseTime) |
4495 | Modality Synergy Complement Learning with Cascaded Aggregation for Visible-Infrared Person Re-Identification | Yiyuan Zhang (Beijing Institute of Technology); Sanyuan Zhao (Beijing Institute of Technology )*; Yuhao Kang (Beijing Institute of Technology); Jianbing Shen (Inception Institute of Artificial Intelligence) |
4496 | RA-Depth: Resolution Adaptive Self-Supervised Monocular Depth Estimation | Mu He (Nanjing University of Science and Technology)*; Le Hui (Nanjing University of Science and Technology); Yikai Bian (Nanjing University of Science and Technology); Jian Ren (Nanjing University of Science and Technology); Jin Xie (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology) |
4505 | MoFaNeRF: Morphable Facial Neural Radiance Field | Yiyu Zhuang (Nanjing University); Hao Zhu (Nanjing University)*; Xusen Sun (Nanjing University); Xun Cao (Nanjing University) |
4513 | Visual Cross-View Metric Localization with Dense Uncertainty Estimates | Zimin Xia (Delft University of Technology)*; Olaf Booij (TomTom); Marco Manfredi (TomTom); Julian F P Kooij (Delft University of Technology) |
4525 | The One Where They Reconstructed 3D Humans and Environments in TV Shows | Georgios Pavlakos (UC Berkeley)*; Ethan Weber (UC Berkeley); Matthew Tancik (UC Berkeley); Angjoo Kanazawa (University of California Berkeley) |
4530 | PointInst3D: Segmenting 3D Instances by Points | Tong He (University of Adelaide)*; Wei Yin (University of Adelaide); Chunhua Shen (“University of Adelaide, Australia”); Anton van den Hengel (University of Adelaide) |
4533 | PolyphonicFormer: Unified Query Learning for Depth-aware Video Panoptic Segmentation | Haobo Yuan (Wuhan University)*; Xiangtai Li (Peking University); Yibo Yang (Peking University); Guangliang Cheng (Sensetime Group Limited); Jing Zhang (The University of Sydney); Yunhai Tong (Peking University); Lefei Zhang (Wuhan University); Dacheng Tao (JD.com) |
4534 | Quasi-Balanced Self-Training on Noise-Aware Synthesis of Object Point Clouds for Closing Domain Gap | Yongwei Chen (South China University of Technology); ZiHao Wang (South China University of Technology); Longkun Zou (South China University of Technology); Ke Chen (South China University of Technology); Kui Jia (South China University of Technology)* |
4537 | TinyViT: Fast Pretraining Distillation for Small Vision Transformers | Kan Wu (Sun Yat-sen University); Jinnian Zhang (University of Wisconsin Madison); Houwen Peng (Microsoft Research)*; Mengchen Liu (Microsoft); Bin Xiao (Microsoft); Jianlong Fu (Microsoft Research); Lu Yuan (Microsoft) |
4551 | VirtualPose: Learning Generalizable 3D Human Pose Models from Virtual Data | Jiajun Su (Peking University)*; Chunyu Wang (Microsoft Research asia); Xiaoxuan Ma (Peking University); Wenjun Zeng (EIT Institute for Advanced Study); Yizhou Wang (PKU) |
4552 | Poseur: Direct Human Pose Regression with Transformers | Weian Mao (the university of adelaide)*; Yongtao Ge (The University of Adelaide); Chunhua Shen (“University of Adelaide, Australia”); Xinlong Wang (University of Adelaide); Zhi Tian (Meituan); Zhibin Wang (Alibaba Group); Anton van den Hengel (University of Adelaide) |
4557 | Adaptive Image Transformations for Transfer-based Adversarial Attack | Zheng Yuan (Institute of Computing Technology, Chinese Academy of Sciences); Jie Zhang (ICT, CAS)*; Shiguang Shan (Institute of Computing Technology, Chinese Academy of Sciences) |
4566 | D2ADA: Dynamic Density-aware Active Domain Adaptation for Semantic Segmentation | Tsung-Han Wu (National Taiwan University)*; Yi-Syuan Liou (National Taiwan University); Shao-Ji Yuan (National Taiwan University); Hsin-Ying Lee (National Taiwan University); Tung-I Chen (National Taiwan University); Kuan-Chih Huang (National Taiwan University); Winston H. Hsu (National Taiwan University) |
4568 | SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds | Qingyong Hu (University of Oxford); Bo Yang (The Hong Kong Polytechnic University)*; Guangchi Fang (Sun Yat-sen University); Yulan Guo (Sun Yat-sen University); Ales Leonardis (University of Birmingham); Niki Trigoni (University of Oxford); Andrew Markham (University of Oxford) |
4581 | Deep Portrait Delighting | Joshua William Weir (Victoria University of Wellington)*; Junhong Zhao (CMIC); Andrew Chalmers (CMIC); Taehyun Rhee (Victoria University of Wellington) |
4584 | Vector Quantized Image-to-Image Translation | Yu-Jie Chen (National Chiao Tung University); Shin-I Cheng (National Chiao Tung University); Wei-Chen Chiu (National Chiao Tung University)*; Hung-Yu Tseng (Facebook); Hsin-Ying Lee (Snap Inc) |
4588 | PointMixer: MLP-Mixer for Point Cloud Understanding | Jaesung Choe (KAIST)*; Chunghyun Park (POSTECH); Francois Rameau (KAIST); Jaesik Park (POSTECH); In So Kweon (KAIST) |
4589 | V2X-ViT: Vehicle-to-Everything Cooperative Perception with Vision Transformer | Runsheng Xu (University of California, Los Angeles); Hao Xiang (University of California, Los Angeles); Zhengzhong Tu (University of Texas at Austin); Xin Xia (University of California, Los Angeles); Ming-Hsuan Yang (University of California at Merced); Jiaqi Ma (University of California, Los Angeles)* |
4593 | Cross-Domain Ensemble Distillation for Domain Generalization | Kyungmoon Lee (POSTECH)*; Sungyeon Kim (POSTECH); Suha Kwak (POSTECH) |
4596 | Cross-Modal 3D Shape Generation and Manipulation | Zezhou Cheng (University of Massachusetts, Amherst)*; Menglei Chai (Snap Inc.); Jian Ren (Snap Inc.); Hsin-Ying Lee (Snap Inc); Kyle B Olszewski (Snap Inc.); Zeng Huang (Snap Inc.); Subhransu Maji (University of Massachusetts, Amherst); Sergey Tulyakov (Snap Inc) |
4607 | Latent Partition Implicit with Surface Codes for 3D Representation | Chao Chen (Tsinghua University); Yu-Shen Liu (Tsinghua University)*; Zhizhong Han (Wayne State University) |
4614 | FILM: Frame Interpolation for Large Motion | Fitsum Reda (Google)*; Janne Kontkanen (Google); Eric Tabellion (Google); Deqing Sun (Google); Caroline Pantofaru (Google Research); Brian Curless (University of Washington) |
4619 | Facial Depth and Normal Estimation using Single Dual-Pixel Camera | Minjun Kang (KAIST)*; Jaesung Choe (KAIST); Hyowon Ha (Facebook); Hae-Gon Jeon (GIST); Sunghoon Im (DGIST); In So Kweon (KAIST); Kuk-Jin Yoon (KAIST) |
4622 | Initialization and Alignment for Adversarial Texture Optimization | Xiaoming Zhao (University of Illinois at Urbana-Champaign)*; Zhizhen Zhao (University of Illinois at Urbana-Champaign); Alexander Schwing (UIUC) |
4631 | Regularizing Vector Embedding in Bottom-Up Human Pose Estimation | Haixin Wang (School of Artificial Intelligence, University of Chinese Academy of Sciences)*; lu zhou (CASIA); Yingying Chen (CASIA); Ming Tang (Institute of Automation, Chinese Academy of Sciences); Jinqiao Wang (Institute of Automation, Chinese Academy of Sciences) |
4633 | Equivariant Hypergraph Neural Networks | Jinwoo Kim (KAIST); Saeyoon Oh (KAIST); Sungjun Cho (LG AI Research); Seunghoon Hong (KAIST)* |
4636 | Learning Quality-aware Dynamic Memory for Video Object Segmentation | Yong Liu (Tsinghua University)*; Ran Yu (Tsinghua university); Fei Yin (Tsinghua University); Xinyuan Zhao (Huawei); Wei Zhao (Huawei); Weihao Xia (University College London); Yujiu Yang (Tsinghua University) |
4652 | Neural Scene Decoration from a Single Photograph | Hong Wing Pang (The Hong Kong University of Science and Technology)*; Yingshu Chen ( The Hong Kong University of Science and Technology); Phuoc-Hieu T. Le (VinAI Research); Binh-Son Hua (VinAI Research); Thanh Nguyen (Deakin University, Australia); Sai-Kit Yeung (Hong Kong University of Science and Technology) |
4656 | Bottom Up Top Down Detection Transformers for Language Grounding in Images and Point Clouds | Ayush Jain (Carnegie Mellon University)*; Nikolaos Gkanatsios (Carnegie Mellon University); Ishita Mediratta (Meta AI); Katerina Fragkiadaki (Carnegie Mellon University) |
4658 | CIRCLE:Convolutional Implicit Reconstruction and Completion for Large-scale Indoor Scene | Hao-Xiang Chen (Tsinghua University)*; Jiahui Huang (Tsinghua University); Tai-Jiang Mu (Tsinghua University); Shi-Min Hu (Tsinghua University) |
4659 | Discovering Deformable Keypoint Pyramids | Jianing Qian (University of Pennsylvania)*; Anastasios Panagopoulos (University of Pennsylvania); Dinesh Jayaraman (University of Pennsylvania) |
4668 | TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors | Gabriel Sarch (Carnegie Mellon University)*; Zhaoyuan Fang (Carnegie Mellon University); Adam Harley (Carnegie Mellon University); Paul Schydlo (Carnegie Mellon University); Michael J Tarr (Carnegie Mellon University); Saurabh Gupta (UIUC); Katerina Fragkiadaki (Carnegie Mellon University) |
4669 | MOTR: End-to-End Multiple-Object Tracking with TRansformer | Fangao Zeng (Megvii Technology); Bin Dong (Megvii Technology); Yuang Zhang (Shanghai Jiao Tong University); Tiancai Wang ( Megvii Technology)*; Xiangyu Zhang (Megvii Technology); Yichen Wei (Megvii Research Shanghai) |
4672 | K-centered Patch Sampling for Efficient Video Recognition | Seong Hyeon Park (KAIST AI)*; Jihoon Tack (KAIST); Byeongho Heo (NAVER AI LAB); Jung-Woo Ha (NAVER CLOVA AI Lab); Jinwoo Shin (KAIST) |
4675 | Learning Implicit Feature Alignment Function for Semantic Segmentation | Hanzhe Hu (Peking University)*; Yinbo Chen (UC San Diego); Jiarui Xu (University of California San Diego); Shubhankar Borse (Qualcomm AI Research ); Hong Cai (Qualcomm AI Research); Fatih Porikli (Qualcomm AI Research); Xiaolong Wang (UCSD) |
4677 | A Visual Navigation Perspective for Category-Level Object Pose Estimation | Jiaxin Guo (Zhejiang University)*; Yiyi Liao (MPI-IS and University of Tübingen); Zhong Fangxun (CUHK); Rong Xiong (Zhejiang University); Yunhui Liu (CUHK); Yue Wang (Zhejiang University) |
4681 | ScaleNet: Searching for the Model to Scale | Jiyang Xie (Huawei Noah’s Ark Lab); Xiu Su (University of Sydney); Shan You (SenseTime); Zhanyu Ma (Beijing University of Posts and Telecommunications)*; Fei Wang (University of Science and Technology of China); Chen Qian (SenseTime) |
4684 | Centrality and Consistency: Two-Stage Clean Samples Identification for Learning with Instance-Dependent Noisy Labels | Ganlong Zhao (The University of Hong Kong); Guanbin Li (Sun Yat-sen University)*; Yipeng Qin (Cardiff University); Feng Liu (Deepwise AI Lab); Yizhou Yu (The University of Hong Kong) |
4685 | GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing | Sijie Zhu (University of Central Florida)*; Zhe Lin (Adobe Research); Scott Cohen (Adobe Research); Jason Kuen (Adobe Research); Zhifei Zhang (Adobe Research); Chen Chen (University of Central Florida) |
4688 | FairGRAPE: Fairness-aware GRAdient Pruning mEthod for Face Attribute Classification | Xiaofeng Lin (University of California – Los Angeles); Seungbae Kim (University of South Florida); Jungseock Joo (University of California Los Angeles)* |
4697 | Tackling Background Distraction in Video Object Segmentation | Suhwan Cho (Yonsei University)*; Heansung Lee (Yonsei University); Minhyeok Lee ( Yonsei University); Chaewon Park (Yonsei University); Sungjun Jang (Yonsei University); Minjung Kim (Yonsei University); Sangyoun Lee (Yonsei University) |
4700 | Hyperspherical Learning in Multi-Label Classification | Bo Ke (Tencent Youtu Lab)*; yunquan zhu (Tencent YouTu Lab); Mengtian Li (East China Normal University); Xiujun shu (Tencent Toutu Lab); Ruizhi Qiao (Tencent Youtu Lab); Bo Ren (Tencent) |
4705 | The Surprisingly Straightforward Scene Text Removal Method With Gated Attention and Region of Interest Generation: A Comprehensive Prominent Model Analysis | Hyeonsu Lee (Naver Corporation)*; Chankyu Choi (Naver Corporation) |
4708 | FingerprintNet: Synthesized Fingerprints for Generated Image Detection | Yonghyun Jeong (NAVER CLOVA)*; Doyeon Kim (Line+); Youngmin Ro (Samsung SDS); pyounggeon kim (SDS); Jongwon Choi (Chung-Ang University) |
4715 | ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild | Wang Zhao (Tsinghua University)*; Shaohui Liu (ETH Zurich); Hengkai Guo (ByteDance AI Lab); Wenping Wang (The University of Hong Kong); Yong-Jin Liu (Tsinghua University) |
4721 | Free-Viewpoint RGB-D Human Performance Capture and Rendering | Phong Ha Nguyen (University of Oulu)*; Nikolaos Sarafianos (Facebook Reality Labs); Christoph Lassner (Meta Reality Labs Research); Janne Heikkila (University of Oulu, Finland); Tony Tung (Facebook) |
4727 | When Active Learning Meets Implicit Semantic Data Augmentation | zhuangzhuang chen (shenzhen university); Jin Zhang (Shenzhen University); Pan Wang (Shenzhen University); Jie Chen (Shenzhen University); Jianqiang Li (Shenzhen University)* |
4733 | Multiview Regenerative Morphing with Dual Flows | Chih-Jung Tsai (National Tsing Hua University); Cheng Sun (National Tsing Hua University); Hwann-Tzong Chen (National Tsing Hua University)* |
4734 | Frequency and Spatial Dual Guidance for Image Dehazing | Hu Yu (University of Science and Technology of China); Naishan Zheng (University of Science and Technology of China); man zhou (University of Science and Technology of China); Jie Huang (University of Science and Technology of China); Zeyu Xiao (University of Science and Technology of China); Feng Zhao (University of Science and Technology of China)* |
4736 | The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing | Dawit Mureja Argaw (KAIST)*; Fabian Caba (Adobe Research); Joon-Young Lee (Adobe Research); Markus Woodson (Adobe); In So Kweon (KAIST) |
4739 | Hallucinating Pose-Compatible Scenes | Tim Brooks (UC Berkeley)*; Alexei A Efros (UC Berkeley) |
4748 | Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection | Hang Ye (Peking University); Wentao Zhu (Peking University)*; Chunyu Wang (Microsoft Research asia); Rujie Wu (Peking University); Yizhou Wang (PKU) |
4754 | Video Interpolation by Event-driven Anisotropic Adjustment of Optical Flow | Song Wu (Huawei Technologies Co., Ltd.); Kaichao You (Tsinghua Univ); Weihua He (Tsinghua University)*; Chen Yang (Peking University); Yang Tian (Tsinghua University); Yaoyuan Wang (Huawei Technologies Co., Ltd.); Jianxing Liao (HUAWEI TECHNOLOGIES CO.LTD); Ziyang Zhang (HUAWEI TECHNOLOGIES CO.LTD) |
4761 | Motion and Appearance Adaptation for Cross-Domain Motion Transfer | Borun Xu (University of Electronic Science and Technology of China)*; Biao Wang (Alibaba Group); Jinhong Deng (University of Electronic Science and Technology of China); Jiale Tao (University of Electronic Science and Technology of China); Tiezheng Ge (Alibaba Group); Yuning Jiang (Alibaba Group); Wen Li (University of Electronic Science and Technology of China); Lixin Duan (University of Electronic Science and Technology of China) |
4762 | AdaBin: Improving Binary Neural Networks with Adaptive Binary Sets | Zhijun Tu (Institute of Artificial Intelligence and Robotics, Xi’an Jiaotong university)*; Xinghao Chen (Huawei Noah’s Ark Lab); Pengju Ren (Institute of Artificial Intelligence at Xi’an Jiaotong University); Yunhe Wang (Huawei Technologies) |
4781 | Social-Implicit: Rethinking Trajectory Prediction Evaluation and The Effectiveness of Implicit Maximum Likelihood Estimation | Abduallah A Mohamed (Meta)*; Deyao Zhu (King Abdullah University of Science and Technology); Warren Vu (The University of Texas at Austin); Mohamed Elhoseiny (KAUST); Christian Claudel (The university of Texas at Austin) |
4788 | A Generalized & Robust Framework For Timestamp Supervision in Temporal Action Segmentation | Rahul Rahaman (National University of Singapore)*; Dipika Singhania (National University of Singapore); Alex Thiery (National University of Singapore); Angela Yao (National University of Singapore) |
4790 | A Deep Moving-camera Background Model | Guy Erez (Ben Gurion University)*; Ron A Shapira Weber (Ben-Gurion University); Oren Freifeld (Ben-Gurion University) |
4800 | DLME: Deep Local-flatness Manifold Embedding | Zelin Zang (Zhejiang University & Westlake University)*; Siyuan Li (Westlake University); di wu (Westlake University); Ge Wang (Westlake University); Kai Wang (National University of Singapore); Lei Shang (Alibaba Group); Baigui Sun (Alibaba Group); Hao Li (Alibaba Group); Stan Z. Li (Westlake University) |
4802 | Neural Video Compression using GANs for Detail Synthesis and Propagation | Fabian Mentzer (Google)*; Eirikur Agustsson (Google); Johannes Ballé (Google); David Minnen (Google Inc.); Nick Johnston (Google); George Toderici (Google Research) |
4804 | Few-shot Action Recognition with Hierarchical Matching and Contrastive Learning | Sipeng Zheng (Renmin University of China)*; Shizhe Chen (INRIA); Qin Jin (Renmin University of China) |
4807 | Perspective Flow Aggregation for Data-Limited 6D Object Pose Estimation | Yinlin Hu (EPFL)*; Pascal Fua (EPFL, Switzerland); Mathieu Salzmann (EPFL) |
4820 | TALISMAN: Targeted Active Learning for Object Detection with Rare Classes and Slices using Submodular Mutual Information | Suraj Kothawade (UT Dallas)*; Saikat Ghosh (University of Texas at Dallas); Sumit Shekhar (Adobe Research); Yu Xiang (The University of Texas at Dallas); Rishabh Iyer (University of Texas at Dallas) |
4826 | New Datasets and Models for Contextual Reasoning in Visual Dialog | Yifeng Zhang (University of Minnesota, Twin Cities); Ming Jiang (University of Minnesota); Qi Zhao (University of Minnesota)* |
4828 | Remote Respiration Monitoring of Moving Person Using Radio Signals | Jae-Ho Choi (Pohang University of Science and Technology)*; KIBONG KANG (POSTECH); Kyung-Tae Kim (Pohang University of Science and Technology) |
4832 | AdvDO: Realistic Adversarial Attacks for Trajectory Prediction | Yulong Cao (University of Michigan, Ann Arbor )*; Chaowei Xiao (NVIDIA); Anima Anandkumar (NVIDIA/Caltech); Danfei Xu (Stanford University); Marco Pavone (Stanford University) |
4836 | Cross-Modality Transformer for Visible-Infrared Person Re-Identification | Kongzhu Jiang (University of Science and Technology of China)*; Tianzhu Zhang (University of Science and Technology of China); Xiang Liu (Dongguan University of Technology); Bingqiao Qian (University of Science and Technology of China); Yongdong Zhang (University of Science and Technology of China); Feng Wu (University of Science and Technology of China) |
4849 | VL-LTR: Learning Class-wise Visual-Linguistic Representation for Long-Tailed Visual Recognition | Changyao Tian (Chinese University of Hong Kong); Wenhai Wang (Nanjing University); Xizhou Zhu (SenseTime); Jifeng Dai (SenseTime)*; Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences) |
4857 | Self-Supervised Classification Network | Elad Amrani (IBM / Technion)*; Leonid Karlinsky (IBM-Research); Alex Bronstein (Technion) |
4865 | DevNet: Self-supervised Monocular Depth Learning via Density Volume Construction | Kaichen Zhou (University of Oxford)*; Lanqing Hong (Huawei Noah’s Ark Lab); Changhao Chen (National University of Defense Technology); Hang Xu (Huawei Noah’s Ark Lab); Chaoqiang Ye (Huawei); Qingyong Hu (University of Oxford); Zhenguo Li (Huawei Noah’s Ark Lab) |
4872 | Bayesian Optimization with Clustering and Rollback for CNN Auto Pruning | Hanwei FAN (HKUST)*; Jiandong MU (HKUST); Wei Zhang (Hong Kong University of Science and Technology) |
4873 | Towards Real-World HDRTV Reconstruction: A Data Synthesis-based Approach | Zhen Cheng (University of Science and Technology of China)*; Tao Wang (Huawei Noah’s Ark Lab); Yong Li (Huawei Noah’s Ark Lab); Fenglong Song (Huawei Noah’s Ark Lab); Chang Chen (Huawei Noah’s Ark Lab); Zhiwei Xiong (University of Science and Technology of China) |
4874 | Quantum Motion Segmentation | Federica Arrigoni (University of Trento)*; Willi Menapace (University of Trento); Marcel Seelbach Benkner (University of Siegen); Elisa Ricci (University of Trento); Vladislav Golyanik (MPI for Informatics) |
4879 | Open-world Semantic Segmentation via Contrasting and Clustering Vision-language Embedding | Quande Liu (The Chinese University of Hong Kong)*; Youpeng Wen (Dalian University of Technology); Jianhua Han (Huawei Noah’s Ark Lab); Chunjing Xu (Huawei Noah’s Ark Lab); Hang Xu (Huawei Noah’s Ark Lab); Xiaodan Liang (Sun Yat-sen University) |
4880 | Custom Structure Preservation in Face Aging | Guillermo Gomez-Trenado (University of Granada)*; Stéphane Lathuilière (Telecom-Paris); Pablo Mesejo (University of Granada); Oscar Cordón García (University of Granada) |
4883 | DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks | Shih-Yang Su (University of British Columbia)*; Timur Bagautdinov (Facebook); Helge Rhodin (UBC) |
4888 | Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization | Jiaxin Qi (Nanyang Technological University)*; Kaihua Tang (Nanyang Technological University); Qianru Sun (Singapore Management University); Xian-Sheng Hua (Damo Academy, Alibaba Group); Hanwang Zhang (Nanyang Technological University) |
4891 | Spatio-Temporal Deformable Attention Network for Video Deblurring | Huicong Zhang (Harbin Institute of Technology)*; Haozhe Xie (Tencent AI Lab); Hongxun Yao (Harbin Institute of Technology) |
4894 | CHORE: Contact, Human and Object REconstruction from a single RGB image | Xianghui Xie (Saarland University )*; Bharat Lal Bhatnagar (University of Tübingen, MPI informatik); Gerard Pons-Moll (University of Tübingen) |
4899 | Complementing Brightness Constancy with Deep Networks for Optical Flow Prediction | Vincent LE GUEN (EDF R&D, CNAM)*; Clément Rambour (Cnam); Nicolas Thome (CNAM, Paris) |
4902 | Learning Discriminative Shrinkage Deep Networks for Image Deconvolution | Pin-Hung Kuo (National Taiwan University)*; Jinshan Pan (Nanjing University of Science and Technology); Shao-Yi Chien (National Taiwan University); Ming-Hsuan Yang (University of California at Merced) |
4904 | Camera Pose Estimation and Localization with Active Audio Sensing | Karren D Yang (MIT); Michael Firman (Niantic); Eric Brachmann (Niantic)*; Clement LJC Godard (Niantic) |
4906 | Learning Efficient Multi-Agent Cooperative Visual Exploration | Chao Yu (Tsinghua University); Xinyi Yang (Tinghua University)*; Jiaxuan Gao (Tsinghua University); Huazhong Yang (Tsinghua University); Yu Wang (Tsinghua University); Yi Wu (Tsinghua University) |
4908 | 4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding | Yujin Chen (Technical University of Munich)*; Matthias Niessner (Technical University of Munich); Angela Dai (Technical University of Munich) |
4918 | Learned Vertex Descent: A New Direction for 3D Human Model Fitting | Enric Corona (IRI)*; Gerard Pons-Moll (University of Tübingen); Guillem Alenyà (IRI); Francesc Moreno (IRI) |
4921 | Hierarchical Semi-Supervised Contrastive Learning for Contamination-Resistant Anomaly Detection | Gaoang Wang (Zhejiang University); Yibing Zhan (JD Explore Academy); Xinchao Wang (National University of Singapore); Mingli Song (Zhejiang University)*; Klara Nahrstedt (University of Illinois at Urbana-Champaign) |
4927 | Learning to Fit Morphable Models | Vasileios Choutas (ETH Zurich)*; Federica Bogo (Meta); Jingjing Shen (Microsoft); Julien Valentin (Microsoft) |
4929 | Few-Shot Classification with Contrastive Learning | Zhanyuan Yang (Shenzhen University); Jinghua Wang (Harbin Institute of Technology); Yingying Zhu (Shenzhen University)* |
4931 | ARM: Any-Time Super-Resolution Method | Bohong Chen (Xiamen University)*; Mingbao Lin (Xiamen University, China); Kekai Sheng (Youtu Lab, Tencent Inc.); mengdan zhang (Youtu, Tencent); Peixian Chen (Youtu Tencent); Ke Li (Tencent); Liujuan Cao (Xiamen University); Rongrong Ji (Xiamen University, China) |
4933 | Tracking Every Thing in the Wild | Siyuan Li (ETH Zurich)*; Martin Danelljan (ETH Zurich); Henghui Ding (ETH Zurich); Thomas E Huang (ETH Zürich); Fisher Yu (ETH Zurich) |
4934 | Learning Self-prior for Mesh Denoising using Dual Graph Convolutional Networks | Shota Hattori (The University of Tokyo)*; Tatsuya Yatagawa (The University of Tokyo); Yutaka Ohtake (The University of Tokyo); Suzuki Hiromasa (The University of Tokyo) |
4940 | Few Zero Level Set-Shot Learning of Shape Signed Distance Functions in Feature Space | Amine Ouasfi (IMT Atlantique ); Adnane Boukhayma (Inria)* |
4948 | Attention-aware Learning for Hyperparameters Prediction in Image Processing Pipelines | Haina Qin (University of Chinese Academy of Sciences); Longfei Han (Beijing Technology and Business University); Juan Wang (Institute of Automation, Chinese Academy of Sciences); Congxuan Zhang (Nanchang Hangkong University); Bing Li (National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences)*; Weiming Hu (Institute of Automation,Chinese Academy of Sciences); Yanwei Li (Zeku Technology(Shanghai) Corp.,Ltd.) |
4950 | Attaining Class-level Forgetting in Pretrained Model using Few Samples | Pravendra Singh (IIT Roorkee); Pratik Mazumder (Indian Institute of Technology Jodhpur)*; Mohammed Asad Karim (Carnegie Mellon University) |
4951 | Data Invariants to Understand Unsupervised Out-of-Distribution Detection | Lars Doorenbos (University of Bern)*; Raphael Sznitman (University of Bern); Pablo Márquez Neila (University of Bern) |
4953 | STEEX: Steering Counterfactual Explanations with Semantics | Paul Jacob (École Polytechnique ); eloi zablocki (Valeo.ai)*; Hedi Ben-younes (Valeo AI); Mickael Chen (valeo.ai); Patrick Pérez (Valeo.ai); Matthieu Cord (Sorbonne University) |
4958 | Outpainting by Queries | Kai Yao (Xi’an Jiaotong-liverpool University); Penglei Gao (Xi’an Jiaotong-Liverpool University); Xi Yang (Xi’an Jiaotong Liverpool University ); jie Sun (Xi’an Jiaotong-Liverpool University ); Rui Zhang (Xi’an Jiaotong-Liverpool University); Kaizhu Huang (Duke Kunshan University)* |
4961 | HULC: 3D HUman Motion Capture with Pose Manifold SampLing and Dense Contact Guidance | Soshi Shimada (MPI for Informatics)*; Vladislav Golyanik (MPI for Informatics); Zhi Li (Max Planck Institute for Informatics); Patrick Pérez (Valeo.ai); Weipeng Xu (Reality Labs Research); Christian Theobalt (MPI Informatik) |
4962 | Interpretable Open-Set Domain Adaptation via Angular Margin Separation | Xinhao Li (University of Electronic Science and Technology of China); Jingjing Li (University of Electronic Science and Technology of China)*; Zhekai Du (University of Electronic Science and Technology of China); Lei Zhu (Shandong Normal Unversity); Wen Li (University of Electronic Science and Technology of China) |
4963 | EgoBody: Human Body Shape and Motion of Interacting People from Head-Mounted Devices | Siwei Zhang (ETH Zurich)*; Qianli Ma (Max Planck Institute for Intelligent Systems); Yan Zhang (ETH Zurich); Zhiyin Qian (ETH Zürich); Taein Kwon (ETH Zurich); Marc Pollefeys (ETH Zurich / Microsoft); Federica Bogo (Meta); Siyu Tang (ETH Zurich) |
4966 | ViTAS: Vision Transformer Architecture Search | Xiu Su (University of Sydney); Shan You (SenseTime)*; Jiyang Xie (Huawei Noah’s Ark Lab); Mingkai Zheng (The University of Sydney); Fei Wang (University of Science and Technology of China); Chen Qian (SenseTime); Changshui Zhang (Tsinghua University); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Chang Xu (University of Sydney) |
4970 | LaLaLoc++: Global Floor Plan Comprehension for Layout Localisation in Unvisited Environments | Henry Howard-Jenkins (University of Oxford)*; Victor Adrian Prisacariu (University of Oxford) |
4972 | diffConv: Analyzing Irregular Point Clouds with an Irregular View | Manxi Lin (Technical University of Denmark)*; Aasa Feragen (Technical University of Denmark) |
4975 | ReAct: Temporal Action Detection with Relational Action Queries | Dingfeng Shi (Beihang University)*; Yujie Zhong (University of Oxford); Qiong Cao (JD.com); Jing Zhang (The University of Sydney); Lin Ma (Meituan); Jia Li (Beihang University); Dacheng Tao (JD.com) |
4976 | StyleBabel: Artistic Style Tagging and Captioning | Dan Ruta (University of Surrey)*; Andrew Gilbert (University of Surrey); Pranav V Aggarwal (Adobe Inc.); Naveen Marri (Adobe Inc); Ajinkya Kale (Adobe); Jo Briggs (University of Northumbria); Chris Speed (University of Edinburgh); Hailin Jin (Adobe Research); Baldo Faieta (Adobe); Alex Filipkowski (Adobe); Zhe Lin (Adobe Research); John Collomosse (Adobe Research) |
4977 | TACS: Taxonomy Adaptive Cross-Domain Semantic Segmentation | RUI GONG (ETH Zurich)*; Martin Danelljan (ETH Zurich); Dengxin Dai (ETH Zurich); Danda Pani Paudel (ETH Zürich); Ajad Chhatkuli (ETH Zurich); Fisher Yu (ETH Zurich); Luc Van Gool (ETH Zurich) |
4983 | Domain Invariant Autoencoders for Self-supervised Learning from Multi-domains | Haiyang Yang (Nanjing University)*; Shixiang Tang (The University of Sydney); Meilin Chen (Zhejiang University); Yizhou Wang (Zhejiang University); Feng Zhu (University of Science and Technology of China); Lei Bai (Shanghai AI Laboratory); Rui Zhao (SenseTime Group Limited); Wanli Ouyang (The University of Sydney) |
4987 | Learned Variational Video Color Propagation | Markus Hofinger (Graz University of Technology)*; Erich Kobler (University Hospital Bonn); Alexander Effland (University of Bonn); Thomas Pock (Graz University of Technology) |
4988 | PD-Flow: A Point Cloud Denoising Framework with Normalizing Flows | aihua mao (South China University of Technolgoy)*; Zihui Du (South China University of Technology); Yu-Hui Wen (Tsinghua University); Jun Xuan (South China University of Technology); Yong-Jin Liu (Tsinghua University) |
4992 | Prototypical Contrast Adaptation for Domain Adaptive Semantic Segmentation | ZhengKai Jiang (Tencent Youtu Lab)*; Yuxi Li (Tencent); Ceyuan Yang (Chinese University of Hong Kong); Peng Gao (Chinese university of hong kong); Yabiao Wang (Tencent); Ying Tai (Tencent YouTu); Chengjie Wang (Tencent; Shanghai Jiao Tong University) |
4996 | Adversarial Contrastive Learning via Asymmetric InfoNCE | Qiying Yu (Tsinghua University)*; Jieming Lou (Harbin Institute of Technology); Xianyuan Zhan (Tsinghua University); Qizhang Li (Harbin Institute of Technology); Wangmeng Zuo (Harbin Institute of Technology, China); Yang Liu (Tsinghua University); Jingjing Liu (Tsinghua University) |
4998 | NeRF for Outdoor Scene Relighting | Viktor Rudnev (Max Planck Institute for Informatics)*; Mohamed Elgharib (Max Planck Institute for Informatics); William Smith (University of York); Lingjie Liu (Max Planck Institute for Informatics ); Vladislav Golyanik (MPI for Informatics); Christian Theobalt (MPI Informatik) |
5001 | FusionVAE: A Deep Hierarchical Variational Autoencoder for RGB Image Fusion | Fabian Duffhauss (Bosch Center for Artificial Intelligence)*; Vien Anh Ngo (Bosch Center for Artificial Intelligence); Hanna Ziesche (Bosch Center for AI); Gerhard Neumann (Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany) |
5007 | Self-calibrating Photometric Stereo by Neural Inverse Rendering | Junxuan Li (Australian National University)*; HONGDONG LI (Australian National University, Australia) |
5009 | Time-rEversed diffusioN tEnsor Transformer: A new TENET of Few-Shot Object Detection | Shan Zhang (Australian National University); Naila Murray (Naver Labs); Lei Wang (“University of Wollongong, Australia”); Piotr Koniusz (ANU College of Engineering and Computer Science)* |
5017 | Detecting Generated Images by Real Images | Bo Liu (Chongqing University of Posts and Telecommunications); fan yang (Chongqing University of Posts and Telecommunications); Xiuli Bi (Chongqing University of Posts and Telecommunications); bin xiao (Chongqing University of Posts and Telecommunications)*; Weisheng Li (Chongqing University of Posts and Telecommunications); Xinbo Gao (Chongqing University of Posts and Telecommunications) |
5018 | VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection | Joanna Hong (KAIST)*; Minsu Kim (KAIST); Yong Man Ro (KAIST) |
5020 | Delta Distillation for Efficient Video Processing | Amirhossein Habibian (Qualcomm AI Research)*; Haitam Ben Yahia (Qualcomm AI Research); Davide Abati (Qualcomm AI Research); Efstratios Gavves (University of Amsterdam ); Fatih Porikli (Qualcomm AI Research) |
5026 | PANDORA: A Panoramic Detection Dataset for Object with Orientation | Hang Xu (Hangzhou Dianzi University;The Institute of Computing Technology of the Chinese Academy of Sciences); Qiang Zhao (The Institute of Computing Technology of the Chinese Academy of Sciences); Yike Ma (Institute of Computing Technology, Chinese Academy of Sciences); Xiaodong Li (Huawei Noah’s Ark Lab); Peng Yuan (Huawei Noah’s Ark Lab); Bailan Feng (Huawei Noah’s Ark Lab); Chenggang Yan (Hangzhou Dianzi University); Feng Dai (Institute of Computing Technology, Chinese Academy of Sciences)* |
5032 | Instance As Identity: A Generic Online Paradigm for Video Instance Segmentation | Feng Zhu (University of Technology Sydney)*; Zongxin Yang (Zhejiang University); Xin Yu (University of Technology Sydney); Yi Yang (Zhejiang University); Yunchao Wei (UTS) |
5034 | Audio-Visual Mismatch-Aware Video Retrieval via Association and Adjustment | Sangmin Lee (KAIST)*; Sungjune Park (KAIST); Yong Man Ro (KAIST) |
5036 | 3D Clothed Human Reconstruction in the Wild | Gyeongsik Moon (Seoul National University); Hyeongjin Nam (Seoul National University); Takaaki Shiratori (Meta Reality Labs Research); Kyoung Mu Lee (Seoul National University)* |
5040 | Classification-Regression for Chart Comprehension | Matan Levy (The Hebrew University of Jerusalem)*; Rami Ben-Ari (OriginAI); Dani Lischinski (The Hebrew University of Jerusalem) |
5042 | Zero-Shot Category-Level Object Pose Estimation | Walter Goodwin (University of Oxford)*; Sagar Vaze (Visual Geometry Group, University of Oxford); Ioannis Havoutis (“Oxford Robotics Institute, Universtity of Oxford”); Ingmar Posner (Oxford University) |
5044 | AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant | Benita Wong (National University of Singapore)*; Joya Chen (National University of Singapore); You Wu (Harvard University); Stan Weixian Lei (National University of Singapore); Dongxing Mao (National University of Singapore); Difei Gao (NUS); Mike Zheng Shou (National University of Singapore) |
5047 | Laplace Mesh Transformer: Dual Attention and Topology Aware Network for 3D mesh Classification and Segmentation | Xiao-Juan Li (Institute of Computing Technology, Chinese Academy of Sciences); Jie Yang (Institute of Computing Technology, Chinese Academy of Sciences)*; Fang-Lue Zhang (Victoria University of Wellington) |
5048 | CoMER: Modeling Coverage for Transformer-based Handwritten Mathematical Expression Recognition | Wenqi Zhao (Peking University)*; Liangcai Gao (Peking University) |
5049 | RBC: Rectifying the Biased Context in Continual Semantic Segmentation | Hanbin Zhao (Zhejiang University)*; Fengyu Yang (University of Michigan); Xinghe Fu (Zhejiang University); Xi Li (Zhejiang University) |
5051 | Don’t Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context | Chongyu Liu (South China University of Technology); Lianwen Jin (South China University of Technology)*; Yuliang Liu (Huazhong University of Science and Technology); Canjie Luo (South China University of Technology); Bangdong Chen (South China University of Technology); Fengjun Guo (IntSig Information Co. Ltd); Kai Ding (IntSig Information Co., Ltd) |
5066 | Semi-Supervised Keypoint Detector and Descriptor for Retinal Image Matching | Jiazhen Liu (Renmin University of China); Xirong Li (Renmin University of China)*; Qijie Wei ( Vistel Inc.); Jie Xu (Beijing Tongren Hospital); Dayong Ding (Vistel Inc.) |
5069 | Memory-Augmented Model-Driven Network for Pansharpening | Keyu Yan ( Hefei Institutes of Physical Science,Chinese Academy of Sciences)*; man zhou (Chinese Academy of Sciences); li zhang (Chinese Academy of Sciences); Chengjun Xie (Institute of Intelligent Machines, Chinese Academy of Sciences China) |
5076 | Factorizing Knowledge in Neural Networks | Xingyi Yang (National University of Singapore)*; Jingwen Ye (National University of Singapore); Xinchao Wang (National University of Singapore) |
5081 | Unleashing Transformers: Parallel Token Prediction with Discrete Absorbing Diffusion for Fast High-Resolution Image Generation from Vector-Quantized Codes | Sam Bond-Taylor (Durham University)*; Peter Hessey (Durham University); Hiroshi Sasaki (Durham University); Toby P Breckon (Durham University); Chris G. Willcocks (Durham University) |
5082 | Contrastive Vicinal Space for Unsupervised Domain Adaptation | Jaemin Na (Ajou University)*; Dongyoon Han (NAVER AI Lab); Hyung Jin Chang (University of Birmingham); Wonjun Hwang (Ajou University) |
5083 | Weight Fixing Networks | Chris Subia-Waud (University of Southampton)*; Srinandan Dasmahapatra (University of Southampton) |
5088 | Sim-to-Real 6D Object Pose Estimation via Iterative Self-training for Robotic Bin Picking | Kai Chen (The Chinese University of Hong Kong); Rui Cao (The Chinese University of Hong Kong); Stephen L James (UC Berkeley); YICHUAN LI (CUHK); Yunhui Liu (CUHK); Pieter Abbeel (UC Berkeley); Qi Dou (The Chinese University of Hong Kong)* |
5092 | ChunkyGAN: Real Image Inversion via Segments | Adéla Šubrtová (Czech Technical University); David Futschik (Czech Technical University in Prague, FEE); Jan Čech (Czech Technical University in Prague); Michal Lukáč (Adobe Research); Eli Shechtman (Adobe Research, US); Daniel Sýkora (Czech Technical University in Prague)* |
5099 | Towards Sequence-Level Training for Visual Tracking | Minji Kim (Seoul National University)*; Seungkwan Lee (POSTECH); Jungseul Ok (POSTECH); Bohyung Han (Seoul National University); Minsu Cho (POSTECH) |
5111 | Scale-aware Spatio-temporal Relation Learning for Video Anomaly Detection | Guoqiu Li (Tsinghua Shenzhen International Graduate School, Tsinghua University)*; Guanxiong Cai (Shenzhen SenseTime Technology Co., Ltd); Xingyu ZENG (SenseTime Group Limited); Rui Zhao (SenseTime Group Limited) |
5114 | Tracking by Associating Clips | Sanghyun Woo (KAIST)*; Kwanyong Park (KAIST); Seoung Wug Oh (Adobe Research); In So Kweon (KAIST); Joon-Young Lee (Adobe Research) |
5117 | An Information Theoretic Approach forAttention-Driven Face Forgery Detection | Ke Sun (Xiamen University)*; Hong Liu (National Institute of Informatics ); Taiping Yao (Tencent YouTu); Xiaoshuai Sun (Xiamen University); Shen Chen (Tencent YouTu Lab); Shouhong Ding (Tencent); Rongrong Ji (Xiamen University, China) |
5118 | Compound Prototype Matching for Few-shot Action Recognition | Yifei Huang (The University of Tokyo)*; Lijin Yang (The University of Tokyo); Yoichi Sato (University of Tokyo) |
5119 | Self-Promoted Supervision for Few-Shot Transformer | Bowen Dong (Harbin Institute of Technology); Pan Zhou (NUS); Shuicheng Yan (National University of Singapore, Department of Electrical and Computer Engineering); Wangmeng Zuo (Harbin Institute of Technology, China)* |
5122 | Completely Self-Supervised Crowd Counting via Distribution Matching | deepak babu sam (Indian Institute of Science)*; Abhinav Agarwalla (Carnegie Mellon University); Jimmy Joseph (Stony Brook University); Vishwanath Sindagi (Johns Hopkins University); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science); Vishal Patel (Johns Hopkins University) |
5123 | Geodesic-Former: a Geodesic-Guided Few-shot 3D Point Cloud Instance Segmenter | Tuan Duc Ngo (VinAI Research)*; Khoi Nguyen (VinAI Research) |
5127 | SeedFormer: Patch Seeds based Point Cloud Completion with Upsample Transformer | Haoran Zhou (Nanjing University)*; Yun Cao (Tencent); Wenqing Chu (Tencent); Junwei Zhu (Tencent); Tong Lu (Nanjing University); Ying Tai (Tencent YouTu); Chengjie Wang (Tencent; Shanghai Jiao Tong University) |
5129 | 3D-PL: Domain Adaptive Depth Estimation with 3D-aware Pseudo-Labeling | Yu-Ting Yen (National Chiao Tung University, Phiar Technologies)*; Chia-Ni Lu (National Chiao Tung University ); Wei-Chen Chiu (National Chiao Tung University); Yi-Hsuan Tsai (Phiar Technologies) |
5136 | Towards Accurate Active Camera Localization | Qihang Fang (Shandong University); Yingda Yin (Peking University); Qingnan Fan (Tencent AI Lab)*; Fei Xia (Google Inc); Siyan Dong (Shandong University); Sheng Wang (3vjia); Jue Wang (Tencent AI Lab); Leonidas Guibas (Stanford University); Baoquan Chen (Peking University) |
5138 | Few-shot Object Counting and Detection | Thanh Van Nguyen (VinAI Research)*; Chau Hai Pham (VinAI Research); Khoi Nguyen (VinAI Research); Minh Hoai (Stony Brook University) |
5140 | RealPatch: A Statistical Matching Framework for Model Patching with Real Samples | Sara Romiti (University of Sussex)*; Christopher Inskip (University of Sussex); Viktoriia Sharmanska (University of Sussex and Imperial College London); Novi Quadrianto (University of Sussex and Basque Center for Applied Mathematics) |
5144 | GAN Cocktail: mixing GANs without dataset access | Omri Avrahami (The Hebrew University of Jerusalem)*; Dani Lischinski (The Hebrew University of Jerusalem); Ohad Fried (IDC Herzliya) |
5156 | Coarse-To-Fine Incremental Few-Shot Learning | Xiang Xiang (Huazhong University of Science and Technology)*; Yuwen Tan (Huazhong University of Science and Technology); Qian Wan (Wuhan Research Institute of Posts and Telecommunications); Jing Ma (Huazhong University of Science and Technology); Alan Yuille (Johns Hopkins University); Gregory D. Hager (The Johns Hopkins University) |
5157 | Learning Unbiased Transferability for Domain Adaptation by Uncertainty Modeling | Jian Hu (Queen Mary University of London)*; Haowen Zhong (Zhejiang Lab); Fei Yang (Zhejiang Lab); Shaogang Gong (Queen Mary University of London); Guile Wu (Queen Mary University of London); Junchi Yan (Shanghai Jiao Tong University) |
5158 | Camera Pose Auto-Encoders for Improving Pose Regression | Yoli Shavit (Faculty of Engineering, Bar Ilan University); Yosi Keller (Bar Ilan University)* |
5160 | CoGS: Controllable Generation and Search from Sketch and Style | Cusuh Ham (Georgia Institute of Technology)*; Gemma Canet Tarrés (CVSSP, University of Surrey); Tu Bui (University of Surrey); James Hays (Georgia Institute of Technology, USA); Zhe Lin (Adobe Research); John Collomosse (Adobe Research) |
5172 | Active Audio-Visual Separation of Dynamic Sound Sources | Sagnik Majumder (University of Texas at Austin)*; Kristen Grauman (Facebook AI Research & UT Austin) |
5175 | AU-aware 3D Face Reconstruction through Personalized AU-specific Blendshape Learning | Chenyi Kuang (Rensselaer Polytechnic Institute)*; Zijun Cui (Rensselaer Polytechnic Institute); Jeffrey Kephart (IBM Research, USA); Qiang Ji (Renselaer Polytechnic Institute) |
5180 | Directed Ray Distance Functions for 3D Scene Reconstruction | Nilesh Kulkarni (University of Michigan)*; Justin Johnson (University of Michigan); David Fouhey (University of Michigan) |
5189 | Background-Insensitive Scene Text Recognition with Text Semantic Segmentation | Liang Zhao (University of South Carolina)*; Zhenyao Wu (University of South Carolina); Xinyi Wu (University of South Carolina); Greg Wilsbacher (University of South Carolina); Song Wang (University of South Carolina) |
5198 | Geometry-Guided Progressive NeRF for Generalizable and Efficient Neural Human Rendering | Mingfei Chen (University of Washington)*; Jianfeng Zhang (NUS); Xiangyu Xu (Sea AI Lab); Lijuan Liu (SEA AI LAB); Yujun Cai (Nanyang Technological University); Jiashi Feng (ByteDance); Shuicheng Yan (Sea AI Labs) |
5207 | MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning | David Junhao Zhang (National University of Singapore)*; Kunchang Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yali Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yunpeng Chen (National University of Singapore); Shashwat Chandra (National University of Singapore); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Luoqi Liu (meitu); Mike Zheng Shou (National University of Singapore) |
5211 | Continual Variational Autoencoder Learning via Online Cooperative Memorization | Fei Ye (University of york)*; Adrian Bors (University of York) |
5215 | Semantic Novelty Detection via Relational Reasoning | Francesco Cappio Borlino (Politecnico di Torino); Silvia Bucci (Italian Institute of Technology)*; Tatiana Tommasi (Politecnico di Torino) |
5217 | FindIt: Generalized Localization with Natural Language Queries | Weicheng Kuo (Google)*; Fred Bertsch (Google); Wei Li (GOOGLE INC); AJ Piergiovanni (Google); Mohammad Saffar (Google); Anelia Angelova (Google) |
5224 | SelectionConv: Convolutional Neural Networks for Non-rectilinear Image Data | David M Hart (Brigham Young University)*; Michael Whitney (Brigham Young University); Bryan S Morse (Brigham Young University) |
5227 | HairNet: Hairstyle Transfer with Pose Changes | Peihao Zhu (KAUST)*; Rameen Abdal (KAUST); JOHN C FEMIANI (Miami University); Peter Wonka (KAUST) |
5234 | Learn2Augment: Learning to Composite Videos for Data Augmentation in Action Recognition | Shreyank N Gowda (University of Edinburgh)*; Marcus Rohrbach (Facebook AI Research); Frank Keller (University of Edinburgh); Laura Sevilla-Lara (Facebook) |
5235 | Action-based Contrastive Learning for Trajectory Prediction | Marah Halawa (Technische Universität Berlin)*; Olaf Hellwich (Technical University Berlin); Pia Bideau (TU Berlin) |
5240 | Scaling Open-vocabulary Image Segmentation with Image-level Labels | Golnaz Ghiasi (Google Brain)*; Xiuye Gu (Google); Yin Cui (Google); Tsung-Yi Lin (Nvidia Research) |
5247 | Improving Closed and Open-Vocabulary Attribute Prediction using Transformers | Khoi Pham (University of Maryland, College Park)*; Kushal Kafle (Adobe Research); Zhe Lin (Adobe Research); Zhihong Ding (Adobe Research); Scott Cohen (Adobe Research); Quan Hung Tran (Adobe Research); Abhinav Shrivastava (University of Maryland) |
5251 | FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context | Pinaki Nath Chowdhury (University of Surrey)*; Aneeshan Sain (University of Surrey); Ayan Kumar Bhunia (University of Surrey); Tao Xiang (University of Surrey); Yulia Gryaditskaya (University of Surrey); Yi-Zhe Song (University of Surrey) |
5252 | A Contrastive Objective for Learning Disentangled Representations | Jonathan Kahana (Hebrew University of Jerusalem)*; Yedid Hoshen (The Hebrew University of Jerusalem) |
5256 | Unbiased Multi-Modality Guidance for Image Inpainting | Yongsheng YU (University of Chinese Academy of Sciences); Dawei Du (Kitware, Inc.)*; Libo Zhang (Institute of Software Chinese Academy of Sciences); Tiejian Luo (University of Chinese Academy of Sciences) |
5257 | Learned Monocular Depth Priors in Visual-Inertial Initialization | Yunwen Zhou (Google)*; Abhishek Kar (Google); Eric L Turner (GOOGLE LLC); Adarsh Kowdle (Google); Chao Guo (Google Inc.); Ryan DuToit (Google); Konstantine Tsotsos (Google) |
5261 | DexMV: Imitation Learning for Dexterous Manipulation from Human Videos | Yuzhe Qin (University of California San Diego)*; Yueh-Hua Wu (UCSD); Shaowei Liu (UIUC); Hanwen Jiang (UT Austin); Ruihan Yang (UC San Diego); Yang Fu (UCSD); Xiaolong Wang (UCSD) |
5265 | Exploring Fine-grained Audiovisual Categorization with the SSW60 Dataset | Grant Van Horn (Cornell University)*; Rui Qian (Cornell University); Kimberly Wilber (Google); Hartwig Adam (Google); Oisin Mac Aodha (University of Edinburgh); Serge Belongie (University of Copenhagen) |
5266 | Radatron: Accurate Detection Using Multi-Resolution Cascaded MIMO Radar | Sohrab Madani (UIUC)*; Junfeng Guan (UIUC); Waleed Ahmed (UIUC); Saurabh Gupta (UIUC); Haitham Hassanieh (UIUC) |
5270 | COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality | Honglu Zhou (Rutgers University)*; Asim Kadav (NEC Labs); Aviv Shamsian (Bar Ilan University); Shijie Geng (Rutgers University); Farley Lai (NEC Laboratories America, Inc.); Long Zhao (Google Research); Ting Liu (Google Research); Mubbasir Kapadia (Rutgers University); Hans Peter Graf (NEC Labs) |
5272 | The Fish Counting Dataset: A Benchmark for Multiple Object Tracking and Counting | Justin Kay (Caltech, Ai.Fish); Peter Kulits (Caltech); Suzanne C Stathatos (Caltech); Siqi Deng (Amazon); Erik Young (Trout Unlimited); Sara M Beery (Caltech); Grant Van Horn (Cornell University)*; Pietro Perona (California Institute of Technology) |
5287 | Object Level Depth Reconstruction for Category Level 6D Object Pose Estimation From Monocular RGB Image | Zhaoxin Fan (Renmin University of China)*; Zhenbo Song (Nanjing University of Science and Technology); Jian Xu (Nreal); Zhicheng Wang (Nreal); Kejian Wu (Nreal); Hongyan Liu (Tsinghua University); Jun He (Renmin University of China) |
5293 | DeepMend: Learning Occupancy Functions to Represent Shape for Repair | Nikolas Lamb (Clarkson University)*; Sean Banerjee (Clarkson University); Natasha Kholgade Banerjee (Clarkson University) |
5297 | Graph Neural Network for Cell Tracking in Microscopy Videos | Tal Ben-Haim (School of Electrical and Computer Engineering, Ben-Gurion University)*; Tammy Riklin Raviv (BGU) |
5299 | Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Networks | Zihang Zou (University of Central Florida)*; Boqing Gong (Google); Liqiang Wang (University of Central Florida) |
5310 | PACS: A Dataset for Physical Audiovisual Commonsense Reasoning | Samuel Yu (Carnegie Mellon University)*; Peter Wu (UC Berkeley); Paul Pu Liang (Carnegie Mellon University); Ruslan Salakhutdinov (Carnegie Mellon University); Louis-Philippe Morency (Carnegie Mellon University) |
5315 | Intelli-Paint: Towards Developing More Human-Intelligible Painting Agents | Jaskirat Singh (Australian National University)*; Cameron Y Smith (Adobe Research); Jose Echevarria (Adobe System Inc.); Liang Zheng (Australian National University) |
5317 | Rethinking Few-Shot Object Detection on A Multi-Domain Benchmark | Kibok Lee (Yonsei University); Hao Yang (Amazon)*; Satyaki Chakraborty (Amazon ); Zhaowei Cai (Amazon); Gurumurthy Swaminathan (Amazon); Avinash Ravichandran (Amazon); Onkar Dabeer (Amazon) |
5318 | LidarNAS: Unifying and Searching Neural Architectures for 3D Point Clouds | Chenxi Liu (Waymo)*; Zhaoqi Leng (Waymo); Pei Sun (Waymo); Shuyang Cheng (Waymo LLC); Charles R. Qi (Waymo); Yin Zhou (Waymo); Mingxing Tan (Waymo); Dragomir Anguelov (Waymo) |
5325 | Improving the Intra-class Long-tail in 3D Detection via Rare Example Mining | Chiyu Jiang (Waymo)*; Mahyar Najibi (Waymo LLC); Charles R. Qi (Waymo); Yin Zhou (Waymo); Dragomir Anguelov (Waymo) |
5326 | Learning to Learn with Smooth Regularization | Yuanhao Xiong (UCLA)*; Cho-Jui Hsieh (UCLA) |
5327 | A Dataset for Interactive Vision-Language Navigation with Unknown Command Feasibility | Andrea Burns (Boston University)*; Deniz Arsan (University of Illinois at Urbana Champaign); Sanjna Agrawal (Boston University); Ranjitha Kumar (UIUC: CS); Kate Saenko (Boston University); Bryan Plummer (Boston University) |
5330 | CoVisPose: Co-Visibility Pose Transformer for Wide-Baseline Relative Pose Estimation in 360 Indoor Panoramas | Will A Hutchcroft (Zillow Group)*; Yuguang Li (Zillow Group); Ivaylo Boyadzhiev (Zillow Group); Zhiqiang Wan (Zillow); Haiyan Wang (The City College of New York); Sing Bing Kang (Zillow Group) |
5340 | PT4AL: Using Self-Supervised Pretext Tasks for Active Learning | John Seon Keun Yi (Georgia Institute of Technology)*; Minseok Seo (si-analytics); Jongchan Park (Lunit); Dong-Geol Choi (Hanbat National University) |
5351 | Uncertainty Quantification in Depth Estimation via Constrained Ordinal Regression | Dongting Hu (The University of Melbourne); Liuhua Peng (The University of Melbourne); Tingjin Chu (University of Melbourne); Xiaoxing Zhang (Meituan); Yinian Mao (Meituan-Dianping Group ); Howard Bondell (University of Melbourne); Mingming Gong (University of Melbourne)* |
5361 | All You Need is RAW: Defending Against Adversarial Attacks with Camera Image Pipelines | Yuxuan Zhang (Princeton University)*; Bo Dong (Princeton University); Felix Heide (Princeton University) |
5362 | ParC-Net: Position Aware Circular Convolution with Merits from ConvNets and Transformer | Haokui Zhang (Lighthouse Co.Ltd)*; Wenze Hu (Lighthouse Co.Ltd); Xiaoyu Wang (The Chinese University of Hong Kong (Shenzhen)) |
5369 | B ́ezierPalm: A Free lunch for Palmprint Recognition | KAI ZHAO (UCLA)*; Lei Shen (Tencent); Yingyi Zhang (Tencent); Chuhan Zhou (Tencent & VIA University College); Tao Wang (Tencent YouTu Lab); Ruixin Zhang (Tencent); Shouhong Ding (Tencent); Wei Jia (Heifei University of Technology); Wei Shen (Shanghai Jiao Tong University) |
5372 | A Repulsive Force Unit for Garment Collision Handling in Neural Networks | Qingyang Tan (UMD)*; Yi Zhou (Adobe Research); Tuanfeng Wang (adobe research); Duygu Ceylan (Adobe Research); Xin Sun (Adobe Research); Dinesh Manocha (University of Maryland at College Park) |
5373 | CYBORGS: Contrastively Bootstrapping Object Representations by Grounding in Segmentation | Renhao Wang (Tsinghua University)*; Hang Zhao (Tsinghua University); Yang Gao (Tsinghua University) |
5377 | Connecting Compression Spaces withTransformer for Approximate Nearest Neighbor Search | Haokui Zhang (Lighthouse Co.Ltd)*; Buzhou Tang (Harbin Institute of Technology, China); Wenze Hu (Lighthouse Co.Ltd); Xiaoyu Wang (The Chinese University of Hong Kong (Shenzhen)) |
5381 | Training Vision Transformers with Only 2040 Images | Yunhao Cao (Nanjing University); Hao Yu (Nanjing University); Jianxin Wu (Nanjing University)* |
5384 | Black-box Few-shot Knowledge Distillation | Dang Nguyen (Deakin University)*; Sunil Gupta (Deakin University, Australia); Kien Duc Do (Deakin Unviersity); Svetha Venkatesh (Deakin University) |
5388 | AutoAvatar: Autoregressive Neural Fields for Dynamic Avatar Modeling | Ziqian Bai (Simon Fraser University)*; Timur Bagautdinov (Facebook); Javier Romero (Facebook); Michael Zollhöfer (Facebook Reality Labs); Ping Tan (Simon Fraser University); Shunsuke Saito (Facebook) |
5392 | Ghost-free High Dynamic Range Imaging with Context-aware Transformer | Zhen Liu (Sichuan University; Megvii ); Yinglong Wang (Huawei Noah’s Ark Lab); Bing Zeng (University of Electronic Science and Technology of China); Shuaicheng Liu (UESTC; Megvii)* |
5393 | Cross-Domain Cross-Set Few-Shot Learning via Learning Compact and Aligned Representations | Wentao Chen (University of Science and Technology of China)*; Zhang Zhang (Institute of Automation, Chinese Academy of Sciences); Wei Wang (Institute of Automation Chinese Academy of Sciences); Liang Wang (NLPR, China); Zilei Wang (University of Science and Technology of China); Tieniu Tan (NLPR, China) |
5396 | Motion Transformer for Unsupervised Image Animation | Jiale Tao (University of Electronic Science and Technology of China)*; Biao Wang (Alibaba Group); Tiezheng Ge (Alibaba Group); Yuning Jiang (Alibaba Group); Wen Li (University of Electronic Science and Technology of China); Lixin Duan (University of Electronic Science and Technology of China) |
5404 | LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection | Yi Wei (Tsinghua University)*; Zibu Wei (Tsinghua University); Yongming Rao (Tsinghua University); Jiaxin Li (Gaussian Robotics); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University) |
5405 | PSS: Progressive Sample Selection for Open-World Visual Representation Learning | Tianyue Cao (Shanghai Jiao Tong University); Yongxin Wang (Amazon)*; Yifan Xing (AMAZON CORPORATE LLC); Tianjun Xiao (Amazon); Tong He (Amazon); Zheng Zhang (AWS); Hao Zhou (Amazon); Joseph Tighe (Amazon) |
5408 | Self-slimmed Vision Transformer | Zhuofan Zong (Beihang University)*; Kunchang Li (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Guanglu Song (Sensetime); Yali Wang (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Biao Leng (Beihang University); Yu Liu (SenseTime Group LTD) |
5410 | Switchable Online Knowledge Distillation | Biao Qian (Hefei University of Technology); Yang Wang (Hefei University of Technology)*; Hongzhi Yin (The University of Queensland); Richang Hong (Hefei University of Technology); Meng Wang (Hefei University of Technology) |
5418 | Adaptive Transformers for Robust Few-shot Cross-domain Face Anti-spoofing | Hsin-Ping Huang (University of California, Merced)*; Deqing Sun (Google); Yaojie Liu (Google); Wen-Sheng Chu (Google); Taihong Xiao (University of California at Merced); Jinwei Yuan (Google); Hartwig Adam (Google); Ming-Hsuan Yang (University of California at Merced) |
5419 | GraphFit: Learning Multi-scale Graph-Convolutional Representation for Point Cloud Normal Estimation | Keqiang Li (Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences)*; Mingyang Zhao (University of Chinese Academy and Sciences&Beijing Academy of Artificial Intelligence); Huaiyu Wu (Institute of Automation, Chinese Academy of Sciences); Dong-Ming Yan (NLPR, CASIA); Zhen Shen (Institute of Automation, Chinese Academy of Sciences/Qingdao Academy of Intelligent Industries); Fei-Yue Wang (Institute of Automation, Chinese Academy of Sciences ); gang xiong (CASIA) |
5424 | Are Vision Transformers Robust to Patch-wise Perturbations? | Jindong Gu (University of Munich)*; Volker Tresp (Siemens AG and Ludwig Maximilian University of Munich ); Yao Qin (Google) |
5428 | DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning | Zifeng Wang (Northeastern University)*; Zizhao Zhang (Google); Sayna Ebrahimi (Google); Ruoxi Sun (Google); Han Zhang (Google); Chen-Yu Lee (Google); Xiaoqi Ren (Google); Guolong Su (Google); Vincent Perot (Google AI); Jennifer Dy (Northeastern); Tomas Pfister (Google) |
5430 | EleGANt: Exquisite and Locally Editable GAN for Makeup Transfer | Chenyu Yang (Tsinghua University)*; Wanrong He (Tsinghua University); Yingqing Xu (Tsinghua University); Yang Gao (Tsinghua University) |
5436 | Union-set Multi-source Model Adaptation for Semantic Segmentation | Zongyao Li (Hokkaido University)*; Ren Togo (Hokkaido University); Takahiro Ogawa (Hokkaido University); Miki Haseyama (Hokkaido University) |
5441 | Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection | Sanghyun Woo (KAIST)*; Kwanyong Park (KAIST); Seoung Wug Oh (Adobe Research); In So Kweon (KAIST); Joon-Young Lee (Adobe Research) |
5443 | TDAM: Top-Down Attention Module for Contextually Guided Feature Selection in CNNs | Shantanu Jaiswal (Agency for Science, Technology and Research ); Basura Fernando (Agency for Science, Technology and Research, ASTAR, Singapore); Cheston Tan (Institute for Infocomm Research, Singapore) |
5451 | Exploring Disentangled Content Information for Face Forgery Detection | Jiahao Liang (Beijing University of Posts and Telecommunications)*; Huafeng Shi (SenseTime Group Limited); Weihong Deng (Beijing University of Posts and Telecommunications) |
5458 | Object Discovery via Contrastive Learning for Weakly Supervised Object Detection | Jinhwan Seo (Pohang University of Science and Technology)*; Wonho Bae (University of British Columbia); Danica J. Sutherland (University of British Columbia); Junhyug Noh (Lawrence Livermore National Laboratory); Daijin Kim (Pohang University of Science and Technology) |
5460 | Unifying Vision Unsupervised Contrastive Learning from a Graph Perspective | Shixiang Tang (The University of Sydney)*; Feng Zhu (University of Science and Technology of China); Lei Bai (Shanghai AI Laboratory); Rui Zhao (SenseTime Group Limited); Chenyu Wang (University of Sydney, Sydney Neuroimaging Analysis Centre); Wanli Ouyang (The University of Sydney) |
5463 | E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context | Zizhang Li (Zhejiang University)*; Mengmeng Wang (Zhejiang University); Huaijin Pi (Zhejiang University); Kechun Xu (Zhejiang University); Jianbiao Mei (Zhejiang University); Yong Liu (Zhejiang University) |
5478 |
|
Hadi Mohaghegh Dolatabadi (University of Melbourne)*; Sarah Erfani (University of Melbourne); Christopher Leckie (University of Melbourne) |
5481 | Spatial-Separated Curve Rendering Network for Efficient and High-Resolution Image Harmonization | Jingtang Liang (University of Macau)*; Xiaodong Cun (Tencent AI Lab); Chi-Man Pun (University of Macau); Jue Wang (Tencent AI Lab) |
5484 | Point MixSwap: Attentional Point Cloud Mixing via Swapping Matched Structural Divisions | Ardian Umam (NYCU)*; Cheng-Kun Yang (National Taiwan University); Yung-Yu Chuang (National Taiwan University); Jen-Hui Chuang (National Chiao Tung University ); Yen-Yu Lin (National Yang Ming Chiao Tung University) |
5491 | One Size Does NOT Fit All: Data-Adaptive Adversarial Training | Shuo Yang (University of Sydney)*; Chang Xu (University of Sydney) |
5494 | IS-MVSNet: Importance Sampling-based MVSNet | Likang Wang (HKUST)*; Yue Gong (Huawei Technologies Co., Ltd.); Xinjun Ma (Huawei); Qirui Wang (Huawei Technologies Co., Ltd.); Kaixuan Zhou (Huawei ); Lei Chen (Hong Kong University of Science and Technology) |
5496 | Multi-Granularity Pruning for Model Acceleration on Mobile Devices | Tianli Zhao (Institute of Automation,Chinese Academy of Sciences;University of Chinese Academy of Sciences); Xi Sheryl Zhang (Institute of Automation, Chinese Academy of Sciences); Wentao Zhu (Amazon); Jiaxing Wang (Institute of Automation, Chinese Academy of Sciences); Sen Yang (Kuaishou); Ji Liu (Kwai Inc.); Jian Cheng (“Chinese Academy of Sciences, China”)* |
5500 | Style-Agnostic Reinforcement Learning | Juyong Lee (POSTECH); Seokjun Ahn (POSTECH); Jaesik Park (POSTECH)* |
5504 | Editing Out-of-domain GAN Inversion via Differential Activations | Haorui Song (South China University of Technology); Yong Du (Ocean University of China); Tianyi Xiang (South China University of Technology); Junyu Dong (Ocean University of China); Jing Qin (The Hong Kong Polytechnic University); Shengfeng He (South China University of Technology)* |
5508 | Bagging Regional Classification Activation Maps for Weakly Supervised Object Localization | Lei Zhu (Beijing University of Posts and Telecommunications); Qian Chen (University of Science and Technology of China); Lujia Jin (Peking University); yunfei you (Peking University); Yanye Lu (Peking University)* |
5518 | Mutually Reinforcing Structure with Proposal Contrastive Consistency for Few-Shot Object Detection | TianXue Ma (East China Normal University)*; Mingwei Bi (Tencent); Jian Zhang (Tencent Youtu); Wang Yuan (East China Normal University); Zhizhong Zhang (East China Normal University); Yuan Xie (East China Normal University); Shouhong Ding (Tencent); Lizhuang Ma (Shanghai Jiao Tong University) |
5523 | Panoptic-PartFormer: Learning a Unified model for Panoptic Part Segmentation | Xiangtai Li (Peking University)*; Shilin Xu (Peking University); Yibo Yang (Peking University); Guangliang Cheng (Sensetime Group Limited); Yunhai Tong (Peking University); Dacheng Tao (JD.com) |
5536 | TextAdaIN: Paying Attention to Shortcut Learning in Text Recognizers | Oren Nuriel (Amazon)*; Ron Litman (Amazon); Sharon Fogel (Amazon) |
5537 | Speaker-adaptive Lip Reading with User-dependent Padding | Minsu Kim (KAIST)*; Hyunjun Kim (KAIST); Yong Man Ro (KAIST) |
5541 | Online Domain Adaptation for Semantic Segmentation in Ever-Changing Conditions | Theodoros Panagiotakopoulos (KTH Royal Institute of Technology in Stockholm); Pier Luigi Dovesi (Univrses); Linus Härenstam-Nielsen (Artisense); Matteo Poggi (University of Bologna)* |
5542 | Point Scene Understanding via Disentangled Instance Mesh Reconstruction | Jiaxiang Tang (Peking University)*; Xiaokang Chen (Peking University); Jingbo Wang (The Chinese University of HongKong); Gang Zeng (Peking University) |
5543 | Dual Contrastive Learning with Anatomical Auxiliary Supervision for Few-shot Medical Image Segmentation | Huisi Wu (Shenzhen University)*; Fangyan Xiao (Shenzhen University); Chongxin Liang (Shenzhen University) |
5544 | An Efficient Person Clustering Algorithm for Open Checkout-free Groceries | Junde Morsen Wu (Purdue University); Yu Zhang (Harbin Institute of Technology); RAO FU (None); Yuanpei Liu (Beijing Institute of Technology); Jing Gao (Purdue University)* |
5548 | Face2Face^ρ: Real-Time High-Resolution One-Shot Face Reenactment | Kewei Yang (NetEase Games AI Lab)*; Kang Chen (NetEase Games AI Lab); Daoliang Guo (NetEase Games AI Lab); Song-Hai Zhang (Tsinghua University); Yuan-Chen Guo (Tsinghua University); Weidong Zhang (Netease Games AI Lab) |
5549 | Decoupled Contrastive Learning | Chun-Hsiao Yeh (Academia Sinica / UC Berkeley)*; Cheng-Yao Hong (Academia Sinica); Yen-Chi Hsu (Academia Sinica); Tyng-Luh Liu (Academia Sinica); Yubei Chen (Berkeley AI Research, UC Berkeley); yann lecun (Facebook) |
5555 | Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning | Chi Zhang (University of California, Los Angeles)*; Sirui Xie (UCLA); Baoxiong Jia (UCLA); Ying Nian Wu (University of California, Los Angeles); Song-Chun Zhu (UCLA); Yixin Zhu (Peking University) |
5556 | On the Robustness of Quality Measures for GANs | Motasem Alfarra (KAUST)*; Juan C Perez (KAUST); Anna Fruehstueck (KAUST); Philip Torr (University of Oxford); Peter Wonka (KAUST); Bernard Ghanem (KAUST) |
5557 | Automatic Check-Out via Prototype-based Classifier Learning from Single-Product Exemplars | Hao Chen (Nanjing University of Science and Technology)*; Xiu-Shen Wei (Nanjing University of Science and Technology); Faen Zhang (AInnovation Co. Ltd.); Yang Shen (Nanjing University of Science and Technology); Hui Xu (QINGDAO AINNOVATION TECHNOLOGY GROUP CO., LTD); liang xiao (nanjing university of science and technology) |
5559 | TDViT: Temporal Dilated Transformer for Dense Video Tasks | Guanxiong Sun (Queen’s University Belfast); Yang Hua (Queen’s University Belfast)*; Guosheng Hu (Oosto); Neil Robertson (Queen’s University Belfast) |
5561 | POP: Mining POtential Performance of new fashion products via webly cross-modal query expansion | Christian Joppi (Humatics srl)*; Geri Skenderi (University of Verona); Marco Cristani (University of Verona) |
5564 | BRACE: The Breakdancing Competition Dataset for Dance Motion Synthesis | Davide Moltisanti (University of Edinburgh)*; Jinyi Wu (S-Lab Nanyang Technological University); Bo Dai (Shanghai AI Lab); Chen Change Loy (Nanyang Technological University) |
5578 | Towards Racially Unbiased Skin Tone Estimation via Scene Disambiguation | Haiwen Feng (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Joachim Tesch (Max Planck Institute for Intelligent Systems); Michael J. Black (Max Planck Institute for Intelligent Systems); Victoria Fernandez Abrevaya (Max Planck Institute)* |
5580 | Style-Guided Shadow Removal | Jin Wan (Beijing Jiaotong University); Hui Yin (Beijing Jiaotong University)*; Zhenyao Wu (University of South Carolina); Xinyi Wu (University of South Carolina); Yanting Liu (Yanting Liu); Song Wang (University of South Carolina) |
5584 | Sound-guided Semantic Video Generation | Seung Hyun Lee (Korea University)*; Gyeongrok Oh (Korea University); Wonmin Byeon (NVIDIA Research); Jihyun Bae (Korea University); Chanyoung Kim (Korea University); Won Jeong Ryoo (Korea University); Sang Ho Yoon (KAIST); Hyunjun Cho (Korea University); Jinkyu Kim (Korea University); Sangpil Kim (Korea University) |
5585 | Robust Visual Tracking by Segmentation | Matthieu Paul (ETH Zurich)*; Martin Danelljan (ETH Zurich); Christoph Mayer (ETH Zurich); Luc Van Gool (ETH Zurich) |
5591 | Semi-Supervised Learning of Optical Flow by Flow Supervisor | Woobin Im (KAIST); Sebin Lee (KAIST); Sungeui Yoon (KAIST)* |
5595 | Joint Learning of Localized Representations from Medical Images and Reports | Philip Müller (Technical University of Munich)*; Georgios Kaissis (Technische Universität München); congyu zou (Klinikum Rechts der Isar Technische Universität München ); Daniel Rueckert (Technische Universität München) |
5599 | D2C-SR: A Divergence to Convergence Approach for Real-World Image Super-Resolution | Youwei Li (Megvii); Haibin Huang (Kuaishou Technology); lanpeng jia (GWM); Haoqiang Fan (Megvii Inc(face++)); Shuaicheng Liu (UESTC; Megvii)* |
5612 | Continual 3D Convolutional Neural Networks for Real-time Processing of Videos | Lukas Hedegaard (Aarhus University)*; Alexandros Iosifidis (Aarhus University) |
5613 | Salient Object Detection for Point Clouds | Songlin Fan (Peking University ); Wei Gao (SECE, Shenzhen Graduate School, Peking University)*; Ge Li (Peking University) |
5616 | Deep ensemble learning by diverse knowledge distillation for fine-grained object classification | Naoki Okamoto (Chubu university)*; Tsubasa Hirakawa (Chubu University); Takayoshi Yamashita (Chubu University); Hironobu Fujiyoshi (Chubu University) |
5619 | Source-free Video Domain Adaptation by Learning Temporal Consistency for Action Recognition | Yuecong Xu (Institute for Infocomm Research, ASTAR, Singapore); Jianfei Yang (Nanyang Technological University); Haozhi Cao (Nanyang Technological University); Keyu Wu (Institute for Infocomm Research, ASTAR, Singapore); Min Wu (Institute for Infocomm Research, ASTAR, Singapore); Zhenghua Chen (Institute for Infocomm Research, A*STAR, Singapore) |
5643 | GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training | Jaeseok Byun (Seoul National university); Taebaek Hwang (M.IN.D Lab); Jianlong Fu (Microsoft Research); Taesup Moon (Seoul National University)* |
5644 | Pose Forecasting in Industrial Human-Robot Collaboration | Alessio Sampieri (Sapienza University)*; Guido Maria D’Amely di Melendugno (Sapienza University); ANDREA AVOGARO (University of Verona); Federico Cunico (University of Verona); Francesco Setti (University of Verona); Geri Skenderi (University of Verona); Marco Cristani (University of Verona); Fabio Galasso (Sapienza University) |
5648 | MeshLoc: Mesh-Based Visual Localization | Vojtech Panek (CTU in Prague, FEE, CIIRC)*; Zuzana Kukelova (Czech Technical University in Prague); Torsten Sattler (Czech Technical University in Prague) |
5660 | Dress Code: High-Resolution Multi-Category Virtual Try-On | Davide Morelli (UNIMORE); Matteo Fincato (Università degli Studi di Modena e Reggio Emilia); Marcella Cornia (University of Modena and Reggio Emilia)*; Federico Landi (University of Modena and Reggio Emilia); Fabio Cesari (YOOX Net-A-Porter Group S.p.A.); Rita Cucchiara (Università di Modena e Reggio Emilia) |
5661 | UC-OWOD: Unknown-Classified Open World Object Detection | Zhiheng Wu (Institute of Automation, Chinese Academy of Sciences (CASIA))*; Yue Lu (Institute of Automation, Chinese Academy of Sciences(CASIA)); Xingyu Chen (Xiaobing.AI); Zhengxing Wu (CASIA); Liwen Kang (Institute of Automation, Chinese Academy of Sciences (CASIA)); Junzhi Yu (CASIA) |
5666 | Helpful or Harmful: Inter-Task Association in Continual Learning | Hyundong Jin (Chung-Ang University ); Eunwoo Kim (Chung-Ang University)* |
5669 | RayTran: 3D pose estimation and shape reconstruction of multiple objects from videos with ray-traced transformers | Michał J Tyszkiewicz (EPFL); Kevis-Kokitsi Maninis (Google Research)*; Stefan Popov (Google Research); Vittorio Ferrari (Google Research) |
5673 | Efficient Point Cloud Segmentation with Geometry-aware Sparse Networks | Maosheng Ye (HKUST)*; Rui Wan (Deeproute.ai); Shuangjie Xu (HKUST); Tongyi Cao (Deeproute.ai); Qifeng Chen (HKUST) |
5677 | Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition | Tianjiao Li (Singapore University of Technology and Design)*; Lin Geng Foo (Singapore University of Technology and Design); Qiuhong Ke (Monash University); Hossein Rahmani (Lancaster University); Anran Wang (Bytedance); Jinghua Wang (Harbin Institute of Technology); Jun Liu (Singapore University of Technology and Design) |
5685 | TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation | Tan Minh Dinh (VinAI Research)*; Rang NGUYEN (VinAI Research); Binh-Son Hua (VinAI Research) |
5688 | CostDCNet: Cost Volume based Depth Completion for a Single RGB-D Image | Jaewon Kam (POSTECH); Jungeon Kim (POSTECH); Soongjin Kim (POSTECH); Jaesik Park (POSTECH); Seungyong Lee (POSTECH)* |
5697 | Efficient Video Deblurring Guided by Motion Magnitude | Yusheng Wang (The University of Tokyo)*; Yunfan Lu (Hong Kong University of Science and Technology); Ye Gao (Honor Technologies Japan); Lin Wang (HKUST); Zhihang Zhong (The University of Tokyo); Yinqiang Zheng (The University of Tokyo); Atsushi Yamashita (The University of Tokyo) |
5702 | Space-Partitioning RANSAC | Daniel Barath (ETH Zürich)*; Gábor Valasek (ELTE) |
5704 | Towards Accurate Binary Neural Networks via Modeling Contextual Dependencies | Xingrun Xing (Beihang University); Yangguang Li (SenseTime Group Limited); Wei Li (Nanyang Technological University); Wenrui Ding (Beihang University); Yalong Jiang (Beihang University)*; Yufeng Wang (Beihang University); Jing Shao (Sensetime); Chunlei Liu (Beihang University); Xianglong Liu (BUAA) |
5712 | Overcoming Shortcut Learning in a Target Domain by Generalizing Basic Visual Factors from a Source Domain | Piyapat Saranrittichai (Bosch Center for Artificial Intelligence)*; Chaithanya Kumar Mummadi (Bosch Center for Artificial Intelligence); Claudia Blaiotta (Bosch Center for Artificial Intelligence); Mauricio Munoz (Bosch Center for Artificial Intelligence); Volker Fischer (Bosch Center for Artificial Intelligence) |
5721 | SimpleRecon: 3D Reconstruction Without 3D Convolutions | Mohamed Sayed (University College London)*; John Gibson (Niantic, Inc.); Jamie Watson (Niantic); Victor A Prisacariu (Niantic Labs); Michael Firman (Niantic); Clement LJC Godard (Niantic) |
5739 | SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding | Morgan L Heisler (Huawei Technologies Canada Co., Ltd.)*; Amin Banitalebi-Dehkordi (Huawei Technologies Canada Co., Ltd.); Yong Zhang (Huawei Technologies Canada Co., Ltd.) |
5740 | A data-centric approach for improving ambiguous labels with combined semi-supervised classification and clustering | Lars Schmarje (Kiel University)*; Monty Santarossa (Kiel University); Simon-Martin Schröder (Kiel University); Claudius Zelenka (Kiel University); Rainer Kiko (Laboratoire d’Océanographie de Villefranche-sur-Mer); Jenny Stracke (University of Bonn); Nina Volkmann (University of Veterinary Medicine Hannover); Reinhard Koch (Kiel University) |
5750 | SPIN: An Empirical Evaluation on Sharing Parameters of Isotropic Networks | Anish J Prabhu (Apple)*; Chien-Yu lin (University of Washington); Thomas Merth (Apple); Sachin Mehta (University of Washington); Anurag Ranjan (Apple); Maxwell C Horton (Apple, Xnor.Ai and University of Washington); Mohammad Rastegari (University of Washington) |
5754 | SAGA: Stochastic Whole-Body Grasping With Contact | Yan Wu (ETH Zurich); Jiahao Wang (Max Planck Institute for Informatics); Yan Zhang (ETH Zurich); Siwei Zhang (ETH Zurich); Otmar Hilliges (ETH Zurich); Fisher Yu (ETH Zurich); Siyu Tang (ETH Zurich)* |
5761 | GTCaR: Graph Transformer for Camera Re-localization | Xinyi Li (Magic Leap)*; Haibin Ling (Stony Brook University) |
5764 | Actor-centered Representations for Action Localization in Streaming Videos | Sathyanarayanan N Aakur (OK State)*; Sudeep Sarkar (University of South Florida, Tampa) |
5769 | Photo-realistic Neural Domain Randomization | Sergey Zakharov (Toyota Research Institute)*; Rareș A Ambruș (Toyota Research Institute); Vitor Guizilini (Toyota Research Institute); Wadim Kehl (Woven Planet); Adrien Gaidon (Toyota Research Institute) |
5770 | ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization | Muhammad Zubair Irshad (Georgia Institute of Technology)*; Sergey Zakharov (Toyota Research Institute); Rareș A Ambruș (Toyota Research Institute); Thomas Kollar (Toyota Research Institute); Zsolt Kira (Georgia Institute of Technology); Adrien Gaidon (Toyota Research Institute) |
5771 | Structure and Motion for Casual Videos | Zhoutong Zhang (MIT)*; Forrester Cole (Google Research); Zhengqi Li (Google Inc.); Noah Snavely (Google); Michael Rubinstein (Google); William T Freeman (Google) |
5775 | Single Frame Atmospheric Turbulence Mitigation: A Benchmark Study and A New Physics-Inspired Transformer Model | Zhiyuan Mao (Purdue University)*; AJAY KUMAR JAISWAL (UT Austin); Zhangyang Wang (University of Texas at Austin); Stanley Chan (Purdue University, USA) |
5778 | Incremental Task Learning with Incremental Rank Updates | Rakib Hyder (University of California, Riverside)*; Ken Shao (UCR); Boyu Hou (The University of California, Riverside ); Panagiotis Markopoulos (RIT); Ashley Prater-Bennette (Air Force Research Laboratory); M. Salman Asif (University of California, Riverside) |
5787 | Bandwidth-Aware Adaptive Codec for DNN Inference Offloading in IoT | Xiufeng Xie (Kwai Inc.)*; Ning Zhou (Amazon); Wentao Zhu (Amazon); Ji Liu (Kwai Inc.) |
5789 | Inpainting at Modern Camera Resolution by Guided PatchMatch with Auto-Curation | Connelly Barnes (Adobe)*; Lingzhi Zhang (University of Pennsylvania); Jianbo Shi (University of Pennsylvania); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Sohrab Amirghodsi (Adobe Research); Kevin Wampler (Adobe Systems Inc.) |
5794 | Controllable Video Generation through Global and Local Motion Dynamics | Aram Davtyan (University of Bern)*; Paolo Favaro (University of Bern) |
5812 | UniCR: Universally Approximated Certified Robustness via Randomized Smoothing | Hanbin Hong (University of Connecticut)*; Binghui Wang (Illinois Institute of Technology); Yuan Hong (University of Connecticut) |
5829 | 3D Siamese Transformer Network for Single Object Tracking on Point Clouds | Le Hui (Nanjing University of Science and Technology)*; Lingpeng Wang (Nanjing University of Science and Technology); Linghua Tang (Nanjing University of Science and Technology); Kaihao Lan (Nanjing University of Science and Technology); Jin Xie (Nanjing University of Science and Technology); Jian Yang (Nanjing University of Science and Technology) |
5837 | Hardly Perceptible Trojan Attack against Neural Networks with Bit Flips | Jiawang Bai (Tsinghua University)*; Kuofeng Gao (Tsinghua University); dihong gong (Tencent AI Lab); Shu-Tao Xia (Tsinghua University); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent) |
5856 | StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN | Fei Yin (Tsinghua University)*; Yong Zhang (Tencent AI Lab); Xiaodong Cun (Tencent AI Lab); Mingdeng Cao (Tsinghua University); Yanbo Fan (Tencent AI Lab); Xuan Wang (Tencent AI Lab); Qingyan Bai (Tsinghua University); Baoyuan Wu (The Chinese University of Hong Kong, Shenzhen); Jue Wang (Tencent AI Lab); Yujiu Yang (Tsinghua University) |
5859 | Referring Object Manipulation of Natural Images with Conditional Classifier-Free Guidance | Myungsub Choi (Google)* |
5880 | Self-Supervised Interactive Object Segmentation Through a Singulation-and-Grasping Approach | Houjian Yu (University of Minnesota, Twin Cities)*; Changhyun Choi (University of Minnesota, Twin Cities) |
5898 | BigColor: Colorization using a Generative Color Prior for Natural Images | geonung kim (POSTECH); Kyoungkook Kang (POSTECH); Seongtae Kim (POSTECH); Hwayoon Lee (POSTECH); Sehoon Kim (Samsung electronics co. ltd.); Jonghyun Kim (Samsung Electronics); Seung-Hwan Baek (POSTECH); Sunghyun Cho (POSTECH)* |
5901 | Object Wake-up: 3D Object Rigging from a Single Image | Ji Yang (University of Alberta)*; Xinxin Zuo (University of Alberta); Sen Wang (University of Alberta); Zhenbo Yu (Shanghai Jiao Tong University); Xingyu Li (University of Alberta); Bingbing Ni (Shanghai Jiao Tong University); Minglun Gong (University of Guelph); Li Cheng (ECE dept., University of Alberta) |
5905 | ClearPose: Large-scale Transparent Object Dataset and Benchmark | Xiaotong Chen (University of Michigan, Ann Arbor)*; Huijie Zhang (University of Michigan, Ann Arbor); Zeren Yu (University of Michigan–Ann Arbor); Anthony Opipari (University of Michigan); Odest Chadwicke Jenkins (University of Michigan) |
5907 | Domain Knowledge-Informed Self-Supervised Representations for Workout Form Assessment | Paritosh Parmar (University of British Columbia)*; Amol Gharat (Flex A.I.); Helge Rhodin (UBC) |
5908 | Neural Capture of Animatable 3D Human from Monocular Video | Gusi Te (Peking University); Xiu Li (Tencent); Xiao Li (Microsoft Research Asia)*; Jinglu Wang (Microsoft Research Asia); Wei Hu (Peking University); Yan Lu (Microsoft Research Asia) |
5913 | Open Vocabulary Object Detection with Pseudo Bounding-Box Labels | Mingfei Gao (Apple)*; Chen Xing (Salesforce Research); Juan Carlos Niebles (Salesforce & Stanford University); Junnan Li (Salesforce); Ran Xu (Salesforce Research); Wenhao Liu (Salesforce Metamind); Caiming Xiong (Salesforce Research) |
5914 | BoundaryFace: A mining framework with noise label self-correction for Face Recognition | Shijie Wu (Southwest Jiaotong University)*; Xun Gong (Southwest Jiaotong University) |
5915 | IntegratedPIFu: Integrated Pixel Aligned Implicit Function for Single-view Human Reconstruction | Kennard Chan Yanting (Nanyang Technological University)*; Guosheng Lin (Nanyang Technological University); Haiyu Zhao (SenseTime International Pte Ltd); Weisi Lin (Nanyang Technological University, Singapore) |
5922 | BMD: A General Class-balanced Multicentric Dynamic Prototype Strategy for Source-free Domain Adaptation | Sanqing Qu (Tongji University); Guang Chen (Tongji University)*; Jing Zhang (The University of Sydney); Zhijun Li (University of Science and Technology of China); Wei He (University of Science and Technology Beijing); Dacheng Tao (JD.com) |
5923 | What Matters for 3D Scene Flow Network | Guangming Wang (Shanghai Jiao Tong University); Yunzhe Hu (Shanghai Jiao Tong University); Zhe Liu (University of Cambridge); Yiyang Zhou (UC Berkeley ); Masayoshi TOMIZUKA (MSC Lab); Wei Zhan (University of California, Berkeley); Hesheng Wang (SJTU)* |
5932 | Controllable Shadow Generation Using Pixel Heigh Maps | Yichen Sheng (Purdue University)*; Yifan Liu (University of Adelaide); Jianming Zhang (Adobe Research); Wei Yin (University of Adelaide); A. Cengiz Oztireli (University of Cambridge, Google); He Zhang (Adobe); Zhe Lin (Adobe Research); Eli Shechtman (Adobe Research, US); Bedrich Benes (Purdue University) |
5937 | CADyQ: Content-Aware Dynamic Quantization for Image Super-Resolution | Cheeun Hong (Seoul National University); Sungyong Baik (Hanyang University); Heewon Kim (Seoul National University); Seungjun Nah (NVIDIA); Kyoung Mu Lee (Seoul National University)* |
5940 | SPSN: Superpixel Prototype Sampling Network for RGB-D Salient Object Detection | Minhyeok Lee ( Yonsei University)*; Chaewon Park (Yonsei University); Suhwan Cho (Yonsei University); Sangyoun Lee (Yonsei University) |
5950 | Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer | Songwei Ge (University of Maryland)*; Thomas F Hayes (Meta); Harry Yang (Facebook); Xi Yin (Facebook); Guan Pang (Facebook); David Jacobs (University of Maryland, USA); Jia-Bin Huang (Facebook ); Devi Parikh (Georgia Tech & Facebook AI Research) |
5951 | Combining Internal and External Constraints for Unrolling Shutter in Videos | Eyal Naor (Weizmann Institute)*; Itai Antebi (Weizmann); Shai Bagon (Weizmann Institute of Science); Michal Irani (Weizmann Institute, Israel) |
5961 | Global Spectral Filter Memory Network for Video Object Segmentation | Yong Liu (Tsinghua University)*; Ran Yu (Tsinghua university); Jiahao Wang (Tsinghua University); Xinyuan Zhao (Huawei); Yitong Wang (Bytedance); Yansong Tang (Tsinghua University); Yujiu Yang (Tsinghua University) |
5964 | SEMICON: A Learning-to-hash Solution for Large-scale Fine-grained Image Retrieval | Yang Shen (Nanjing University of Science and Technology); Xu Hao XH SUN (Nanjing University Of Science And Technology); Xiu-Shen Wei (Nanjing University of Science and Technology)*; Qing-Yuan Jiang (HuaWei); Jian Yang (Nanjing University of Science and Technology) |
5966 | Batch-efficient EigenDecomposition for Small and Medium Matrices | Yue Song (University of Trento)*; Nicu Sebe (University of Trento); Wei Wang (EPFL) |
5972 | General Object Pose Transformation Network from Unpaired Data | Yukun Su (South China University of Technology)*; Guosheng Lin (Nanyang Technological University); RuiZhou Sun (South China University of Technology); Qingyao Wu (South China University of Technology) |
5974 | Robust Network Architecture Search via Feature Distortion Restraining | Yaguan QIAN (Zhejiang University of Science and Technology)*; Shenghui Huang (Zhejiang University of Science and Technology); Bin WANG (Network and Information Security Laboratory of Hangzhou Hikvision Digital Technology Co.); Xiang Ling (Institute of Software, Chinese Academy of Sciences); Xiaohui Guan (Zhejiang University of Water Resources and Electric Power); Zhaoquan Gu (Guangzhou University); Shaoning Zeng (Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China); Wujie Zhou (Zhejiang University of Science and Technology); Haijiang Wang (Zhejiang University of Science and Technology) |
5988 | Correspondence Reweighted Translation Averaging | Lalit Manam (Indian Institute of Science Bengaluru)*; Venu Madhav Govindu (Indian Institute of Science) |
5993 | RepMix: Representation Mixing for Robust Attribution of Synthesized Images | Tu Bui (University of Surrey)*; Ning Yu (Salesforce Research); John Collomosse (Adobe Research) |
6000 | When Deep Classifiers Agree: Analyzing Correlations between Learning Order and Image Statistics | Iuliia Pliushch (Goethe University)*; Martin Mundt (TU Darmstadt); Nicolas Lupp (Goethe University Frankfurt); Visvanathan Ramesh (Goethe University) |
6002 | S2F2: Single-Stage Flow Forecasting for Future Multiple Trajectories Prediction | YU-WEN CHEN (National Tsing Hua University); Hsuan-Kung Yang (National Tsing Hua University); Chu-Chi Chiu (National Tsin-Hua University); Chun-Yi Lee (National Tsing Hua University)* |
6004 | Few-Shot Object Detection by Knowledge Distillation Using Bag-of-Visual-Words Representations | Wenjie Pei (Harbin Institute of Technology, Shenzhen); Shuang Wu (Harbin Institute of Technology, Shenzhen); Dianwen Mei (Harbin Institute of Technology, Shenzhen); Fanglin Chen (Harbin Institute of Technology, Shenzhen); Jiandong Tian (CAS); Guangming Lu ( Harbin Institute of Technology, Shenzhen)* |
6009 | Stochastic Consensus: Enhancing Semi-Supervised Learning with Consistency of Stochastic Classifiers | Hui Tang (South China University of Technology)*; Kui Jia (South China University of Technology); Lin Sun (Magic Leap) |
6011 | Learning Where To Look – Generative NAS is Surprisingly Efficient | Jovita Lukasik (University of Mannheim)*; Steffen Jung (MPII); Margret Keuper (University of Mannheim) |
6023 | Realistic One-shot Mesh-based Head Avatars | Taras Khakhulin (Skolkovo Institute of Science and Technology)*; Vanessa Valerievna Skliarova (Skoltech); Victor Lempitsky (Yandex); Egor Zakharov (Skolkovo Institute of Science and Technology) |
6024 | Ensemble Knowledge Guided Sub-network Search and Fine-tuning for Filter Pruning | Seunghyun Lee (Inha University); Byung Cheol Song (Inha University)* |
6037 | SALISA: Saliency-based Input Sampling for Efficient Video Object Detection | Babak Ehteshami Bejnordi (Qualcomm AI Reseach)*; Amir Ghodrati (Qualcomm AI Research); Fatih Porikli (Qualcomm AI Research); Amirhossein Habibian (Qualcomm AI Research) |
6039 | Video Instance Segmentation via Multi-Scale Spatio-Temporal Split Attention Transformer | Omkar Thawakar (MBZUAI)*; Sanath Narayan (Inception Institute of Artificial Intelligence); Jiale Cao (Tianjin University); Hisham Cholakkal (MBZUAI); Rao Muhammad Anwer (MBZUAI/AALTO); Muhammad Haris Khan (Muhammad Bin Zayed University of Artificial Intelligence); Salman Khan (MBZUAI/ANU); Michael Felsberg (Linköping University); Fahad Shahbaz Khan (MBZUAI) |
6044 | RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation | Haodi He (University of Science and Technology of China); Yuhui Yuan (Microsoft Research)*; Xiangyu Yue (University of California, Berkeley); Han Hu (Microsoft Research Asia) |
6046 | Contextformer: A Transformer with Spatio-Channel Attention for Context Modeling in Learned Image Compression | Ahmet Burakhan Koyuncu (Technical University of Munich)*; Han Gao (Tencent America); Atanas Boev (Huawei Technologies Duesseldorf GmbH); Georgii Gaikov (Huawei Moscow Research Center); Elena Alshina (Huawei Technologies); Eckehard Steinbach (TUM) |
6048 | Image Super-Resolution with Deep Dictionary | Shunta Maeda (Navier Inc.)* |
6054 | ECO-TR: Efficient Correspondences Finding Via Coarse-to-Fine Refinement | Dongli Tan (Xiamen University)*; Jiang-Jiang Liu (Nankai University); Xingyu Chen (Youtu Lab); Chao Chen (Youtu Laboratory); Ruixin Zhang (Tencent); Yunhang Shen (Xiamen University); Shouhong Ding (Tencent); Rongrong Ji (Xiamen University, China) |
6056 | Responsive Listening Head Generation: A Benchmark Dataset and Baseline | Mohan Zhou (Harbin Institute of Technology)*; Yalong Bai (JD AI Research); Wei Zhang (JD AI Research); Ting Yao (JD AI Research); Tiejun Zhao (Harbin Institute of Technology); Tao Mei (AI Research of JD.com) |
6063 | WISE: Whitebox Image Stylization by Example-based Learning | Winfried Lötzsch (Merantix Momentum); Max Reimann (Hasso-Plattner-Institute)*; Martin Büßemeyer (Hasso-Plattner-Institut); Amir Semmo (Digital Masterpieces GmbH); Jürgen Döllner (Hasso-Plattner-Institut); Matthias Trapp (Hasso Plattner Institute, University of Potsdam) |
6067 | 3D Equivariant Graph Implicit Functions | Yunlu Chen (University of Amsterdam); Basura Fernando (Agency for Science, Technology and Research, ASTAR, Singapore); Hakan Bilen (University of Edinburgh); Matthias Niessner (Technical University of Munich); Efstratios Gavves (University of Amsterdam ) |
6068 | AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment | Kangyeol Kim (KAIST)*; Sunghyun Park (KAIST); Jaeseong Lee (KAIST); Sunghyo Chung (Korea University); Junsoo Lee (NAVER WEBTOON Ltd.); Jaegul Choo (Korea Advanced Institute of Science and Technology) |
6076 | Towards Scale-Aware, Robust, and Generalizable Unsupervised Monocular Depth Estimation by Integrating IMU Motion Dynamics | Sen Zhang (The University of Sydney); Jing Zhang (The University of Sydney)*; Dacheng Tao (The University of Sydney) |
6078 | Dynamic Local Aggregation Network with Adaptive Clusterer for Anomaly Detection | Zhiwei Yang (Xidian University)*; Peng Wu (Xidian University); Jing Liu (Xidian University); Xiaotao Liu (Xidian University) |
6080 | Learning Semantic Segmentation from Multiple Datasets with Label Shifts | Dongwan Kim (Seoul National University)*; Yi-Hsuan Tsai (Phiar Technologies); Yumin Suh (NEC Labs America); Masoud Faraki (NEC Labs); Sparsh Garg (NEC Labs America); Manmohan Chandraker (UC San Diego); Bohyung Han (Seoul National University) |
6086 | SecretGen: Privacy Recovery on Pre-trained Models via Distribution Discrimination | Zhuowen Yuan (UIUC); Fan Wu (UIUC); Yunhui Long (University of Illinois); Chaowei Xiao (NVIDIA); Bo Li (UIUC)* |
6090 | A Kendall Shape Space Approach to 3D Shape Estimation from 2D Landmarks | Martha Paskin (Zuse Institute Berlin); Daniel Baum (Zuse Institute Berlin); Mason N Dean (City University of Hong Kong); Christoph von Tycowicz (Zuse Institute Berlin)* |
6092 | Temporally Consistent Transformer for Video Denoising | Mingyang Song (ETH Zurich)*; Yang Zhang (Disney Research Studios); Tunç Aydin (Disney Research) |
6093 | Action Quality Assessment with Temporal Parsing Transformer | Yang Bai (Durham University); Desen Zhou (Baidu, Inc.)*; Songyang Zhang (Shanghai AI Laboratory); Jian Wang (Baidu); Errui Ding (Baidu Inc.); Yu Guan (University of Warwick); Yang Long (Durham University); Jingdong Wang (Baidu) |
6097 | A study of Pre-training strategies and datasets for facial representation learning | Adrian Bulat (Samsung AI Center, Cambridge)*; Shiyang Cheng (Samsung); Jing Yang (University of Nottingham); Andrew Garbett (Samsung AI Center); Enrique Sanchez (Samsung AI Centre); Georgios Tzimiropoulos (Queen Mary University of London) |
6112 | Neural Strands: Learning Hair Geometry and Appearance from Multi-View Images | Radu Alexandru Rosu (University of Bonn); Shunsuke Saito (Facebook); Ziyan Wang (Carnegie Mellon University); Chenglei Wu (Facebook Reality Labs); Sven Behnke (University of Bonn); Giljoo Nam (Facebook Inc.)* |
6114 | Conditional Stroke Recovery for Fine-Grained Sketch-Based Image Retrieval | Zhixin Ling (Fudan University)*; Zhen Xing (Fudan University); Jian Zhou (Fudan University); Xiangdong Zhou (Fudan University) |
6123 | Generalized Brain Image Synthesis with Transferable Convolutional Sparse Coding Networks | Yawen Huang (Tencent)*; Feng Zheng (SUSTech); Xu Sun (Tencent); Yuexiang Li (Jarvis Lab, Tencent); Ling Shao (Terminus Group); Yefeng Zheng (Tencent) |
6127 | Wave-ViT: Unifying Wavelet and Transformers for Visual Representation Learning | Ting Yao (JD AI Research); Yingwei Pan (JD AI Research)*; Yehao Li (JD AI Research); Chong-Wah Ngo (Singapore Management University); Tao Mei (AI Research of JD.com) |
6129 | GraphCSPN: Geometry-Aware Depth Completion via Dynamic GCNs | Xin Liu (Tsinghua University)*; Xiaofei Shao (Deptrum); Bo Wang (Deptrum); Ya-Li Li (Tsinghua University); Shengjin Wang (Tsinghua University) |
6138 | Revisiting Batch Norm Initialization | Jim Davis (Ohio State University); Logan Frank (Ohio State University)* |
6141 | NewsStories: Illustrating articles with visual summaries | Reuben Tan (Boston University)*; Bryan Plummer (Boston University); Kate Saenko (Boston University); J.P. Lewis (Google Research); Avneesh Sud (Google); Thomas Leung (Google) |
6144 | Improving Few-Shot Learning through Multi-task Representation Learning Theory | Quentin Bouniot (CEA, LIST)*; Ievgen Redko (Laboratoire Hubert Curien); Romaric Audigier (CEA LIST); Angélique Loesch (CEA LIST); Amaury Habrard (University of St-Etienne, Lab. H. Curien) |
6145 | Deep Semantic Statistics Matching (D2SM) Denoising Network | Kangfu Mei (Johns Hopkins University)*; Vishal Patel (Johns Hopkins University); Rui Huang (The Chinese University of Hong Kong, Shenzhen) |
6148 | Long-tailed Instance Segmentation using Gumbel Optimized Loss | Konstantinos P Alexandridis (University of Liverpool)*; Jiankang Deng (Imperial College London); Anh Nguyen (University of Liverpool); Shan Luo (University of Liverpool) |
6162 | DetMatch: Two Teachers are Better Than One for Joint 2D and 3D Semi-Supervised Object Detection | Jinhyung Park (Carnegie Mellon University)*; Chenfeng Xu (UC Berkeley); Yiyang Zhou (UC Berkeley ); Masayoshi TOMIZUKA (MSC Lab); Wei Zhan (University of California, Berkeley) |
6177 | 3D Scene Inference from Transient Histograms | Sacha Jungerman (University of Wisconsin-Madison)*; Atul N Ingle (University of Wisconsin-Madison); Yin Li (University of Wisconsin-Madison); Mohit Gupta (“University of Wisconsin-Madison, USA “) |
6178 | SSBNet: Improving Visual Recognition Efficiency by Adaptive Sampling | Ho Man Kwan (The Hong Kong University of Science and Technology)*; S.H. Song (HKUST) |
6182 | Deep 360° Optical Flow Estimation by Multi-Projection Fusion | Yiheng Li (Victoria University of Wellington); Connelly Barnes (Adobe); Kun Huang (Victoria University of Wellington); Fang-Lue Zhang (Victoria University of Wellington)* |
6187 | Neural Space-filling Curves | Hanyu Wang (University of Maryland – College Park)*; Kamal Gupta (University of Maryland); Larry Davis (University of Maryland); Abhinav Shrivastava (University of Maryland) |
6192 | MFIM: Megapixel Facial Identity Manipulation | Sanghyeon Na (kakaobrain)* |
6194 | Objects Can Move: 3D Change Detection by GeometricTransformation Consistency | Aikaterini Adam (National Techniclal University of Athens)*; Torsten Sattler (Czech Technical University in Prague); Konstantinos Karantzalos (National Technical University of Athens); Tomas Pajdla (Czech Technical University in Prague) |
6199 | MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration | Thomas F Hayes (Meta); Songyang Zhang (University of Rochester)*; Xi Yin (Facebook); Guan Pang (Facebook); Sasha Sheng (Meta Platforms); Harry Yang (Facebook); Songwei Ge (University of Maryland, College Park); Qiyuan Hu (Facebook AI Research); Devi Parikh (Georgia Tech & Facebook AI Research) |
6203 | PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation | Bo Sun (UT Austin)*; Vladimir Kim (Adobe); Qixing Huang (The University of Texas at Austin); Noam Aigerman (Adobe); Siddhartha Chaudhuri (Adobe Research) |
6207 | Network Binarization via Contrastive Learning | Yuzhang Shang (Illinois Institute of Technology)*; Dan Xu (The Hong Kong University of Science and Technology); Ziliang Zong (Texas State University); Liqiang Nie (Harbin Institute of Technology (Shenzhen)); Yan Yan (Illinois Institute of Technology) |
6210 | Lipschitz Continuity Retained Binary Neural Network | Yuzhang Shang (Illinois Institute of Technology)*; Dan Xu (The Hong Kong University of Science and Technology); Bin Duan (Illinois Institute of Technology); Ziliang Zong (Texas State University); Liqiang Nie (Harbin Institute of Technology (Shenzhen)); Yan Yan (Illinois Institute of Technology) |
6212 | Is Geometry Enough for Matching in Visual Localization? | Qunjie Zhou (Technical University of Munich)*; Sérgio Agostinho (Institute for Systems and Robotics, Instituto Superior Técnico, Universidade de Lisboa); Aljosa Osep (TUM Munich); Laura Leal-Taixé (TUM) |
6214 | Webly Supervised Concept Expansion for General Purpose Vision Models | Amita Kamath (Allen Institute for Artificial Intelligence); Christopher A Clark (Allen Institute for AI)*; Tanmay Gupta (Allen Institute for Artificial Intelligence); Eric Kolve (Allen AI); Derek Hoiem (University of Illinois at Urbana-Champaign); Aniruddha Kembhavi (Allen Institute for Artificial Intelligence) |
6216 | Compositional Human-Scene Interaction Synthesis with Semantic Control | Kaifeng Zhao (ETH Zurich)*; Shaofei wang (ETH Zurich); Yan Zhang (ETH Zurich); Thabo Beeler (Disney Research | Studios); Siyu Tang (ETH Zurich) |
6218 | MaCLR: Motion-aware Contrastive Learning of Representations for Videos | Fanyi Xiao (Meta); Joseph Tighe (Amazon); Davide Modolo (Amazon)* |
6220 | Transformers as Meta-Learners for Implicit Neural Representations | Yinbo Chen (UC San Diego)*; Xiaolong Wang (UCSD) |
6222 | RAWtoBit: A Fully End-to-end Camera ISP Network | Wooseok Jeong (Korea University); Seung-Won Jung (Korea University)* |
6227 | SpatialDETR: Robust Scalable Transformer-Based 3D Object Detection from Multi-View Camera Images with Global Cross-Sensor Attention | Simon Doll (University of Tübingen)*; Richard Schulz (Mercedes Benz); Lukas Schneider (Daimer); Viviane Benzin (Mercedes-Benz AG); Markus Enzweiler (Esslingen University of Applied Sciences); Hendrik P. A. Lensch (University of Tübingen) |
6228 | 3D Face Reconstruction with Dense Landmarks | Erroll Wood (Microsoft)*; Tadas Baltrusaitis (Microsoft); Charlie Hewitt (Microsoft); Matthew A Johnson (Microsoft); Jingjing Shen (Microsoft); Nikola Milosavljevic (Microsoft); Daniel S Wilde (Microsoft); Stephan J Garbin (University College London); Toby Sharp (Microsoft); Ivan Stojiljkovic (Microsoft); Tom Cashman (Microsoft); Julien Valentin (Microsoft) |
6236 | SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds | Pei Sun (Waymo)*; Mingxing Tan (Waymo); Weiyue Wang (Waymo); Chenxi Liu (Waymo); Fei Xia (Waymo); Zhaoqi Leng (Waymo); Dragomir Anguelov (Waymo) |
6247 | Incomplete Multi-view Domain Adaptation via Channel Enhancement and Knowledge Transfer | Haifeng Xia (Tulane University)*; Pu Wang (MERL); Zhengming Ding (Tulane University) |
6250 | Exposure-Aware Dynamic Weighted Learning for Single-Shot HDR Imaging | An Gia Vien (Dongguk University); Chul Lee (Dongguk University)* |
6259 | Seeing through a Black Box: Toward High-Quality Terahertz Imaging via Subspace-and-Attention Guided Restoration | Weng-Tai Su (National Tsing Hua University); Yi-Chun Hung (University of California, Los Angeles); Po-Jen Yu (National Tsing Hua University); Shang-Hua Yang (National Tsing Hua University); Chia-Wen Lin (National Tsing Hua University)* |
6265 | SPViT: Enabling Faster Vision Transformers via Soft Token Pruning | Zhenglun Kong (Northeastern University)*; Peiyan Dong (Northeastern University); Xiaolong Ma (Clemson University); Xin Meng (Peking university); Wei Niu (William & Mary); Mengshu Sun (Northeastern University); Xuan Shen (Northeastern University); Geng Yuan (Northeastern University); Bin Ren (William & Mary); Hao Tang (ETH Zurich); Minghai Qin (Western Digital Research); Yanzhi Wang (Northeastern University) |
6269 | Soft Masking for Cost-Constrained Channel Pruning | Ryan Humble (Stanford University)*; Maying Shen (NVIDIA); Jorge Albericio Latorre (NVIDIA); Eric Darve (Stanford University); Jose M. Alvarez (NVIDIA) |
6271 | Ensemble Learning Priors Driven Deep Unfolding forScalable Snapshot Compressive Imaging | Chengshuai Yang (Westlake University)*; Shiyu Zhang (Westlake University); Xin Yuan (Westlake University) |
6275 | A Simple Baseline for Open Vocabulary Semantic Segmentation with Pre-trained Vision-language Model | Mengde Xu (Huazhong University of Science and Tech.); Zheng Zhang (MSRA)*; Fangyun Wei (Microsoft Research Asia); Yutong Lin (Xi’an Jiaotong University); Yue Cao (Microsoft Research); Han Hu (Microsoft Research Asia); Xiang Bai (Huazhong University of Science and Technology) |
6276 | Triangle Attack: A Query-efficient Decision-based Adversarial Attack | Xiaosen Wang (Huazhong University of Science and Technology)*; Zeliang Zhang (Huazhong University of Sci. & Technology); Kangheng Tong (Huazhong University of Science and Technology); dihong gong (Tencent AI Lab); Kun He (Huazhong University of Science and Technology); Zhifeng Li (Tencent AI Lab); Wei Liu (Tencent) |
6282 | Tailoring Self-Supervision for Supervised Learning | WonJun Moon (Sungkyunkwan University)*; Jihwan Kim (Sungkyunkwan University); Jae-Pil Heo (Sungkyunkwan University) |
6283 | Difficulty-Aware Simulator for Open Set Recognition | WonJun Moon (Sungkyunkwan University)*; Jun ho Park (Sungkyunkwan university); Hyun Seok Seong (Sungkyunkwan University); Cheol-Ho Cho (Sungkyunkwan University); Jae-Pil Heo (Sungkyunkwan University) |
6287 | Non-Uniform Step Size Quantization for Accurate Post-Training Quantization | Sangyun Oh (UNIST)*; Hyeonuk Sim (UNIST); Jounghyun Kim (UNIST); Jongeun Lee (UNIST) |
6298 | FedVLN: Privacy-preserving Federated Vision-and-Language Navigation | Kaiwen Zhou (University of California, Santa Cruz)*; Xin Eric Wang (University of California, Santa Cruz) |
6305 | Data-free Backdoor Removal Based on Channel Lipschitzness | Runkai Zheng (Chinese University of Hong Kong (Shenzhen)); Rongjun Tang (The Chinese University of Hong Kong, Shenzhen); Jianze Li (Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen); Li Liu (Shenzhen Research Institute of Big Data, the chinese university of hong kong shenzhen)* |
6312 | SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning | Haoran You (Rice University)*; Baopu Li (Baidu ); Zhanyi Sun (Rice University); Xu Ouyang (Rice University); Yingyan Lin (Rice University) |
6316 | PCR-CG: Point Cloud Registration via Deep Explicit Color and Geometry | Yu Zhang (Shanghai Jiaotong University )*; Yu Junle (HangZhou dianzi university); Xiaolin Huang (Shanghai Jiao Tong University); Wenhui Zhou (Hangzhou Dianzi University); Ji Hou (Meta Reality Labs) |
6323 | DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization | Xueqing Deng (University of California, Merced); Dawei Sun (University of Illinois Urbana-Champaign); Shawn Newsam (UC Merced); Peng Wang (Bytedance USA LLC.)* |
6324 | Tomography of Turbulence Strength Based on Scintillation Imaging | Nir Shaul (Technion)*; Schechner Yoav (Technion) |
6325 | Realistic Blur Synthesis for Learning Image Deblurring | Jaesung Rim (POSTECH); Geonung Kim (POSTECH); Jungeon Kim (POSTECH); Junyong Lee (POSTECH); Seungyong Lee (POSTECH); Sunghyun Cho (POSTECH)* |
6328 | GLAMD: Global and Local Attention MaskDistillation for Object Detectors | YounHo Jang (Kyung Hee University); Wheemyung Shin (Kyung Hee University); Jinbeom Kim (Sungkyunkwan University (SKKU)); Sung-Ho Bae (Kyung Hee University)*; Simon S Woo (Sungkyunkwan University (SKKU)) |
6337 | Meta-GF: Training Dynamic-Depth Neural Networks Harmoniously | Yi Sun (National University of Defense Technology); Jian Li (NUDT); Xin Xu (National University of Defense Technology)* |
6338 | CXR Segmentation by AdaIN-based Domain Adaptation and Knowledge Distillation | Yujin Oh (Kim Jaechul Graduate School of AI, KAIST, Korea); Jong Chul Ye (Kim Jaechul Graduate School of AI, KAIST, Korea)* |
6342 | Emotion-aware Multi-view Contrastive Learning for Facial Emotion Recognition | Daeha Kim (Inha University); Byung Cheol Song (Inha University)* |
6356 | FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection | Danila Rukhovich (Samsung AI Center Moscow); Anna Vorontsova (Samsung AI Center)*; Anton S. Konushin (Samsung AI Center Moscow) |
6365 | Video Dialog as Conversation about Objects Living in Space-Time | Hoang-Anh Pham (Deakin University)*; Thao Minh Le (Deakin University); Vuong Le (Deakin University); Tu Minh Phuong (Posts and Telecommunications Institute of Technology); Truyen Tran (Deakin University) |
6366 | Few-Shot Class-Incremental Learning from an Open-Set Perspective | Can Peng (the University of Queensland)*; Kun Zhao (Sullivan Nicolaides Pathology); Tianren Wang (The University of Queensland); Meng Li (The University of Queensland); Brian C Lovell (University of Queensland) |
6380 | ML-BPM: Multi-teacher Learning with Bidirectional Photometric Mixing for Open Compound Domain Adaptation in Semantic Segmentation | Fei Pan (KAIST)*; Sungsu Hur (KAIST); Seokju Lee (KENTECH); Junsik Kim (Harvard University); In So Kweon (KAIST) |
6389 | DRCNet: Dynamic Image Restoration Contrastive Network | Fei Li (China Agricultural University)*; Lingfeng Shen (Tencent AI Lab); YANG MI (China Agricultural University); Zhenbo Li (China Agricultural University) |
6394 | Order Learning Using Partially Ordered Data via Chainization | Seon-Ho Lee (MCL, Korea University); Chang-Su Kim (Korea university)* |
6395 | Style Your Hair: Latent Optimization for Pose-Invariant Hairstyle Transfer via Local-Style-Aware Hair Alignment | Chaeyeon Chung ( Korea Advanced Institute of Science and Technology)*; Taewoo Kim (Korea Advanced Institute of Science and Technology ); Yoonseo Kim (KAIST); Sunghyun Park (KAIST); Kangyeol Kim (KAIST); Jaegul Choo (Korea Advanced Institute of Science and Technology) |
6403 | High-Resolution Virtual Try-On with Misalignment and Occlusion-Handled Conditions | SangYun Lee (Soongsil University); Gyojung Gu (Korea Advanced Institute of Science and Technology)*; Sunghyun Park (KAIST); Seunghwan Choi (Korea Advanced Institute of Science and Technology ); Jaegul Choo (Korea Advanced Institute of Science and Technology) |
6418 | Zero-Shot Learning for Reflection Removal of Single 360-Degree Image | Byeong-Ju Han (Ulsan National Institute of Science and Technology ); Jae-Young Sim (Ulsan National Institute of Science and Technology)* |
6420 | A Codec Information Assisted Framework for Efficient Compressed Video Super-Resolution | Hengsheng Zhang (Shanghai Jiao Tong University)*; Xueyi Zou (Huawei Noah’s Ark Lab); Jiaming Guo (Huawei Noah’s Ark Lab); Youliang Yan (Huawei Noah’s Ark Lab); Rong Xie (Shanghai Jiao Tong University); Li Song (Shanghai Jiao Tong University) |
6421 | Towards Ultra Low Latency Spiking Neural Networks for Vision and Sequential Tasks Using Temporal Pruning | Sayeed Shafayet Chowdhury (Purdue University)*; Nitin Rathi (Purdue University); Kaushik Roy (Purdue Uniiversity) |
6439 | MimicME: A Large Scale Diverse 4D Database for Facial Expression Analysis | Athanasios Papaioannou (Huawei)*; Baris Gecer (Huawei); Shiyang Cheng (Samsung); Grigorios Chrysos (EPFL); Jiankang Deng (Imperial College London); Eftychia Fotiadou (Imperial College London); Christos Kampouris (ApolloXR); Dimitrios Kollias (Queen Mary University London); Stylianos Moschoglou (Huawei Technologies Co. Ltd); Kritaphat Songsri-In (Imperial College London); Stylianos Ploumpis (Huawei Technologies Co. Ltd); George Trigeorgis (Imperial College London ); Panagiotis Tzirakis (Imperial College London); Evangelos Ververas (Imperial College London); Yuxiang Zhou (Deepmind, Google); Allan Ponniah (NHS); Anastasios Roussos (Institute of Computer Science, Foundation for Research and Technology Hellas); Stefanos Zafeiriou (Imperial College London) |
6441 | Black-Box Dissector: Towards Erasing-based Hard-Label Model Stealing Attack | Yixu Wang (Xiamen University)*; Jie Li (Xiamen University); Hong Liu (National Institute of Informatics ); Yan Wang (Pinterest); Yongjian Wu (Tencent Technology (Shanghai) Co.,Ltd); Feiyue Huang (Tencent); Rongrong Ji (Xiamen University, China) |
6451 | Video Anomaly Detection by Solving Decoupled Spatio-Temporal Jigsaw Puzzles | Guodong Wang (Beihang University)*; Yunhong Wang (State Key Laboratory of Virtual Reality Technology and System, Beihang University, Beijing 100191, China); Jie Qin (Nanjing University of Aeronautics and Astronautics); Dongming Zhang ( National Computer Network Emergency Response Technical Team/Coordination Center of China ); Xiuguo bao (National Computer Network Emergency Response Technical Team/Coordination Center of China); Di Huang (Beihang University, China) |
6454 | Towards Accurate Network Quantization with Equivalent Smooth Regularizers | Kirill Solodskikh (Huawei Noah’s Ark Lab, MSU)*; Vladimir Chikin (Huawei Noah’s Ark Lab); Ruslan Aydarkhanov (Huawei Noah’s Ark Lab); Dehua Song (Huawei Noah’s Ark Lab); Irina Zhelavskaya (Skolkovo Institute of Science and Technology (Skoltech)); Jiansheng Wei (Huawei Technologies Co. Ltd.) |
6455 | DiffuseMorph: Unsupervised Deformable Image Registration Using Diffusion Model | Boah Kim (KAIST)*; Inhwa Han (KAIST); Jong Chul Ye (Kim Jaechul Graduate School of AI, KAIST, Korea) |
6459 | An Impartial Take to the CNN vs Transformer Robustness Contest | Francesco Pinto (University of Oxford)*; Philip Torr (University of Oxford); Puneet Dokania (University of Oxford) |
6460 | CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval | Haoran Wang (Baidu)*; Dongliang He (Baidu); Wenhao Wu (Baidu); Boyang Xia (Institute of Computing Technology, Chinese Academy of Science); Min Yang (Baidu); Fu Li (Baidu); Yunlong Yu (Zhejiang University); Zhong Ji (Tianjin University); Errui Ding (Baidu Inc.); Jingdong Wang (Baidu) |
6463 | Weakly Supervised 3D Scene Segmentation with Region-Level Boundary Awareness and Instance Discrimination | Kangcheng LIU (The Chinese University of Hong Kong)*; Yuzhi Zhao (City University of Hong Kong); Qiang Nie (Tencent Youtu Lab); Zhi Gao (NUS); Ben M. Chen (Chinese University of Hong Kong) |
6471 | FOSTER: Feature Boosting and Compression for Class-Incremental Learning | Fu-Yun Wang (Nanjing University)*; Da-Wei Zhou (Nanjing University); Han-Jia Ye (Nanjing University); De-Chuan Zhan (Nanjing University) |
6472 | Delving into Universal Lesion Segmentation: Method, Dataset, and Benchmark | Yu Qiu (Nankai University)*; Jing Xu (Nankai University) |
6475 | Explicit Model Size Control and Relaxation via Smooth Regularization for Mixed-Precision Quantization | Vladimir Chikin (Huawei Noah’s Ark Lab)*; Kirill Solodskikh (Huawei Noah’s Ark Lab, MSU); Irina Zhelavskaya (Skolkovo Institute of Science and Technology (Skoltech)) |
6479 | Large scale Real-world Multi Person Tracking | Bing Shuai (Amazon)*; Alessandro Bergamo (Amazon); Uta Büchler (Amazon); Andrew G Berneshawi (Amazon); Alyssa Boden (Amazon Web Services); Joseph Tighe (Amazon) |
6491 | Class-agnostic Object Detection with Multi-modal Transformer | Muhammad Maaz (MBZUAI)*; Hanoona Abdul Rasheed (MBZUAI); Salman Khan (MBZUAI/ANU); Fahad Shahbaz Khan (MBZUAI); Rao Muhammad Anwer (MBZUAI/AALTO); Ming-Hsuan Yang (University of California at Merced) |
6493 | Language-Grounded Indoor 3D Semantic Segmentation in the Wild | Dávid Rozenberszki (Technische Universitat Munchen)*; Or Litany (Stanford); Angela Dai (Technical University of Munich) |
6505 | Injecting 3D Perception of Controllable NeRF-GAN into StyleGAN for Editable Portrait Image Synthesis | Jeong-gi Kwak (Korea University); Yuanming Li (Korea University); Dongsik Yoon (Korea University); Donghyeon Kim (Korea university); David K Han (Drexel University); Hanseok Ko (Korea University)* |
6512 | BASQ: Branch-wise Activation-clipping Search Quantization for Sub-4-bit Neural Networks | Han-Byul Kim (Seoul National University)*; Eunhyeok Park (POSTECH); Sungjoo Yoo (Seoul National University) |
6513 | AdaNeRF: Adaptive Sampling for Real-time Rendering of Neural Radiance Fields | Andreas Kurz (Graz University of Technology)*; Thomas Neff (Graz University of Technology); Zhaoyang Lv (Facebook); Michael Zollhöfer (Facebook Reality Labs); Markus Steinberger (Graz University of Technology) |
6516 | Neural Light Field Estimation for Street Scenes with Differentiable Virtual Object Insertion | Zian Wang (University of Toronto)*; Wenzheng Chen (University of Toronto); David Acuna (University of Toronto, NVIDIA); Jan Kautz (NVIDIA); Sanja Fidler (University of Toronto, NVIDIA) |
6519 | Tree Structure-Aware Few-Shot Image Classification via Hierarchical Aggregation | Min Zhang (Zhejiang University)*; Siteng Huang (Westlake University); Wenbin Li (Nanjing University); Donglin Wang (Westlake University) |
6526 | PoseScript: 3D Human Poses from Natural Language | Ginger Delmas (NAVER LABS EUROPE)*; Philippe Weinzaepfel (NAVER LABS Europe); Thomas LUCAS (Naver); Francesc Moreno (IRI); Gregory Rogez (NAVER LABS Europe) |
6532 | Learning Energy-Based Models With Adversarial Training | Xuwang Yin (University of Virginia)*; Shiying Li (University of North Carolina, Chapel Hill); Gustavo Rohde (University of Virginia) |
6538 | You Already Have It: A Generator-Free Low-Precision DNN Training Framework using Stochastic Rounding | Geng Yuan (Northeastern University)*; Sung-En Chang (Northeastern University); Qing Jin (Northeastern University); Alec Lu (Simon Fraser University ); Yanyu Li (Northeastern University); Yushu Wu (Northeastern University); Zhenglun Kong (Northeastern University); Yanyue Xie (Northeastern University); Peiyan Dong (Northeastern University); Minghai Qin (Western Digital Research); Xiaolong Ma (Clemson University); Xulong Tang (University of Pittsburgh); Zhenman Fang (Simon Fraser University); Yanzhi Wang (Northeastern University) |
6540 | TIPS: Text-Induced Pose Synthesis | Prasun Roy (University of Technology Sydney)*; Subhankar Ghosh (University of Technology Sydney ); Saumik Bhattacharya (Indian Institute of Technology Kharagpur ); Umapada Pal (Indian Statistical Institute, Kolkata); Michael Blumenstein (University of Technology Sydney) |
6541 | Unsupervised High-Fidelity Facial Texture Generation and Reconstruction | Ron Slossberg (Technion)*; Ibrahim Jubran (The University of Haifa); Ron Kimmel (Technion) |
6551 | Addressing Heterogeneity in Federated Learning via Distributional Transformation | Haolin Yuan (Johns Hopkins University); Bo Hui (Johns Hopkins University); Yuchen Yang (Johns Hopkins University); Philippe Burlina (JHU/APL/CS/SOM); Neil Zhenqiang Gong (Duke University); Yinzhi Cao (JHU)* |
6555 | Adversarial Label Poisoning Attack on Graph Neural Networks via Label Propagation | Ganlin Liu (The University of Liverpool)*; Xiaowei Huang (Liverpool University); Xinping Yi (University of Liverpool) |
6559 | Approximate Discrete Optimal Transport Plan with Auxiliary Measure Method | Dongsheng An (Stony Brook University)*; Na Lei (Dalian University of Technology); Xianfeng GU (Stony Brook University) |
6560 | Visual Knowledge Tracing | Neehar Kondapaneni (Caltech)*; Pietro Perona (California Institute of Technology); Oisin Mac Aodha (University of Edinburgh) |
6562 | Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning | Xinlei He (CISPA Helmholtz Center for Information Security)*; Hongbin Liu (Duke University); Neil Zhenqiang Gong (Duke University); Yang Zhang (CISPA Helmholtz Center for Information Security) |
6565 | DProST: Dynamic Projective Spatial Transformer Network for 6D Pose Estimation | Jaewoo Park (Seoul National University); Nam Ik Cho (Seoul National University)* |
6567 | Accurate Detection of Proteins in Cryo-Electron Tomograms from Sparse Labels | Qinwen Huang (Duke University)*; Alberto Bartesaghi (Duke University); Ye Zhou (Duke University); Hsuan-Fu Liu (Duke University) |
6576 | Subspace Diffusion Generative Models | Bowen Jing (Massachusetts Institute of Technology)*; Gabriele Corso (MIT); Renato Berlinghieri (MIT); Tommi Jaakkola (MIT) |
6583 | Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features | Byeonghu Na (KAIST); Yoonsik Kim (Clova AI Research, NAVER Corp.); Sungrae Park (Upstage AI Research, Upstage AI)* |
6592 | Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments | Khoi D. Nguyen (VinAI Research)*; Quoc-Huy Tran (Retrocausal, Inc.); Khoi Nguyen (VinAI Research); Binh-Son Hua (VinAI Research); Rang NGUYEN (VinAI Research) |
6599 | Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection | Kyle Min (Intel Labs); Sourya Roy (University of California, Riverside); Subarna Tripathi (Intel Labs)*; Tanaya Guha (University of Glasgow); Somdeb Majumdar (Intel Labs) |
6602 | Relative Contrastive Loss for Unsupervised Representation Learning | Shixiang Tang (The University of Sydney)*; Feng Zhu (University of Science and Technology of China); Lei Bai (Shanghai AI Laboratory); Rui Zhao (SenseTime Group Limited); Wanli Ouyang (The University of Sydney) |
6615 | Personalized Education: Blind Knowledge Distillation | Xiang Deng (State University of New York at Binghamton)*; Jian Zheng (Amazon); Zhongfei Zhang (Binghamton University) |
6619 | Fast Two-View Motion Segmentation Using Christoffel Polynomials | Bengisu Ozbay (Northeastern University); Octavia Camps (Northeastern University); Mario Sznaier (Northeastern University)* |
6623 | Real Spike: Learning Real-valued Spikes for Spiking Neural Networks | Yufei Guo (The Second Academy of China Aerospace Science and Industry Corporation)*; Liwen Zhang (X Lab, the Second Academy of CASIC, Beijing); Yuanpei Chen (X LAB,The Second Academy of CASIC,Beijing); Xinyi Tong (The Second Academy of China Aerospace Science and Industry Corporation); Xiaode Liu (X Lab, The Second Academy of China Aerospace Science and Industry Corporation); YingLei Wang (CASIC); Xuhui Huang (X Lab, The Second Academy of CASIC); Zhe Ma (Xlab, the Second Academy of CASIC, Beijing) |
6627 | Language-Driven Artistic Style Transfer | Tsu-Jui Fu (UCSB)*; Xin Eric Wang (University of California, Santa Cruz); William Yang Wang (UC Santa Barbara) |
6634 | FedLTN: Federated Learning for Sparse and Personalized Lottery Ticket Networks | Vaikkunth Mugunthan (DynamoFL)*; Eric Lin (DynamoFL); Vignesh Gokul (University of California San Diego); Christian Lau (DynamoFL); Lalana Kagal (MIT); Steve Pieper (Isomics, Inc.) |
6639 | Transformer with Implicit Edges for Particle-based Physics Simulation | Yidi Shao (Nanyang Technological University)*; Chen Change Loy (Nanyang Technological University); Bo Dai (Shanghai AI Lab) |
6651 | Improving the Perceptual Quality of 2D Animation Interpolation | Shuhong Chen (University of Maryland – College Park)*; Matthias Zwicker (University of Maryland) |
6652 | Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning | Tao He (Monash University); Lianli Gao (The University of Electronic Science and Technology of China); Jingkuan Song (UESTC); Yuan-Fang Li (Monash University)* |
6655 | S3C: Self-Supervised Stochastic Classifiers for Few-Shot Class-Incremental Learning | Jayateja Kalla (Indian Institute of Science); Soma Biswas (Indian Institute of Science, Bangalore)* |
6660 | Entry-Flipped Transformer for Inference and Prediction of Participant Behavior | BO HU (Nanyang Technological University)*; Tat-Jen Cham (Nanyang Technological University) |
6665 | OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning | Mamshad Nayeem Rizve (University of Central Florida)*; Navid Kardan (University of Central Florida); Salman Khan (MBZUAI/ANU); Fahad Shahbaz Khan (MBZUAI); Mubarak Shah (University of Central Florida) |
6666 | Fine-grained Fashion Representation Learning by Online Deep Clustering | Yang Jiao (Amazon)*; Ning Xie (Amazon); Yan Gao (Amazon); Chien-Chih Wang (Amazon); Yi Sun (Amazon) |
6667 | Perspective Phase Angle Model for Polarimetric 3D Reconstruction | Guangcheng Chen (Guangdong University of Technology)*; Li He (Southern University of Science and Technology); Yisheng Guan (Guangdong University of Technology); Hong Zhang (University of Alberta) |
6670 | Selective TransHDR: Transformer-based selective HDR Imaging using Ghost Region Mask | Jou Won Song (Sogang University); Ye-In Park (Sogang University); Kyeongbo Kong (Pukyong National University); Jaeho Kwak (Sogang University); Suk-Ju Kang (Sogang University)* |
6671 | 3D Interacting Hand Pose Estimation by Hand De-occlusion and Removal | Hao Meng (BeiHang University); Sheng Jin (The University of Hong Kong)*; Wentao Liu (Sensetime); Chen Qian (SenseTime); Mengxiang Lin (Beihang University); Wanli Ouyang (The University of Sydney); Ping Luo (The University of Hong Kong) |
6678 | Recover Fair Deep Classification Models via Altering Pre-trained Structure | Yanfu Zhang (University of Pittsburgh)*; Shangqian Gao (University of Pittsburgh); Heng Huang (University of Pittsburgh) |
6680 | Improving Fine-Grained Visual Recognition in Low Data Regimes via Self-Boosting Attention Mechanism | Yangyang Shu (University of Adelaide); Lingqiao Liu (University of Adelaide)*; Baosheng Yu (The University of Sydney); Haiming Xu (The University of Adelaide) |
6686 | VSA: Learning Varied-Size Window Attention in Vision Transformers | Qiming Zhang (The University of Sydney)*; YUFEI XU (University of sydney); Jing Zhang (The University of Sydney); Dacheng Tao (JD.com) |
6693 | PoseGPT: Quantization-based 3D Human Motion Generation and Forecasting | Thomas LUCAS (Naver)*; Fabien Baradel (Naver Labs Europe); Philippe Weinzaepfel (NAVER LABS Europe); Gregory Rogez (NAVER LABS Europe) |
6694 | CAViT: Contextual Alignment Vision Transformer for Video Object Re-identification | jinlin wu (Institute of Automation, Chinese Academy of Sciences, Beijing, China)*; He Lingxiao (nlpr,cripac); Wu Liu (AI Research of JD.com); Yang Yang (Institute of Automation, Chinese Academy of Sciences); Zhen Lei (NLPR, CASIA, China); Tao Mei (AI Research of JD.com); Stan Z. Li (Westlake University) |
6698 | Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution | Cheng Ma (Tsinghua University); Jingyi Zhang (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)* |
6715 | Frozen CLIP Models are Efficient Video Learners | Ziyi Lin (The Chinese University of Hong Kong)*; Shijie Geng (Rutgers University); Renrui Zhang (Shanghai AI Lab); Peng Gao (Chinese university of hong kong); Gerard de Melo (Hasso Plattner Institute); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Jifeng Dai (SenseTime); Yu Qiao (Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences); Hongsheng Li (The Chinese University of Hong Kong) |
6719 | Deforming Radiance Fields with Cages | Tianhan Xu (The University of Tokyo)*; Tatsuya Harada (The University of Tokyo / RIKEN) |
6720 | GeoAug: Data Augmentation for Few-Shot NeRF with Geometry Constrains | Di Chen (Alibaba Group)*; Yu Liu (Alibaba Group); Lianghua Huang (Alibaba Group); bin wang (alibaba group); Pan Pan (Alibaba Group) |
6722 | DoodleFormer: Creative Sketch Drawing with Transformers | Ankan Kumar Bhunia (MBZUAI)*; Salman Khan (MBZUAI/ANU); Hisham Cholakkal (MBZUAI); Rao Muhammad Anwer (MBZUAI/AALTO); Fahad Shahbaz Khan (MBZUAI); Jorma Laaksonen (Aalto University); Michael Felsberg (Linköping University) |
6727 | Implicit Neural Representations for Variable Length Human Motion Generation | Pablo Alberto Cervantes Baque (Tokyo Institute of Technology)*; Yusuke Sekikawa (Denso IT Laboratory); Ikuro Sato (Tokyo Institute of Technology / Denso IT Laboratory); Koichi SHINODA (Tokyo Institute of Technology) |
6730 | FLEX: Extrinsic Parameters-free Multi-view 3D Human Motion Reconstruction | Brian Gordon (Tel Aviv University); Sigal Raab (Tel Aviv University)*; Guy Azov (Tel Aviv University); Raja Giryes (Tel Aviv University); Danny Cohen-Or (Tel Aviv University) |
6731 | Pairwise Contrastive Learning Network for Action Quality Assessment | Mingzhe Li (Huaqiao University); Hong-Bo Zhang (Huaqiao University)*; Qing Lei (Huaqiao University); Zongwen Fan (Huaqiao University); Jinghua Liu (Huaqiao University); Ji-Xiang Du (Huaqiao University) |
6742 | Large-displacement 3D Object Tracking with Hybrid Non-local Optimization | Xuhui Tian (Shandong University)*; Xinran Lin (Shandong University); Fan Zhong (Shandong University); Xueying N/A Qin (Shandong University) |
6745 | Learning Object Placement via Dual-path Graph Completion | Siyuan Zhou (Shanghai Jiao Tong University)*; Liu Liu (Shanghai Jiao Tong University); Li Niu (Shanghai Jiao Tong University); Liqing Zhang (Shanghai Jiao Tong University) |
6777 | Unbiased Manifold Augmentation for Coarse Class Subdivision | Baoming Yan (Alibaba Group)*; KE GAO (alibaba-inc); Bo Gao (Alibaba Group); Lin Wang (Alibaba-inc); Jiang Yang (Alibaba Group); Xiaobo Li (Alibaba) |
6798 | Rethinking Video Rain Streak Removal: A New Synthesis Model and A Deraining Network with Video Rain Prior | Shuai Wang ( College of Intelligence and Computing, Tianjin University); Lei Zhu (The Hong Kong University of Science and Technology (Guangzhou))*; Huazhu Fu (IHPC, ASTAR); Jing Qin (The Hong Kong Polytechnic University); Carola-Bibiane B Schönlieb (Cambridge University); Wei Feng (School of Computer Science and Technology, Tianjin University); Song Wang (University of South Carolina) |
6817 | Expanded Adaptive Scaling Normalization for End to End Image Compression | Chajin Shin (Yonsei University)*; Hyeongmin Lee (Yonsei University ); Hanbin Son (Yonsei Univ.); Sangjin Lee (Yonsei University); Dogyoon Lee (Yonsei University); Sangyoun Lee (Yonsei University) |
6827 | Embedding contrastive unsupervised features to cluster in- and out-of-distribution noise in corrupted image datasets | Paul Albert (Insight Centre for Data Analytics (DCU))*; Eric Arazo (Insight Centre for Data Analytics (DCU)); Noel O Connor (Home); Kevin McGuinness (DCU) |
6835 | Filter Pruning via Feature Discrimination in Deep Neural Networks | Zhiqiang He (Zhejiang University of Science and Technology)*; Yaguan QIAN (Zhejiang University of Science and Technology); Yuqi Wang (Zhejiang University of Science and Technology); Bin WANG (Network and Information Security Laboratory of Hangzhou Hikvision Digital Technology Co.); Xiaohui Guan (Zhejiang University of Water Resources and Electric Power); Zhaoquan Gu (Guangzhou University); Xiang Ling (Institute of Software, Chinese Academy of Sciences); Shaoning Zeng (Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China); Haijiang Wang (Zhejiang University of Science and Technology); Wujie Zhou (Zhejiang University of Science and Technology) |
6836 | VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer | Juan Felipe Montesinos (Universitat Pompeu Fabra)*; Venkatesh Shenoy Kadandale (Universitat Pompeu Fabra); Gloria Haro (Universitat Pompeu Fabra) |
6837 | SGBANet: Semantic GAN and Balanced Attention Network for Arbitrarily Oriented Scene Text Recognition | Dajian Zhong (East China Normal University)*; Shujing Lv (East China Normal University); Palaiahnakote Shivakumara (University of Malaya); Bing Yin (IFLYTEK Co.,Ltd); Jiajia Wu (IFLYTEK Co.,Ltd); Umapada Pal (Indian Statistical Institute, Kolkata); Yue Lu (East China Normal University) |
6838 | DenseHybrid: Hybrid Anomaly Detection for Dense Open-set Recognition | Matej Grcić (University of Zagreb, Faculty of Electrical Engineering and Computing)*; Petra Bevandić (Faculty of Electrical Engineering and Computing); Sinisa Segvic (UniZg-FER) |
6862 | D2-TPred: Discontinuous Dependency for Trajectory Prediction under Traffic Lights | Yuzhen Zhang (Zhengzhou University); Wentong Wang (Zhengzhou University); weizhi guo (zhengzhou university); Pei Lv (Zhengzhou University)*; Mingliang Xu (Zhengzhou University); Wei Chen (State Key Lab of CAD&CG, Zhejiang University); Dinesh Manocha (University of Maryland at College Park) |
6867 | Where in the World is this Image? Transformer-based Geo-localization in the Wild | Shraman Pramanick (Johns Hopkins University)*; Ewa M Nowara (Meta Reality Labs); Joshua Gleason (Univ of Maryland); Carlos Castillo (Johns Hopkins University); Rama Chellappa (Johns Hopkins University) |
6884 | MODE: Multi-view Omnidirectional Depth Estimation with 360-degree Cameras | Ming Li (NanJing University)*; Xueqian Jin (Nanjing University); Xuejiao Hu (Nanjing University); Jingzhao Dai (Nanjing University); Sidan Du (Nanjing University); Yang Li (NanJing University) |
6895 | NashAE: Disentangling Representations through Adversarial Covariance Minimization | Eric C Yeats (Duke University)*; Frank Liu (Oak Ridge National Lab); David Womble (Oak Ridge National Laboratory); Hai Li (Duke University) |
6900 | Rethinking Confidence Calibration for Failure Prediction | Fei Zhu (Institute of Automation of Chinese Academy of Sciences)*; Zhen Cheng (Institute of Automation of Chinese Academy of Sciences); Xu-Yao Zhang (Institute of Automation of Chinese Academy of Sciences); Cheng-Lin Liu (Institute of Automation of Chinese Academy of Sciences) |
6905 | Colorization for in situ marine plankton images | Guannan Guo (Shenzhen Institute of Advanced Technology ,Chinese Academy of Sciences); Qi Lin (Xiamen University); Tao Chen (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences); Zhenghui Feng (Harbin Institute of Technology, Shenzhen); Zheng Wang (Shenzhen Institutes of Advanced Technology); Jianping Li (Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences)* |
6912 | PIP: Physical Interaction Prediction via Mental Simulation with Span Selection | Jiafei Duan (University of Washington, Seattle)*; Samson Yu (Agency for Science, Technology and Research); Soujanya Poria (Singapore University of Technology and Design); Bihan Wen (Nanyang Technological University); Cheston Tan (Institute for Infocomm Research, Singapore) |
6917 | Generator Knows What Discriminator Should Learn in Unconditional GANs | Gayoung Lee (NAVER AI Lab)*; Hyunsu Kim (NAVER AI Lab); Junho Kim (NAVER AI Lab); Seonghyeon Kim (Clova AI Research, NAVER Corp.); Jung-Woo Ha (NAVER CLOVA AI Lab); Yunjey Choi (NAVER AI Lab) |
6921 | A Gyrovector Space Approach for Symmetric Positive Semi-definite Matrix Learning | Xuan Son Nguyen (Ensea)* |
6940 | Compositional Visual Generation with Composable Diffusion Models | Nan Liu (University of Illinois at Urbana-Champaign); Shuang Li (MIT); Yilun Du (MIT)*; Antonio Torralba (MIT); Joshua Tenenbaum (MIT) |
6942 | Temporal and cross-modal attention for audio-visual zero-shot learning | Otniel-Bogdan Mercea (University of Tübingen)*; Thomas Hummel (University of Tübingen); A. Sophia Koepke (University of Tübingen); Zeynep Akata (University of Tübingen) |
6946 | Telepresence Video Quality Assessment | Zhenqiang Ying (The University of Texas at Austin)*; Deepti Ghadiyaram (Facebook); Alan Bovik (University of Texas at Austin) |
6955 | Enhancing Multi-modal Features Using Local Self-attention for 3D Object Detection | hao li (Hikvision Digital Technology Co. Ltd)*; Zehan Zhang (Shanghai Jiao Tong University & Hangzhou Hikvision Digital Technology Co. Ltd); Zhao Xian (Hikvision); yulong wang (Hikvision Digital Technology Co. Ltd); Yuxi Shen (Hikvision); Shiliang Pu (Hikvision Research Institute); Hui Mao (Hangzhou hikvision digital technology Co.,Ltd) |
6956 | Totems: Physical Objects for Verifying Visual Integrity | Jingwei Ma (University of Washington)*; Lucy Chai (MIT); Minyoung Huh (MIT); Tongzhou Wang (MIT); Ser-Nam Lim (Meta AI); Phillip Isola (MIT); Antonio Torralba (MIT) |
6959 | ManiFest: manifold deformation for few-shot image translation | Fabio Pizzati (Inria / Vislab)*; Jean-Francois Lalonde (Université Laval); Raoul de Charette (Inria) |
6963 | 3D Shape Sequence of Human Comparison and Classification using Current and Varifolds | Emery Pierson (Université de Lille)*; Mohamed Daoudi (IMT Lille Douai); Sylvain Arguillere (Institute Camille Jordan) |
6971 | Decouple-and-Sample: Protecting sensitive information in task agnostic data release | Abhishek Singh (MIT)*; Ethan Garza (MIT); Ayush Chopra (MIT); Praneeth Vepakomma (MIT); Vivek Sharma (MIT); Ramesh Raskar (Massachusetts Institute of Technology) |
6972 | Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space | Wenqi Shao (The Chinese University of HongKong)*; Xun Zhao (Tencent Company); Yixiao Ge (Tencent); Zhaoyang Zhang (The Chinese University of Hong Kong); Lei Yang (Tencent); Xiaogang Wang (Chinese University of Hong Kong, Hong Kong); Ying Shan (Tencent); Ping Luo (The University of Hong Kong) |
6973 | Object Detection as Probabilistic Set Prediction | Georg Hess (Chalmers University of Technology)*; Christoffer Petersson (Zenseact); Lennart Svensson (Chalmers University of Technology) |
6974 | k-SALSA: k-anonymous synthetic averaging of retinal images via local style alignment | Minkyu Jeon (Korea University)*; Hyeonjin Park (Korea university); Hyunwoo J Kim (Korea University); Michael G Morley (Ophthalmic Consultants fo Boston); Hyunghoon Cho (Broad Institute of MIT and Harvard) |
6976 | Uncertainty-guided Source-free Domain Adaptation | Subhankar Roy (University of Trento)*; Martin Trapp (Aalto University ); Andrea Pilzer (Aalto University); Juho Kannala (Aalto University, Finland); Nicu Sebe (University of Trento); Elisa Ricci (University of Trento); Arno Solin (Aalto University) |
6978 | LA3: Efficient Label-Aware AutoAugment | Mingjun Zhao (University of Alberta)*; Shan Lu (University of Alberta); Zixuan Wang (Tencent Inc.); Xiaoli Wang (Tencent); Di Niu (University of Alberta) |
6982 | Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions | Zhi Li (University of California, Berkeley)*; Lu He (Tencent America); Huijuan Xu (Pennsylvania State University) |
6986 | Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos | Tanqiu Qiao (Durham University); Qianhui Men (University of Oxford); Frederick W. B. Li (University of Durham); Yoshiki Kubotani (Waseda University); Shigeo Morishima (Waseda Research Institute for Science and Engineering); Hubert P. H. Shum (Durham University)* |
6990 | FEAR: Fast, Efficient, Accurate and Robust Visual Tracker | Vasyl Borsuk (Ukrainian Catholic University); Roman Vei (Ukrainian Catholic University); Orest Kupyn (Ukrainian Catholic University); Tetiana Martyniuk (Ukrainian Catholic University)*; Igor Krashenyi (Piñata Farms); Jiri Matas (CMP CTU FEE) |
6997 | Variance-Aware Weight Initializationfor Point Convolutional Neural Networks | Pedro Hermosilla Casajus (Ulm University)*; Michael Schelling (Ulm University – Institute of Media Informatics); Tobias Ritschel (UCL); Timo Ropinski (Ulm University) |
7004 | Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training | Haoxuan You (Columbia University)*; Luowei Zhou (Microsoft); Bin Xiao (Microsoft); Noel C Codella (Microsoft); Yu Cheng (Microsoft Research); Ruochen Xu (Microsoft); Shih-Fu Chang (Columbia University); Lu Yuan (Microsoft) |
7016 | Single-Stream Multi-Level Alignment for Vision-Language Pretraining | Zaid Khan (Northeastern University)*; Vijay Kumar B G (NEC Laboratories America); Xiang Yu (NEC Labs); Samuel Schulter (NEC Laboratories America); Manmohan Chandraker (UC San Diego); YUN FU (Northeastern University) |
7022 | Revisiting Outer Optimization in Adversarial Training | Ali Dabouei (West Virginia university)*; Fariborz Taherkhani (Carnegie Mellon University); Sobhan Soleymani (West Virginia University); Nasser Nasrabadi (West Virginia University) |
7027 | Supervised Attribute Information Removal and Reconstruction for Image Manipulation | Nannan Li (Boston University)*; Bryan Plummer (Boston University) |
7028 | Conditional-Flow NeRF: Accurate 3D Modelling with Reliable Uncertainty Quantification | Jianxiong Shen (IRI, CSIC-UPC)*; Antonio Agudo (Institut de Robotica i Informatica Industrial, CSIC-UPC); Francesc Moreno (IRI); Adria Ruiz (Seedtag) |
7035 | BLT: Bidirectional Layout Transformer for Controllable Layout Generation | Xiang Kong (Carnegie Mellon University)*; Lu Jiang (Google Research); Huiwen Chang (Google); Han Zhang (Google); Yuan Hao (Google); Haifeng Gong (Google Inc.); Irfan Essa (Google) |
7039 | Neural Correspondence Field for Object Pose Estimation | Lin Huang (University at Buffalo); Tomas Hodan (Facebook Reality Labs)*; Lingni Ma (Facebook Reality Labs); Linguang Zhang (Facebook Reality Labs); Luan Tran (Facebook); Christopher D Twigg (Meta); PO-CHEN WU (Meta Inc.); Junsong Yuan (“State University of New York at Buffalo, USA”); Cem Keskin (Facebook); Robert Wang (Facebook Reality Labs) |
7043 | The Missing Link: Finding label relations across datasets | Jasper Uijlings (Google Research)*; Thomas Mensink (Google Research); Vittorio Ferrari (Google Research) |
7044 | On Label Granularity and Object Localization | Elijah Cole (Caltech)*; Kimberly Wilber (Google); Grant Van Horn (Cornell University); Xuan Yang (Google); Marco Fornoni (Google); Pietro Perona (California Institute of Technology); Serge Belongie (University of Copenhagen); Andrew Howard (Google); Oisin Mac Aodha (University of Edinburgh) |
7045 | RadioTransformer: A Cascaded Global-Focal Transformer for Visual Attention-guided Disease Classification | Moinak Bhattacharya (Stony Brook University)*; Shubham Jain (Stony Brook University); Prateek Prasanna (Stony Brook University) |
7048 | OIMNet++: Prototypical Normalization and Localization-aware Learning for Person Search | Sanghoon Lee (Yonsei University); Youngmin Oh (Yonsei University); Donghyeon Baek (Yonsei University); Junghyup Lee (Yonsei University); Bumsub Ham (Yonsei University)* |
7050 | Most and Least Retrievable Images in Visual-Language Query Systems | Liuwan Zhu (Old Dominion University)*; Rui Ning (Old Dominion University); Jiang Li (Old Dominion University); Chunsheng Xin (Old Dominion University); Hongyi Wu (Univesity of Arizona) |
7051 | Contrasting quadratic assignments for set-based representation learning | Artem Moskalev (University of Amsterdam)*; Ivan Sosnovik (University of Amsterdam); Volker Fischer (Bosch Center for Artificial Intelligence); Arnold W.M. Smeulders (University of Amsterdam) |
7061 | How stable are Transferability Metrics evaluations? | Andrea Agostinelli (Google)*; Michal Pandy (University of Cambridge); Jasper Uijlings (Google Research); Thomas Mensink (Google Research); Vittorio Ferrari (Google Research) |
7070 | A Comparative Study of Graph Matching Algorithms in Computer Vision | Stefan Haller (Heidelberg University)*; Lorenz Feineis (Heidelberg University); Lisa Hutschenreiter (Heidelberg University); Florian Bernard (University of Bonn); Carsten Rother (University of Heidelberg); Dagmar Kainmueller (MDC); Paul Swoboda (MPI fuer Informatik, Saarbruecken); Bogdan Savchynskyy (Heidelberg University) |
7077 | HM: Hybrid Masking for Few-Shot Segmentation | Seonghyeon Moon (Rutgers University)*; Samuel S Sohn (Rutgers University); Honglu Zhou (Rutgers University); Sejong Yoon (The College of New Jersey); Vladimir Pavlovic (Rutgers University); Muhammad Haris Khan (Muhammad Bin Zayed University of Artificial Intelligence); Mubbasir Kapadia (Rutgers) |
7082 | UCTNet: Uncertainty-aware Cross-modal Transformer Network for Indoor RGB-D Semantic Segmentation | Xiaowen Ying (Lehigh University)*; Mooi Choo Chuah (Lehigh University) |
7090 | Learning Omnidirectional Flow in 360° Video via Siamese Representation | Keshav Bhandari (Texas State University)*; Bin Duan (Illinois Institute of Technology); Gaowen Liu (Cisco Research); Hugo M Latapie (Cisco); Ziliang Zong (Texas State University); Yan Yan (Illinois Institute of Technology) |
7093 | Improving Generalization in Federated Learning by Seeking Flat Minima | Debora Caldarola (Politecnico di Torino)*; Barbara Caputo (Politecnico di Torino); Marco Ciccone (Politecnico di Torino) |
7099 | Efficient Deep Visual and Inertial Odometry with Adaptive Visual Modality Selection | Mingyu Yang (University of Michigan)*; Yu Chen (University of Michigan); Hun Seok Kim (Nil) |
7102 | MultiMAE: Multi-modal Multi-task Masked Autoencoders | Roman Bachmann (EPFL)*; David Mizrahi (EPFL); Andrei Atanov (EPFL); Amir Zamir (Swiss Federal Institute of Technology (EPFL)) |
7110 | GigaDepth: Learning Depth from StructuredLight with Branching Neural Networks | Simon Schreiberhuber (TUWien)*; Jean-Baptiste Weibel (TU Wien); Timothy Patten (University of Technology Sydney); Markus Vincze (TU Wien) |
7122 | Diverse Generation from a Single Video Made Possible | Niv Haim (Weizmann Institute of Science)*; Ben Feinstein (Weizmann Institute of Science); Niv Granot (Weizmann Institute of Science); Assaf Shocher (Weizmann Institute of Science); Shai Bagon (Weizmann Institute of Science); Tali Dekel (Weizmann Institute of Science); Michal Irani (Weizmann Institute, Israel) |
7127 | Privacy-Preserving Action Recognition via Motion Difference Quantization | Sudhakar Kumawat (Osaka University)*; Hajime Nagahara (Osaka University) |
7139 | Learning Phase Mask for Privacy-Preserving Passive Depth Estimation | Zaid Tasneem (Rice University); Giovanni Milione (4 independence Way, Princeton, NJ 08540); Yi-Hsuan Tsai (Phiar Technologies); Xiang Yu (NEC Labs); Ashok Veeraraghavan (Rice University); Manmohan Chandraker (UC San Diego); Francesco Pittaluga (NEC Laboratories America)* |
7143 | DuelGAN: A Duel Between Two Discriminators Stabilizes the GAN Training | Jiaheng Wei (UCSC)*; Minghao Liu (UCSC); Jiahao Luo (UCSC); Andrew Zhu (UCSC); James E Davis (UC Santa Cruz); Yang Liu (UC Santa Cruz) |
7151 | Should All Proposals be Treated Equally in Object Detection? | Yunsheng Li (UCSD)*; Yinpeng Chen (Microsoft); Xiyang Dai (Microsoft); DongDong Chen (Microsoft Cloud AI); Mengchen Liu (Microsoft); Pei Yu (); Ying Jin (Microsoft); Lu Yuan (Microsoft); Zicheng Liu (Microsoft); Nuno Vasconcelos (UC San Diego) |
7153 | Interpretations Steered Network Pruning via Amortized Inferred Saliency Maps | Alireza Ganjdanesh (University of Pittsburgh); Shangqian Gao (University of Pittsburgh); Heng Huang (University of Pittsburgh)* |
7158 | Out-of-Distribution Identification: Let Detector Tell Which I Am Not Sure | Ruoqi Li (SJTU); Chongyang Zhang (Shanghai Jiao Tong University)*; Hao Zhou (Shanghai Jiao Tong University); Chao Shi (Shanghai Jiao Tong University); Yan Luo (Shanghai Jiao Tong University) |
7167 | Unsupervised Few-Shot Image Classification by Learning Features into Clustering Space | Shuo Li (Xidian University); Fang Liu (Xidian University)*; Zehua Hao (Xidian University); Kaibo Zhao (Xidian University); Licheng Jiao (Xidian University) |
7173 | ViP: Unified Certified Detection and Recovery for Patch Attack with Vision Transformers | Junbo Li (UC Santa Cruz); Huan Zhang (UCLA); Cihang Xie (University of California, Santa Cruz)* |
7174 | Panoramic Vision Transformer for Saliency Detection in 360 Videos | Heeseung Yun (Seoul National University)*; Sehun Lee (Seoul National University); Gunhee Kim (Seoul National University) |
7175 | ActiveNeRF: Learning where to See with Uncertainty Estimation | Xuran Pan (Tsinghua University); Zihang Lai (CMU); Shiji Song (Department of Automation, Tsinghua University); Gao Huang (Tsinghua)* |
7176 | incDFM: Incremental Deep Feature Modeling for Continual Novelty Detection | Amanda S Rios (University of Southern California; Intel )*; Nilesh A Ahuja (Intel); Ibrahima Ndiour (Intel); Ergin U Genc (Intel); Laurent Itti (University of Southern California); Omesh Tickoo (Intel) |
7186 | BA-Net: Bridge Attention for Deep Convolutional Neural Networks | Yue Zhao (Sun Yat-sen University); Junzhou Chen (Sun Yat-sen University)*; Zhang Zirui (Sun Yat-sen University); Ronghui Zhang (Sun Yat-Sen University) |
7199 | Super-Resolution by Predicting Offsets: An Ultra-Efficient Super-Resolution Network for Rasterized Images | Jinjin Gu (The University of Sydney)*; Haoming CAI (University of Maryland, College Park); Chenyu Dong (Graduate school at Shenzhen , Tsinghua University); Ruofan Zhang (Tsinghua University); Yulun Zhang (ETH Zurich); Wenming Yang (Tsinghua University); Chun Yuan (Graduate school at ShenZhen,Tsinghua university) |
7210 | Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance | Zhihang Zhong (The University of Tokyo); Xiao Sun (Microsoft Research Asia); Zhirong Wu (Microsoft Research); Yinqiang Zheng (The University of Tokyo); Stephen Lin (Microsoft Research)*; Imari Sato (National Institute of Informatics) |
7211 | Zero-Shot Attribute Attacks on Fine-Grained Recognition Models | Nasim Shafiee (Northeastern University)*; Ehsan Elhamifar (Northeastern University) |
7214 | Break and Make: Interactive Structural Understanding Using LEGO Bricks | Aaron T Walsman (University of Washington)*; Muru Zhang (University of Washington); Klemen Kotar (Allen Institute for AI); Karthik Desingh (University Washington); Dieter Fox (NVIDIA Research / University of Washington); Ali Farhadi (University of Washington, Allen Institue for AI, Apple) |
7218 | PoserNet: Refining Relative Camera Poses Exploiting Object Detections | Matteo Taiana (Istituto Italiano di Tecnologia)*; Matteo Toso (Istituto Italiano di Tecnologia); Stuart James (Istituto Italiano di Tecnologia (IIT)); Alessio Del Bue (Istituto Italiano di Tecnologia (IIT)) |
7224 | Towards Effective and Robust Neural Trojan Defenses via Input Filtering | Kien Duc Do (Deakin Unviersity)*; Haripriya Harikumar (Deakin University); Hung Le (Deakin University); Dung Nguyen (Deakin University); Truyen Tran (Deakin University); Santu Rana (Deakin University, Australia); Dang Nguyen (Deakin University); Willy Susilo (University of Wollongong); Svetha Venkatesh (Deakin University) |
7230 | View Vertically: A Hierarchical Network for Trajectory Prediction via Fourier Spectrums | Conghao Wong (Huazhong University of Science and Technology); Beihao Xia (Huazhong University of Science and Technology); Ziming Hong (Huazhong University of Science and Technology); Qinmu Peng (Huazhong University of Science and Technology); Wei Yuan (Huazhong University of Science and Technology); Qiong Cao (JD.com); Yibo Yang (Peking University); Xinge YOU (Huazhong University of Science and Technology)* |
7238 | Bi-directional Contrastive Learning for Domain Adaptive Semantic Segmentation | Geon Lee (Yonsei University); Chanho Eom (Yonsei University); Wonkyung Lee (PS Analytics); Hyekang Park (Yonsei University); Bumsub Ham (Yonsei University)* |
7277 | Rayleigh EigenDirections (REDs): Nonlinear GAN latent space traversals for multidimensional features | Guha Balakrishnan (Rice University)*; Raghudeep Gadde (Amazon); Aleix M Martinez (Amazon); Pietro Perona (Amazon Web Services (AWS)) |
7278 | ActionFormer: Localizing Moments of Actions with Transformers | Chen-Lin Zhang (4Paradigm, Inc); Jianxin Wu (Nanjing University); Yin Li (University of Wisconsin-Madison)* |
7281 | Theoretical Understanding of the Information Flow on Continual Learning Performance | Joshua J Andle (University of Maine); Salimeh Yasaei Sekeh (University of Maine)* |
7283 | 3DG-STFM: 3D Geometric Guided Student-Teacher Feature Matching | Runyu Mao (Purdue University)*; Chen Bai (Xpeng Motors); yatong an (xm); Fengqing Maggie Zhu (Purdue University, USA); Cheng Lu (Xiaopeng) |
7288 | Pure Transformer with Integrated Experts for Scene Text Recognition | Yew Lee Tan (Nanyang Technological University)*; Wai-Kin Adams Kong (Nanyang Technological University); Jung Jae Kim (I2R) |
7301 | AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation | Efthymios Tzinis (University of Illinois at Urbana-Champaign); Scott Wisdom (Google)*; Tal Remez (Google); John Hershey (Google) |
7304 | Bridging the Domain Gap towards Generalization in Automatic Colorization | Hyejin Lee (Kookmin University); Daehee Kim (Naver Corp.); Daeun Lee (Korea university); Jinkyu Kim (Korea University); Jaekoo Lee (Kookmin University)* |
7311 | Learning with Free Object Segments for Long-Tailed Instance Segmentation | Cheng Zhang (Carnegie Mellon University)*; Tai-Yu Pan (The Ohio State University); tianle chen (The Ohio State University); Jike Zhong (The Ohio State University); Wenjin Fu (The Ohio State University); Wei-Lun Chao (The Ohio State University) |
7315 | Rethinking Closed-loop Training for Autonomous Driving | Chris Zhang (Waabi / University of Toronto)*; Runsheng Guo (University of Waterloo); Wenyuan Zeng (Waabi, University of Toronto); Yuwen Xiong (University of Toronto); Binbin Dai (Waabi); Rui Hu (Waabi); Mengye Ren (NYU / Google); Raquel Urtasun (Uber ATG) |
7331 | Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction | YuXuan Liu (Covariant.ai, UC Berkeley)*; Nikhil Mishra (Covariant.ai, UC Berkeley); Maximilian Sieb (Covariant.ai); Fred Shentu (UC Berkeley); Pieter Abbeel (UC Berkeley); Peter Chen (COVARIANT.AI) |
7337 | Learning Regional Purity for Instance Segmentation on 3D Point Clouds | Shichao Dong (Nanyang Technological University)*; Guosheng Lin (Nanyang Technological University); Tzu-Yi HUNG (Delta Research Center) |
7346 | Learning from Unlabeled 3D Environments for Vision-and-Language Navigation | Shizhe Chen (INRIA)*; Pierre-Louis Guhur (Inria); Makarand Tapaswi (Wadhwani AI, IIIT Hyderbad); Cordelia Schmid (Inria/Google); Ivan Laptev (INRIA Paris) |
7350 | A Dataset Generation Framework for Evaluating Megapixel Image Classifiers & their Explanations | Gautam B Machiraju (Stanford University)*; Sylvia Plevritis (Stanford University); Parag Mallick (Stanford University) |
7351 | Sports Video Analysis on Large-Scale Data | Dekun Wu (University of Pittsburgh)*; He Zhao (York University); Xingce Bao (EPFL); Rick Wildes (York University) |
7368 | Audio-Visual Segmentation | Jinxing Zhou (Hefei University of Technology); Jianyuan Wang (Chinese University of Hong Kong); Jiayi Zhang (BeiHang University); Weixuan Sun (Australian National University); Jing Zhang (Australian National University); Stan Birchfield (NVIDIA); Dan Guo (Hefei University of Technology); Lingpeng Kong (The University of Hong Kong); Meng Wang (Hefei University of Technology); Yiran Zhong (Australian National University)* |
7374 | SLiDE: Self-supervised LiDAR De-snowing through Reconstruction Difficulty | Gwangtak Bae (Seoul National University)*; Byungjun Kim (Seoul National University); Seongyong Ahn (Agency for Defense Development); jihong Min (Agency for Defense Development); Inwook Shim (Inha University) |
7375 | On the Angular Update and Hyperparameter Tuning of a Scale-Invariant Network | Juseung Yun (KAIST)*; Janghyeon Lee (LG AI Research); Hyounguk Shon (KAIST); Eojindl Yi (KAIST); Seung Hwan Kim (LG AI Research); Junmo Kim (KAIST) |
7384 | IGFormer: Interaction Graph Transformer for Skeleton-based Human Interaction Recognition | Yunsheng Pang (University of Melbourne)*; Qiuhong Ke (Monash University); Hossein Rahmani (Lancaster University); James Bailey (THE UNIVERSITY OF MELBOURNE); Jun Liu (Singapore University of Technology and Design) |
7385 | LANA: Latency Aware Network Acceleration | Pavlo Molchanov (NVIDIA)*; James B Hall (Microsoft Research); Hongxu Yin (NVIDIA ); Nicolo Fusi (Microsoft Research); Jan Kautz (NVIDIA); Arash Vahdat (NVIDIA) |
7388 | A Sketch Is Worth a Thousand Words:Image Retrieval with Text and Sketch | Patsorn Sangkloy (Georgia Institute of Technology)*; Wittawat Jitkrittum (Google Research); Diyi Yang (Georgia Institute of Technology); James Hays (Georgia Institute of Technology, USA) |
7396 | HVC-Net: Unifying Homography, Visibility, and Confidence Learning for Planar Object Tracking | Haoxian Zhang (Tencent)*; Yonggen Ling (Tencent) |
7417 | 3D Random Occlusion and Multi-Layer Projection for Deep Multi-Camera Pedestrian Localization | Rui Qiu (Xi’an Jiaotong-Liverpool University, University of Liverpool); Ming Xu (Xi’an Jiaotong-Liverpool University)*; Yuyao Yan (Xi’an Jiaotong-Liverpool University); Jeremy S Smith (University of Liverpool); Xi Yang (Xi’an Jiaotong Liverpool University ) |
7427 | Masked Siamese Networks for Label-Efficient Learning | Mahmoud Assran (Facebook AI)*; Mathilde Caron (Facebook Artificial Intelligence Research); Ishan Misra (Facebook AI Research); Piotr Bojanowski (Facebook); Florian Bordes (MILA); Pascal Vincent (Facebook FAIR & MILA Université de Montréal); Armand Joulin (Facebook AI Research); Mike Rabbat (Facebook FAIR); Nicolas Ballas (Facebook FAIR) |
7441 | A Simple Single-Scale Vision Transformer for Object Detection and Instance Segmentation | Wuyang Chen (University of Texas at Austin)*; Xianzhi Du (Google Brain); Fan Yang (Google); Lucas Beyer (Google Brain); Xiaohua Zhai (Google Brain); Tsung-Yi Lin (Google Brain); Huizhong Chen (Google); Jing Li (Google Brain); Xiaodan Song (Google Brain); Zhangyang Wang (University of Texas at Austin); Denny Zhou (Google Brain) |
7443 | A Cloud 3D Dataset and Application-Specific Learned Image Compression in Cloud 3D | Tianyi Liu (The University of Texas at San Antonio)*; Sen He (The University of Texas at San Antonio); Vinodh Kumaran Jayakumar (UTSA); Wei Wang (The University of Texas at San Antonio) |
7449 | Cross-Domain Few-Shot Semantic Segmentation | Shuo Lei (Virginia Tech)*; Xuchao Zhang (NEC Labs America); Jianfeng He (Virginia Tech); Fanglan Chen (Virginia Tech); Bowen Du (Beihang Univeristy); Chang-Tien Lu (Virginia Tech, USA) |
7450 | VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments | Yu-Yun Tseng (University of Colorado Boulder)*; Alexander Bell (IVC Group); Danna Gurari (University of Colorado Boulder) |
7474 | Towards Metrical Reconstruction of Human Faces | Wojciech Zielonka (Max Planck Institute for Intelligent Systems); Timo Bolkart (Max Planck Institute for Intelligent Systems); Justus Thies (Max Planck Institute for Intelligent Systems)* |
7476 | DeepShadow: Neural Shape from Shadow | Asaf Karnieli (Reichman University)*; Yacov Hel-Or (The Interdisciplinary Center); Ohad Fried (IDC Herzliya) |
7500 | Class-Incremental Learning with Cross-Space Clustering and Controlled Transfer | Arjun Ashok (Indian Institute of Technology, Hyderabad)*; Joseph K J (Indian Institute of Technology, Hyderabad); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad) |
7509 | Object discovery and representation networks | Olivier Henaff (DeepMind)*; Skanda Koppula (DeepMind); Evan Shelhamer (DeepMind); Daniel Zoran (DeepMind); Andrew Jaegle (DeepMind); Andrew Zisserman (Oxford University); Joao Carreira (DeepMind); Relja Arandjelović (DeepMind) |
7511 | MeshUDF: Fast and Differentiable Meshing of Unsigned Distance Field Networks | Benoit Guillard (EPFL)*; Federico Stella (EPFL); Pascal Fua (EPFL, Switzerland) |
7519 | Natural Synthetic Anomalies for Self-Supervised Anomaly Detection and Localization | Hannah M Schlueter (Imperial College London)*; Jeremy Tan (Imperial College London); Benjamin Hou (Imperial College London); Bernhard Kainz (Imperial College London, FAU Erlangen-Nürnberg) |
7522 | Shap-CAM: Visual Explanations for Convolutional Neural Networks based on Shapley Value | Quan Zheng (Tsinghua University); Ziwei Wang (Tsinghua University); Jie Zhou (Tsinghua University); Jiwen Lu (Tsinghua University)* |
7529 | Simple Open-Vocabulary Object Detection with Vision Transformers | Matthias Minderer (Google Research)*; Alexey Gritsenko (Google Brain); Austin C Stone (Google); Maxim Neumann (Google); Dirk Weißenborn (German Research Center for Artificial Intelligence); Alexey Dosovitskiy (Inceptive); Aravindh Mahendran (Google); Anurag Arnab (Google); Mostafa Dehghani (Google Brain); Zhuoran Shen (Pony.ai); Xiao Wang (Google); Xiaohua Zhai (Google Brain); Thomas Kipf (Google Brain); Neil Houlsby (Google) |
7533 | Video Restoration Framework and its Meta-adaptations to Data-poor Conditions | Prashant W Patil (Deakin University)*; Sunil Gupta (Deakin University, Australia); Santu Rana (Deakin University, Australia); Svetha Venkatesh (Deakin University) |
7539 | PRIME: A Few Primitives Can Boost Robustness to Common Corruptions | Apostolos Modas (EPFL)*; Rahul Shekhar Rade (EthonAI); Guillermo Ortiz-Jimenez (EPFL); Seyed-Mohsen Moosavi-Dezfooli (Imperial College London); Pascal Frossard (EPFL) |
7541 | AlphaVC: High-Performance and Efficient Learned Video Compression | Yibo Shi (Huawei); Yunying Ge (Huawei Technologies); Jing Wang (Huawei)*; Jue Mao (Huawei technologies) |
7542 | Content-Oriented Learned Image Compression | Meng Li (Huawei); Shangyin Gao (Huawei); Yihui Feng (HUAWEI Technology Co., Ltd); Yibo Shi (Huawei); Jing Wang (Huawei)* |
7543 | Generating Natural Images with Direct Patch Distributions Matching | Ariel Elnekave (Hebrew University of Jerusalem)*; Yair Weiss (Hebrew University) |
7545 | Latent Space Smoothing for Individually Fair Representations | Momchil Peychev (ETH Zurich)*; Anian Ruoss (DeepMind); Mislav Balunovic (ETH Zurich); Maximilian Baader (ETH Zürich); Martin Vechev (ETH Zurich) |
7555 | SAU: Smooth activation function using convolution with approximate identities | Koushik Biswas (Indraprastha Institute of Information Technology, New Delhi, India)*; Sandeep Kumar (Shaheed Bhagat Singh College, University of Delhi, Delhi); Shilpak Banerjee (Indian Institute of Technology Tirupati); Ashish Kumar Pandey (Indraprastha Institute of Information Technology, New Delhi, India) |
7561 | TRoVE: Transforming Road Scene Datasets into Photorealistic Virtual Environments | Shubham Dokania (IIIT Hyderabad)*; Anbumani Subramanian (IIIT-Hyderabad); Manmohan Chandraker (UC San Diego); C.V. Jawahar (IIIT-Hyderabad) |
7562 | Motion Sensitive Contrastive Learning for Self-supervised Video Representation | JingCheng Ni (Behang University)*; Nan Zhou (Beihang University); Jie Qin (Nanjing University of Aeronautics and Astronautics); Qian Wu (Megvii); Junqi Liu (Megvii); Boxun Li (Megvii Inc.); Di Huang (Beihang University, China) |
7573 | Scaling Adversarial Training to Large Perturbation Bounds | Sravanti Addepalli (Indian Institute of Science)*; Samyak Jain (Indian Institute of Technology (BHU), Varanasi); Gaurang Sriramanan (University of Maryland, College Park); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science) |
7592 | RDO-Q: Extremely Fine-Grained Channel-Wise Quantization via Rate-Distortion Optimization | Zhe Wang (Institute for Infocomm Research, Singapore); Jie Lin (Institute for Infocomm Research (I2R), Singapore); Xue Geng (I2R, ASTAR); Mohamed M. Sabry Aly (Nanyang Technological University); Vijay R. Chandrasekhar (Institute for Infocomm Research) |
7605 | Camera Auto-calibration from the Steiner Conic of the Fundamental Matrix | Yu LIU (United International College, BNU-HKBU)*; Hui Zhang (UIC) |
7626 | Understanding Collapse in Non-Contrastive Siamese Representation Learning | Alexander C Li (Carnegie Mellon University)*; Alexei A Efros (UC Berkeley); Deepak Pathak (Carnegie Mellon University) |
7634 | AutoTransition: Learning to Recommend Video Transition Effects | Yaojie Shen (Institute of Software, Chinese Academy of Sciences); Libo Zhang (Institute of Software Chinese Academy of Sciences); Kai Xu (ByteDance Inc); Xiaojie Jin (Bytedance Inc. USA)* |
7651 | SPE-Net: Boosting Point Cloud Analysis via Rotation Robustness Enhancement | Zhaofan Qiu (JD.com); Yehao Li (JD AI Research); Yu Wang (JD AI Research); Yingwei Pan (JD AI Research); Ting Yao (JD AI Research)*; Tao Mei (AI Research of JD.com) |
7667 | Text-based Temporal Localization of Novel Events | Sudipta Paul (University of California, Riverside)*; Niluthpol C Mithun (SRI International); Amit K. Roy-Chowdhury (University of California, Riverside) |
7687 | Effective Presentation Attack Detection Driven by Face Related Task | Wentian Zhang (Shenzhen University); Haozhe Liu ( King Abdullah University of Science and Technology); Feng Liu (Shenzhen University )*; Raghavendra Ramachandra (NTNU, Norway); Christoph Busch (Norwegian University of Science and Technology) |
7691 | LWGNet – Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval | Atreyee Saha (Indian Institute of Technology Madras)*; Salman Siddique Khan (IIT Madras); Sagar Sehrawat (IIT Madras); Sanjana S Prabhu (Indian Institute of Technology Madras); Shanti Bhattacharya (IIT Madras); Kaushik Mitra (IIT Madras) |
7693 | Federated Self-supervised Learning for Video Understanding | Yasar Rehman (TCL Corporate Research(Hong Kong) Co. Ltd); Yan Gao (University of Cambridge)*; Jiajun Shen (TCL Research); Pedro Gusmao (University of Cambridge); Nicholas Lane (University of Cambridge and Samsung AI) |
7694 | Reliability-Aware Prediction via Uncertainty Learning for Person Image Retrieval | Zhaopeng Dou (Tsinghua University)*; Zhongdao Wang (Tsinghua University); Weihua Chen (alibaba group); Ya-Li Li (Tsinghua University); Shengjin Wang (Tsinghua University) |
7704 | The Shape Part Slot Machine: Contact-based Reasoning for Generating 3D Shapes from Parts | Kai Wang (Brown University)*; Paul Guerrero (Adobe); Vladimir Kim (Adobe); Siddhartha Chaudhuri (Adobe Research); Minhyuk Sung (KAIST); Daniel Ritchie (Brown University) |
7710 | Attention Diversification for Domain Generalization | Rang Meng (Hikvision Research Institute)*; Xianfeng Li (Hikvision Research Institute ); Weijie Chen (Zhejiang University); Shicai Yang (Hikvision Research Institute); Jie Song (Zhejiang University); Xinchao Wang (National University of Singapore); Lei Zhang (Chongqing University); Mingli Song (Zhengjiang University); Di Xie (Hikvision Research Institute); Shiliang Pu (Hikvision Research Institute) |
7718 | Exploiting the local parabolic landscapes of adversarial losses to accelerate black-box adversarial attack | Hoang Tran (Oak Ridge National Laboratory); Dan Lu (Oak Ridge National Laboratory); Guannan Zhang (Oak Ridge National Laboratory)* |
7719 | Towards Efficient and Effective Self-Supervised Learning of Visual Representations | Sravanti Addepalli (Indian Institute of Science)*; Kaushal Bhogale (Indian Institute of Technology, Madras); Priyam Dey (Indian Institute of Science); Venkatesh Babu RADHAKRISHNAN (Indian Institute of Science) |
7722 | TransVLAD: Focusing on Locally Aggregated Descriptors for Few-Shot Learning | Haoquan Li (Southern University of Science and Technology)*; Laoming Zhang (Southern University of Science and Technology); Daoan Zhang (Southern University of Science and Technology); Lang Fu (Southern University of Science and Technology); Peng Yang (Southern University of Science and Technology); Jianguo Zhang (Southern University of Science and Technology) |
7735 | Rotation Regularization Without Rotation | Takumi Kobayashi (National Institute of Advanced Industrial Science and Technology)* |
7741 | Parameterized Temperature Scaling for Boosting the Expressive Power in Post-Hoc Uncertainty Calibration | Christian Tomani (TUM)*; Daniel Cremers (TU Munich); Florian Buettner (German Cancer Research Center and Frankfurt University) |
7746 | FairStyle: Debiasing StyleGAN2 with Style Channel Manipulations | Cemre Efe Karakas (Bogazici University); Alara Dirik (Bogazici University); Eylül Yalçınkaya (Bogazici University); Pinar Yanardag (Bogazici University)* |
7756 | Dynamic Temporal Filtering in Video Models | Fuchen Long (JD.com); Zhaofan Qiu (JD.com); Yingwei Pan (JD AI Research)*; Ting Yao (JD AI Research); Chong-Wah Ngo (Singapore Management University); Tao Mei (AI Research of JD.com) |
7764 | DH-AUG: DH Forward Kinematics Model Driven Augmentation for 3D Human Pose Estimation | linzhi huang (Beijing University of Posts and Telecommunications)*; Jiahao Liang (Beijing University of Posts and Telecommunications); Weihong Deng (Beijing University of Posts and Telecommunications) |
7765 | Super-resolution 3D Human Shape from a Single Low-Resolution Image | Marco Pesavento (University of Surrey)*; Marco Volino (University of Surrey); Adrian Hilton (University of Surrey) |
7771 | Trading Positional Complexity vs Deepness in Coordinate Networks | Jianqiao Zheng (University of Adelaide)*; Sameera Ramasinghe (University of Adelaide); Xueqian Li (Carnegie Mellon University); Simon Lucey (University of Adelaide) |
7785 | ESS: Learning Event-based Semantic Segmentation from Still Images | Zhaoning Sun (ETH Zürich); Nico Messikommer (University of Zurich & ETH Zurich)*; Daniel Gehrig (University of Zurich & ETH Zurich); Davide Scaramuzza (University of Zurich & ETH Zurich, Switzerland) |
7802 | U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search | Ahmet Yüzügüler (EPFL)*; Nikolaos Dimitriadis (EPFL); Pascal Frossard (EPFL) |
7803 | MonteBoxFinder: Detecting and Filtering Primitives to Fit a Noisy Point Cloud | Michaël Ramamonjisoa (Ecole des Ponts)*; Sinisa Stekovic (Graz University of Technology); Vincent Lepetit (Ecole des Ponts ParisTech) |
7815 | Trapped in texture bias? A large scale comparison of deep instance segmentation | Johannes Theodoridis (Hochschule der Medien Stuttgart)*; Jessica Hofmann (Hochschule der Medien); Johannes Maucher (Media University Stuttgart); Andreas G Schilling (University of Tübingen) |
7845 | MVDG: A Unified Multi-view Framework for Domain Generalization | Jian Zhang (Nanjing University)*; Lei Qi (Southeast University); Yinghuan Shi (Nanjing University); Yang Gao (Nanjing University) |
7847 | MINER: Multiscale Implicit Neural Representation | Vishwanath Saragadam (Rice University)*; Jasper T Tan (Rice University); Guha Balakrishnan (Rice University); Richard Baraniuk (Rice University); Ashok Veeraraghavan (Rice University) |
7856 | PTQ4ViT: Post-Training Quantization for Vision Transformers with Twin Uniform Quantization | Zhihang Yuan (Peking University)*; Chenhao Xue (Peking University); Yiqi Chen (Peking University); Qiang Wu (HOUMO.AI); Guangyu Sun (Peking University) |
7865 | Context-Consistent Semantic Image Editing with Style-Preserved Modulation | Wuyang Luo (School of Computer Science, Fudan University); Su Yang (School of Computer Science, Fudan University)*; Hong Wang (School of Computer Science, Fudan University); Bo Long (School of Computer Science, Fudan University ); Weishan Zhang (Department of Software Engineering, China University of Petroleum) |
7874 | Distilling the Undistillable: Learning from a Nasty Teacher | Surgan Jandial (MDSR Labs, Adobe)*; Yash Khasbage (Indian Institute of Technology, Hyderabad); Arghya Pal (Harvard University); Vineeth N Balasubramanian (Indian Institute of Technology, Hyderabad); Balaji Krishnamurthy () |
7879 | Grounding Visual Representations with Texts for Domain Generalization | Seonwoo Min (LG AI Research)*; Nokyung Park (Korea University); Siwon Kim (Seoul National University); Seunghyun Park (Clova AI Research, NAVER Corp.); Jinkyu Kim (Korea University) |
7883 | Towards Accurate Open-Set Recognition via Background-Class Regularization | Wonwoo Cho (Korea Advanced Institute of Science and Technology)*; Jaegul Choo (Korea Advanced Institute of Science and Technology) |
7899 | In Defense of Image Pre-Training for Spatiotemporal Recognition | Xianhang Li (University of California, Santa Cruz)*; Huiyu Wang (JHU); Chen Wei (Johns Hopkins University); Jieru Mei (Johns Hopkins University); Alan Yuille (Johns Hopkins University); Yuyin Zhou (UC Santa Cruz); Cihang Xie (University of California, Santa Cruz) |
7925 | SocialVAE: Human Trajectory Prediction using Timewise Latents | Pei Xu (Clemson University)*; Jean-Bernard Hayet (CIMAT); Ioannis Karamouzas (Clemson University) |
7926 | BodySLAM: Joint Camera Localisation, Mapping, and Human Motion Tracking | Dorian F Henning (Imperial College London)*; Tristan Laidlow (Imperial College London); Stefan Leutenegger (TU Munich) |
7935 | Eliminating Gradient Conflict in Reference-based Line-Art Colorization | zekun li (University of Electronic Science and Technology of China)*; Zhengyang Geng (Peking University); Zhao Kang (University of Electronic Science and Technology of China); Wenyu Chen (University of Electronic Science and Technology of China); Yibo Yang (Peking University) |
7950 | Transfer without Forgetting | Matteo Boschini (University of Modena and Reggio Emilia)*; Lorenzo Bonicelli (Università of Modena and Reggio Emilia); Angelo Porrello (University of Modena and Reggio Emilia); Giovanni Bellitto (University of Catania); Matteo Pennisi (University of Catania); Simone Palazzo (University of Catania); Concetto Spampinato (University of Catania); SIMONE CALDERARA (University of Modena and Reggio Emilia, Italy) |
7955 | DSR — A dual subspace re-projection network for surface anomaly detection | Vitjan Zavrtanik (University of Ljubljana)*; Matej Kristan (University of Ljubljana); Danijel Skocaj (University of Ljubljana) |
7964 | Multi-Exit Semantic Segmentation Networks | Alexandros Kouris (Imperial College London and Samsung AI)*; Stylianos Venieris (Samsung AI); Stefanos Laskaridis (Samsung AI); Nicholas Lane (University of Cambridge and Samsung AI) |
7968 | Almost-Orthogonal Layers for Efficient General-Purpose Lipschitz Networks | Bernd Prach (IST Austria)*; Christoph H Lampert (IST Austria) |
8001 | Bridging the visual semantic gap in VLN via semantically richer instructions | Joaquín Ignacio Ossandón (Universidad Catolica de Chile)*; Benjamín Earle (Universidad Católica de Chile); Alvaro Soto (Universidad Catolica de Chile) |
8003 | Kernel Relative-prototype Spectral Filtering for Few-shot Learning | Tao Zhang (Chengdu Techman Software Co., Ltd.)*; Wu Huang (Sichuan University) |
8009 | StoryDALL-E: Adapting Pretrained Text-to-image Transformers for Story Continuation | Adyasha Maharana (UNC Chapel Hill)*; Darryl Hannan (University of North Carolina at Chapel Hill); Mohit Bansal (University of North Carolina at Chapel Hill) |
8026 | Unsupervised Learning of Efficient Geometry-Aware Neural Articulated Representations | Atsuhiro Noguchi (The University of Tokyo)*; Xiao Sun (Microsoft Research Asia); Stephen Lin (Microsoft Research); Tatsuya Harada (The University of Tokyo / RIKEN) |
8029 | PANDORA: Polarization-Aided Neural Decomposition Of Radiance | Akshat Dave (Rice University)*; Yongyi Zhao (Rice University); Ashok Veeraraghavan (Rice University) |
8042 | OCR-free Document Understanding Transformer | Geewook Kim (NAVER Corporation)*; Teakgyu Hong (Upstage AI); Moonbin Yim (Clova AI Research, NAVER Corp.); Jeongyeon Nam (Naver); Jinyoung Park (TmaxAI); Jinyeong Yim (Google); Wonseok Hwang (LBox); Sangdoo Yun (NAVER AI LAB); Dongyoon Han (NAVER AI Lab); Seunghyun Park (Clova AI Research, NAVER Corp.) |
8048 | VQGAN-CLIP: Open Domain Image Generation and Manipulation Using Natural Language | Katherine B Crowson (EleutherAI); Stella R Biderman (Booz Allen Hamilton)*; daniel kornis (Eleuther.ai); Dashiell Stander (Eleuther AI); Eric Hallahan (EleutherAI); Louis J Castricato (Georgia Tech); Edward Raff (Booz Allen Hamilton) |
8063 | Learning to use unlabeled data in data augmentation for 3D detection | Zhaoqi Leng (Waymo)*; Shuyang Cheng (Waymo LLC); Ben Caine (Google); Weiyue Wang (Waymo); Xiao Zhang (Cruise); Jonathon Shlens (Google); Mingxing Tan (Waymo); Dragomir Anguelov (Waymo) |
8070 | Differentiable Zooming for Multiple Instance Learning on Whole-Slide Images | Kevin Thandiackal (ETH Zurich / IBM Research)*; Boqi Chen (ETH Zurich ); Pushpak Pati (IBM Research Zurich); Guillaume Jaume (Harvard); Drew Williamson (Pathology, Brigham and Women’s Hospital, Harvard Medical School); Maria Gabrani (IBM Research); Orcun Goksel (ETH Zurich) |
8081 | Towards Learning Neural Representations from Shadows | Kushagra Tiwary (MIT)*; Tzofi M Klinghoffer (Massachusetts Institute of Technology); Ramesh Raskar (Massachusetts Institute of Technology) |
8086 | Augmenting Deep Classifiers with Polynomial Neural Networks | Grigorios Chrysos (EPFL)*; Markos Georgopoulos (Imperial College London); Jiankang Deng (Imperial College London); Jean Kossaifi (NVIDIA); Yannis Panagakis (University of Athens); Animashree Anandkumar (Caltech) |
8092 | AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation | Farshid Varno (Dalhousie/Imagia)*; Marzie Saghayi (Dalhousie University); Laya Rafiee Sevyeri (Concordia); Sharut Gupta (MILA, Imagia, Indian Institute of Technology Delhi (IIT Delhi)); Stan Matwin (Dalhouise University); Mohammad Havaei (Imagia) |
8094 | A Simple Approach and Benchmark for 21,000-Category Object Detection | Yutong Lin (Xi’an Jiaotong University); Chen Li (Xi’an Jiaotong University); Yue Cao (Microsoft Research); Zheng Zhang (MSRA); Jianfeng Wang (Microsoft); Lijuan Wang (Microsoft); Zicheng Liu (Microsoft); Han Hu (Microsoft Research Asia)* |
8106 | Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach | Jiseok Youn (Seoul National University)*; Jaehun Song (Seoul National University); Hyung-Sin Kim (Seoul National University); Saewoong Bahk (Seoul National University) |
8140 | Learning with Noisy Labels by Efficient Transition Matrix Estimation to Combat Label Miscorrection | Seong Min Kye (KAIST); Kwanghee Choi (Sogang University); Joonyoung Yi (Hyperconnect); Buru Chang (Hyperconnect)* |
8170 | Online Task-free Continual Learning with Dynamic Sparse Distributed Memory | Julien Pourcel (ENSEA)*; Ngoc-Son Vu (ETIS/Université Paris Seine, Université Cergy-Pontoise, ENSEA, CNRS/ 95000-Cergy); Robert M FRENCH (CNRS) |