CVPR 2021 is the premier annual computer vision event comprising the main conference and several co-located workshops and short courses. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers.

Now the 2021 paper has not been fully released, and will be updated directly when it is released later. Now let’s review the 2020 and 2019 papers.
CVPR 2021

Continuously update Github

https://github.com/Sophia-11/Awesome-CVPR-Paper

Target Detection

Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection Paper address: https://arxiv.org/abs/1912.02424
Code: https://github.com/sfzhang15/ATSS
Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector Paper address: https://arxiv.org/abs/1908.01998

Image segmentation

Semi-Supervised Semantic Image Segmentation with Self-correcting Networks Paper address: https://arxiv.org/abs/1811.07073
Deep Snake for Real-Time Instance Segmentation Paper address: https://arxiv.org/abs/2001.01629
CenterMask: Real-Time Anchor-Free Instance Segmentation Paper address: https://arxiv.org/abs/1911.06667 Code: https://github.com/youngwanLEE/CenterMask
SketchGCN: Semantic Sketch Segmentation with Graph Convolutional Networks Paper address: https://arxiv.org/abs/2003.00678
PolarMask: Single Shot Instance Segmentation with Polar Representation Paper address: https://arxiv.org/abs/1909.13226 Code: https://github.com/xieenze/PolarMask
xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation Paper address: https://arxiv.org/abs/1911.12676
BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation Paper address: https://arxiv.org/abs/2001.00309

Face recognition

Towards Universal Representation Learning for Deep Face Recognition paper address: https://arxiv.org/abs/2002.11841
Suppressing Uncertainties for Large-Scale Facial Expression Recognition
Paper address: https://arxiv.org/abs/2002.10392 Code: https://github.com/kaiwang960112/Self-Cure-Network

3. Face X-ray for More General Face Forgery Detection paper address: https://arxiv.org/pdf/1912.13458.pdf

Target Tracking

1. ROAM: Recurrently Optimizing Tracking Model Paper address: https://arxiv.org/abs/1907.12006

3D point cloud & reconstruction

PF-Net: Point Fractal Network for 3D Point Cloud Completion Paper address: https://arxiv.org/abs/2003.00410
PointAugment: an Auto-Augmentation Framework for Point Cloud Classification Paper address: https://arxiv.org/abs/2002.10876 Code: https://github.com/liruihui/PointAugment/

3. Learning multiview 3D point cloud registration address: https://arxiv.org/abs/2001.05119

C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds Paper address: https://arxiv.org/abs/1912.07009
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds Paper address: https://arxiv.org/abs/1911.11236
Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image Paper address: https://arxiv.org/abs/2002.12212
Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion Paper address: https://arxiv.org/abs/2003.01456
In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks Paper address: https://arxiv.org/pdf/1911.11924.pdf

Attitude estimation

VIBE: Video Inference for Human Body Pose and Shape Estimation Paper address: https://arxiv.org/abs/1912.05656
Code: https://github.com/mkocabas/VIBE
Distribution-Aware Coordinate Representation for Human Pose Estimation Paper address: https://arxiv.org/abs/1910.06278
Code: https://github.com/ilovepose/DarkPose
4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras Paper address: https://arxiv.org/abs/2002.12625
Optimal least-squares solution to the hand-eye calibration problem Paper address: https://arxiv.org/abs/2002.10838
D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry Paper address: https://arxiv.org/abs/2003.01060
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition Paper address: https://arxiv.org/abs/2001.09691
Distribution Aware Coordinate Representation for Human Pose Estimation Paper address: https://arxiv.org/abs/1910.06278
The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation Paper address: https://arxiv.org/abs/1911.07524

9.PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation Paper address: https://arxiv.org/abs/1911.04231

GAN

Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models Paper address: https://arxiv.org/abs/1911.12287 Code: https://github.com/giannisdaras/ylg
MSG-GAN: Multi-Scale Gradient GAN for Stable Image Synthesis Paper address: https://arxiv.org/abs/1903.06048
Robust Design of Deep Neural Networks against Adversarial Attacks based on Lyapunov Theory Paper address: https://arxiv.org/abs/1911.04636

Small sample & zero sample

Improved Few-Shot Visual Classification paper address: https://arxiv.org/pdf/1912.03432.pdf

2. Meta-Transfer Learning for Zero-Shot Super-Resolution Paper address: https://arxiv.org/abs/2002.12213

Weak supervision & unsupervised

Rethinking the Route Towards Weakly Supervised Object Localization Paper address: https://arxiv.org/abs/2002.11359
NestedVAE: Isolating Common Factors via Weak Supervision Paper address: https://arxiv.org/abs/2002.11576

3.Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation Paper address: https://arxiv.org/abs/1911.07450

4. Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction address: https://arxiv.org/abs/2003.01460

Neural Networks

Visual Commonsense R-CNN paper address: https://arxiv.org/abs/2002.12204
GhostNet: More Features from Cheap Operations Paper address: https://arxiv.org/abs/1911.11907 Code: https://github.com/iamhankai/ghostnet
Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral Paper address: https://arxiv.org/abs/2003.01826

Model acceleration

GPU-Accelerated Mobile Multi-view Style Transfer paper address: https://arxiv.org/abs/2003.00706

Visual common sense

What it Thinks is Important is Important: Robustness Transfers through Input Gradients Paper address: https://arxiv.org/abs/1912.05699

2. Attentive Context Normalization for Robust Permutation-Equivariant Learning paper address: https://arxiv.org/abs/1907.02545

Bundle Adjustment on a Graph Processor Paper address: https://arxiv.org/abs/2003.03134 https://github.com/joeaortiz/gbp
Transferring Dense Pose to Proximal Animal Classes Paper address: https://arxiv.org/abs/2003.00080
Representations, Metrics and Statistics For Shape Analysis of Elastic Graphs Paper address: https://arxiv.org/abs/2003.00287
Learning in the Frequency Domain paper address: https://arxiv.org/abs/2002.12416

7. Filter Grafting for Deep Neural Networks paper address: https://arxiv.org/pdf/2001.05868.pdf

8.ClusterFit: Improving Generalization of Visual Representations Paper address: https://arxiv.org/abs/1912.03330

9.Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction Paper address: https://arxiv.org/abs/2002.11927

10. Auto-Encoding Twin-Bottleneck Hashing paper address: https://arxiv.org/abs/2002.11930

11. Learning Representations by Predicting Bags of Visual Words Paper address: https://arxiv.org/abs/2002.12247

12.Holistically-Attracted Wireframe Parsing paper address: https://arxiv.org/abs/2003.01663

13.A General and Adaptive Robust Loss Function paper address: https://arxiv.org/abs/1701.03077

14. A Characteristic Function Approach to Deep Implicit Generative Modeling paper address: https://arxiv.org/abs/1909.07425

15.AdderNet: Do We Really Need Multiplications in Deep Learning? Paper address: https://arxiv.org/pdf/1912.13200

16.12-in-1: Multi-Task Vision and Language Representation Learning Paper address: https://arxiv.org/abs/1912.02315

17.Making Better Mistakes: Leveraging Class Hierarchies with Deep Networks Paper address: https://arxiv.org/abs/1912.09393

18.CARS: Contunuous Evolution for Efficient Neural Architecture Search Paper address: https://arxiv.org/pdf/1909.04977.pdf Code: https://github.com/huawei-noah/CARS

19.Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training Paper address: https://arxiv.org/abs/2002.10638 Code: https://github.com/weituo12321/PREVALENT

1.GhostNet: More Features from Cheap Operations (over the architecture of Mobilenet v3) Paper link: https://arxiv.org/pdf/1911.11907 arxiv.org model (amazing performance on ARM CPU): https://github. com/iamhankai/ghostnetgithub.com

We beat other SOTA lightweight CNNs such as MobileNetV3 and FBNet.

AdderNet: Do We Really Need Multiplications in Deep Learning? (Additive Neural Network) Achieved very good performance on large-scale neural networks and datasets. Link to the paper: https://arxiv.org/pdf/1912.13200arxiv.org
Frequency Domain Compact 3D Convolutional Neural Networks (3dCNN compression) Paper link: https://arxiv.org/pdf/1909.04977arxiv.org Open source code: https://github.com/huawei-noah/CARSgithub.com
A Semi-Supervised Assessor of Neural Architectures (NAS)
Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection (NAS detection) backbone-neck-head search together, trinity
CARS: Contunuous Evolution for Efficient Neural Architecture Search (Continuously evolved NAS) is efficient, has multiple advantages of differentiability and evolution, and can output Pareto pre-research
On Positive-Unlabeled Classification in GAN (PU+GAN)
Learning multiview 3D point cloud registration (3D point cloud) Link to the paper: arxiv.org/abs/2001.05119
Multi-Modal Domain Adaptation for Fine-Grained Action Recognition (fine-grained action recognition) Link to the paper: arxiv.org/abs/2001.09691
Action Modifiers: Learning from Adverbs in Instructional Video Link to the paper: arxiv.org/abs/1912.06617
PolarMask: Single Shot Instance Segmentation with Polar Representation (instance segmentation modeling) Paper link: arxiv.org/abs/1909.13226 Paper interpretation: https://zhuanlan.zhihu.com/p/84890413 Open source code: https://github. com/xieenze/PolarMask
Rethinking Performance Estimation in Neural Architecture Search (NAS) Since the real time-consuming part of block wise neural architecture search is performance estimation, this article finds the optimal parameters for block wise NAS, which is faster and more relevant.
Distribution Aware Coordinate Representation for Human Pose Estimation (human body pose estimation) Link to the paper: arxiv.org/abs/1910.06278 Github: https://github.com/ilovepose/DarkPose Author team homepage: https://ilovepose.github.io/ coco/

OCR

ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network Paper address: https://arxiv.org/abs/2002.10200 Code: https://github.com/Yuliang-Liu/bezier_curve_text_spotting,https://github. com/aim-uofa/adet

Image classification

Self-training with Noisy Student improves ImageNet classification Paper address: https://arxiv.org/abs/1911.04252
Image Matching across Wide Baselines: From Paper to Practice Paper address: https://arxiv.org/abs/2003.01587
Towards Robust Image Classification Using Sequential Attention Models Paper address: https://arxiv.org/abs/1912.02184

Video analysis

Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications Paper address: https://arxiv.org/abs/2003.01455
Code: https://github.com/bbrattoli/ZeroShotVideoClassification
Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs Paper address: https://arxiv.org/abs/2003.00387
Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning Paper address: https://arxiv.org/abs/2003.00392
Object Relational Graph with Teacher-Recommended Learning for Video Captioning Paper address: https://arxiv.org/abs/2002.11566
Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution Paper address: https://arxiv.org/abs/2002.11616
Blurry Video Frame Interpolation Paper address: https://arxiv.org/abs/2002.12259
Hierarchical Conditional Relation Networks for Video Question Answering Paper address: https://arxiv.org/abs/2002.10698
Action Modifiers: Learning from Adverbs in Instructional Video Paper address: https://arxiv.org/abs/1912.06617

Image Processing

Learning to Shade Hand-drawn Sketches Paper address: https://arxiv.org/abs/2002.11812

2.Single Image Reflection Removal through Cascaded Refinement Paper address: https://arxiv.org/abs/1911.06634

3.Generalized ODIN: Detecting Out-of-distribution Image without Learning from Out-of-distribution Data Paper address: https://arxiv.org/abs/2002.11297