CVPR 2021

CVPR  2021 is the premier annual computer vision event comprising the main conference and several co-located workshops and short courses. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers.


Now the 2021 paper has not been fully released, and will be updated directly when it is released later. Now let’s review the 2020 and 2019 papers.

CVPR 2021 


Continuously update Github

https://github.com/Sophia-11/Awesome-CVPR-Paper 


Target Detection

  1. Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection Paper address: https://arxiv.org/abs/1912.02424
    Code: https://github.com/sfzhang15/ATSS

  2. Few-Shot Object Detection with Attention-RPN and Multi-Relation Detector Paper address: https://arxiv.org/abs/1908.01998

 

Image segmentation

  1. Semi-Supervised Semantic Image Segmentation with Self-correcting Networks Paper address: https://arxiv.org/abs/1811.07073

  2. Deep Snake for Real-Time Instance Segmentation Paper address: https://arxiv.org/abs/2001.01629

  3. CenterMask: Real-Time Anchor-Free Instance Segmentation Paper address: https://arxiv.org/abs/1911.06667  Code: https://github.com/youngwanLEE/CenterMask

  4. SketchGCN: Semantic Sketch Segmentation with Graph Convolutional Networks Paper address: https://arxiv.org/abs/2003.00678

  5. PolarMask: Single Shot Instance Segmentation with Polar Representation Paper address: https://arxiv.org/abs/1909.13226  Code: https://github.com/xieenze/PolarMask

  6. xMUDA: Cross-Modal Unsupervised Domain Adaptation for 3D Semantic Segmentation Paper address: https://arxiv.org/abs/1911.12676

  7. BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation Paper address: https://arxiv.org/abs/2001.00309

 

Face recognition

  1. Towards Universal Representation Learning for Deep Face Recognition paper address: https://arxiv.org/abs/2002.11841

  2. Suppressing Uncertainties for Large-Scale Facial Expression Recognition
    Paper address: https://arxiv.org/abs/2002.10392  Code: https://github.com/kaiwang960112/Self-Cure-Network

3. Face X-ray for More General Face Forgery Detection paper address: https://arxiv.org/pdf/1912.13458.pdf

 

Target Tracking

1. ROAM: Recurrently Optimizing Tracking Model Paper address: https://arxiv.org/abs/1907.12006

 

3D point cloud & reconstruction

  1. PF-Net: Point Fractal Network for 3D Point Cloud Completion Paper address: https://arxiv.org/abs/2003.00410

  2. PointAugment: an Auto-Augmentation Framework for Point Cloud Classification Paper address: https://arxiv.org/abs/2002.10876  Code: https://github.com/liruihui/PointAugment/

3. Learning multiview 3D point cloud registration address: https://arxiv.org/abs/2001.05119

  1. C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds Paper address: https://arxiv.org/abs/1912.07009

  2. RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds Paper address: https://arxiv.org/abs/1911.11236

  3. Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image Paper address: https://arxiv.org/abs/2002.12212

  4. Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion Paper address: https://arxiv.org/abs/2003.01456

  5. In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction from 2D Landmarks Paper address: https://arxiv.org/pdf/1911.11924.pdf

 

Attitude estimation

  1. VIBE: Video Inference for Human Body Pose and Shape Estimation Paper address: https://arxiv.org/abs/1912.05656
    Code: https://github.com/mkocabas/VIBE

  2. Distribution-Aware Coordinate Representation for Human Pose Estimation Paper address: https://arxiv.org/abs/1910.06278
    Code: https://github.com/ilovepose/DarkPose

  3. 4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras Paper address: https://arxiv.org/abs/2002.12625

  4. Optimal least-squares solution to the hand-eye calibration problem Paper address: https://arxiv.org/abs/2002.10838

  5. D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry Paper address: https://arxiv.org/abs/2003.01060

  6. Multi-Modal Domain Adaptation for Fine-Grained Action Recognition Paper address: https://arxiv.org/abs/2001.09691

  7. Distribution Aware Coordinate Representation for Human Pose Estimation Paper address: https://arxiv.org/abs/1910.06278

  8. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation Paper address: https://arxiv.org/abs/1911.07524

9.PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation Paper address: https://arxiv.org/abs/1911.04231

 

GAN

  1. Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models Paper address: https://arxiv.org/abs/1911.12287  Code: https://github.com/giannisdaras/ylg

  2. MSG-GAN: Multi-Scale Gradient GAN for Stable Image Synthesis Paper address: https://arxiv.org/abs/1903.06048

  3. Robust Design of Deep Neural Networks against Adversarial Attacks based on Lyapunov Theory Paper address: https://arxiv.org/abs/1911.04636

 

Small sample & zero sample

  1. Improved Few-Shot Visual Classification paper address: https://arxiv.org/pdf/1912.03432.pdf

2. Meta-Transfer Learning for Zero-Shot Super-Resolution Paper address: https://arxiv.org/abs/2002.12213

 

Weak supervision & unsupervised

  1. Rethinking the Route Towards Weakly Supervised Object Localization Paper address: https://arxiv.org/abs/2002.11359
  2. NestedVAE: Isolating Common Factors via Weak Supervision Paper address: https://arxiv.org/abs/2002.11576

3.Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation Paper address: https://arxiv.org/abs/1911.07450

4. Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction address: https://arxiv.org/abs/2003.01460

 

Neural Networks

  1. Visual Commonsense R-CNN paper address: https://arxiv.org/abs/2002.12204

  2. GhostNet: More Features from Cheap Operations Paper address: https://arxiv.org/abs/1911.11907  Code: https://github.com/iamhankai/ghostnet

  3. Watch your Up-Convolution: CNN Based Generative Deep Neural Networks are Failing to Reproduce Spectral Paper address: https://arxiv.org/abs/2003.01826

 

Model acceleration

  1. GPU-Accelerated Mobile Multi-view Style Transfer paper address: https://arxiv.org/abs/2003.00706

 

Visual common sense

  1. What it Thinks is Important is Important: Robustness Transfers through Input Gradients Paper address: https://arxiv.org/abs/1912.05699

2. Attentive Context Normalization for Robust Permutation-Equivariant Learning paper address: https://arxiv.org/abs/1907.02545

  1. Bundle Adjustment on a Graph Processor Paper address: https://arxiv.org/abs/2003.03134  https://github.com/joeaortiz/gbp

  2. Transferring Dense Pose to Proximal Animal Classes Paper address: https://arxiv.org/abs/2003.00080

  3. Representations, Metrics and Statistics For Shape Analysis of Elastic Graphs Paper address: https://arxiv.org/abs/2003.00287

  4. Learning in the Frequency Domain paper address: https://arxiv.org/abs/2002.12416

7. Filter Grafting for Deep Neural Networks paper address: https://arxiv.org/pdf/2001.05868.pdf

8.ClusterFit: Improving Generalization of Visual Representations Paper address: https://arxiv.org/abs/1912.03330

9.Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction Paper address: https://arxiv.org/abs/2002.11927

10. Auto-Encoding Twin-Bottleneck Hashing paper address: https://arxiv.org/abs/2002.11930

11. Learning Representations by Predicting Bags of Visual Words Paper address: https://arxiv.org/abs/2002.12247

12.Holistically-Attracted Wireframe Parsing paper address: https://arxiv.org/abs/2003.01663

13.A General and Adaptive Robust Loss Function paper address: https://arxiv.org/abs/1701.03077

14. A Characteristic Function Approach to Deep Implicit Generative Modeling paper address: https://arxiv.org/abs/1909.07425

15.AdderNet: Do We Really Need Multiplications in Deep Learning? Paper address: https://arxiv.org/pdf/1912.13200

16.12-in-1: Multi-Task Vision and Language Representation Learning Paper address: https://arxiv.org/abs/1912.02315

17.Making Better Mistakes: Leveraging Class Hierarchies with Deep Networks Paper address: https://arxiv.org/abs/1912.09393

18.CARS: Contunuous Evolution for Efficient Neural Architecture Search Paper address: https://arxiv.org/pdf/1909.04977.pdf  Code: https://github.com/huawei-noah/CARS

19.Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training Paper address: https://arxiv.org/abs/2002.10638  Code: https://github.com/weituo12321/PREVALENT

1.GhostNet: More Features from Cheap Operations (over the architecture of Mobilenet v3) Paper link: https://arxiv.org/pdf/1911.11907 arxiv.org  model (amazing performance on ARM CPU): https://github. com/iamhankai/ghostnetgithub.com

We beat other SOTA lightweight CNNs such as MobileNetV3 and FBNet.

  1. AdderNet: Do We Really Need Multiplications in Deep Learning? (Additive Neural Network) Achieved very good performance on large-scale neural networks and datasets. Link to the paper: https://arxiv.org/pdf/1912.13200arxiv.org

  2. Frequency Domain Compact 3D Convolutional Neural Networks (3dCNN compression) Paper link: https://arxiv.org/pdf/1909.04977arxiv.org  Open source code: https://github.com/huawei-noah/CARSgithub.com

  3. A Semi-Supervised Assessor of Neural Architectures (NAS)

  4. Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection (NAS detection) backbone-neck-head search together, trinity

  5. CARS: Contunuous Evolution for Efficient Neural Architecture Search (Continuously evolved NAS) is efficient, has multiple advantages of differentiability and evolution, and can output Pareto pre-research

  6. On Positive-Unlabeled Classification in GAN (PU+GAN)

  7. Learning multiview 3D point cloud registration (3D point cloud) Link to the paper: arxiv.org/abs/2001.05119

  8. Multi-Modal Domain Adaptation for Fine-Grained Action Recognition (fine-grained action recognition) Link to the paper: arxiv.org/abs/2001.09691

  9. Action Modifiers: Learning from Adverbs in Instructional Video Link to the paper: arxiv.org/abs/1912.06617

  10. PolarMask: Single Shot Instance Segmentation with Polar Representation (instance segmentation modeling) Paper link: arxiv.org/abs/1909.13226 Paper interpretation: https://zhuanlan.zhihu.com/p/84890413  Open source code: https://github. com/xieenze/PolarMask

  11. Rethinking Performance Estimation in Neural Architecture Search (NAS) Since the real time-consuming part of block wise neural architecture search is performance estimation, this article finds the optimal parameters for block wise NAS, which is faster and more relevant.

  12. Distribution Aware Coordinate Representation for Human Pose Estimation (human body pose estimation) Link to the paper: arxiv.org/abs/1910.06278 Github: https://github.com/ilovepose/DarkPose  Author team homepage: https://ilovepose.github.io/ coco/

 

OCR

  1. ABCNet: Real-time Scene Text Spotting with Adaptive Bezier-Curve Network Paper address: https://arxiv.org/abs/2002.10200  Code: https://github.com/Yuliang-Liu/bezier_curve_text_spotting,https://github. com/aim-uofa/adet

 

Image classification

  1. Self-training with Noisy Student improves ImageNet classification Paper address: https://arxiv.org/abs/1911.04252

  2. Image Matching across Wide Baselines: From Paper to Practice Paper address: https://arxiv.org/abs/2003.01587

  3. Towards Robust Image Classification Using Sequential Attention Models Paper address: https://arxiv.org/abs/1912.02184

 

Video analysis

  1. Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications Paper address: https://arxiv.org/abs/2003.01455
    Code: https://github.com/bbrattoli/ZeroShotVideoClassification

  2. Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs Paper address: https://arxiv.org/abs/2003.00387

  3. Fine-grained Video-Text Retrieval with Hierarchical Graph Reasoning Paper address: https://arxiv.org/abs/2003.00392

  4. Object Relational Graph with Teacher-Recommended Learning for Video Captioning Paper address: https://arxiv.org/abs/2002.11566

  5. Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution Paper address: https://arxiv.org/abs/2002.11616

  6. Blurry Video Frame Interpolation Paper address: https://arxiv.org/abs/2002.12259

  7. Hierarchical Conditional Relation Networks for Video Question Answering Paper address: https://arxiv.org/abs/2002.10698

  8. Action Modifiers: Learning from Adverbs in Instructional Video Paper address: https://arxiv.org/abs/1912.06617

 

Image Processing

  1. Learning to Shade Hand-drawn Sketches Paper address: https://arxiv.org/abs/2002.11812

2.Single Image Reflection Removal through Cascaded Refinement Paper address: https://arxiv.org/abs/1911.06634

3.Generalized ODIN: Detecting Out-of-distribution Image without Learning from Out-of-distribution Data Paper address: https://arxiv.org/abs/2002.11297

  1. Deep Image Harmonization via Domain Verification Paper address: https://arxiv.org/abs/1911.13239  Code: https://github.com/bcmi/Image_Harmonization_Datasets

  2. RoutedFusion: Learning Real-time Depth Map Fusion Paper address: https://arxiv.org/pdf/2001.04388.pdf

 

Update

  1. Visual Commonsense R-CNN, Visual Commonsense R-CNN

https://arxiv.org/abs/2002.12204

  1. Out-of-distribution image detection

https://arxiv.org/abs/2002.11297

  1. Blurry Video Frame Interpolation, Blurry Video Frame Interpolation

https://arxiv.org/abs/2002.12259

  1. Meta transfer learning zero sample superscore

https://arxiv.org/abs/2002.12213

  1. 3D indoor scene understanding

https://arxiv.org/abs/2002.12212

6. Generate unbiased scene graphs from biased training

https://arxiv.org/abs/2002.11949

  1. Automatically encode double bottleneck hash

https://arxiv.org/abs/2002.11930

  1. A Convolutional Neural Network of Social Spatio-temporal Graph for Human Trajectory Prediction

https://arxiv.org/abs/2002.11927

  1. For general representation learning for deep face recognition

https://arxiv.org/abs/2002.11841

  1. Visual representation generalization

https://arxiv.org/abs/1912.03330

  1. Reduce context bias

https://arxiv.org/abs/2002.11812

  1. Unsupervised reinforcement learning with transferable meta skills

https://arxiv.org/abs/1911.07450

  1. Fast and accurate spatio-temporal video super-resolution

https://arxiv.org/abs/2002.11616

  1. Object relationship diagram Teacher recommended learning video captioning

https://arxiv.org/abs/2002.11566

  1. Rethinking the Location and Routing of Weakly Supervised Objects

https://arxiv.org/abs/2002.11359

  1. General agents for learning visual and language navigation through pre-training

https://arxiv.org/pdf/2002.10638.pdf

  1. GhostNet lightweight neural network

https://arxiv.org/pdf/1911.11907.pdf

  1. AdderNet: In deep learning, do we really need multiplication?

https://arxiv.org/pdf/1912.13200.pdf

  1. CARS: continuous evolution of efficient neural structure search

https://arxiv.org/abs/1909.04977

  1. Removal of reflections in a single image through collaborative iterative cascade fine-tuning

https://arxiv.org/abs/1911.06634

  1. Filter grafting of deep neural network

https://arxiv.org/pdf/2001.05868.pdf

  1. PolarMask: unify instance segmentation to FCN

https://arxiv.org/pdf/1909.13226.pdf

  1. Semi-supervised semantic image segmentation

https://arxiv.org/pdf/1811.07073.pdf

  1. Defend general attacks through selective feature regeneration

https://arxiv.org/pdf/1906.03444.pdf

  1. Real-time image retrieval based on fine-grained sketches

https://arxiv.org/abs/2002.10310

  1. Ask the VQA model with sub-questions

https://arxiv.org/abs/1906.03444

  1. Learning neural 3D texture space from 2D paradigms

https://geometry.cs.ucl.ac.uk/projects/2020/neuraltexture/

  1. NestedVAE: Isolate common factors through weak supervision

https://arxiv.org/abs/2002.11576

  1. Realize multiple future trajectory predictions

https://arxiv.org/pdf/1912.06445.pdf

  1. Use sequence attention model for robust image classification

https://arxiv.org/pdf/1912.02184


Source : The official website of CVPR 2021







Post a Comment

Previous Post Next Post