CVPR 2020 Tutorial onVisual Recognition for Images, Video, and 3D |
||
Location: Online |
The purpose of this tutorial is to discuss popular approaches and recent advancements in the family of visual recognition tasks for different input modalities. We will cover in detail the most recent work on object recognition and scene understanding. Going beyond single images we will show current progress in video (detection and classification in video) and 3D visual recognition (multi-object mesh prediction). Our goal is to show existing connections between the techniques specialized for different input modalities and provide some insights about diverse challenges that each modality presents.
In conjunction with the tutorial we are open-sourcing three new visual recognition systems for images, videos, and 3D respectively. These PyTorch-based systems contain multiple state-of-the-art methods in the corresponding domains. In our tutorial we will pair each research talk with a talk that discusses these codebases sharing best engineering practices and showing details of implementation for each domain. We hope that such pairing will help researchers who are interested primarily in visual recognition to build and benchmark their systems easier. For researchers from different areas we hope to make SOTA recognition systems easy to incorporate in their frameworks.
Session1: 2D Recognition |
||
Live Q&A: 3:00 PM - 3:15 PM PDT |
Ross Girshick - Object Detection as a Machine Learning Problem[Slides]
| |
Alexander Kirillov - Pixel-Level Recognition[Slides]
| ||
Yuxin Wu - Detectron2[Slides]
|
Session2: 3D Vision |
||
Live Q&A: 3:20 PM - 3:35 PM PDT |
Justin Johnson - Making 3D Predictions with 2D Supervision[Slides]
| |
Nikhila Ravi - PyTorch3D[Slides]
|
Session3: Video Recognition |
||
Live Q&A: 3:40 PM - 3:55 PM PDT |
Christoph Feichtenhofer - Efficient Video Recognition[Slides]
| |
Haoqi Fan - PySlowFast[Slides]
|
FAIR is hiring Research Engineers! See information here.
Contact: Saining Xie