Katerina Fragkiadaki - 3D Vision with 3D View-Predictive Neural Scene representations

About Share Download Add to

September 29th, 2020. MIT - CSAIL Abstract: Current state-of-the-art CNNs localize rare object categories in internet photos, yet, they miss basic facts that a two-year-old has mastered: that objects have 3D extent, they persist over time despite changes in the camera view, they do not 3D intersect, and others. We will discuss models that learn to map 2D and images and videos into amodal completed 3D feature maps of the scene and the objects in it by predicting views. We will show the proposed models learn object permanence, have objects emerge in 3D without human annotations, support grounding of language in 3D visual simulations, and learn intuitive physics that generalize across scene arrangements and camera configurations. In this way, they overcome many limitations of 2D CNNs for video perception, model learning and language grounding. Bio: Katerina Fragkiadaki is an Assistant Professor in the Machine Learning Department in Carnegie Mellon University. She received her Ph.D. from University

Share with your friends

Link:

Embed:

<iframe width="640" height="360" src="//myvideo.cc/embed/eDlFVHRiNjVvbExhZlVSUkc4cC90MC9IS2VBSkdsSkhoOXN6Vmw0MDc2RT0" frameborder="0" webkitallowfullscreen mozallowfullscreen allowfullscreen></iframe>

Video Size:

Custom size:

Autoplay video

Hide player controls

Hide resume playing

Add to Playlist:

Favorites

My Playlist

Watch Later