Google Tech Talks January, 29 2008 ABSTRACT We present an algorithm to convert standard digital pictures into 3-d models. This is a challenging problem, since an image is formed by a projection of the 3-d scene onto two dimensions, thus losing the depth information. We take a supervised learning approach to this problem, and use a Markov Random Field (MRF) to model the image depth cues as well as the relationships between different parts of the image. We show that even on unstructured scenes (of indoor and outdoor environments, including forests, trees, buildings, etc.), our algorithm is frequently able to recover fairly accurate 3-d models. We use our method to create visually pleasing 3-d flythroughs from the image. We also present a few extensions of these ideas, such as additionally incorporating triangulation (stereo) cues, and using multiple images to produce large scale 3-d models. We also apply our methods to two robotics applications: (a) high speed offroad obstacle avoidance on an autonomously driven remote-controlled car, and (b) having a robot unload items from a dishwasher. To convert your own image of an outdoor scene, landscape, etc. to a 3-d model, please visit: http://make3d.stanford.edu Joint work with Min Sun and Andrew Y. Ng. Speaker: Ashutosh Saxena Ashutosh is a PhD candidate with Prof. Andrew Y. Ng in the Computer Science department in Stanford University. He received his B. Tech. from Indian Institute of Technology (IIT Kanpur) in 2004. His research focuses on machine learning approaches to problems in computer vision and in robotic manipulation. Using data-driven machine learning techniques, he developed algorithms for creating 3-d models from a single image, and algorithms for robotic manipulation tasks such as opening doors, and grasping previously unseen objects.
Get notified about new features and conference additions.