Week5 | CS7497 Spring 2013

KinnectFusion : RealTime 3D Reconstruction and Interaction using a moving depth camera

KinectFusion use the Microsoft device Kinect to create in real-time 3D reconstructions of indoor scene only using the depth data of the camera.
The Kinect camera generates Cloud Points from whom a mesh is generated. To avoid holes and decrease the noise of the measurement, several points of view are needed. Starting from hat concept, the KinectFusion has been design to ensure a real-time camera tracking and a real-time 3D recosntruction. That allows the user to move in the room and visualize the mesh. Moreover, the system has to be usuable for every-room : then it must be able to scale to the room size, independent with infrastructure and do not rely on a specific detection method. One other challenge is to allow the modification of the 3D reconstruction. Indeed, the capture does not suppose a static environment, i.e., it allows user interaction. So, the indoor room can evolve with time.

KinectFusion also uses CUDA to get an access to the GPU, which is much faster than a CPU for computing images. The GPU implementation consists of a 4phases pipeline that can be parallelized. First, the depth map conversion is calculated. 3D points are created from the measurement. Then comes the camera tracking. The 6DOF matrices are computed and consecutive images can be matched. When, the new position of the camera is known, the volumetric integration can be processed. This step consists of creating a surface from the points. To finish, the Raycasting is computed. This pipeline can be extend to support more feature. A step can be added to modeling the physical behavior of one or more object on the 3D reconstruction. Another feature is the separation of the foreground and the background. The distinction makes easier the rendering of the scene because the foreground has a higher probability of evolve, since there are user interactions.

Going out: Robust Model-based Tracking for Outdoor Augmented Reality

This paper presents a system that provides real-time and accurate overlay for a handled device. Traditional outdoor augmented reality systems use GPS and magnetic compass and inertial sensors to achieve localization. Unfortunately, GPS is very sensitive to shadowing, which occurs in urban environment. So, the system described here rather uses a computer vision-based solution combined with inertial sensors and magnetic field measurement fused by a Kalman Filter. The tracking system is based on a textured 3D model of the urban environment. From this modelization, a realistic view is render from the prior camera. Then, edges are extracted. The prior pose, that is overlaid on the video image is then updated.
Intertial measurements and sensors provide gyroscopic measurements, 3D acceleration vector and 3D magnetic field vector. Combined with pose estimated by the optical tracking component, the system offers computation quit accurate in presence of fast motions.
The problem with computer vision based tracking is the system fails when an unmodeled object occurs the urban environment. The recovry sub-system implemented here uses a statistical test to detect when the tracking module does not succeed. In those case, it recovery module tries to match the current frame with older ones.

To test the system, experiences have been made on two different locations. The protocol is a game, which basically consist of finding a ladder and the right window. To help the user, those two elements are highlighted in the device’s screen. The system deployed uses a cheap camera with a poor resolution and a quit slow framerate. The accuracy observed is about 0 and 0.2 meters while the system successfully recovered from several occlusions of the environment. However, when the surface tracked is too close to the camera, the tracking fails.

CS7497 Spring 2013

Virtual Environments

KinnectFusion and Going out

Main

Recent Activity

Archives