Abstract
Photogrammetry is a common method used to create 3D representations of real-world scenarios. Camera calibration and at least two images of scenario are used to create a point cloud representing the scene. Increasing pictures from different cameras and positions generates more information for point cloud representing a more robust real-world scenario. The Traditional methods include Structure from Motion and Multi view stereo. 3D Representations of Objects called 3D Object reconstruction has become a crucial technology in fields such as Augmented reality, Virtual reality, Gaming and Autonomous navigation. Traditional methods have shown limitations in handling complex textures, dynamic scenes and non-static objects. Recent advancements in neural based methods, such as Neural Radiance Fields (Nerf) have demonstrated superior results in novel view synthesis and high-fidelity reconstruction. This project explores the use of Gaussian Splatting to generate accurate 3D Objects from video data while addressing challenges such as background noise, non-static object and object orientation. The project employed a systematic design combining Segment Anything Model (SAM) for object segmentation, COLMAP for point cloud generation, and Gaussian Splatting for model training and rendering. Data is collected using high-resolution video capture, and background removal was performed using AI-based segmentation. Performance improvements were achieved by increasing video length and refining training parameters.
The results demonstrate that objects with distinct textures and varying surface patterns yield higher reconstruction accuracy, while objects with symmetrical features or low contrast remain challenging. This work highlights the potential of Gaussian Splatting in real-world 3D modeling applications and proposes future improvements in object orientation handling, multi-camera integration, and enhanced inpainting algorithms.