Abstract:
The main focus of this thesis lies on image-based rendering (IBR) techniques designed to operate in real-world environments and special attention is paid to the state-of-the-art end-to-end pipelines used to create and display virtual reality (VR) of 360° real-world environments. Head-mounted displays (HMDs) enable users to experience virtual environments freely, but the creation of real-world VR experiences remains a challenging interdisciplinary research problem. VR experiences can greatly differ depending on the used underlying scene representation and the meaning of real-world VR heavily depends on the context, i.e., the system or format at hand. Terminology and fundamental concepts are introduced which are needed to understand related IBR and learned IBR (neural) approaches, which are categorically surveyed in the context of end-to-end pipelines to create real-world IBR experiences. The applicability of the discussed approaches to create real-world VR applications is categorised into practical aspects covering capture, reconstruction, representation, and rendering, which yields a fairly good overview of the research landscape to which this thesis
contributes. The life cycle of immersive media production depends on computer vision and computer graphics problems and describes, in its whole, end-to-end pipelines for creating 3D
photography used to render high-quality real-world VR experiences. Vision is needed to obtain viewpoint and scene information to create scene representations, i.e., 3D photographs, and computer graphics is needed for creating high-quality novel viewpoints, for instance by applying IBR techniques to the reconstructed scene representation. Lack of widely available immersive real-world VR content which suits current generations of HMDs motivates research in casual 3D photography. Furthermore, augmenting widely available real-world VR formats, e.g., omnidirectional stereo (ODS), seems intriguing in order to increase the immersion of currently available real-world VR experiences. This thesis contributes three end-to-end IBR pipelines for the creation and display of immersive 360° VR experiences, all outperforming the current de-facto standard (ODS) while only relying on moderate computational resources which are commonly available to casual consumers, and one learned IBR approach based on conditional adversarial nets that takes a casually captured video sweep as input to perform high-quality video extrapolation. The ability to casually capture 3D photography might have a profound impact on the way consumers capture, edit, share, and re-live personal experiences in a near foreseeable future.
The main focus of this thesis lies on image-based rendering (IBR) techniques designed to operate in real-world environments and special attention is paid to the state-of-the-art end-to-end pipelines used to create and display virtual reality (VR) of 360° real-world environments. Head-mounted displays (HMDs) enable users to experience virtual environments freely, but the creation of real-world VR experiences remains a challenging interdisciplinary research problem. VR experiences can greatly differ depending on the used underlying scene representation and the meaning of real-world VR heavily depends on the context, i.e., the system or format at hand. Terminology and fundamental concepts are introduced which are needed to understand related IBR and learned IBR (neural) approaches, which are categorically surveyed in the context of end-to-end pipelines to create real-world IBR experiences. The applicability of the discussed approaches to create real-world VR applications is categorised into practical aspects covering capture, reconstruction, representation, and rendering, which yields a fairly good overview of the research landscape to which this thesis
contributes. The life cycle of immersive media production depends on computer vision and computer graphics problems and describes, in its whole, end-to-end pipelines for creating 3D
photography used to render high-quality real-world VR experiences. Vision is needed to obtain viewpoint and scene information to create scene representations, i.e., 3D photographs, and computer graphics is needed for creating high-quality novel viewpoints, for instance by applying IBR techniques to the reconstructed scene representation. Lack of widely available immersive real-world VR content which suits current generations of HMDs motivates research in casual 3D photography. Furthermore, augmenting widely available real-world VR formats, e.g., omnidirectional stereo (ODS), seems intriguing in order to increase the immersion of currently available real-world VR experiences. This thesis contributes three end-to-end IBR pipelines for the creation and display of immersive 360° VR experiences, all outperforming the current de-facto standard (ODS) while only relying on moderate computational resources which are commonly available to casual consumers, and one learned IBR approach based on conditional adversarial nets that takes a casually captured video sweep as input to perform high-quality video extrapolation. The ability to casually capture 3D photography might have a profound impact on the way consumers capture, edit, share, and re-live personal experiences in a near foreseeable future.