Aggregating Spatial and Photometric Context for Photometric Stereo

Honzátko, David

doi:10.5075/epfl-thesis-9806

Honzátko, David

2024

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Photometric stereo, a computer vision technique for estimating the 3D shape of objects through images captured under varying illumination conditions, has been a topic of research for nearly four decades. In its general formulation, photometric stereo is an ill-posed problem and requires robust prior knowledge of material reflectance properties, light transport, and object shapes, all of which are quite difficult to obtain in many scenarios. We focus on task of estimating the surface normals of an inspected object given a large, but apriori unknown, number of input images and the illumination directions under which these images were captured. This is also known as far-field dense calibrated photometric stereo, and it is the main topic of this thesis. Like in many other computer vision fields, recent advances in photometric stereo have leveraged deep learning. Despite their success, these methods struggle with the large input data dimensionality, the disparity between the spatial domain and the domain of illumination directions, the apriori unknown number of observations provided for a scene, and the general unavailability of extensive real data collections to train them. To tackle these issues, we formulate the problem as a four-dimensional regression and propose novel neural architectures that leverage both the spatial context of individual images and the photometric context captured in the intensity variations of individual pixels under different illumination directions. Our methods work with the concept of observation maps -- fixed-size two-dimensional planes, encoding pixel intensities together with the associated illumination directions for each pixel separately. This framework enabled the design of fully convolutional networks utilizing separable four-dimensional convolutions, which simultaneously process observation maps and image spatial dimensions, thus learning both reflectance and shape prior knowledge. With this approach, we achieve higher performance than the existing works. Additionally, we introduce a fast rendering approach for on-the-fly sample generation during training, which allows for much larger diversity in shape and reflectance properties than existing static datasets offer. Coupled with an efficient training strategy, this approach enables training the four-dimensional neural architectures on standard consumer hardware within a reasonable timeframe. These innovations have culminated in state-of-the-art qualitative performance on all relevant benchmark datasets that feature real images, thus making a significant contribution to the field of photometric stereo.

Details

Title Aggregating Spatial and Photometric Context for Photometric Stereo

Author(s) Honzátko, David

Advisor(s)

Fua, Pascal
Türetken, Engin

Pagination 114

Date 2024

Publisher Lausanne, EPFL

Keywords

photometric stereo; synthetic data generation; four-dimensional convolutions; fully convolutional neural architectures; 3D shape reconstruction

Language English

DOI https://doi.org/10.5075/epfl-thesis-9806

Laboratories CVLAB

Record Appears in Scientific production and competences > I&C - School of Computer and Communication Sciences > IINFCOM > CVLAB - Computer Vision Laboratory
Scientific production and competences > Euler Center for Signal Processing
Scientific production and competences > EPFL Theses
Work produced at EPFL
Published
Theses

Record creation date 2024-01-29

Files

Abstract

Details

PDF