research-article

Open Access

An Implicit Neural Representation for the Image Stack: Depth, All in Focus, and High Dynamic Range

Authors:
Chao Wang

Max-Planck-Institut für Informatik, Germany

Max-Planck-Institut für Informatik, Germany
View Profile

,
Ana Serrano

Universidad de Zaragoza, I3A, Spain

Universidad de Zaragoza, I3A, Spain
View Profile

,
Xingang Pan

Max-Planck-Institut für Informatik, Germany & Nanyang Technological University, Singapore

Max-Planck-Institut für Informatik, Germany & Nanyang Technological University, Singapore
View Profile

,
Krzysztof Wolski

Max-Planck-Institut für Informatik, Germany

Max-Planck-Institut für Informatik, Germany
View Profile

,
Bin Chen

Max-Planck-Institut für Informatik, Germany

Max-Planck-Institut für Informatik, Germany
View Profile

,
Karol Myszkowski

Max-Planck-Institut für Informatik, Germany

Max-Planck-Institut für Informatik, Germany
View Profile

,
Hans-Peter Seidel

Max-Planck-Institut für Informatik, Germany

Max-Planck-Institut für Informatik, Germany
View Profile

,
Christian Theobalt

Max-Planck-Institut für Informatik, Germany

Max-Planck-Institut für Informatik, Germany
View Profile

,
Thomas Leimkühler

Max-Planck-Institut für Informatik, Germany

Max-Planck-Institut für Informatik, Germany
View Profile

Authors Info & Claims

ACM Transactions on Graphics Volume 42 Issue 6Article No.: 221pp 1–11https://doi.org/10.1145/3618367

Published:05 December 2023Publication History

ACM Transactions on Graphics

Abstract

In everyday photography, physical limitations of camera sensors and lenses frequently lead to a variety of degradations in captured images such as saturation or defocus blur. A common approach to overcome these limitations is to resort to image stack fusion, which involves capturing multiple images with different focal distances or exposures. For instance, to obtain an all-in-focus image, a set of multi-focus images is captured. Similarly, capturing multiple exposures allows for the reconstruction of high dynamic range. In this paper, we present a novel approach that combines neural fields with an expressive camera model to achieve a unified reconstruction of an all-in-focus high-dynamic-range image from an image stack. Our approach is composed of a set of specialized implicit neural representations tailored to address specific sub-problems along our pipeline: We use neural implicits to predict flow to overcome misalignments arising from lens breathing, depth, and all-in-focus images to account for depth of field, as well as tonemapping to deal with sensor responses and saturation - all trained using a physically inspired supervision structure with a differentiable thin lens model at its core. An important benefit of our approach is its ability to handle these tasks simultaneously or independently, providing flexible post-editing capabilities such as refocusing and exposure adjustment. By sampling the three primary factors in photography within our framework (focal distance, aperture, and exposure time), we conduct a thorough exploration to gain valuable insights into their significance and impact on overall reconstruction quality. Through extensive validation, we demonstrate that our method outperforms existing approaches in both depth-from-defocus and all-in-focus image reconstruction tasks. Moreover, our approach exhibits promising results in each of these three dimensions, showcasing its potential to enhance captured image quality and provide greater control in post-processing.

Supplemental Material

papers_480s4-file3.mp4

mp4

203 MB

Download

Available for Download

zip

papers_480s4-file4.zip (39.9 MB)

supplemental

References

Maryam Azimi et al. 2021. PU21: A novel perceptually uniform encoding for adapting existing quality metrics for HDR. In 2021 Picture Coding Symposium (PCS). IEEE, 1--5.Google Scholar
Sai Bangaru, Jesse Michel, Kevin Mu, Gilbert Bernstein, Tzu-Mao Li, and Jonathan Ragan-Kelley. 2021. Systematically Differentiating Parametric Discontinuities. ACM Trans. Graph. 40, 107 (2021), 107:1--107:17.Google ScholarDigital Library
Odysseas Bouzos, Ioannis Andreadis, and Nikolaos Mitianoudis. 2019. Conditional random field model for robust multi-focus image fusion. IEEE Transactions on Image Processing 28, 11 (2019), 5636--5648.Google ScholarDigital Library
Paul E. Debevec and Jitendra Malik. 1997. Recovering High Dynamic Range Radiance Maps from Photographs. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '97). ACM Press/Addison-Wesley Publishing Co., USA, 369--378. Google ScholarDigital Library
David Eigen, Christian Puhrsch, and Rob Fergus. 2014. Depth map prediction from a single image using a multi-scale deep network. Advances in neural information processing systems 27 (2014).Google Scholar
Paolo Favaro. 2010. Recovering thin structures via nonlocal-means regularization with application to depth from defocus. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 1133--1140.Google ScholarCross Ref
Brandon Yushan Feng, Susmija Jabbireddy, and Amitabh Varshney. 2022. Viinter: View interpolation with implicit neural representations of images. In SIGGRAPH Asia 2022 Conference Papers. 1--9.Google ScholarDigital Library
Herbert Gross. 2005. Handbook of Optical Systems. (2005).Google Scholar
Pascal Gwosdek, Sven Grewenig, Andrés Bruhn, and Joachim Weickert. 2012. Theoretical foundations of gaussian convolution by extended box filtering. In Scale Space and Variational Methods in Computer Vision: Third International Conference, SSVM 2011, Ein-Gedi, Israel, May 29--June 2, 2011, Revised Selected Papers 3. Springer, 447--458.Google ScholarDigital Library
Samuel W Hasinoff and Kiriakos N Kutulakos. 2007. A layer-based restoration framework for variable-aperture photography. In 2007 IEEE 11th International Conference on Computer Vision. IEEE, 1--8.Google ScholarCross Ref
Caner Hazirbas, Sebastian Georg Soyer, Maximilian Christian Staab, Laura Leal-Taixé, and Daniel Cremers. 2019. Deep depth from focus. In Computer Vision-ACCV 2018: 14th Asian Conference on Computer Vision, Perth, Australia, December 2--6, 2018, Revised Selected Papers, Part III 14. Springer, 525--541.Google Scholar
Xin Huang, Qi Zhang, Ying Feng, Hongdong Li, Xuan Wang, and Qing Wang. 2022. Hdr-nerf: High dynamic range neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18398--18408.Google ScholarCross Ref
Ralph Jacobson, Sidney Ray, Geoffrey G Attridge, and Norman Axford. 2000. Manual of Photography. Taylor & Francis.Google Scholar
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual losses for real-time style transfer and super-resolution. In Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part II 14. Springer, 694--711.Google Scholar
Kim Jun-Seong, Kim Yu-Ji, Moon Ye-Bin, and Tae-Hyun Oh. 2022. HDR-Plenoxels: Self-Calibrating High Dynamic Range Radiance Fields. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XXXII. Springer, 384--401.Google ScholarDigital Library
Yoni Kasten, Dolev Ofri, Oliver Wang, and Tali Dekel. 2021. Layered neural atlases for consistent video editing. ACM Transactions on Graphics (TOG) 40, 6 (2021), 1--12.Google ScholarDigital Library
Iro Laina, Christian Rupprecht, Vasileios Belagiannis, Federico Tombari, and Nassir Navab. 2016. Deeper depth prediction with fully convolutional residual networks. In 2016 Fourth international conference on 3D vision (3DV). IEEE, 239--248.Google ScholarCross Ref
Jun Li, Reinhard Klein, and Angela Yao. 2017. A two-streamed network for estimating fine-scaled depth maps from single rgb images. In Proceedings of the IEEE International Conference on Computer Vision. 3372--3380.Google ScholarCross Ref
Shutao Li, Xudong Kang, and Jianwen Hu. 2013. Image fusion with guided filtering. IEEE Transactions on Image processing 22, 7 (2013), 2864--2875.Google ScholarCross Ref
Xing Lin, Jinli Suo, Xun Cao, and Qionghai Dai. 2013. Iterative feedback estimation of depth and radiance from defocused images. In Computer Vision-ACCV 2012: 11th Asian Conference on Computer Vision, Daejeon, Korea, November 5--9, 2012, Revised Selected Papers, Part IV 11. Springer, 95--109.Google Scholar
Yu Liu, Xun Chen, Hu Peng, and Zengfu Wang. 2017. Multi-focus image fusion with a deep convolutional neural network. Information Fusion 36 (2017), 191--207.Google ScholarDigital Library
Yu-Lun Liu, Wei-Sheng Lai, Yu-Sheng Chen, Yi-Lung Kao, Ming-Hsuan Yang, Yung-Yu Chuang, and Jia-Bin Huang. 2020. Single-image HDR reconstruction by learning to reverse the camera pipeline. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1651--1660.Google ScholarCross Ref
Li Ma, Xiaoyu Li, Jing Liao, Qi Zhang, Xuan Wang, Jue Wang, and Pedro V Sander. 2022. Deblur-nerf: Neural radiance fields from blurry images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12861--12870.Google ScholarCross Ref
David Mandl, Peter M Roth, Tobias Langlotz, Christoph Ebner, Shohei Mori, Stefanie Zollmann, Peter Mohr, and Denis Kalkofen. 2021. Neural cameras: Learning camera characteristics for coherent mixed reality rendering. In 2021 IEEE International Symposium on Mixed and Augmented Reality (ISMAR). IEEE, 508--516.Google ScholarCross Ref
Rafal K Mantiuk, Dounia Hammou, and Param Hanji. 2023. HDR-VDP-3: A multi-metric for predicting image differences, quality and contrast distortions in high dynamic range and regular content. arXiv preprint arXiv:2304.13625 (2023).Google Scholar
Maxim Maximov, Kevin Galim, and Laura Leal-Taixé. 2020. Focus on defocus: bridging the synthetic to real domain gap for depth estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1071--1080.Google ScholarCross Ref
Tom Mertens, Jan Kautz, and Frank Van Reeth. 2007. Exposure fusion. In 15th Pacific Conference on Computer Graphics and Applications (PG'07). IEEE, 382--390.Google ScholarDigital Library
Ben Mildenhall, Peter Hedman, Ricardo Martin-Brualla, Pratul P Srinivasan, and Jonathan T Barron. 2022. Nerf in the dark: High dynamic range view synthesis from noisy raw images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16190--16199.Google ScholarCross Ref
Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99--106.Google ScholarDigital Library
Michael Moeller, Martin Benning, Carola Schönlieb, and Daniel Cremers. 2015. Variational depth from focus reconstruction. IEEE Transactions on Image Processing 24, 12 (2015), 5369--5378.Google ScholarDigital Library
Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. 2022. Instant Neural Graphics Primitives with a Multiresolution Hash Encoding. ACM Trans. Graph. 41, 4, Article 102 (July 2022), 15 pages.Google ScholarDigital Library
Seonghyeon Nam, Marcus A Brubaker, and Michael S Brown. 2022. Neural image representations for multi-image fusion and layer separation. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part VII. Springer, 216--232.Google Scholar
Michael Potmesil and Indranil Chakravarty. 1981. A lens and aperture camera model for synthetic image generation. ACM SIGGRAPH Computer Graphics 15, 3 (1981), 297--305.Google ScholarDigital Library
Haozhe Si, Bin Zhao, Dong Wang, Yupeng Gao, Mulin Chen, Zhigang Wang, and Xuelong Li. 2023. Fully Self-Supervised Depth Estimation from Defocus Clue. arXiv preprint arXiv:2303.10752 (2023).Google Scholar
Nathan Silberman, Derek Hoiem, Pushmeet Kohli, and Rob Fergus. 2012. Indoor segmentation and support inference from rgbd images. ECCV (5) 7576 (2012), 746--760.Google Scholar
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
Vincent Sitzmann, Julien Martel, Alexander Bergman, David Lindell, and Gordon Wetzstein. 2020. Implicit neural representations with periodic activation functions. Advances in Neural Information Processing Systems 33 (2020), 7462--7473.Google Scholar
Supasorn Suwajanakorn, Carlos Hernandez, and Steven M Seitz. 2015. Depth from focus with your mobile phone. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3497--3506.Google ScholarCross Ref
Chao Wang, Ana Serrano, Xingang Pan, Bin Chen, Hans-Peter Seidel, Christian Theobalt, Karol Myszkowski, and Thomas Leimkuehler. 2022. GlowGAN: Unsupervised Learning of HDR Images from LDR Images in the Wild. arXiv preprint arXiv:2211.12352 (2022).Google Scholar
Ning-Hsu Wang, Ren Wang, Yu-Lun Liu, Yu-Hao Huang, Yu-Lin Chang, Chia-Ping Chen, and Kevin Jou. 2021. Bridging unsupervised and supervised depth from focus via all-in-focus supervision. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12621--12631.Google ScholarCross Ref
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600--612.Google ScholarDigital Library
Juyang Weng, Paul Cohen, Marc Herniou, et al. 1992. Camera calibration with distortion models and accuracy evaluation. IEEE Transactions on pattern analysis and machine intelligence 14, 10 (1992), 965--980.Google ScholarDigital Library
Changyeon Won and Hae-Gon Jeon. 2022. Learning Depth from Focus in the Wild. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part I. Springer, 1--18.Google Scholar
Zijin Wu, Xingyi Li, Juewen Peng, Hao Lu, Zhiguo Cao, and Weicai Zhong. 2022. DoF-NeRF: Depth-of-Field Meets Neural Radiance Fields. In Proceedings of the 30th ACM International Conference on Multimedia. 1718--1729.Google ScholarDigital Library
Fengting Yang, Xiaolei Huang, and Zihan Zhou. 2022. Deep Depth from Focus with Differential Focus Volume. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12642--12651.Google ScholarCross Ref
Qiang Zhang and Martin D Levine. 2016. Robust multi-focus image fusion using multitask sparse representation and spatial context. IEEE Transactions on Image Processing 25, 5 (2016), 2045--2058.Google ScholarDigital Library

Index Terms

An Implicit Neural Representation for the Image Stack: Depth, All in Focus, and High Dynamic Range
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
  2. Computer graphics
    1. Image manipulation
      1. Computational photography

Recommendations

Multiple depth layers and all-in-focus image generations by blurring and deblurring operations

The depth map and all-in-focus generations from a single image are proposed.The depth map bases on the characteristic curve of COC vs. the depth characteristic curve of a camera.All-in-focus image generation is from the estimated depth map.A joint ...
Read More
Half-sweep imaging for depth from defocus

Depth from defocus (DFD) is a technique that restores scene depth based on the amount of defocus blur in the images. DFD usually captures two differently focused images, one near-focused and the other far-focused, and calculates the size of the defocus ...
Read More
An MRF Model-Based Approach to Simultaneous Recovery of Depth and Restoration from Defocused Images

Depth from defocus (DFD) problem involves calculating the depth of various points in a scene by modeling the effect that the focal parameters of the camera have on images acquired with a small depth of field. In this paper, we propose a MAP-MRF-based ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Graphics Volume 42, Issue 6
December 2023
1565 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3632123
Issue’s Table of Contents

Copyright © 2023 Owner/Author
This work is licensed under a Creative Commons Attribution International 4.0 License.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 5 December 2023
Published in tog Volume 42, Issue 6

Check for updates
Author Tags
depth from defocus
high dynamic range imaging
neural fields
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 464
  Total Downloads
- Downloads (Last 12 months)464
- Downloads (Last 6 weeks)73
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An Implicit Neural Representation for the Image Stack: Depth, All in Focus, and High Dynamic Range

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Multiple depth layers and all-in-focus image generations by blurring and deblurring operations

Half-sweep imaging for depth from defocus

An MRF Model-Based Approach to Simultaneous Recovery of Depth and Restoration from Defocused Images

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

An Implicit Neural Representation for the Image Stack: Depth, All in Focus, and High Dynamic Range

ACM Transactions on Graphics

Abstract

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

Multiple depth layers and all-in-focus image generations by blurring and deblurring operations

Half-sweep imaging for depth from defocus

An MRF Model-Based Approach to Simultaneous Recovery of Depth and Restoration from Defocused Images

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media