Abstract
Objects obscured by occluders are considered lost in the images acquired by conventional camera systems, prohibiting both visualization and understanding of such hidden objects. Non-line-of-sight methods (NLOS) aim at recovering information about hidden scenes, which could help make medical imaging less invasive, improve the safety of autonomous vehicles, and potentially enable capturing unprecedented high-definition RGB-D data sets that include geometry beyond the directly visible parts. Recent NLOS methods have demonstrated scene recovery from time-resolved pulse-illuminated measurements encoding occluded objects as faint indirect reflections. Unfortunately, these systems are fundamentally limited by the quartic intensity fall-off for diffuse scenes. With laser illumination limited by eye-safety limits, recovery algorithms must tackle this challenge by incorporating scene priors. However, existing NLOS reconstruction algorithms do not facilitate learning scene priors. Even if they did, datasets that allow for such supervision do not exist, and successful encoder-decoder networks and generative adversarial networks fail for real-world NLOS data. In this work, we close this gap by learning hidden scene feature representations tailored to both reconstruction and recognition tasks such as classification or object detection, while still relying on physical models at the feature level. We overcome the lack of real training data with a generalizable architecture that can be trained in simulation. We learn the differentiable scene representation jointly with the reconstruction task using a differentiable transient renderer in the objective, and demonstrate that it generalizes to unseen classes and unseen real-world scenes, unlike existing encoder-decoder architectures and generative adversarial networks. The proposed method allows for end-to-end training for different NLOS tasks, such as image reconstruction, classification, and object detection, while being memory-efficient and running at real-time rates. We demonstrate hidden view synthesis, RGB-D reconstruction, classification, and object detection in the hidden scene in an end-to-end fashion.
Supplemental Material
Available for Download
Supplemental material.
- Nils Abramson. 1978. Light-in-flight recording by holography. Optics Letters 3, 4 (1978), 121--123.Google ScholarCross Ref
- Victor Arellano, Diego Gutierrez, and Adrian Jarabo. 2017. Fast back-projection for non-line of sight reconstruction. Optics Express 25, 10 (2017), 11574--11583.Google ScholarCross Ref
- Katherine L Bouman, Vickie Ye, Adam B Yedidia, Frédo Durand, Gregory W Wornell, Antonio Torralba, and William T Freeman. 2017. Turning corners into cameras: Principles and methods. In IEEE International Conference on Computer Vision (ICCV). 2289--2297.Google ScholarCross Ref
- Samuel Burri. 2016. Challenges and Solutions to Next-Generation Single-Photon Imagers. Technical Report. EPFL.Google Scholar
- Mauro Buttafava, Jessica Zeman, Alberto Tosi, Kevin Eliceiri, and Andreas Velten. 2015. Non-line-of-sight imaging using a time-gated single photon avalanche diode. Optics express 23, 16 (2015), 20997--21011.Google Scholar
- Piergiorgio Caramazza, Alessandro Boccolini, Daniel Buschek, Matthias Hullin, Catherine F Higham, Robert Henderson, Roderick Murray-Smith, and Daniele Faccio. 2018a. Neural network identification of people hidden from view with a single-pixel, single-photon detector. Scientific reports 8, 1 (2018), 11945.Google Scholar
- Piergiorgio Caramazza, Alessandro Boccolini, Daniel Buschek, Matthias Hullin, Catherine F Higham, Robert Henderson, Roderick Murray-Smith, and Daniele Faccio. 2018b. Neural network identification of people hidden from view with a single-pixel, single-photon detector. Scientific Reports 8, 1 (2018), 11945.Google ScholarCross Ref
- Susan Chan, Ryan E Warburton, Genevieve Gariepy, Jonathan Leach, and Daniele Faccio. 2017. Non-line-of-sight tracking of people at long range. Optics express 25, 9 (2017), 10109--10117.Google Scholar
- Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, and Fisher Yu. 2015. ShapeNet: An Information-Rich 3D Model Repository. Technical Report arXiv:1512.03012 [cs.GR]. Stanford University --- Princeton University --- Toyota Technological Institute at Chicago.Google Scholar
- Wenzheng Chen, Simon Daneau, Fahim Mannan, and Felix Heide. 2019. Steady-state non-line-of-sight imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6790--6799.Google ScholarCross Ref
- Javier Grau Chopite, Matthias B. Hullin, Michael Wand, and Julian Iseringhausen. 2020. Deep Non-Line-of-Sight Reconstruction. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
- Christopher B Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, and Silvio Savarese. 2016. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In European conference on computer vision. Springer, 628--644.Google ScholarCross Ref
- Özgün Çiçek, Ahmed Abdulkadir, Soeren S Lienkamp, Thomas Brox, and Olaf Ronneberger. 2016. 3D U-Net: learning dense volumetric segmentation from sparse annotation. In International conference on medical image computing and computerassisted intervention. Springer, 424--432.Google Scholar
- PB Coates. 1972. Pile-up corrections in the measurement of lifetimes. Journal of Physics E: Scientific Instruments 5, 2 (1972), 148.Google ScholarCross Ref
- Michael F Cohen and Donald P Greenberg. 1985. The hemi-cube: A radiosity solution for complex environments. ACM Siggraph Computer Graphics 19, 3 (1985), 31--40.Google ScholarDigital Library
- Qi Guo, Iuri Frosio, Orazio Gallo, Todd Zickler, and Jan Kautz. 2018. Tackling 3d tof artifacts through learning and the flat dataset. In Proceedings of the European Conference on Computer Vision (ECCV). 368--383.Google ScholarDigital Library
- Otkrist Gupta, Thomas Willwacher, Andreas Velten, Ashok Veeraraghavan, and Ramesh Raskar. 2012. Reconstruction of hidden 3D shapes using diffuse reflections. Opt. Express 20, 17 (Aug 2012), 19096--19108.Google ScholarCross Ref
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
- Felix Heide, Steven Diamond, David B Lindell, and Gordon Wetzstein. 2018. Subpicosecond photon-efficient 3D imaging using single-photon sensors. Scientific reports 8, 1 (2018), 17726.Google Scholar
- Felix Heide, Matthias B Hullin, James Gregson, and Wolfgang Heidrich. 2013. Low-budget transient imaging using photonic mixer devices. ACM Transactions on Graphics (ToG) 32, 4 (2013), 1--10.Google ScholarDigital Library
- Felix Heide, Matthew O'Toole, Kai Zang, David B Lindell, Steven Diamond, and Gordon Wetzstein. 2019. Non-line-of-sight imaging with partial occluders and surface normals. ACM Transactions on Graphics (ToG) 38, 3 (2019), 22.Google ScholarDigital Library
- Felix Heide, Lei Xiao, Wolfgang Heidrich, and Matthias B Hullin. 2014. Diffuse mirrors: 3D reconstruction from diffuse indirect illumination using inexpensive time-of-flight sensors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3222--3229.Google ScholarDigital Library
- Quercus Hernandez, Diego Gutierrez, and Adrian Jarabo. 2017. A Computational Model of a Single-Photon Avalanche Diode Sensor for Transient Imaging. arXiv:physics.insdet/1703.02635Google Scholar
- Julian Iseringhausen and Matthias B Hullin. 2020. Non-line-of-sight reconstruction using efficient transient rendering. ACM Transactions on Graphics (TOG) 39, 1 (2020), 1--14.Google ScholarDigital Library
- Max Jaderberg, Karen Simonyan, Andrew Zisserman, et al. 2015. Spatial transformer networks. In Advances in neural information processing systems. 2017--2025.Google Scholar
- Adrian Jarabo and Victor Arellano. 2018. Bidirectional rendering of vector light transport. In Computer Graphics Forum, Vol. 37. Wiley Online Library, 96--105.Google ScholarDigital Library
- Adrian Jarabo, Julio Marco, Adolfo Munoz, Raul Buisan, Wojciech Jarosz, and Diego Gutierrez. 2014. A Framework for Transient Rendering. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia) 33, 6 (nov 2014). Google ScholarDigital Library
- Adrian Jarabo, Belen Masia, Julio Marco, and Diego Gutierrez. 2017. Recent advances in transient imaging: A computer graphics and vision perspective. Visual Informatics 1, 1 (2017), 65--79.Google ScholarCross Ref
- Achuta Kadambi, Refael Whyte, Ayush Bhandari, Lee Streeter, Christopher Barsi, Adrian Dorrington, and Ramesh Raskar. 2013. Coded time of flight cameras: sparse deconvolution to address multipath interference and recover time profiles. ACM Transactions on Graphics (ToG) 32, 6 (2013), 167.Google ScholarDigital Library
- Achuta Kadambi, Hang Zhao, Boxin Shi, and Ramesh Raskar. 2016. Occluded imaging with time-of-flight sensors. ACM Transactions on Graphics (ToG) 35, 2 (2016), 15.Google ScholarDigital Library
- Ori Katz, Pierre Heidmann, Mathias Fink, and Sylvain Gigan. 2014. Non-invasive singleshot imaging through scattering layers and around corners via speckle correlations. Nature photonics 8, 10 (2014), 784.Google Scholar
- Ori Katz, Eran Small, and Yaron Silberberg. 2012. Looking around corners and through thin turbid layers in real time with scattered incoherent light. Nature photonics 6, 8 (2012), 549--553.Google Scholar
- A. Kirmani, T. Hutchison, J. Davis, and R. Raskar. 2009. Looking around the corner using transient imaging. In IEEE International Conference on Computer Vision (ICCV). 159--166.Google Scholar
- Ahmed Kirmani, Dheera Venkatraman, Dongeek Shin, Andrea Colaço, Franco NC Wong, Jeffrey H Shapiro, and Vivek K Goyal. 2014. First-photon imaging. Science 343, 6166 (2014), 58--61.Google Scholar
- Jonathan Klein, Christoph Peters, Jaime Martín, Martin Laurenzis, and Matthias B Hullin. 2016. Tracking objects outside the line of sight using 2D intensity images. Scientific reports 6 (2016), 32491.Google Scholar
- Martin Laurenzis and Andreas Velten. 2014. Feature selection and back-projection algorithms for nonline-of-sight laser-gated viewing. Journal of Electronic Imaging 23, 6 (2014), 063003.Google ScholarCross Ref
- David B Lindell, Gordon Wetzstein, and Vladlen Koltun. 2019a. Acoustic non-line-of-sight imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6780--6789.Google ScholarCross Ref
- David B. Lindell, Gordon Wetzstein, and Matthew O'Toole. 2019b. Wave-based non-line-of-sight imaging using fast f-k migration. ACM Trans. Graph. (SIGGRAPH) 38, 4 (2019), 116.Google ScholarDigital Library
- Xiaochun Liu, Sebastian Bauer, and Andreas Velten. 2020. Phasor field diffraction based reconstruction for fast non-line-of-sight imaging systems. Nature Communications 11 (2020). Google ScholarCross Ref
- Xiaochun Liu, Ibón Guillén, Marco La Manna, Ji Hyun Nam, Syed Azer Reza, Toan Huu Le, Adrian Jarabo, Diego Gutierrez, and Andreas Velten. 2019. Non-line-of-sight imaging using phasor-field virtual wave optics. Nature (2019), 1--4.Google Scholar
- Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann, and Yaser Sheikh. 2019. Neural Volumes: Learning Dynamic Renderable Volumes from Images. ACM Trans. Graph. 38, 4, Article 65 (July 2019), 14 pages.Google ScholarDigital Library
- Julio Marco, Quercus Hernandez, Adolfo Muñoz, Yue Dong, Adrian Jarabo, Min H Kim, Xin Tong, and Diego Gutierrez. 2017. DeepToF: off-the-shelf real-time correction of multipath interference in time-of-flight imaging. ACM Transactions on Graphics (ToG) 36, 6 (2017), 1--12.Google ScholarDigital Library
- Christopher A. Metzler, Felix Heide, Prasana Rangarajan, Muralidhar Madabhushi Balaji, Aparna Viswanath, Ashok Veeraraghavan, and Richard G. Baraniuk. 2020. Deep-inverse correlography: towards real-time high-resolution non-line-of-sight imaging. Optica 7, 1 (Jan 2020), 63--71. Google ScholarCross Ref
- Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. arXiv:cs.CV/2003.08934Google Scholar
- N. Naik, S. Zhao, A. Velten, R. Raskar, and K. Bala. 2011. Single view reflectance capture using multiplexed scattering and time-of-flight imaging. ACM Trans. Graph. 30, 6 (2011), 171.Google ScholarDigital Library
- Frédéric Nolet, Samuel Parent, Nicolas Roy, Marc-Olivier Mercier, Serge Charlebois, Réjean Fontaine, and Jean-Francois Pratte. 2018. Quenching Circuit and SPAD Integrated in CMOS 65 nm with 7.8 ps FWHM Single Photon Timing Resolution. Instruments 2, 4 (2018), 19.Google ScholarCross Ref
- Kyle Olszewski, Sergey Tulyakov, Oliver Woodford, Hao Li, and Linjie Luo. 2019. Transformable Bottleneck Networks. The IEEE International Conference on Computer Vision (ICCV) (Nov 2019).Google Scholar
- Matthew O'Toole, David B Lindell, and Gordon Wetzstein. 2018a. Confocal non-line-of-sight imaging based on the light-cone transform. Nature 555, 7696 (2018), 338.Google Scholar
- Matthew O'Toole, David B. Lindell, and Gordon Wetzstein. 2018b. Confocal Non-line-of-sight imaging based on the light cone transform. Nature (2018), 338--341. Issue 555.Google Scholar
- R. Pandharkar, A. Velten, A. Bardagjy, E. Lawson, M. Bawendi, and R. Raskar. 2011. Estimating motion and size of moving non-line-of-sight objects in cluttered environments. In Proc. CVPR. 265--272.Google Scholar
- Luca Parmesan, Neale AW Dutton, Neil J Calder, Andrew J Holmes, Lindsay A Grant, and Robert K Henderson. 2014. A 9.8 μm sample and hold time to amplitude converter CMOS SPAD pixel. In Solid State Device Research Conference (ESSDERC), 2014 44th European. IEEE, 290--293.Google Scholar
- Adithya Pediredla, Ashok Veeraraghavan, and Ioannis Gkioulekas. 2019. Ellipsoidal Path Connections for Time-gated Rendering. ACM Trans. Graph. (SIGGRAPH) (2019).Google Scholar
- Adithya Kumar Pediredla, Mauro Buttafava, Alberto Tosi, Oliver Cossairt, and Ashok Veeraraghavan. 2017. Reconstructing rooms using photon echoes: A plane based model and reconstruction algorithm for looking around the corner. In IEEE International Conference on Computational Photography (ICCP). IEEE.Google ScholarCross Ref
- Stephan R Richter and Stefan Roth. 2018. Matryoshka networks: Predicting 3d geometry via nested shape layers. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1936--1944.Google Scholar
- Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234--241.Google ScholarCross Ref
- Charles Saunders, John Murray-Bruce, and Vivek K Goyal. 2019. Computational periscopy with an ordinary digital camera. Nature 565, 7740 (2019), 472.Google Scholar
- Nicolas Scheiner, Florian Kraus, Fangyin Wei, Buu Phan, Fahim Mannan, Nils Appenrodt, Werner Ritter, Jurgen Dickmann, Klaus Dietmayer, Bernhard Sick, et al. 2020. Seeing Around Street Corners: Non-Line-of-Sight Detection and Tracking In-the-Wild Using Doppler Radar. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2068--2077.Google ScholarCross Ref
- Vincent Sitzmann, Justus Thies, Felix Heide, Matthias Niessner, Gordon Wetzstein, and Michael Zollhöfer. 2019a. DeepVoxels: Learning Persistent 3D Feature Embeddings. In Proc. CVPR.Google ScholarCross Ref
- Vincent Sitzmann, Michael Zollhöfer, and Gordon Wetzstein. 2019b. Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations. In Advances in Neural Information Processing Systems.Google Scholar
- Robert H Stolt. 1978. Migration by Fourier transform. Geophysics 43, 1 (1978), 23--48.Google ScholarCross Ref
- Shuochen Su, Felix Heide, Gordon Wetzstein, and Wolfgang Heidrich. 2018. Deep end-to-end time-of-flight imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6383--6392.Google ScholarCross Ref
- Matthew Tancik, Guy Satat, and Ramesh Raskar. 2018. Flash Photography for Data-Driven Hidden Scene Recovery. CoRR abs/1810.11710 (2018). arXiv:1810.11710 http://arxiv.org/abs/1810.11710Google Scholar
- Maxim Tatarchenko, Alexey Dosovitskiy, and Thomas Brox. 2015. Single-view to Multi-view: Reconstructing Unseen Views with a Convolutional Network. CoRR abs/1511.06702 (2015). arXiv:1511.06702 http://arxiv.org/abs/1511.06702Google Scholar
- Chia-Yin Tsai, Kiriakos N Kutulakos, Srinivasa G Narasimhan, and Aswin C Sankaranarayanan. 2017. The geometry of first-returning photons for non-line-of-sight imaging. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
- Chia-Yin Tsai, Aswin C Sankaranarayanan, and Ioannis Gkioulekas. 2019. Beyond Volumetric Albedo-A Surface Optimization Framework for Non-Line-Of-Sight Imaging. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1545--1555.Google ScholarCross Ref
- A. Velten, T. Willwacher, O. Gupta, A. Veeraraghavan, M.G. Bawendi, and R. Raskar. 2012. Recovering three-dimensional shape around a corner using ultrafast time-of-flight imaging. Nature Communications 3 (2012), 745.Google ScholarCross Ref
- A. Velten, D. Wu, A. Jarabo, B. Masia, C. Barsi, C. Joshi, E. Lawson, M. Bawendi, D. Gutierrez, and R. Raskar. 2013. Femto-Photography: Capturing and Visualizing the Propagation of Light. ACM Trans. Graph. 32 (2013).Google Scholar
- Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. 2018. Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7794--7803.Google ScholarCross Ref
- D. Wu, M. O'Toole, A. Velten, A. Agrawal, and R. Raskar. 2012. Decomposing global light transport using time of flight imaging. In Proc. CVPR. 366--373.Google Scholar
- Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 2015. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1912--1920.Google Scholar
- Feihu Xu, Gal Shulkind, Christos Thrampoulidis, Jeffrey H. Shapiro, Antonio Torralba, Franco N. C. Wong, and Gregory W. Wornell. 2018. Revealing hidden scenes by photon-efficient occlusion-based opportunistic active imaging. OSA Opt. Express 26, 8 (2018), 9945--9962.Google ScholarCross Ref
- Tinghui Zhou, Shubham Tulsiani, Weilun Sun, Jitendra Malik, and Alexei A. Efros. 2016. View Synthesis by Appearance Flow. CoRR abs/1605.03557 (2016). arXiv:1605.03557 http://arxiv.org/abs/1605.03557Google Scholar
Index Terms
- Learned feature embeddings for non-line-of-sight imaging and recognition
Recommendations
A dataset for benchmarking time-resolved non-line-of-sight imaging
SIGGRAPH '19: ACM SIGGRAPH 2019 PostersTime-resolved imaging has made it possible to look around corners by exploiting information from diffuse light bounces. While there have been successive improvements in the field since its conception, so far it has only been proven to work in very ...
Non-line-of-sight Imaging with Partial Occluders and Surface Normals
Imaging objects obscured by occluders is a significant challenge for many applications. A camera that could “see around corners” could help improve navigation and mapping capabilities of autonomous vehicles or make search and rescue missions more ...
Virtual Mirrors: Non-Line-of-Sight Imaging Beyond the Third Bounce
Non-line-of-sight (NLOS) imaging methods are capable of reconstructing complex scenes that are not visible to an observer using indirect illumination. However, they assume only third-bounce illumination, so they are currently limited to single-corner ...
Comments