Abstract
The ability to generate novel, diverse, and realistic 3D shapes along with associated part semantics and structure is central to many applications requiring high-quality 3D assets or large volumes of realistic training data. A key challenge towards this goal is how to accommodate diverse shape variations, including both continuous deformations of parts as well as structural or discrete alterations which add to, remove from, or modify the shape constituents and compositional structure. Such object structure can typically be organized into a hierarchy of constituent object parts and relationships, represented as a hierarchy of n-ary graphs. We introduce StructureNet, a hierarchical graph network which (i) can directly encode shapes represented as such n-ary graphs, (ii) can be robustly trained on large and complex shape families, and (iii) be used to generate a great diversity of realistic structured shape geometries. Technically, we accomplish this by drawing inspiration from recent advances in graph neural networks to propose an order-invariant encoding of n-ary graphs, considering jointly both part geometry and inter-part relations during network training. We extensively evaluate the quality of the learned latent spaces for various shape families and show significant advantages over baseline and competing methods. The learned latent spaces enable several structure-aware geometry processing applications, including shape generation and interpolation, shape editing, or shape structure discovery directly from un-annotated images, point clouds, or partial scans.
Supplemental Material
Available for Download
Supplemental files.
- Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, and Leonidas Guibas. 2017. Learning representations and generative models for 3d point clouds. arXiv preprint arXiv:1707.02392 (2017).Google Scholar
- Panos Achlioptas, Judy Fan, Robert Hawkins, Noah Goodman, and Leonidas Guibas. 2019. ShapeGlot: Learning Language for Shape Differentiation. Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019).Google ScholarCross Ref
- Amir Arsalan Soltani, Haibin Huang, Jiajun Wu, Tejas D Kulkarni, and Joshua B Tenenbaum. 2017. Synthesizing 3d shapes via modeling multi-view depth maps and silhouettes with deep generative networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1511--1519.Google ScholarCross Ref
- H. G. Barrow, J. M. Tenenbaum, R. C. Bolles, and H. C. Wolf. 1977. Parametric Correspondence and Chamfer Matching: Two New Techniques for Image Matching. In Proceedings of the 5th International Joint Conference on Artificial Intelligence - Volume 2 (IJCAI'77). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 659--663. http://dl.acm.org/citation.cfm?id=1622943.1622971Google Scholar
- Davide Boscaini, Jonathan Masci, Simone Melzi, Michael M Bronstein, Umberto Castellani, and Pierre Vandergheynst. 2015. Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks. In Computer Graphics Forum, Vol. 34. Wiley Online Library, 13--23.Google Scholar
- Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann LeCun. 2013. Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203 (2013).Google Scholar
- Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015).Google Scholar
- Siddhartha Chaudhuri, Evangelos Kalogerakis, Leonidas Guibas, and Vladlen Koltun. 2011. Probabilistic reasoning for assembly-based 3D modeling. In ACM Transactions on Graphics (TOG), Vol. 30. ACM, 35.Google ScholarDigital Library
- Christopher B Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, and Silvio Savarese. 2016. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In European conference on computer vision. Springer, 628--644.Google ScholarCross Ref
- Angela Dai, Angel X. Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Nießner. 2017. ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes. In Proc. Computer Vision and Pattern Recognition (CVPR), IEEE.Google ScholarCross Ref
- Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. 2016. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in neural information processing systems. 3844--3852.Google Scholar
- Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248--255.Google ScholarCross Ref
- Anastasia Dubrovina, Fei Xia, Panos Achlioptas, Mira Shalah, and Leonidas Guibas. 2019. Composite Shape Modeling via Latent Space Factorization. Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2019).Google ScholarCross Ref
- David K Duvenaud, Dougal Maclaurin, Jorge Iparraguirre, Rafael Bombarell, Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P Adams. 2015. Convolutional networks on graphs for learning molecular fingerprints. In Advances in neural information processing systems. 2224--2232.Google Scholar
- Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, and Noah A Smith. 2016. Recurrent neural network grammars. arXiv preprint arXiv:1602.07776 (2016).Google Scholar
- Haoqiang Fan, Hao Su, and Leonidas J Guibas. 2017. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE conference on computer vision and pattern recognition. 605--613.Google ScholarCross Ref
- Noa Fish, Melinos Averkiou, Oliver Van Kaick, Olga Sorkine-Hornung, Daniel Cohen-Or, and Niloy J Mitra. 2014. Meta-representation of shape families. ACM Transactions on Graphics (TOG) 33, 4 (2014), 34.Google ScholarDigital Library
- Noa Fish, Oliver van Kaick, Amit Bermano, and Daniel Cohen-Or. 2016. Structure-oriented networks of shape collections. ACM Transactions on Graphics (TOG) 35, 6 (2016), 171.Google ScholarDigital Library
- Vignesh Ganapathi-Subramanian, Olga Diamanti, Soeren Pirk, Chengcheng Tang, Matthias Niessner, and Leonidas Guibas. 2018. Parsing geometry using structure-aware shape templates. In 2018 International Conference on 3D Vision (3DV). IEEE, 672--681.Google ScholarCross Ref
- Lin Gao, Jie Yang, Tong Wu, Yu-Jie Yuan, Hongbo Fu, Yu-Kun Lai, and Hao Zhang. 2019. SDM-NET: Deep Generative Network for Structured Deformable Mesh. arXiv preprint arXiv:1908.04520 (2019).Google Scholar
- Aleksey Golovinskiy and Thomas Funkhouser. 2009. Consistent segmentation of 3D models. Computers & Graphics 33, 3 (2009), 262--269.Google ScholarDigital Library
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in neural information processing systems. 2672--2680.Google Scholar
- Thibault Groueix, Matthew Fisher, Vladimir G Kim, Bryan C Russell, and Mathieu Aubry. 2018. A papier-mâché approach to learning 3d surface generation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 216--224.Google ScholarCross Ref
- J Gwak, Christopher B Choy, Animesh Garg, Manmohan Chandraker, and Silvio Savarese. 2017. Weakly supervised generative adversarial networks for 3d reconstruction. arXiv preprint arXiv:1705.10904 (2017).Google Scholar
- Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems. 1024--1034.Google Scholar
- Rana Hanocka, Amir Hertz, Noa Fish, Raja Giryes, Shachar Fleishman, and Daniel Cohen-Or. 2019. MeshCNN: a network with an edge. ACM Transactions on Graphics (TOG) 38, 4 (2019), 90.Google ScholarDigital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.Google ScholarCross Ref
- Geoffrey E. Hinton. 1990. Mapping Part-whole Hierarchies into Connectionist Networks. Artif. Intell. 46, 1--2 (Nov. 1990), 47--75. Google ScholarDigital Library
- Ruizhen Hu, Lubin Fan, and Ligang Liu. 2012. Co-segmentation of 3d shapes via subspace clustering. In Computer graphics forum, Vol. 31. Wiley Online Library, 1703--1713.Google Scholar
- Qixing Huang, Vladlen Koltun, and Leonidas Guibas. 2011. Joint shape segmentation with linear programming. In ACM transactions on graphics (TOG), Vol. 30. ACM, 125.Google Scholar
- Evangelos Kalogerakis, Melinos Averkiou, Subhransu Maji, and Siddhartha Chaudhuri. 2017. 3D shape segmentation with projective convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3779--3788.Google ScholarCross Ref
- Evangelos Kalogerakis, Siddhartha Chaudhuri, Daphne Koller, and Vladlen Koltun. 2012. A probabilistic model for component-based shape synthesis. ACM Transactions on Graphics (TOG) 31, 4 (2012), 55.Google ScholarDigital Library
- Evangelos Kalogerakis, Aaron Hertzmann, and Karan Singh. 2010. Learning 3D mesh segmentation and labeling. ACM Transactions on Graphics (TOG) 29, 4 (2010), 102.Google ScholarDigital Library
- Javor Kalojanov, Isaak Lim, Niloy Mitra, and Leif Kobbelt. 2019. String-Based Synthesis of Structured Shapes. Computer Graphics Forum 38, 2 (2019), 027--036.Google ScholarCross Ref
- Vladimir G Kim, Wilmot Li, Niloy J Mitra, Siddhartha Chaudhuri, Stephen DiVerdi, and Thomas Funkhouser. 2013. Learning part-based templates from large collections of 3D shapes. ACM Transactions on Graphics (TOG) 32, 4 (2013), 70.Google ScholarDigital Library
- Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).Google Scholar
- Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations (ICLR).Google Scholar
- Chun-Liang Li, Manzil Zaheer, Yang Zhang, Barnabas Poczos, and Ruslan Salakhutdinov. 2018b. Point cloud gan. arXiv preprint arXiv:1810.05795 (2018).Google Scholar
- Jun Li, Kai Xu, Siddhartha Chaudhuri, Ersin Yumer, Hao Zhang, and Leonidas Guibas. 2017. GRASS: Generative Recursive Autoencoders for Shape Structures. ACM Transactions on Graphics 36, 4 (2017).Google ScholarDigital Library
- Manyi Li, Akshay Gadi Patil, Kai Xu, Siddhartha Chaudhuri, Owais Khan, Ariel Shamir, Changhe Tu, Baoquan Chen, Daniel Cohen-Or, and Hao Zhang. 2019. Grains: Generative recursive autoencoders for indoor scenes. ACM Transactions on Graphics (TOG) 38, 2 (2019), 12.Google ScholarDigital Library
- Yangyan Li, Hao Su, Charles Ruizhongtai Qi, Noa Fish, Daniel Cohen-Or, and Leonidas J. Guibas. 2015. Joint Embeddings of Shapes and Images via CNN Image Purification. ACM Trans. Graph. 34, 6, Article 234 (Oct. 2015), 12 pages. Google ScholarDigital Library
- Yujia Li, Oriol Vinyals, Chris Dyer, Razvan Pascanu, and Peter Battaglia. 2018a. Learning deep generative models of graphs. arXiv preprint arXiv:1803.03324 (2018).Google Scholar
- Tianqiang Liu, Siddhartha Chaudhuri, Vladimir G Kim, Qixing Huang, Niloy J Mitra, and Thomas Funkhouser. 2014. Creating consistent scene graphs using a probabilistic grammar. ACM Transactions on Graphics (TOG) 33, 6 (2014), 211.Google ScholarDigital Library
- Chris Maddison and Daniel Tarlow. 2014. Structured generative models of natural source code. In International Conference on Machine Learning. 649--657.Google Scholar
- Ameesh Makadia and Mehmet Ersin Yumer. 2014. Learning 3d part detection from sparsely labeled data. In 2014 2nd International Conference on 3D Vision, Vol. 1. IEEE, 311--318.Google ScholarDigital Library
- Jonathan Masci, Davide Boscaini, Michael Bronstein, and Pierre Vandergheynst. 2015. Geodesic convolutional neural networks on riemannian manifolds. In Proceedings of the IEEE international conference on computer vision workshops. 37--45.Google ScholarDigital Library
- Niloy J. Mitra, Michael Wand, Hao Zhang, Daniel Cohen-Or, Vladimir Kim, and Qi-Xing Huang. 2014. Structure-aware Shape Processing. In ACM SIGGRAPH 2014 Courses (SIGGRAPH '14). ACM, New York, NY, USA, Article 13, 21 pages. Google ScholarDigital Library
- Kaichun Mo, Shilin Zhu, Angel Chang, Li Yi, Subarna Tripathi, Leonidas Guibas, and Hao Su. 2019. PartNet: A Large-scale Benchmark for Fine-grained and Hierarchical Part-level 3D Object Understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
- Pascal Müller, Peter Wonka, Simon Haegler, Andreas Ulmer, and Luc Van Gool. 2006. Procedural modeling of buildings. Acm Transactions On Graphics (Tog) 25, 3 (2006), 614--623.Google ScholarDigital Library
- Charlie Nash and Christopher KI Williams. 2017. The shape variational autoencoder: A deep generative model of part-segmented 3D objects. In Computer Graphics Forum, Vol. 36. Wiley Online Library, 1--12.Google Scholar
- Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017a. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 652--660.Google Scholar
- Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017b. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in Neural Information Processing Systems. 5099--5108.Google Scholar
- Oana Sidi, Oliver van Kaick, Yanir Kleiman, Hao Zhang, and Daniel Cohen-Or. 2011. Unsupervised co-segmentation of a set of shapes via descriptor-space spectral clustering. Vol. 30. ACM.Google Scholar
- Ayan Sinha, Asim Unmesh, Qixing Huang, and Karthik Ramani. 2017. Surfnet: Generating 3d shape surfaces using deep residual networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6040--6049.Google ScholarCross Ref
- Richard Socher, Brody Huval, Bharath Bath, Christopher D Manning, and Andrew Y Ng. 2012. Convolutional-recursive deep learning for 3d object classification. In Advances in neural information processing systems. 656--664.Google Scholar
- Richard Socher, Cliff C Lin, Chris Manning, and Andrew Y Ng. 2011. Parsing natural scenes and natural language with recursive neural networks. In Proceedings of the 28th international conference on machine learning (ICML-11). 129--136.Google ScholarDigital Library
- Minhyuk Sung, Hao Su, Vladimir G Kim, Siddhartha Chaudhuri, and Leonidas Guibas. 2017. Complementme: weakly-supervised component suggestions for 3D modeling. ACM Transactions on Graphics (TOG) 36, 6 (2017), 226.Google ScholarDigital Library
- Maxim Tatarchenko, Alexey Dosovitskiy, and Thomas Brox. 2017. Octree generating networks: Efficient convolutional architectures for high-resolution 3d outputs. In Proceedings of the IEEE International Conference on Computer Vision. 2088--2096.Google ScholarCross Ref
- Yonglong Tian, Andrew Luo, Xingyuan Sun, Kevin Ellis, William T. Freeman, Joshua B. Tenenbaum, and Jiajun Wu. 2019. Learning to Infer and Execute 3D Shape Programs. In International Conference on Learning Representations.Google Scholar
- Shubham Tulsiani, Hao Su, Leonidas J Guibas, Alexei A Efros, and Jitendra Malik. 2017. Learning shape abstractions by assembling volumetric primitives. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2635--2643.Google ScholarCross Ref
- Oliver Van Kaick, Kai Xu, Hao Zhang, Yanzhen Wang, Shuyang Sun, Ariel Shamir, and Daniel Cohen-Or. 2013. Co-hierarchical analysis of shape structures. ACM Transactions on Graphics (TOG) 32, 4 (2013), 69.Google Scholar
- Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. 2017. Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).Google Scholar
- Oriol Vinyals, Łukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, and Geoffrey Hinton. 2015. Grammar as a foreign language. In Advances in neural information processing systems. 2773--2781.Google Scholar
- Hao Wang, Nadav Schor, Ruizhen Hu, Haibin Huang, Daniel Cohen-Or, and Hui Huang. 2018a. Global-to-local generative model for 3d shapes. In SIGGRAPH Asia 2018 Technical Papers. ACM, 214.Google Scholar
- Peng-Shuai Wang, Chun-Yu Sun, Yang Liu, and Xin Tong. 2018b. Adaptive O-CNN: a patch-based deep representation of 3D shapes. In SIGGRAPH Asia 2018 Technical Papers. ACM, 217.Google Scholar
- Yue Wang, Yongbin Sun, Ziwei Liu, Sanjay E. Sarma, Michael M. Bronstein, and Justin M. Solomon. 2019. Dynamic Graph CNN for Learning on Point Clouds. ACM Transactions on Graphics (TOG) (2019).Google ScholarDigital Library
- Yanzhen Wang, Kai Xu, Jun Li, Hao Zhang, Ariel Shamir, Ligang Liu, Zhiquan Cheng, and Yueshan Xiong. 2011a. Symmetry hierarchy of man-made objects. In Computer Graphics Forum, Vol. 30. Wiley Online Library, 287--296.Google Scholar
- Y. Wang, K. Xu, J. Li, H. Zhang, A. Shamir, L. Liu, Z. Cheng, and Y. Xiong. 2011b. Symmetry Hierarchy of Man-Made Objects. Computer Graphics Forum 30, 2 (2011), 287--296. Google ScholarCross Ref
- Jiajun Wu, Chengkai Zhang, Tianfan Xue, Bill Freeman, and Josh Tenenbaum. 2016. Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In Advances in neural information processing systems. 82--90.Google Scholar
- Zhijie Wu, Xiang Wang, Di Lin, Dani Lischinski, Daniel Cohen-Or, and Hui Huang. 2019. SAGNet: Structure-aware Generative Network for 3D-Shape Modeling. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2019) 38, 4 (2019), 91:1--91:14.Google ScholarDigital Library
- Zhige Xie, Kai Xu, Ligang Liu, and Yueshan Xiong. 2014. 3d shape segmentation and labeling via extreme learning machine. In Computer graphics forum, Vol. 33. Wiley Online Library, 85--95.Google Scholar
- Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019a. How Powerful are Graph Neural Networks?. In International Conference on Learning Representations. https://openreview.net/forum?id=ryGs6iA5KmGoogle Scholar
- Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019b. How Powerful are Graph Neural Networks?. In ICLR.Google Scholar
- Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, and Honglak Lee. 2016. Perspective transformer nets: Learning single-view 3d object reconstruction without 3d supervision. In Advances in Neural Information Processing Systems. 1696--1704.Google Scholar
- Li Yi, Leonidas Guibas, Aaron Hertzmann, Vladimir G Kim, Hao Su, and Ersin Yumer. 2017a. Learning hierarchical shape segmentation and labeling from online repositories. arXiv preprint arXiv:1705.01661 (2017).Google Scholar
- Li Yi, Vladimir G Kim, Duygu Ceylan, I Shen, Mengyan Yan, Hao Su, Cewu Lu, Qixing Huang, Alla Sheffer, and Leonidas Guibas Guibas. 2016. A scalable active framework for region annotation in 3D shape collections. ACM Transactions on Graphics (TOG) 35, 6 (2016), 210.Google ScholarDigital Library
- Li Yi, Hao Su, Xingwen Guo, and Leonidas J Guibas. 2017b. Syncspeccnn: Synchronized spectral cnn for 3d shape segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2282--2290.Google ScholarCross Ref
- Jiaxuan You, Rex Ying, Xiang Ren, William L Hamilton, and Jure Leskovec. 2018. Graphrnn: Generating realistic graphs with deep auto-regressive models. arXiv preprint arXiv:1802.08773 (2018).Google Scholar
- Fenggen Yu, Kun Liu, Yan Zhang, Chenyang Zhu, and Kai Xu. 2019. PartNet: A Recursive Part Decomposition Network for Fine-grained and Hierarchical Shape Segmentation. In CVPR.Google Scholar
- Mehmet Ersin Yumer, Siddhartha Chaudhuri, Jessica K Hodgins, and Levent Burak Kara. 2015. Semantic shape editing using deformation handles. ACM Transactions on Graphics (TOG) 34, 4 (2015), 86.Google ScholarDigital Library
- Xi Zhao, Ruizhen Hu, Paul Guerrero, Niloy Mitra, and Taku Komura. 2016. Relationship templates for creating scene variations. ACM Transactions on Graphics (TOG) 35, 6 (2016), 207.Google ScholarDigital Library
Index Terms
- StructureNet: hierarchical graph networks for 3D shape generation
Recommendations
SP-GAN: sphere-guided 3D shape generation and manipulation
We present SP-GAN, a new unsupervised sphere-guided generative model for direct synthesis of 3D shapes in the form of point clouds. Compared with existing models, SP-GAN is able to synthesize diverse and high-quality shapes with fine details and promote ...
ShapeAssembly: learning to generate programs for 3D shape structure synthesis
Manually authoring 3D shapes is difficult and time consuming; generative models of 3D shapes offer compelling alternatives. Procedural representations are one such possibility: they offer high-quality and editable results but are difficult to author and ...
GRASS: generative recursive autoencoders for shape structures
We introduce a novel neural network architecture for encoding and synthesis of 3D shapes, particularly their structures. Our key insight is that 3D shapes are effectively characterized by their hierarchical organization of parts, which reflects ...
Comments