Lossless Compression of Deep Neural Networks

Serra, Thiago; Kumar, Abhinav; Ramalingam, Srikumar

doi:10.1007/978-3-030-58942-4_27

Thiago Serra¹⁰,
Abhinav Kumar¹¹ &
Srikumar Ramalingam¹¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12296))

Included in the following conference series:

International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research

1990 Accesses

Abstract

Deep neural networks have been successful in many predictive modeling tasks, such as image and language recognition, where large neural networks are often used to obtain good accuracy. Consequently, it is challenging to deploy these networks under limited computational resources, such as in mobile devices. In this work, we introduce an algorithm that removes units and layers of a neural network while not changing the output that is produced, which thus implies a lossless compression. This algorithm, which we denote as LEO (Lossless Expressiveness Optimization), relies on Mixed-Integer Linear Programming (MILP) to identify Rectified Linear Units (ReLUs) with linear behavior over the input domain. By using $\ell _1$ regularization to induce such behavior, we can benefit from training over a larger architecture than we would later use in the environment where the trained neural network is deployed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Model Compression Techniques in Deep Neural Networks

A Survey of Deep Neural Network Compression

Compression of Deep Neural Networks on the Fly

References

Aghasi, A., Abdi, A., Nguyen, N., Romberg, J.: Net-trim: convex pruning of deep neural networks with performance guarantee. In: NeurIPS (2017)
Google Scholar
Agrawal, A., Amos, B., Barratt, S., Boyd, S., Diamond, S., Kolter, Z.: Differentiable convex optimization layers. In: NeurIPS (2019)
Google Scholar
Alvarez, A., Louveaux, Q., Wehenkel, L.: A machine learning-based approximation of strong branching. INFORMS J. Comput. (2017)
Google Scholar
Alvarez, J., Salzmann, M.: Learning the number of neurons in deep networks. In: NeurIPS (2016)
Google Scholar
Amos, B., Kolter, Z.: OptNet: differentiable optimization as a layer in neural networks. In: ICML (2017)
Google Scholar
Anderson, R., Huchette, J., Tjandraatmadja, C., Vielma, J.: Strong mixed-integer programming formulations for trained neural networks. In: IPCO (2019)
Google Scholar
Arora, R., Basu, A., Mianjy, P., Mukherjee, A.: Understanding deep neural networks with rectified linear units. In: ICLR (2018)
Google Scholar
Balcan, M.F., Dick, T., Sandholm, T., Vitercik, E.: Learning to branch. In: ICML (2018)
Google Scholar
Bartlett, P., Maiorov, V., Meir, R.: Almost linear VC-dimension bounds for piecewise polynomial networks. Neural Comput. 10, 2159–2173 (1998)
Article Google Scholar
Bello, I., Pham, H., Le, Q.V., Norouzi, M., Bengio, S.: Neural combinatorial optimization with reinforcement learning. In: ICLR (2017)
Google Scholar
Bengio, Y., Lodi, A., Prouvost, A.: Machine learning for combinatorial optimization: a methodological tour d’horizon. CoRR abs/1811.06128 (2018)
Google Scholar
Bertsimas, D., Dunn, J.: Optimal classification trees. Mach. Learn. 106(7), 1039–1082 (2017)
Article MathSciNet Google Scholar
Bienstock, D., Muñoz, G., Pokutta, S.: Principled deep neural network training through linear programming. CoRR abs/1810.03218 (2018)
Google Scholar
Bonami, P., Lodi, A., Zarpellon, G.: Learning a classification of mixed-integer quadratic programming problems. In: CPAIOR (2018)
Google Scholar
Cappart, Q., Goutierre, E., Bergman, D., Rousseau, L.M.: Improving optimization bounds using machine learning: decision diagrams meet deep reinforcement learning. In: AAAI (2019)
Google Scholar
Cheng, C., Nührenberg, G., Ruess, H.: Maximum resilience of artificial neural networks. In: ATVA (2017)
Google Scholar
Ciresan, D., Meier, U., Masci, J., Schmidhuber, J.: Multi column deep neural network for traffic sign classification. Neural Netw. 32, 333–338 (2012)
Article Google Scholar
Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or -1. In: NeurIPS (2016)
Google Scholar
Cybenko, G.: Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems (1989)
Google Scholar
Dai, H., Khalil, E.B., Zhang, Y., Dilkina, B., Song, L.: Learning combinatorial optimization algorithms over graphs. In: NeurIPS (2017)
Google Scholar
Demirović, E., et al.: An investigation into prediction + optimisation for the knapsack problem. In: CPAIOR (2019)
Google Scholar
Denton, E., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: NeurIPS (2014)
Google Scholar
Deudon, M., Cournut, P., Lacoste, A., Adulyasak, Y., Rousseau, L.M.: Learning heuristics for the TSP by policy gradient. In: CPAIOR (2018)
Google Scholar
Ding, J.Y., et al.: Accelerating primal solution findings for mixed integer programs based on solution prediction. CoRR abs/1906.09575 (2019)
Google Scholar
Donti, P., Amos, B., Kolter, Z.: Task-based end-to-end model learning in stochastic optimization. In: NeurIPS (2017)
Google Scholar
Dubey, A., Chatterjee, M., Ahuja, N.: Coreset-based neural network compression. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 469–486. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_28
Chapter Google Scholar
Dutta, S., Jha, S., Sankaranarayanan, S., Tiwari, A.: Output range analysis for deep feedforward networks. In: NFM (2018)
Google Scholar
Elmachtoub, A., Grigas, P.: Smart predict, then optimize. CoRR abs/1710.08005 (2017)
Google Scholar
Ferber, A., Wilder, B., Dilkina, B., Tambe, M.: MIPaaL: mixed integer program as a layer. In: AAAI (2020)
Google Scholar
Fischetti, M., Lodi, A., Zarpellon, G.: Learning MILP resolution outcomes before reaching time-limit. In: CPAIOR (2019)
Google Scholar
Fischetti, M., Jo, J.: Deep neural networks and mixed integer linear optimization. Constraints (2018)
Google Scholar
Frankle, J., Carbin, M.: The lottery ticket hypothesis: Finding sparse, trainable neural networks. In: ICLR (2019)
Google Scholar
Galassi, A., Lombardi, M., Mello, P., Milano, M.: Model agnostic solution of CSPs via deep learning: a preliminary study. In: CPAIOR (2018)
Google Scholar
Gambella, C., Ghaddar, B., Naoum-Sawaya, J.: Optimization models for machine learning: a survey. CoRR abs/1901.05331 (2019)
Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: AISTATS (2011)
Google Scholar
Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: ICML (2013)
Google Scholar
Gurobi Optimization, L.: Gurobi optimizer reference manual (2018). http://www.gurobi.com
Hahnloser, R., Sarpeshkar, R., Mahowald, M., Douglas, R., Seung, S.: Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit. Nature 405, 947–951 (2000)
Article Google Scholar
Han, S., et al.: DSD: regularizing deep neural networks with dense-sparse-dense training flow. arXiv preprint arXiv:1607.04381 (2016)
Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: NeurIPS (2015)
Google Scholar
Hanin, B., Rolnick, D.: Complexity of linear regions in deep networks. In: ICML (2019)
Google Scholar
Hanin, B., Rolnick, D.: Deep relu networks have surprisingly few activation patterns. In: NeurIPS (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Herrmann, C., Bowen, R., Zabih, R.: Deep networks with probabilistic gates. CoRR abs/1812.04180 (2018)
Google Scholar
Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE Sig. Process. Mag. 29, 82–97 (2012)
Article Google Scholar
Hornik, K., Stinchcombe, M., White, H.: Multilayer feed-forward networks are universal approximators. Neural Net. 2(5), 359–366 (1989)
Article Google Scholar
Hottung, A., Tanaka, S., Tierney, K.: Deep learning assisted heuristic tree search for the container pre-marshalling problem. Comput. Oper. Res. (2020)
Google Scholar
Howard, A., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Huang, G., Liu, Z., Maaten, L.V.D., Weinberger, K.: Densely connected convolutional networks. In: CVPR (2017)
Google Scholar
Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: LIOn (2011)
Google Scholar
Iandola, F., Han, S., Moskewicz, M., Ashraf, K., Dally, W., Keutzer, K.: Squeezenet: alexnet-level accuracy with 50x fewer parameters and $<$ 0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016)
Icarte, R., Illanes, L., Castro, M., Cire, A., McIlraith, S., Beck, C.: Training binarized neural networks using MIP and CP. In: International Conference on Principles and Practice of Constraint Programming (CP) (2019)
Google Scholar
Jaderberg, M., Vedaldi, A., Zisserman, A.: Speeding up convolutional neural networks with low rank expansions. In: BMVC (2014)
Google Scholar
Kadioglu, S., Malitsky, Y., Sellmann, M., Tierney, K.: ISAC – Instance-Specific Algorithm Configuration. In: ECAI (2010)
Google Scholar
Khalil, E., Bodic, P., Song, L., Nemhauser, G., Dilkina, B.: Learning to branch in mixed integer programming. In: AAAI (2016)
Google Scholar
Khalil, E., Gupta, A., Dilkina, B.: Combinatorial attacks on binarized neural networks. In: ICLR (2019)
Google Scholar
Kolmogorov, V., Rother, C.: Minimizing nonsubmodular functions with graph cuts-a review. In: TPAMI (2007)
Google Scholar
Kotthoff, L.: Algorithm selection for combinatorial search problems: a survey. AI Mag. 35(3) (2014)
Google Scholar
Koval, V., Schlesinger, M.: Two-dimensional programming in image analysis problems. USSR Academy of Science, Automatics and Telemechanics (1976)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: NeurIPS (2012)
Google Scholar
Kruber, M., Lübbecke, M., Parmentier, A.: Learning when to use a decomposition. In: CPAIOR (2017)
Google Scholar
Kumar, A., Serra, T., Ramalingam, S.: Equivalent and approximate transformations of deep neural networks. arXiv preprint arXiv:1905.11428 (2019)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
Article Google Scholar
Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.: Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710 (2016)
Lin, C., Zhong, Z., Wei, W., Yan, J.: Synaptic strength for convolutional neural network. In: NeurIPS (2018)
Google Scholar
Lin, H., Jegelka, S.: Resnet with one-neuron hidden layers is a universal approximator. In: NeurIPS (2018)
Google Scholar
Liu, B., Wang, M., Foroosh, H., Tappen, M., Pensky, M.: Sparse convolutional neural networks. In: CVPR (2015)
Google Scholar
Lodi, A., Zarpellon, G.: On learning and branching: a survey. Top 25(2), 207–236 (2017)
Article MathSciNet Google Scholar
Lombardi, M., Milano, M.: Boosting combinatorial problem modeling with machine learning. In: IJCAI (2018)
Google Scholar
Lomuscio, A., Maganti, L.: An approach to reachability analysis for feed-forward ReLU neural networks. CoRR abs/1706.07351 (2017)
Google Scholar
Luo, J.H., Wu, J., Lin, W.: Thinet: A filter level pruning method for deep neural network compression. In: ICCV (2017)
Google Scholar
Mhaskar, H., Poggio, T.: Function approximation by deep networks. CoRR abs/1905.12882 (2019)
Google Scholar
Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient transfer learning. arXiv preprint arXiv:1611.06440 (2016)
Montúfar, G.: Notes on the number of linear regions of deep neural networks. In: SampTA (2017)
Google Scholar
Montúfar, G., Pascanu, R., Cho, K., Bengio, Y.: On the number of linear regions of deep neural networks. In: NeurIPS (2014)
Google Scholar
Nair, V., Hinton, G.: Rectified linear units improve restricted boltzmann machines. In: ICML (2010)
Google Scholar
Narodytska, N., Kasiviswanathan, S., Ryzhyk, L., Sagiv, M., Walsh, T.: Verifying properties of binarized deep neural networks. In: AAAI (2018)
Google Scholar
Pascanu, R., Montúfar, G., Bengio, Y.: On the number of response regions of deep feedforward networks with piecewise linear activations. In: ICLR (2014)
Google Scholar
Paszke, A., et al.: Automatic differentiation in pytorch. In: NeurIPS Workshops (2017)
Google Scholar
Peng, B., Tan, W., Li, Z., Zhang, S., Xie, D., Pu, S.: Extreme network compression via filter group approximation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 307–323. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_19
Chapter Google Scholar
Raghu, M., Poole, B., Kleinberg, J., Ganguli, S., Dickstein, J.: On the expressive power of deep neural networks. In: ICML (2017)
Google Scholar
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: Imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32
Chapter Google Scholar
Ryu, M., Chow, Y., Anderson, R., Tjandraatmadja, C., Boutilier, C.: CAQL: Continuous action Q-learning. CoRR abs/1909.12397 (2019)
Google Scholar
Say, B., Wu, G., Zhou, Y.Q., Sanner, S.: Nonlinear hybrid planning with deep net learned transition models and mixed-integer linear programming. In: IJCAI (2017)
Google Scholar
Serra, T., Ramalingam, S.: Empirical bounds on linear regions of deep rectifier networks. In: AAAI (2020)
Google Scholar
Serra, T., Tjandraatmadja, C., Ramalingam, S.: Bounding and counting linear regions of deep neural networks. In: ICML (2018)
Google Scholar
Serra, T.: On defining design patterns to generalize and leverage automated constraint solving (2012)
Google Scholar
Singh, G., Gehr, T., Püschel, M., Vechev, M.: Robustness certification with refinement. In: ICLR (2019)
Google Scholar
Sutskever, I., Vinyals, O., Le, Q.: Sequence to sequence learning with neural networks. In: NeurIPS (2014)
Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: CVPR (2015)
Google Scholar
Tan, Y., Delong, A., Terekhov, D.: Deep inverse optimization. In: CPAIOR (2019)
Google Scholar
Tang, Y., Agrawal, S., Faenza, Y.: Reinforcement learning for integer programming: learning to cut. CoRR abs/1906.04859 (2019)
Google Scholar
Tang, Z., Peng, X., Li, K., Metaxas, D.: Towards efficient u-nets: a coupled and quantized approach. In: TPAMI (2019)
Google Scholar
Telgarsky, M.: Benefits of depth in neural networks. In: COLT (2016)
Google Scholar
Tjeng, V., Xiao, K., Tedrake, R.: Evaluating robustness of neural networks with mixed integer programming. In: ICLR (2019)
Google Scholar
Tung, F., Mori, G.: Clip-q: Deep network compression learning by in-parallel pruning-quantization. In: CVPR (2018)
Google Scholar
Veit, A., Belongie, S.: Convolutional networks with adaptive computation graphs. CoRR abs/1711.11503 (2017)
Google Scholar
Wainwright, M., Jaakkola, T., Willsky, A.: Map estimation via agreement on (hyper)trees: Message-passing and linear-programming approaches. IEEE Trans. Inf. Theory 51(11), 3697–3717 (2005)
Article Google Scholar
Wainwright, M., Jaakkola, T., Willsky, A.: Tree consistency and bounds on the performance of the max-product algorithm and its generalizations. Stat. Comput. 14, 143–166 (2004). https://doi.org/10.1023/B:STCO.0000021412.33763.d5
Article MathSciNet Google Scholar
Wang, W., Sun, Y., Eriksson, B., Wang, W., Aggarwal, V.: Wide compression: tensor ring nets. In: CVPR (2018)
Google Scholar
Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. In: NeurIPS (2016)
Google Scholar
Werner, T.: A linear programming approach to max-sum problem: a review. Technical Report CTU-CMP-2005-25, Center for Machine Perception (2005)
Google Scholar
Wong, E., Kolter, J.Z.: Provable defenses against adversarial examples via the convex outer adversarial polytope. In: ICML (2018)
Google Scholar
Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices. In: CVPR (2016)
Google Scholar
Xiao, K., Tjeng, V., Shafiullah, N., Madry, A.: Training for faster adversarial robustness verification via inducing ReLU stability. ICLR (2019)
Google Scholar
Xu, H., Koenig, S., Kumar, T.S.: Towards effective deep learning for constraint satisfaction problems. In: CP (2018)
Google Scholar
Xue, Y., van Hoeve, W.J.: Embedding decision diagrams into generative adversarial networks. In: CPAIOR (2019)
Google Scholar
Ye, Z., Say, B., Sanner, S.: Symbolic bucket elimination for piecewise continuous constrained optimization. In: CPAIOR (2018)
Google Scholar
Yu, R., et al.: NISP: pruning networks using neuron importance score propagation. In: CVPR (2018)
Google Scholar
Yu, X., Yu, Z., Ramalingam, S.: Learning strict identity mappings in deep residual networks. In: CVPR (2018)
Google Scholar
Zhang, X., Zou, J., Ming, X., He, K., Sun, J.: Efficient and accurate approximations of nonlinear convolutional networks. In: CVPR (2015)
Google Scholar
Zhao, C., Ni, B., Zhang, J., Zhao, Q., Zhang, W., Tian, Q.: Variational convolutional neural network pruning. In: CVPR (2019)
Google Scholar
Zhou, H., Alvarez, J.M., Porikli, F.: Less is more: towards compact CNNs. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 662–677. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_40
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Bucknell University, Lewisburg, USA
Thiago Serra
The University of Utah, Salt Lake City, USA
Abhinav Kumar & Srikumar Ramalingam

Authors

Thiago Serra
View author publications
You can also search for this author in PubMed Google Scholar
Abhinav Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Srikumar Ramalingam
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thiago Serra .

Editor information

Editors and Affiliations

LAAS - CNRS, Toulouse, France
Emmanuel Hebrard
TU Wien, Vienna, Wien, Austria
Nysret Musliu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Serra, T., Kumar, A., Ramalingam, S. (2020). Lossless Compression of Deep Neural Networks. In: Hebrard, E., Musliu, N. (eds) Integration of Constraint Programming, Artificial Intelligence, and Operations Research. CPAIOR 2020. Lecture Notes in Computer Science(), vol 12296. Springer, Cham. https://doi.org/10.1007/978-3-030-58942-4_27

Download citation

DOI: https://doi.org/10.1007/978-3-030-58942-4_27
Published: 19 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58941-7
Online ISBN: 978-3-030-58942-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics