Abstract
Cellular Neural Network (CeNN) is considered as a powerful paradigm for embedded devices. Its analog and mix-signal hardware implementations are proved to be applicable to high-speed image processing, video analysis, and medical signal processing with its efficiency and popularity limited by smaller implementation size and lower precision. Recently, digital implementations of CeNNs on FPGA have attracted researchers from both academia and industry due to its high flexibility and short time-to-market. However, most existing implementations are not well optimized to fully utilize the advantages of FPGA platform with unnecessary design and computational redundancy that prevents speedup. We propose a multi-level-optimization framework for energy-efficient CeNN implementations on FPGAs. In particular, the optimization framework is featured with three level optimizations: system-, module-, and design-space-level, with focus on computational redundancy and attainable performance, respectively. Experimental results show that with various configurations our framework can achieve an energy-efficiency improvement of 3.54× and up to 3.88× speedup compared with existing implementations with similar accuracy.
- Cheng Zhuo, Kassan Unda, Yiyu Shi, and Wei Kai Shih. 2018. From layout to system: Early stage power delivery and architecture co-exploration. IEEE Trans. Comput.-Aided Design Integr. Circ. Syst. PP, 99 (2018), 1--1.Google Scholar
- Cheng Zhuo, Gustavo Wilke, Ritochit Chakraborty, Alaeddin A. Aydiner, Sourav Chakravarty, and Wei-Kai Shih. 2015. Silicon-validated power delivery modeling and analysis on a 32-nm DDR I/O interface. IEEE Trans. Very Large Scale Integr. Syst. 23, 9 (2015), 1760--1771.Google ScholarDigital Library
- Huaqing Li, Xiaofeng Liao, Chuandong Li, Hongyu Huang, and Chaojie Li. 2011. Edge detection of noisy images based on cellular neural networks. Commun. Nonlin. Sci. Numer. Simul. 16, 9 (2011), 3746--3759.Google ScholarCross Ref
- Osama Basil Gazi, Mohamed Belal, and Hala Abdel-Galil. 2014. Edge detection in satellite image using cellular neural network. System 8 (2014), 9.Google Scholar
- Jeremy Hills and Yongmin Zhong. 2014. Cellular neural network-based thermal modelling for real-time robotic path planning. Int. J. Agile Syst. Manage. 20 7, 3--4 (2014), 261--281.Google ScholarCross Ref
- M. Duraisamy and F. Mary Magdalene Jane. 2014. Cellular neural network based medical image segmentation using artificial bee colony algorithm. In Proceedings of the International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE’14). IEEE, 1--6.Google Scholar
- Zhongyang Liu, Cheng Zhuo, and Xiaowei Xu. 2018. Efficient Segmentation Method Using Quantised and Non-linear CeNN for Breast Tumour Classification. Electronics Letters.Google Scholar
- Fadi Al Machot, Mouhannad Ali, Ahmad Haj Mosa, Christopher Schwarzlmüller, Markus Gutmann, and Kyandoghere Kyamakya. 2016. Real-time raindrop detection based on cellular neural networks for ADAS. J. Real-Time Image Process. (2016), 1--13.Google Scholar
- Nerhun Yildiz, Evren Cesur, and Vedat Tavsanoglu. 2016. On the way to a third generation real-time cellular neural network processor. In Proceedings of the International Workshop on Cellular Neural Networks and their Applications (CNNA’16).Google Scholar
- Dilan Manatunga, Hyesoon Kim, and Saibal Mukhopadhyay. 2015. SP-CNN: A scalable and programmable CNN-based accelerator. IEEE Micro 35, 5 (2015), 42--50.Google ScholarDigital Library
- Hubert Harrer, Josef A. Nossek, Tams Roska, and Leon O. Chua. 1994. A current-mode DTCNN universal chip. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS’94), Vol. 4. IEEE, 135--138.Google Scholar
- Angel Rodrguez-Vzquez, Gustavo Lin-Cembrano, L. Carranza, Elisenda Roca-Moreno, Ricardo Carmona-Galn, Francisco Jimnez-Garrido, Rafael Domnguez-Castro, and S. Espejo Meana. 2004. ACE16k: The third generation of mixed-signal SIMD-CNN ACE chips toward VSoCs. IEEE Trans. Circ. Syst. I: Reg. Papers 51, 5 (2004), 851--863.Google ScholarCross Ref
- Gabriele Manganaro, Paolo Arena, and Luigi Fortuna. 2012. Cellular Neural Networks: Chaos, Complexity and VLSI Processing. Vol. 1. Springer Science 8 Business Media.Google Scholar
- Stephen J. Carey, David R. W. Barr, Bin Wang, Alexey Lopich, and Piotr Dudek. 2013. Mixed signal SIMD processor array vision chip for real-time image processing. Analog Integr. Circ. Signal Process. 77, 3 (2013), 385--399. Google ScholarDigital Library
- Nerhun Yildiz, Evren Cesur, Kamer Kayaer, Vedat Tavsanoglu, and Murathan Alpay. 2015. Architecture of a fully pipelined real-time cellular neural network emulator. IEEE Trans. Circ. Syst. I: Reg. Papers 62, 1 (2015), 130--138.Google ScholarCross Ref
- Seungjin Lee, Minsu Kim, Kwanho Kim, Joo-Young Kim, and Hoi-Jun Yoo. 2011. 24-GOPS 4.5-mm<sup>2</sup> digital cellular neural network for rapid visual attention in an object-recognition SoC. IEEE Trans. Neural Netw. 22, 1 (2011), 64--73. Google ScholarDigital Library
- Sasanka Potluri, Alireza Fasih, Laxminand Kishore Vutukuru, Fadi Al Machot, and Kyandoghere Kyamakya. 2011. CNN based high performance computing for real time image processing on GPU. In Proceedings of the Nonlinear Dynamics and Synchronization (INDS) 8 16th International Symposium on Theoretical Electrical Engineering (ISTET), 2011 Joint 3rd International Workshop. IEEE, 1--7.Google ScholarCross Ref
- Hsin-Chieh Chen, Yung-Ching Hung, Chang-Kuo Chen, Teh-Lu Liao, and Chun-Kuo Chen. 2006. Image-processing algorithms realized by discrete-time cellular neural networks and their circuit implementations. Chaos, Solitons Fract. 29, 5 (2006), 1100--1108.Google ScholarCross Ref
- Reid Porter, Jan Frigo, Al Conti, Neal Harvey, Garrett Kenyon, and Maya Gokhale. 2007. A reconfigurable computing framework for multi-scale cellular image processing. Microprocess. Microsyst. 31, 8 (2007), 546--563. Google ScholarDigital Library
- J. Javier Martnez, Javier Garrigs, Javier Toledo, and J. Manuel Fernandez. 2013. An efficient and expandable hardware implementation of multilayer cellular neural networks. Neurocomputing 114 (2013), 54--62.Google ScholarCross Ref
- Jens Muller, Robert Wittig, Jan Muller, and Ronald Tetzlaff. 2016. An improved cellular nonlinear network architecture for binary and greyscale image processing. IEEE Trans. Circ. Syst. II: Express Briefs 65, 8 (2016), 1084--1088.Google Scholar
- Qian Wang, Youjie Li, Botang Shao, Siddhartha Dey, and Peng Li. 2017. Energy-efficient parallel neuromorphic architectures with approximate arithmetic on FPGA. Neurocomputing 221 (2017). Google ScholarDigital Library
- Qian Wang, Yingyezhe Jin, and Peng Li. 2015. General-purpose LSM learning processor architecture and theoretically guided design space exploration. In Proceedings of the Biomedical Circuits and Systems Conference (BioCAS’15). 1--4.Google ScholarCross Ref
- Bruno da Silva, An Braeken, Eril H. D’Hollander, and Abdellah Touhafi. 2013. Performance modeling for FPGAs: Extending the roofline model with high-level synthesis tools. Int. J. Reconfig. Comput. 2013, Article 7 (2013). Google ScholarDigital Library
- Rahimeh Rouhi, Mehdi Jafari, Shohreh Kasaei, and Peiman Keshavarzian. 2015. Benign and malignant breast tumors classification based on region growing and CNN segmentation. Expert Syst. Appl. 42, 3 (2015), 990--1002. Google ScholarDigital Library
- Wei Wang, Li-Jun Yang, Yu-Ting Xie, and You-wei An. 2014. Edge detection of infrared image with CNN_DGA algorithm. Optik-Int. J. Light Electron Optics 125, 1 (2014), 464--467.Google ScholarCross Ref
- Xiaoming Liu and Jinshan Tang. 2014. Mass classification in mammograms using selected geometry and texture features, and a new SVM-based feature selection method. IEEE Syst. J. 8, 3 (2014), 910--920.Google ScholarCross Ref
- Yu Zhang, Noriko Tomuro, Jacob Furst, and Daniela Stan Raicu. 2012. Building an ensemble system for diagnosing masses in mammograms. Int. J. Comput. Assisted Radiol. Surg. 7, 2 (2012), 323--329.Google ScholarCross Ref
- Brijesh Verma, Peter McLeod, and Alan Klevansky. 2010. Classification of benign and malignant patterns in digital mammograms for the diagnosis of breast cancer. Expert Syst. Appl. 37, 4 (2010), 3344--3351. Google ScholarDigital Library
- Leon O. Chua and Tamas Roska. 2002. Cellular Neural Networks and Visual Computing: Foundations and Applications. Cambridge University Press. Google ScholarDigital Library
- K. Karacs, G. Y. Cserey, Zarndy, P. Szolgay, C. S. Rekeczky, L. Kek, V. Szab, G. Pazienza, and T. Roska. 2010. Software library for cellular wave computing engines. Cellular Sensory and Wave Computing Laboratory of the Computer and Automation Research Institute. Hungarian Academy of Sciences (MTA SZTAKI), and the Jedlik Laboratories of the Pazmany University.Google Scholar
- M. Heath, K. Bowyer, D. Kopans, R. Moore, and P. Kegelmeyer. 2001. The Digital Database for Screening Mammography. Springer, Netherlands, 457--460.Google Scholar
- Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. 2015. Optimizing fpga-based accelerator design for deep convolutional neural networks. In Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 161--170. Google ScholarDigital Library
Index Terms
- A Multi-Level-Optimization Framework for FPGA-Based Cellular Neural Network Implementation
Recommendations
Efficient Hardware Implementation of Cellular Neural Networks with Incremental Quantization and Early Exit
Special Issue on Neuromorphic ComputingCellular neural networks (CeNNs) have been widely adopted in image processing tasks. Recently, various hardware implementations of CeNNs have emerged in the literature, with Field Programmable Gate Array (FPGA) being one of the most popular choices due ...
FPGA based cellular neural network optimization: from design space to system
NCS '17: Proceedings of the Neuromorphic Computing SymposiumCellular Neural Network (CeNN) is considered as a powerful paradigm for embedded devices. Its analog and mix-signal hardware implementations are proved to be applicable to high-speed image processing, video analysis and medical signal processing with ...
An FPGA implementation for neural networks with the FDFM processor core approach
This paper presents a field programmable gate array FPGA implementation of a three-layer perceptron using the few DSP blocks and few block RAMs FDFM approach implemented in the Xilinx Virtex-6 family FPGA. In the FDFM approach, multiple processor cores ...
Comments