Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis

Fang, Shancheng; Xie, Hongtao; Chen, Zhineng; Liu, Yizhi; Li, Yan

doi:10.1007/s12021-017-9350-0

Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis

Original Article
Published: 19 January 2018

Volume 16, pages 445–455, (2018)
Cite this article

Neuroinformatics Aims and scope Submit manuscript

Shancheng Fang^1,2,
Hongtao Xie ORCID: orcid.org/0000-0002-6249-5315³,
Zhineng Chen⁴,
Yizhi Liu⁵ &
…
Yan Li⁶

428 Accesses
2 Citations
5 Altmetric
Explore all metrics

Abstract

How to read Uyghur text from biomedical graphic images is a challenge problem due to the complex layout and cursive writing of Uyghur. In this paper, we propose a system that extracts text from Uyghur biomedical images, and matches the text in a specific lexicon for semantic analysis. The proposed system possesses following distinctive properties: first, it is an integrated system which firstly detects and crops the Uyghur text lines using a single fully convolutional neural network, and then keywords in the lexicon are matched by a well-designed matching network. Second, to train the matching network effectively an online sampling method is applied, which generates synthetic data continually. Finally, we propose a GPU acceleration scheme for matching network to match a complete Uyghur text line directly rather than a single window. Experimental results on benchmark dataset show our method achieves a good performance of F-measure 74.5%. Besides, our system keeps high efficiency with 0.5s running time for each image due to the GPU acceleration scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Convolutional Neural Networks for Scene Image Recognition

Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition

Deep-learning based end-to-end system for text reading in the wild

Article 21 March 2022

Riadh Harizi, Rim Walha & Fadoua Drira

Notes

512 3 × 3 convolutional filters + ReLU, 512 3 × 3 convolutional filters + ReLU, 6 3 × 3 convolutional filters where 6 presents text/non-text predicted scores and four predicted coordinate offsets.
http://www.wintone.com.cn/en/
https://github.com/tesseract-ocr/tesseract, 4.0 version

References

Arbelaez, P., Maire, M., Fowlkes, C.C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898–916.
Article PubMed Google Scholar
Bissacco, A., Cummins, M., Netzer, Y., & Neven, H. (2013). Photoocr: reading text in uncontrolled conditions. In ICCV (pp. 785–792). Washington: IEEE Computer Society.
Epshtein, B., Ofek, E., & Wexler, Y. (2010). Detecting text in natural scenes with stroke width transform. In CVPR (pp. 2963–2970). Washington: IEEE Computer Society.
Fang, S., Xie, H., Chen, Z., Zhu, S., Gu, X., & Gao, X. (2017). Detecting Uyghur text in complex background images with convolutional neural network. Multimedia Tools and Applications, 76(13), 15083–15103. https://doi.org/10.1007/s11042-017-4538-8.
Article Google Scholar
Girshick, R.B. (2015). Fast R-CNN. In ICCV (pp. 1440–1448). Washington: IEEE Computer Society.
Goodfellow, I.J., Warde-farley, D., Mirza, M., Courville, A.C., & Bengio, Y. (2013). Maxout networks. In ICML (3), JMLR.org, JMLR Workshop and Conference Proceedings, (Vol. 28 pp. 1319–1327).
Graves, A., Fernández, S., Gomez, F.J., & Schmidhuber, J. (2006). Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In ICML, ACM, ACM international conference proceeding series, (Vol. 148 pp. 369–376).
Gupta, A, Vedaldi, A, & Zisserman, A. (2016). Synthetic data for text localisation in natural images. In CVPR (pp. 2315–2324). Washington: IEEE Computer Society.
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9), 1904–1916.
Article PubMed Google Scholar
He, P., Huang, W., Qiao, Y., Loy, C.C., & Tang, X. (2016). Reading scene text in deep convolutional sequences. In AAAI (pp. 3501–3508). Washington: AAAI Press.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Article PubMed CAS Google Scholar
Huang, W., Lin, Z., Yang, J., & Wang, J. (2013). Text localization in natural images using stroke feature transform and text covariance descriptors. In ICCV (pp. 1241–1248). Washington: IEEE Computer Society.
Ibrayim, M., & Hamdulla, A. (2012). Design and implementation of prototype system for online handwritten uyghur character recognition. Wuhan University Journal of Natural Sciences, 17(2), 131–136.
Article Google Scholar
Jaderberg, M., Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Synthetic data and artificial neural networks for natural scene text recognition. CoRR arXiv:abs/1406.2227.
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.B. , Guadarrama, S., & Darrell, T. (2014). Caffe: convolutional architecture for fast feature embedding. In ACM multimedia (pp. 675–678). New York: ACM.
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S.K., Bagdanov, A.D., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., Lu, S., Shafait, F., Uchida, S., & Valveny, E. (2015). ICDAR 2015 competition on robust reading. In ICDAR (pp. 1156–1160). Washington: IEEE Computer Society.
Kennedy, D.N., & Haselgrove, C. (2006). The internet analysis tools registry a public resource for image analysis. Neuroinformatics, 4(3), 263–270.
Article PubMed Google Scholar
Kingma, D.P., & Ba, J. (2014). Adam: a method for stochastic optimization. CoRR arXiv:abs/1412.6980.
Lin, T., Maire, M., Belongie, S.J., Hays, J., Perona, P., Ramanan, D., Dollár, P, & Zitnick, C.L. (2014). Microsoft COCO: common objects in context. In ECCV (5), Springer, Lecture Notes in Computer Science, (Vol. 8693 pp. 740–755).
Liu, N., Yu, X., Wang, C., Li, C., Ma, L., & Lei, J. (2017). Energy-sharing model with price-based demand response for microgrids of peer-to-peer prosumers. IEEE Transactions on Power Systems, 32(5), 3569–3583.
Article Google Scholar
Neumann, L., & Matas, J. (2010). A method for text localization and recognition in real-world images. In ACCV (3), Springer, Lecture Notes in Computer Science, (Vol. 6494 pp. 770–783).
Ren, H.Y., Yuan B.S., & Tian, Y. (2010). On-line & handwritten uyghur character recognition based on bp neural network [j]. Microelectronics & Computer, 8, 060.
Google Scholar
Ren, S., He, K., Girshick, R.B., & Sun, J. (2015). Faster R-CNN: towards real-time object detection with region proposal networks. In NIPS (pp. 91–99).
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., & LeCun, Y. (2013). Overfeat: integrated recognition, localization and detection using convolutional networks. CoRR arXiv:abs/1312.6229.
Shi, B., Bai, X., & Yao, C. (2015). An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. CoRR arXiv:abs/1507.05717.
Simayi, W., Ibrayim, M., Tursun, D., & Hamdulla, A. (2013). Research on on-line uyghur character recognition technology based on center distance feature. In 2013 IEEE international symposium on signal processing and information technology (ISSPIT) (pp. 000,293–000,298). Piscataway: IEEE.
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale imagev recognition. CoRR arXiv:abs/1409.1556.
Smith, R. (2007). An overview of the tesseract OCR engine. In ICDAR (pp. 629–633). Washington: IEEE Computer Society.
Som, T., Can, D., & Saraclar, M. (2009). Hmm-based sliding video text recognition for turkish broadcast news. In ISCIS (pp. 475–479). Piscataway: IEEE.
Song, Y., Chen, J., Xie, H., Chen, Z., Gao, X., & Chen, X. (2017). Robust and parallel Uyghur text localization in complex background images. Machine Vision and Applications, 28(7), 755–769. https://doi.org/10.1007/s00138-017-0837-3.
Article Google Scholar
Tian, Z., Huang, W., He, T., He, P., & Qiao, Y. (2016). Detecting text in natural image with connectionist text proposal network. In ECCV (8), Springer, Lecture Notes in Computer Science, (Vol. 9912 pp. 56–72).
Veit, A., Matera, T., Neumann, L., Matas, J., & Belongie, S.J. (2016). Coco-text: dataset and benchmark for text detection and recognition in natural images. CoRR arXiv:abs/1601.07140.
Wang, K., Babenko, B., & Belongie, S.J. (2011). End-to-end scene text recognition. In ICCV (pp. 1457–1464). Washington: IEEE Computer Society.
Wang, T., Wu, D.J., Coates, A., & Ng, A.Y. (2012). End-to-end text recognition with convolutional neural networks. In ICPR (pp. 3304–3308). Washington: IEEE Computer Society.
Xie, H., Gao, K., Zhang, Y., Tang, S., Li, J., & Liu, Y. (2011). Efficient feature detection and effective post-verification for large scale near-duplicate image search. IEEE Transactions on Multimedia, 13(6), 1319–1332.
Article Google Scholar
Xie, H., Zhang, Y., Gao, K., Tang, S., Xu, K., Guo, L., & Li, J. (2013). Robust common visual pattern discovery using graph matching. Journal of Visual Communication and Image Representation, 24(5), 635–646.
Article Google Scholar
Yan, C., Zhang, Y., Dai, F., Wang, X., Li, L., & Dai, Q. (2014a). Parallel deblocking filter for hevc on many-core processor. Electronics Letters, 50(5), 367–368. https://doi.org/10.1049/el.2013.3235.
Article Google Scholar
Yan, C., Zhang, Y., Xu, J., Dai, F., Li, L., Dai, Q., & Wu, F. (2014b). A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Processing Letters, 21(5), 573–576.
Article Google Scholar
Yan, C.C., Zhang, Y., Xu, J., Dai, F., Zhang, J., Dai, Q., & Wu, F. (2014c). Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Transactions on Circuits and Systems for Video Technology, 24(12), 2077–2089.
Article Google Scholar
Yan, C., Xie, H., Liu, S., Yin, J., Zhang, Y., & Dai, Q. (2017). Effective uyghur language text detection in complex background images for traffic prompt identification. IEEE Transactions on Intelligent Transportation Systems. https://doi.org/10.1109/TITS.2017.2749977.
Yan, C., Xie, H., Yang, D., Yin, J., Zhang, Y., & Dai, Q. (2017). Supervised hash coding with deep neural network for environment perception of intelligent vehicles. IEEE Transactions on Intelligent Transportation Systems. https://doi.org/10.1109/TITS.2017.2749965.
Yin, X., Yin, X., Huang, K., & Hao, H. (2014). Robust text detection in natural scene images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5), 970–983.
Article PubMed Google Scholar

Download references

Acknowledgements

This work is supported by the National Nature Science Foundation of China (61771468, 61772526), the Youth Innovation Promotion Association Chinese Academy of Sciences (2017209).

Author information

Authors and Affiliations

National Engineering Laboratory for Information Security Technologies, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China
Shancheng Fang
School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China
Shancheng Fang
School of Information Science and Technology, University of Science and Technology of China, Hefei, China
Hongtao Xie
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zhineng Chen
School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China
Yizhi Liu
Beijing Kuaishou Technology Co., Ltd., Beijing, China
Yan Li

Authors

Shancheng Fang
View author publications
You can also search for this author in PubMed Google Scholar
Hongtao Xie
View author publications
You can also search for this author in PubMed Google Scholar
Zhineng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yizhi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hongtao Xie.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fang, S., Xie, H., Chen, Z. et al. Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis. Neuroinform 16, 445–455 (2018). https://doi.org/10.1007/s12021-017-9350-0

Download citation

Published: 19 January 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s12021-017-9350-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis

Abstract

Access this article

Similar content being viewed by others

Convolutional Neural Networks for Scene Image Recognition

Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition

Deep-learning based end-to-end system for text reading in the wild

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis

Abstract

Access this article

Similar content being viewed by others

Convolutional Neural Networks for Scene Image Recognition

Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition

Deep-learning based end-to-end system for text reading in the wild

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation