Skip to main content
Log in

Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis

  • Original Article
  • Published:
Neuroinformatics Aims and scope Submit manuscript

Abstract

How to read Uyghur text from biomedical graphic images is a challenge problem due to the complex layout and cursive writing of Uyghur. In this paper, we propose a system that extracts text from Uyghur biomedical images, and matches the text in a specific lexicon for semantic analysis. The proposed system possesses following distinctive properties: first, it is an integrated system which firstly detects and crops the Uyghur text lines using a single fully convolutional neural network, and then keywords in the lexicon are matched by a well-designed matching network. Second, to train the matching network effectively an online sampling method is applied, which generates synthetic data continually. Finally, we propose a GPU acceleration scheme for matching network to match a complete Uyghur text line directly rather than a single window. Experimental results on benchmark dataset show our method achieves a good performance of F-measure 74.5%. Besides, our system keeps high efficiency with 0.5s running time for each image due to the GPU acceleration scheme.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. 512 3 × 3 convolutional filters + ReLU, 512 3 × 3 convolutional filters + ReLU, 6 3 × 3 convolutional filters where 6 presents text/non-text predicted scores and four predicted coordinate offsets.

  2. http://www.wintone.com.cn/en/

  3. https://github.com/tesseract-ocr/tesseract, 4.0 version

References

  • Arbelaez, P., Maire, M., Fowlkes, C.C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898–916.

    Article  PubMed  Google Scholar 

  • Bissacco, A., Cummins, M., Netzer, Y., & Neven, H. (2013). Photoocr: reading text in uncontrolled conditions. In ICCV (pp. 785–792). Washington: IEEE Computer Society.

  • Epshtein, B., Ofek, E., & Wexler, Y. (2010). Detecting text in natural scenes with stroke width transform. In CVPR (pp. 2963–2970). Washington: IEEE Computer Society.

  • Fang, S., Xie, H., Chen, Z., Zhu, S., Gu, X., & Gao, X. (2017). Detecting Uyghur text in complex background images with convolutional neural network. Multimedia Tools and Applications, 76(13), 15083–15103. https://doi.org/10.1007/s11042-017-4538-8.

    Article  Google Scholar 

  • Girshick, R.B. (2015). Fast R-CNN. In ICCV (pp. 1440–1448). Washington: IEEE Computer Society.

  • Goodfellow, I.J., Warde-farley, D., Mirza, M., Courville, A.C., & Bengio, Y. (2013). Maxout networks. In ICML (3), JMLR.org, JMLR Workshop and Conference Proceedings, (Vol. 28 pp. 1319–1327).

  • Graves, A., Fernández, S., Gomez, F.J., & Schmidhuber, J. (2006). Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In ICML, ACM, ACM international conference proceeding series, (Vol. 148 pp. 369–376).

  • Gupta, A, Vedaldi, A, & Zisserman, A. (2016). Synthetic data for text localisation in natural images. In CVPR (pp. 2315–2324). Washington: IEEE Computer Society.

  • He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9), 1904–1916.

    Article  PubMed  Google Scholar 

  • He, P., Huang, W., Qiao, Y., Loy, C.C., & Tang, X. (2016). Reading scene text in deep convolutional sequences. In AAAI (pp. 3501–3508). Washington: AAAI Press.

  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.

    Article  PubMed  CAS  Google Scholar 

  • Huang, W., Lin, Z., Yang, J., & Wang, J. (2013). Text localization in natural images using stroke feature transform and text covariance descriptors. In ICCV (pp. 1241–1248). Washington: IEEE Computer Society.

  • Ibrayim, M., & Hamdulla, A. (2012). Design and implementation of prototype system for online handwritten uyghur character recognition. Wuhan University Journal of Natural Sciences, 17(2), 131–136.

    Article  Google Scholar 

  • Jaderberg, M., Simonyan, K., Vedaldi, A., & Zisserman, A. (2014). Synthetic data and artificial neural networks for natural scene text recognition. CoRR arXiv:abs/1406.2227.

  • Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R.B. , Guadarrama, S., & Darrell, T. (2014). Caffe: convolutional architecture for fast feature embedding. In ACM multimedia (pp. 675–678). New York: ACM.

  • Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S.K., Bagdanov, A.D., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., Lu, S., Shafait, F., Uchida, S., & Valveny, E. (2015). ICDAR 2015 competition on robust reading. In ICDAR (pp. 1156–1160). Washington: IEEE Computer Society.

  • Kennedy, D.N., & Haselgrove, C. (2006). The internet analysis tools registry a public resource for image analysis. Neuroinformatics, 4(3), 263–270.

    Article  PubMed  Google Scholar 

  • Kingma, D.P., & Ba, J. (2014). Adam: a method for stochastic optimization. CoRR arXiv:abs/1412.6980.

  • Lin, T., Maire, M., Belongie, S.J., Hays, J., Perona, P., Ramanan, D., Dollár, P, & Zitnick, C.L. (2014). Microsoft COCO: common objects in context. In ECCV (5), Springer, Lecture Notes in Computer Science, (Vol. 8693 pp. 740–755).

  • Liu, N., Yu, X., Wang, C., Li, C., Ma, L., & Lei, J. (2017). Energy-sharing model with price-based demand response for microgrids of peer-to-peer prosumers. IEEE Transactions on Power Systems, 32(5), 3569–3583.

    Article  Google Scholar 

  • Neumann, L., & Matas, J. (2010). A method for text localization and recognition in real-world images. In ACCV (3), Springer, Lecture Notes in Computer Science, (Vol. 6494 pp. 770–783).

  • Ren, H.Y., Yuan B.S., & Tian, Y. (2010). On-line & handwritten uyghur character recognition based on bp neural network [j]. Microelectronics & Computer, 8, 060.

    Google Scholar 

  • Ren, S., He, K., Girshick, R.B., & Sun, J. (2015). Faster R-CNN: towards real-time object detection with region proposal networks. In NIPS (pp. 91–99).

  • Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., & LeCun, Y. (2013). Overfeat: integrated recognition, localization and detection using convolutional networks. CoRR arXiv:abs/1312.6229.

  • Shi, B., Bai, X., & Yao, C. (2015). An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. CoRR arXiv:abs/1507.05717.

  • Simayi, W., Ibrayim, M., Tursun, D., & Hamdulla, A. (2013). Research on on-line uyghur character recognition technology based on center distance feature. In 2013 IEEE international symposium on signal processing and information technology (ISSPIT) (pp. 000,293–000,298). Piscataway: IEEE.

  • Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale imagev recognition. CoRR arXiv:abs/1409.1556.

  • Smith, R. (2007). An overview of the tesseract OCR engine. In ICDAR (pp. 629–633). Washington: IEEE Computer Society.

  • Som, T., Can, D., & Saraclar, M. (2009). Hmm-based sliding video text recognition for turkish broadcast news. In ISCIS (pp. 475–479). Piscataway: IEEE.

  • Song, Y., Chen, J., Xie, H., Chen, Z., Gao, X., & Chen, X. (2017). Robust and parallel Uyghur text localization in complex background images. Machine Vision and Applications, 28(7), 755–769. https://doi.org/10.1007/s00138-017-0837-3.

    Article  Google Scholar 

  • Tian, Z., Huang, W., He, T., He, P., & Qiao, Y. (2016). Detecting text in natural image with connectionist text proposal network. In ECCV (8), Springer, Lecture Notes in Computer Science, (Vol. 9912 pp. 56–72).

  • Veit, A., Matera, T., Neumann, L., Matas, J., & Belongie, S.J. (2016). Coco-text: dataset and benchmark for text detection and recognition in natural images. CoRR arXiv:abs/1601.07140.

  • Wang, K., Babenko, B., & Belongie, S.J. (2011). End-to-end scene text recognition. In ICCV (pp. 1457–1464). Washington: IEEE Computer Society.

  • Wang, T., Wu, D.J., Coates, A., & Ng, A.Y. (2012). End-to-end text recognition with convolutional neural networks. In ICPR (pp. 3304–3308). Washington: IEEE Computer Society.

  • Xie, H., Gao, K., Zhang, Y., Tang, S., Li, J., & Liu, Y. (2011). Efficient feature detection and effective post-verification for large scale near-duplicate image search. IEEE Transactions on Multimedia, 13(6), 1319–1332.

    Article  Google Scholar 

  • Xie, H., Zhang, Y., Gao, K., Tang, S., Xu, K., Guo, L., & Li, J. (2013). Robust common visual pattern discovery using graph matching. Journal of Visual Communication and Image Representation, 24(5), 635–646.

    Article  Google Scholar 

  • Yan, C., Zhang, Y., Dai, F., Wang, X., Li, L., & Dai, Q. (2014a). Parallel deblocking filter for hevc on many-core processor. Electronics Letters, 50(5), 367–368. https://doi.org/10.1049/el.2013.3235.

    Article  Google Scholar 

  • Yan, C., Zhang, Y., Xu, J., Dai, F., Li, L., Dai, Q., & Wu, F. (2014b). A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Processing Letters, 21(5), 573–576.

    Article  Google Scholar 

  • Yan, C.C., Zhang, Y., Xu, J., Dai, F., Zhang, J., Dai, Q., & Wu, F. (2014c). Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Transactions on Circuits and Systems for Video Technology, 24(12), 2077–2089.

    Article  Google Scholar 

  • Yan, C., Xie, H., Liu, S., Yin, J., Zhang, Y., & Dai, Q. (2017). Effective uyghur language text detection in complex background images for traffic prompt identification. IEEE Transactions on Intelligent Transportation Systems. https://doi.org/10.1109/TITS.2017.2749977.

  • Yan, C., Xie, H., Yang, D., Yin, J., Zhang, Y., & Dai, Q. (2017). Supervised hash coding with deep neural network for environment perception of intelligent vehicles. IEEE Transactions on Intelligent Transportation Systems. https://doi.org/10.1109/TITS.2017.2749965.

  • Yin, X., Yin, X., Huang, K., & Hao, H. (2014). Robust text detection in natural scene images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5), 970–983.

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Nature Science Foundation of China (61771468, 61772526), the Youth Innovation Promotion Association Chinese Academy of Sciences (2017209).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongtao Xie.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, S., Xie, H., Chen, Z. et al. Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis. Neuroinform 16, 445–455 (2018). https://doi.org/10.1007/s12021-017-9350-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12021-017-9350-0

Keywords

Navigation