research-article

Improving the Accessibility of Mobile OCR Apps Via Interactive Modalities

Authors:
Michael Cutter

University of California, Santa Cruz, Santa Cruz, CA

University of California, Santa Cruz, Santa Cruz, CA
View Profile

,
Roberto Manduchi

University of California, Santa Cruz, Santa Cruz, CA

University of California, Santa Cruz, Santa Cruz, CA
View Profile

Authors Info & Claims

ACM Transactions on Accessible Computing Volume 10 Issue 4Article No.: 11pp 1–27https://doi.org/10.1145/3075300

Published:09 August 2017Publication History

ACM Transactions on Accessible Computing

Abstract

We describe two experiments with a system designed to facilitate the use of mobile optical character recognition (OCR) by blind people. This system, implemented as an iOS app, enables two interaction modalities (autoshot and guidance). In the first study, augmented reality fiducials were used to track a smartphone’s camera, whereas in the second study, the text area extent was detected using a dedicated text spotting and text line detection algorithm. Although the guidance modality was expected to be superior in terms of faster text access, this was shown to be true only when some conditions (involving the user interface and text detection modules) are met. Both studies also showed that our participants, after experimenting with the autoshot or guidance modality, appeared to have improved their skill at taking OCR-readable pictures even without use of such interaction modalities.

References

Hend S. Al-Khalifa. 2008. Utilizing QR code and mobile phones for blinds and visually impaired people. In Proceedings of the International Conference on Computers for Handicapped Persons. 1065--1069. Google ScholarDigital Library
Jeffrey Bigham, Chandrika Jayant, Andrew Miller, Brandyn White, and Tom Yeh. 2010b. VizWiz::LocateIt—enabling blind people to locate objects in their environment. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW’10).Google ScholarCross Ref
Jeffrey P. Bigham, Chandrika Jayant, Hanjie Ji, Greg Little, Andrew Miller, Robert C. Miller, Robin Miller, Aubrey Tatarowicz, Brandyn White, Samuel White, and Tom Yeh. 2010a. VizWiz: Nearly real-time answers to visual questions. In Proceedings of the 23nd Annual ACM Symposium on User Interface Software and Technology (UIST’10). ACM, New York, NY, 333--342. Google ScholarDigital Library
Alessandro Bissacco, Mark Cummins, Yuval Netzer, and Hartmut Neven. 2013. PhotoOCR: Reading text in uncontrolled conditions. In Proceedings of the IEEE International Conference on Computer Vision. 785--792. Google ScholarDigital Library
Erin Brady, Meredith Ringel Morris, Yu Zhong, Samuel White, and Jeffrey P. Bigham. 2013. Visual challenges in the everyday lives of blind people. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’13). 2117--2126. Google ScholarDigital Library
Leo Breiman. 2001. Random forests. Machine Learning 45, 1, 5--32. Google ScholarDigital Library
Rickey Dale Burks, Charles Lee Oakes III, Randy Ray Morlen, Bharat Prasad, Michael Frank Morris, and Xia Hua. 2012. Systems and methods to use a digital camera to remotely deposit a negotiable instrument. US Patent 8,290,237.Google Scholar
John Canny. 1986. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 8, 6, 679--698. Google ScholarDigital Library
James Coughlan and Roberto Manduchi. 2013. Camera-based access to visual information. In Assistive Technology for Blindness and Low Vision, R. Manduchi and S. Kurniawan (Eds.). CRC Press, Boca Raton, FL, 219--246.Google Scholar
Michael P. Cutter and Roberto Manduchi. 2013. Real time camera phone guidance for compliant document image acquisition without sight. In Proceedings of the 2013 12th International Conference on Document Analysis and Recognition. IEEE, Los Alamitos, CA, 408--412. Google ScholarDigital Library
Michael P. Cutter and Roberto Manduchi. 2015. Towards mobile OCR: How to take a good picture of a document without sight. In Proceedings of the 2015 ACM Symposium on Document Engineering. ACM, New York, NY, 75--84. Google ScholarDigital Library
C. Patrick Doncaster and Andrew J. H. Davey. 2007. Analysis of Variance and Covariance: How to Choose and Construct Models for the Life Sciences. Cambridge University Press.Google Scholar
Richard O. Duda and Peter E. Hart. 1972. Use of the Hough transformation to detect lines and curves in pictures. Communications of the ACM 15, 1, 11--15. Google ScholarDigital Library
B. Epshtein, E. Ofek, and Y. Wexler. 2010. Detecting text in natural scenes with stroke width transform. In Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’10).Google Scholar
Richard Hartley and Andrew Zisserman. 2003. Multiple View Geometry in Computer Vision. Cambridge University Press. Google ScholarDigital Library
Donald Hedeker and Robert D. Gibbons. 2006. Longitudinal Data Analysis. Vol. 451. John Wiley 8 Sons.Google Scholar
Bill Holton. 2016. A day in the life: Technology that assists a visually impaired person throughout the day. AFB AccessWorld Magazine 17, 2. Available at http://www.afb.org/afbpress/pubnew.asp&quest;DocID=aw170202.Google Scholar
Chandrika Jayant, Hanjie Ji, Samuel White, and Jeffrey P. Bigham. 2011. Supporting blind photography. In Proceedings of the 13th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS’11). 203--210. Google ScholarDigital Library
Shaun K. Kane, Brian Frey, and Jacob O. Wobbrock. 2013. Access lens: A gesture-based screen reader for real-world documents. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI’13). ACM, New York, NY, 347--350. Google ScholarDigital Library
Dimosthenis Karatzas, Faisal Shafait, Seiichi Uchida, Masakazu Iwamura, Lluis Gomez i Bigorda, Sergi Robles Mestre, Joan Mas, David Fernandez Mota, Jon Almazan Almazan, and Lluis Pere de las Heras. 2013. ICDAR 2013 Robust Reading Competition. In Proceedings of the 2013 12th International Conference on Document Analysis and Recognition (ICDAR’13). Google ScholarDigital Library
Roberto Manduchi and James M. Coughlan. 2014. The last meter: Blind visual guidance to a target. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, New York, NY, 3113--3122. Google ScholarDigital Library
Lukáš Neumann and Jiří Matas. 2012. Real-time scene text localization and recognition. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’12). IEEE, Los Alamitos, CA, 3538--3545. Google ScholarDigital Library
Siyang Qin and Roberto Manduchi. 2016. A fast and robust text spotter. In Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision (WACV’16). IEEE, Los Alamitos, CA, 1--8.Google ScholarCross Ref
Roy Shilkrot, Jochen Huber, Wong Meng Ee, Pattie Maes, and Suranga Nanayakkara. 2015. Fingerreader: A wearable device to explore printed text on the go. In Proceedings of the 33rd Annual Conference on Human Factors in Computing Systems (CHI’15). 2363--2372. Google ScholarDigital Library
Lee Stearns, Ruofei Du, Uran Oh, Catherine Jou, Leah Findlater, David A. Ross, and Jon E. Froehlich. 2016. Evaluating haptic and auditory directional guidance to assist blind people in reading printed text using finger-mounted cameras. ACM Transactions on Accessible Computing 9, 1, 1. Google ScholarDigital Library
Deborah Stein. 1998. The Optacon: Past, Present, and Future. Retrieved July 5, 2017, from https://nfb.org//Images/nfb/ Publications/bm/bm98/bm980506.htm.Google Scholar
Ender Tekin and James M. Coughlan. 2010. A mobile phone application enabling visually impaired users to find and read product barcodes. In Proceedings of the 12th International Conference on Computers Helping People With Special Needs (ICCHP’10). 290--295. DOI:http://dl.acm.org/citation.cfm?id=1880751.1880800 Google ScholarDigital Library
Marynel Vázquez and Aaron Steinfeld. 2012. Helping visually impaired users properly aim a camera. In Proceedings of the 14th International ACM SIGACCESS Conference on Computers and Assessibility (ASSETS’12). 95--102. Google ScholarDigital Library
Ali Zandifar and Antoine Chahine. 2002. A video based interface to textual information for the visually impaired. In Proceedings of the 4th IEEE International Conference on Multimodal Interfaces (ICMI’02). 325. Google ScholarDigital Library
Zheng Zhang, Chengquan Zhang, Wei Shen, Cong Yao, Wenig Liu, and Xiang Bai. 2016. Multi-oriented text detection with fully convolutional networks. arXiv:1604.04018.Google Scholar
Yu Zhong, Pierre J. Garrigues, and Jeffrey P. Bigham. 2013. Real time object scanning using a mobile phone and cloud-based visual search engine. In Proceedings of the 15th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS’13). Article No. 20. Google ScholarDigital Library
Yu Zhong, Walter S. Lasecki, Erin Brady, and Jeffrey P. Bigham. 2015. RegionSpeak: Quick comprehensive spatial descriptions of complex images for blind users. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems (CHI’15). 2353--2362. Google ScholarDigital Library

Index Terms

Improving the Accessibility of Mobile OCR Apps Via Interactive Modalities
1. Human-centered computing
  1. Accessibility
    1. Accessibility technologies
  2. Ubiquitous and mobile computing
    1. Empirical studies in ubiquitous and mobile computing

Recommendations

LêRótulos: A Mobile Application Based on Text Recognition in Images to Assist Visually Impaired People
Universal Access in Human-Computer Interaction. Methods, Technologies, and Users
Abstract
The autonomy of the visual impaired person can be evaluated in day to day activities like recognizing objects, identifying textual information, among others. This paper features the OCR technology-based LêRótulos application, with the objective of ...
Read More
Towards a real-time system for finding and reading signs for visually impaired users
ICCHP'12: Proceedings of the 13th international conference on Computers Helping People with Special Needs - Volume Part II

Printed text is a ubiquitous form of information that is inaccessible to many blind and visually impaired people unless it is represented in a non-visual form such as Braille. OCR (optical character recognition) systems have been used by blind and ...
Read More
Mobile device accessibility for the visually impaired: problems mapping and recommendations

Mobile devices can be an important ally that improves the quality of life of visually impaired people by permitting greater independence in the execution of certain tasks and facilitating social inclusion. This work presents a systematic review that ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Accessible Computing Volume 10, Issue 4
October 2017
129 pages
ISSN:1936-7228
EISSN:1936-7236
DOI:10.1145/3131767
Editors:
Matt Huenerfauth
Rochester Institute of Technology, USA
,
Kathleen F. McCoy
University of Delaware, USA
Issue’s Table of Contents
Copyright © 2017 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 August 2017
- Accepted: 1 March 2017
- Revised: 1 January 2017
- Received: 1 September 2016
Published in taccess Volume 10, Issue 4

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
OCR
accessibility
blindness
mobile devices
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 405
  Total Downloads
- Downloads (Last 12 months)27
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Improving the Accessibility of Mobile OCR Apps Via Interactive Modalities

ACM Transactions on Accessible Computing

Abstract

References

Cited By

Index Terms

Recommendations

LêRótulos: A Mobile Application Based on Text Recognition in Images to Assist Visually Impaired People

Towards a real-time system for finding and reading signs for visually impaired users

Mobile device accessibility for the visually impaired: problems mapping and recommendations