research-article

Open Access

Mathematical Content Browsing for Print-disabled Readers Based on Virtual-world Exploration and Audio-visual Sensory Substitution

Authors:
Rynhardt Kruger

Stellenbosch University, Stellenbosch, South Africa and Council for Scientific and Industrial Research, Brummeria, Pretoria, South Africa

Stellenbosch University, Stellenbosch, South Africa and Council for Scientific and Industrial Research, Brummeria, Pretoria, South Africa

0000-0003-2455-8589
View Profile

,
Febe De Wet

Stellenbosch University, Stellenbosch, South Africa

Stellenbosch University, Stellenbosch, South Africa

0000-0003-3495-9802
View Profile

,
Thomas Niesler

Stellenbosch University, Stellenbosch, South Africa

Stellenbosch University, Stellenbosch, South Africa

0000-0002-7341-1017
View Profile

Authors Info & Claims

ACM Transactions on Accessible Computing Volume 16 Issue 2Article No.: 12pp 1–27https://doi.org/10.1145/3584365

Published:11 May 2023Publication History

ACM Transactions on Accessible Computing

Abstract

Documents containing mathematical content remain largely inaccessible to blind and visually impaired readers because they are predominantly published as untagged PDFs, which do not include the semantic data necessary for effective accessibility. Equations in such documents consist of text interlaced with lines and other graphical elements and cannot be interpreted using a screen reader. We present a browsing approach for print-disabled readers specifically aimed at such mathematical content. This approach draws on the navigational mechanisms often used to explore the virtual worlds of text adventure games with audio-visual sensory substitution for graphical content. The relative spatial placement of the elements of an equation are represented as a virtual world so the reader can navigate between elements. Text elements are announced conventionally using synthesised speech, while graphical elements, such as roots and fraction lines, are rendered using a modification of the vOICe algorithm. The virtual world allows the reader to interactively discover the spatial structure of the equation, while the rendition of graphical elements as sound allows the shape and identity of elements that cannot be synthesised as speech to be discovered and recognised. The browsing approach was evaluated by 11 blind and 14 sighted participants in a user trial that included identifying twelve equations extracted from PDF documents. Overall, equations were identified completely correctly in 78% of cases (74% and 83%, respectively, for blind and sighted subjects). If partial correctness is considered, then the performance is substantially higher. Feedback from the blind subjects indicated that the technique allows spatial information and graphical detail to be discovered. We conclude that the integration of a spatial model represented as a virtual world in conjunction with audio-visual sensory substitution for non-textual elements can be an effective way for blind and visually impaired readers to read currently inaccessible mathematical content in PDF documents.

1 INTRODUCTION

The proliferation of electronic text has made it possible for blind and print-disabled readers to access articles and other documents directly without first requiring conversion from physically printed material. This is achieved by the use of screen readers and other assistive technology, which can convey electronic text to a blind reader via alternative mediums, such as synthesised speech and electronic braille. However, such mediums are currently unable to convey graphical information such as diagrams and print mathematics, which are by nature two-dimensional. Although standards exist for encoding accessible representations of mathematical equations, diagrams, and other graphical material, in practice most published work does not adhere to these standards. In particular, most scientific and technical papers are published as electronic documents in the portable document format (PDF) without additional accessibility information. For these documents, screen readers only provide access to the textual content. Equations and diagrams, which are often key to a proper understanding of the work, remain inaccessible.

We present a method that makes typeset equations, as typically found in scientific papers published as PDFs, directly accessible to blind and visually impaired readers by means of interactive textual exploration and audio sensory substitution. Such equations are a mixture of alphanumeric and graphical information, and we consider how they can be explored by a blind reader to discover their constituent parts, both textual and graphical, along with the spatial relationships between these elements. Our work builds on the vOICe algorithm proposed by Meijer [1992], which encodes visual information as sound and combines this with approaches used in accessible games to interactively navigate spatially structured information. To some extent, the current study builds on our own previous work in which we proposed extensions to the vOICe that allows it to be used for the interactive exploration of simple diagrams using gestures and a touch screen.

In the following section, we review the current state of the art in terms of accessibility of mathematical material to blind readers. We then describe our proposed approach, and evaluate it by means of a set of user trials. We further discuss our key findings, as well as informal user feedback gathered from the participants. We conclude by describing future directions of our research.

2 BACKGROUND

From an accessibility perspective, technical material in digital format can be broadly categorised into two types of content. The first is textual content like the text of an article, which can already be accessed by blind and visually impaired readers using screen readers and other assistive technology [Evans and Blenkhorn 2008; King 2012]. The second is two-dimensional content with graphical elements, such as equations and diagrams. This type of content can be made accessible via a screen reader if it is encoded using a semantically rich structure, such as MathML for mathematical equations [Frankel et al. 2017; World Wide Web Consortium 1998]. However, especially in most currently available PDF documents, such two-dimensional content is most often encoded as vector graphics or as rasterised image data. In this case, the content is not accessible using a screen reader unless additional accessibility annotations have been included in the document, such as alternative text (“alt text”) as specified by the Web Content Accessibility Guidelines [World Wide Web Consortium 2018] or the PDF/UA specification [Adobe 2022; Drümmer 2012; for Standardization 2014]. The PDF/UA ISO standard prescribes tags that can be added to PDF documents to improve accessibility. These tags are similar to HTML markup and can be used to convey semantic information. However, most current technical and scientific papers continue to be published as untagged PDF documents. It should also be borne in mind that textual and two-dimensional content are often interlaced in a document, for example, when the symbols of an equation or the labels of a diagram are represented as electronic text. Without the surrounding graphical content to provide context, however, this textual information can generally not be interpreted by a blind or visually impaired reader.

Work is currently being conducted under the umbrella of the LaTeX project, which promises to implement the production of tagged documents. When completed, documents compiled using LaTeX should become accessible with minimal effort by the author. However, it is unclear how much existing content needs to be recompiled to take advantage of these advances. Moreover, not all PDF documents are produced from LaTeX source.

Technical material can be made accessible by presenting the content in an alternative (usually linear) format that can be read using braille or synthesised speech. This approach is appropriate when the graphical content is a visual representation of semantic information that can be represented in an alternative textual form that is accessible via braille or text-to-speech (TTS), as is the case, for example, for mathematical equations and chemical formulas [Karshmer and Bledsoe 2002]. Alternatively, accessibility can be achieved by direct translation of the content into a form that can be interpreted by a human sense other than sight. This approach, known as sensory substitution, is appropriate when the shape of the graphical content itself is important, such as, for example, in the case of plans and geographical maps [King and Evans 2006]. An example of a form of sensory substitution is the use of tactile diagrams, in which the sense of touch is used to convey information that is usually encoded visually [Miller et al. 2010].

2.1 Linear Representations of Graphical Content

Perhaps the most easily understood linear representation for graphical content is a textual description. Current accessibility standards, for example, require that graphical information is described in this way to print-disabled readers [World Wide Web Consortium 2018]. This is usually accomplished by adding a special attribute containing the textual description to the information source, a mechanism that is known as alternative text or simply “alt text.” However, the linear nature of textual descriptions may not be appropriate for the description of complex content that includes structured relationships between its constituent parts. Furthermore, an adequate textual description for complex graphical information might be impractically long [Larkin and Simon 1987].

Several specialised linear formats have been developed for reading mathematical content. The best-known example is arguably braille mathematics. Although different standards exist for braille mathematics, for example, Nemeth braille [Braille Authority of North America 1972] and UK mathematics [Royal National Institute of Blind People 2015], they all make use of additional markup to provide context that is not readily apparent in a linear form. For instance, mathematical braille codes provide markup that indicates the start and end of a fraction. These demarcations are immediately apparent to a sighted reader looking at a graphical representation of the equation, but much harder to determine from a linear textual representation.

Similar markup is utilised by linear textual mathematical formats such as LaTeX and ASCII Math. Although these formats can be read by blind and visually impaired readers, they were designed primarily with typesetting in mind. Their braille representations are therefore not as easy to read as braille mathematics codes, and their direct rendition by a screen reader as synthesised speech is cumbersome [Melfi et al. 2018; Stöger and Miesenberger 2015]. For this reason, a number of ways to provide a more efficient means of reading linear mathematical content with a screen reader have been proposed.

One of the first examples was AsTeR, described by Raman [1994]. AsTeR converts mathematical content in LaTeX format into synthesised speech, making use of different speech attributes to indicate aspects of the context. For instance, AsTeR will read a numerator in a higher pitch, while a denominator will be read in a lower pitch. AsTeR also provides commands for moving forward and backward through the content, as well as exploring structures in more detail. With the development of the MathML standard for displaying mathematics on the web, similar speech browsers have been devised for the exploration of MathML using a screen reader.

The company Design Science Inc developed the MathPlayer extension for some web browsers. This software provides a hierarchical explorer for mathematical equations published in MathML format for screen reader users [Soiffer 2005, 2007]. Equations are represented as a tree structure, with sub-expressions represented as child nodes of their enclosing expressions. Chromevox for Chrome OS was the first screen reader to incorporate native accessible equation reading, including a hierarchical explorer [Sorge et al. 2014]. Similar software extensions are provided for popular screen readers, such as the Access8 extension for the NVDA screen reader [Tseng 2021]. An accessibility mode, which includes a hierarchical explorer, was also recently added to the MathJax library used to render TeX equations on the web [Cervone et al. 2016a, 2016b]. However, all these approaches require the semantic information of the equation to be specified as markup, such as LaTeX or MathML. Although mathematical content published as MathML is becoming more prevalent, a large body of existing mathematical content remains available only in untagged PDF documents without such markup, or even with the equations represented as embedded images.

Studying mathematics in a linear way also presents several challenges to a blind or visually impaired person [Jayant 2006; Karshmer and Bledsoe 2002]. Linear formats require the introduction of additional markup to indicate context that is not readily apparent without the spatial information included in a typeset equation. The result is an increase in the total number of symbols describing the equation. Furthermore, since all these symbols are part of one linear sequence, blind and visually impaired readers must determine the context of a term by reading ahead or backwards through the linear representation to find the relevant contextual markup. This problem is partially addressed by the hierarchical structure representations described above [Stöger and Miesenberger 2015].

The proliferation of several different linear representations for accessing mathematics has also added to the difficulty of communication between blind and sighted readers, as well as between blind readers using different linear representations. For instance, several standards exist for the representation of mathematics in braille, but these are not compatible [Stöger and Miesenberger 2015]. Translators have been developed to allow conversion between print mathematics and several linear formats [Edwards et al. 2006; Gardner 2014; Karshmer et al. 2003; Thompson 2005]. However, these translators require access to the semantic information as captured by MathML, for example, and this is often not available.

Hence, although many approaches to the accessibility of mathematical material have been proposed, a large body of published mathematical content is represented only as graphical layout, and the semantic information required for accessibility is not available. The conversion from graphical formats, such as those in untagged PDF documents, to accessible formats, such as braille or MathML, is usually a manual process. Pattern recognition approaches have been proposed to automate this conversion [Suzuki et al. 2003; Yamaguchi et al. 2008]. However, pattern-based approaches can only recognise elements that were included in the training data. Furthermore, the output may contain recognition errors, which a blind reader has no means to detect without access to the visual representation [Stöger and Miesenberger 2015].

2.2 Sensory Substitution

An alternative and more direct approach to the provision of access to two-dimensional visual material is by direct translation of the visual data to a form that can be perceived by a different human sense. This approach is sometimes referred to as sensory substitution, as it substitutes one sense with another for the perception of information. Two main sensory substitution methods that have been studied for non-visual accessibility are tactile perception and auditory feedback.

2.2.1 Tactile Perception.

Perhaps the best-known example of tactile-visual sensory substitution is the tactile diagram. A tactile diagram can be produced by hand using cardboard and other materials, but can also be produced by braille embossers equipped for this task [View Plus 2021] or printed as ink on material that expands when heat is subsequently applied [American Thermoform 2003]. Tactile diagrams will often require manual intervention in the conversion process, as the resolution of the human tactile sense is lower than that of the visual sense [Loomis et al. 2018]. Tactile diagrams also require labels and other textual elements to be represented in braille, which takes up more space than the printed equivalent [Miller et al. 2010]. Finally, perceiving information via touch necessitates different strategies for accessing and localising information and may be more vulnerable to systematic distortion [Hatwell et al. 2003].

When digital diagrams do include semantic metadata, it is possible to produce tactile diagrams with less manual intervention. This is accomplished by offering textual descriptions of graphical components alongside the tactile diagram, based on the metadata contained in the digital diagram. The company View Plus offers a product called Iveo, which enables a blind reader to interactively study an embossed diagram by placing it on a touch pad [View Plus 2005]. Although the lower resolution of the human tactile sense limits the effectiveness of this approach, it does allow an interactive spatial exploration of the diagram.

A disadvantage of tactile technology is its significant cost to the user. Tactile diagram production requires specialised hardware, and often manual editing, both of which are expensive. Alternative and lower-cost hardware have been proposed for tactile representation. This includes gloves with stimuli on the finger tips [Goncu and Marriott 2011; Manshad and Manshad 2008; Soviak et al. 2016], computer mice incorporating tactile displays [Jansson et al. 2006; Rastogi et al. 2010], force feedback devices [Sjöström et al. 2003; Tornil and Baptiste-Jessel 2004], and friction overlays for touch screens [Xu et al. 2011]. However, these solutions require the use of specialised hardware, which is often expensive and difficult to obtain.

The vibratory actuators built into consumer mobile devices have been studied as a low-cost alternative to traditional tactile diagrams [Gorlewicz et al. 2020; Klatzky et al. 2014]. One disadvantage of this approach is the associated limited bandwidth, since these vibratory actuators are limited to one contact point. It has been shown that multiple contact points are necessary for effective path tracing, as the user is then able to judge the curvature of the path and thereby judge the direction to move to continue tracing [Rosenbaum et al. 2006]. By using distinct vibratory patterns and sound to denote different components of the diagram, the limited bandwidth of commercial vibratory actuators can be augmented. However, this again requires the diagram to be enriched with semantic information and is therefore not useful for existing technical documents.

2.2.2 Auditory Feedback.

Sensory substitution via the sense of hearing has been studied by Meijer [1992], who described an algorithm called the vOICe (the capital letters meaning “Oh, I see”). The vOICe translates an image to sound by scanning it from left to right while generating a series of tone chords that correspond to the pixel columns of the image. Each chord consists of several sinusoids perceived as tones, where each tone indicates a bright pixel in the original image. The frequency and therefore the pitch of a tone is determined by the vertical position of the pixel in the source image, with higher pixels producing higher-pitched tones. Synthesised chords are also placed in the stereo field to indicate horizontal position [Meijer 2002]. Therefore, the leftmost column of the image would be placed to the left of the stereo field, while the rightmost column would be placed to the right.

Alternative image to sound mappings have been described by Capelle et al. [1998] and by Abboud et al. [2014].

Although sensory substitution approaches to graphical accessibility have most often been applied to diagrams where the shape itself needs to be conveyed, they have also been studied as an alternative to the linear approach usually used to represent semantic spatial information like that present in mathematical equations. DotsPlus is a system for representing mathematical content as a tactile diagram interlaced with braille symbols [Gardner 1995]. DotsPlus is intended to be produced directly from visual mathematics on a graphical braille embosser. Mathematical symbols are replaced by braille counterparts, while visual elements are directly translated to a tactile form. In this way, most of the spatial layout and graphical indicators of the original visual representation are preserved.

In our own previous work, we extended the vOICe algorithm to allow the interactive exploration of simple diagrams using gestures and a touch screen [Kruger et al. 2020]. This method was shown to allow blind and visually impaired readers to access diagrams using a process of interactive interrogation that used the vOICe to sonify (that is, to render as sound) local portions of the diagram selected by finger gestures. Importantly, this approach allowed the readers to determine and understand the spatial relationships in the diagram. In this work, we will apply aspects of this technique to the particular case of mathematical equations. For the sake of brevity, we will continue to refer to the rendition of a certain portion of an image by the vOICe as its sonification.

2.3 Accessible Games

The virtual worlds used in accessible games are another setting that requires the representation of complex spatial information in a non-visual way. We will use some techniques employed by accessible games as an alternative method for representing the structured spatial information in mathematical equations [Balan et al. 2015].

Games that are accessible to blind and visually impaired users can be broadly categorised into two types, based on the medium of output. First, there are textual games, the output of which can be accessed by a screen reader [Montfort and Short 2012]. These games are also referred to as text adventures or interactive fiction. Second, there are audio games, which use audio as the primary output modality [Friberg and Gärdenfors 2004].

Most textual virtual worlds employ similar mechanisms to convey spatial layout [Montfort 2005]. Worlds are divided into locations (known as “rooms”), which can conceptually represent any amount of physical space. Relations between rooms are described by means of exits, which are usually indicated by compass directions, such as “north” and “east,” or “up” and “right.” The player is able to explore the environment by issuing textual commands to affect or query the state of the world model. The work by Murillo-Morales and Miesenberger [2020] on accessible diagram exploration via natural queries bears some similarity to the interaction modality used in textual games.

Audio games usually provide a number of tools that players can use to explore the environment [Trewin et al. 2008]. These include simulated sonar (which produces a beep for every object located within a defined range of the listener), footstep sounds that echo in empty spaces, and speech synthesis that describes aspects of the world not easily conveyed by sounds. One common functionality is a command that, when triggered, describes an object in a specific direction relative to the player. Players may, for instance, request the name of the object in front of them or the object to their left or right.

Audio games may also assign additional sounds to objects that may otherwise be silent. Similar to the sonic cues found in audio games, non-speech sounds (also called earcons) have been successfully used to convey navigational context when interactively exploring structure [Bates and Fitzpatrick 2010].

Our interest in the methods employed by accessible games is to explore the use of a direct translation method for reading mathematical equations. We believe that direct translation facilitated by sensory substitution, combined with the relational exploration conventions of accessible games, may provide some attractive advantages when used to supplement linear representations of mathematical equations. First and foremost, direct translation offers a method for accessing content that would otherwise not be accessible to a blind reader, because a linear representation of the content is not available for the overwhelming majority of currently published scientific and technical material. Second, this approach might facilitate communication between blind and sighted colleagues, because it provides blind readers access to the same spatial representation available to their sighted counterparts. Finally, it would allow blind authors to verify the visual representation of their work after typesetting it with a linear markup language such as braille mathematics or LaTeX and also serve as an accessible way of verifying the correctness of linear representations automatically produced by software like InftyReader.

3 PROPOSED APPROACH

Despite many initiatives and standards that aim to improve the accessibility of scientific and technical documents, the reality remains, unfortunately, that most such publications are only available in PDF format without any accessible markup. These documents consist of plain text interleaved with vector drawing instructions (to render equations and vector drawings) and embedded rasterised images (to render bitmap images). While such documents can be read using a screen reader, only the plain text portions are accessible. Reading the text embedded in the equations of these documents is of very limited use to a blind reader, since the text on its own cannot be used to interpret the equations without graphical information to provide the necessary context. Even when PDF documents are tagged, they usually do not include sufficient semantic information. The PDF document of a technical paper may, for instance, include tags denoting tables and headings, but no tags denoting the semantic information of equations. Hence, the current reality is that, although screen readers provide access to the textual material present in technical PDF documents, the information contained in equations remains inaccessible.

To illustrate the inaccessibility of equations from untagged PDF documents, consider the following equation: $\begin{equation*} y=\frac{x}{2}+x^2.\end{equation*}$

A current mainstream PDF reader extracts the following textual representation when viewing the equation:

We see that the equation is represented as text displayed over several lines as a result of the vertical placement of terms. It is, however, not possible to determine from this representation which elements are fractions and which are exponents. It is also not possible to determine the extent of the fraction line, and, therefore, which terms are part of the numerator and which are part of the denominator. For example, the following are all legitimate interpretations of the above textual representation:

$y = \frac{2}{x}+\frac{2}{x}$
$y = \frac{2}{x}+x^2$
$y = \frac{2}{x+x^2}$
$y = x+\frac{2}{x^2}$ .

As a second example, let us consider the following equation, which contains brackets: $\begin{equation*} y=\left(\frac{x\sqrt {x}}{x +2}\right)^5.\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad \end{equation*}$

The plain text representation of this equation is as follows:

This equation is therefore displayed over four lines, due to the spacing of the fraction and the exponent. However, the graphical indicators are absent, and, therefore, it is impossible for a blind reader to distinguish between the exponent and the fraction. If we assume that the equation does not contain brackets, then the following interpretations are possible:

$y = x+\sqrt {\frac{x}{x+2}}^5$
$y = x^x+\sqrt {\frac{x}{2}}^5$
$y = x^x+\frac{\sqrt {x}}{2}^5$
$y = \frac{x\sqrt {x}^5}{x+2}.$

The above examples are intended to illustrate the inaccessibility of equations in untagged PDF documents.

To address these limitations, we have developed a procedure to convey the graphical information contained in a mathematical equation by way of audio-visual sensory substitution while continuing to allow access to symbols and other textual elements via synthesised speech or braille. Our approach is implemented as a non-visual content browser for graphical information that is interlaced with text, such as a mathematical equation. This browser draws on the techniques used to navigate and map virtual worlds in text adventure games to represent the spatial relationships between elements of the equation. However, it also allows graphical information that cannot be identified using synthesised speech; it does so through the use of an adaptation of the vOICe sensory substitution algorithm first described by Meijer [1992] and subsequently adapted in our own work to allow the focused exploration of graphical content in electronic documents [Kruger et al. 2020]. For the current study, we focus specifically on access to mathematical equations that are graphically represented in inaccessible PDF documents. However, our proposed approach is in principle also applicable to other two-dimensional representations of information, such as charts or infographics.

Our browser provides blind readers with two interface modes, both of which allow the two-dimensional structure of an equation to be explored. The elements of the equation are considered to be a kind of map of the virtual world being explored. The reader is always positioned at one of the elements of the equation, referred to as the focus. The focus is always a textual item that can, for example, be spoken as synthesised speech. Because the geometric arrangement of the equation is preserved, the reader is able to discover how items are arranged relative to each other.

Non-textual components of the equation, such as fraction lines, the extents of square roots and most brackets, are considered to be graphical information, and their geometric shape is conveyed using the vOICe algorithm. In practice, the sonification is similar for elements of the same type and may therefore serve the same purpose as earcons, as described by Bates and Fitzpatrick [2010]. Since these non-textual graphical parts of an equation are represented in PDF documents as graphical instructions for which the meaning can not be easily inferred and therefore cannot be rendered as synthesised speech, they can never be in focus and are referred to as “not navigable.” Instead, they must be explored from the vantage of the surrounding navigable elements.

Text mode enables exploration using textual commands in a style similar to that used to explore the virtual worlds of text adventure games. There are commands to examine the focus and also to move to elements adjacent to the focus. Output is provided in the form of verbose textual descriptions, containing information about the elements of the equation currently in focus and their spatial placement.

Graphical mode allows the user to explore the equation by using keys on the keyboard, as well as gestures on a touch screen, if one is available. This interface, therefore, does not rely on explicit textual commands to be typed to move around the equation. Textual content that is encountered is announced by the screen reader, while graphical content is rendered as audio interactively by the adapted vOICe algorithm.

Our approach requires two sources of data: a rasterised image of the document as it would appear on-screen and the set of constituent textual elements with their locations and bounding rectangles. The textual elements and locations are obtained automatically by parsing the PDF document using a PDF parser like Poppler [freedesktop.org 2005]. To facilitate navigation, a document model of the equation under consideration is constructed using all textual elements that are identified by the PDF parser.

3.1 Document Object Model

The document object model (DOM) consists of a graph constructed using the navigable elements that can be identified from the source equation. Non-navigable elements are not contained within the DOM but are extracted from the rasterised rendering of the equation before sonification (described in Section 3.2). For each navigable element, a node is created containing as data the text of the element as well as its location and bounding rectangle. Thereafter, edges are added between neighbouring nodes. Each node may have up to 12 connected edges, representing four primary directions (left, up, right, and down), each with three secondary variants such as up-left, up-centre, and up-right. Although not utilised in this study, the DOM also supports nodes contained within nodes, as parent-child relationships. This might be useful for representing nested expressions in more complex equations or sub-parts of a diagram. However, the PDF extraction functionality will need to be extended to take advantage of such nested structures. This may involve using a pattern recognition approach similar to the algorithm described by Suzuki et al. [2003], which would allow semantic equation exploration, but with the benefit of non-visual layout verification.

To compute the edges, a few simple rules are followed:

•	A node may be connected to another node by only one edge of a certain kind.
•	The possible neighbours of a node are prioritised by shortest distance.
•	An edge between two nodes is only inserted if it does not cross any other nodes.

The pseudocode of the algorithm used to construct the edges of a DOM is given in Figure 1.

Fig. 1. Pseudocode of the algorithm used to construct the connections of the document object model (DOM).

As an example, consider the equation $y = \frac{x^2+4}{3}$, for which the DOM is shown in Figure 2. The DOM contains only textual elements; purely graphical elements are sonified directly from the rasterised rendition of the equation. Our algorithm for producing a DOM is deterministic, therefore, any given equation will always result in the same DOM.

Fig. 2. Document object model for $y = \frac{x^2+4}{3}$ .

Note that non-navigable elements like fraction lines and the extent of square roots are not contained within the DOM and are therefore not taken into account when edges are inserted. These elements are extracted from the rasterised rendering of the equation at the time of sonification.

3.2 Sonification and Navigation

Both text mode and graphical mode allow the user to move among the navigable elements of an equation. The focus, which is initially the top left navigable element of the equation, is shared between both interface modes. For both interface modes, the text of the focus is announced by the screen reader and can also be sonified via audio-visual sensory substitution. This sonification occurs automatically in graphical mode with navigation. In addition, this sonification of textual elements is especially useful, given that symbols in PDF documents are often represented as custom glyph entries with non-standard character codes. Blind users may be able to interpret the character based on its visual shape in such instances.

Since non-textual graphical elements of the equation are not navigable and can therefore never be the focus, our approach is to allow users to interpret such graphical elements by means of audio-visual sensory substitution only, using the surrounding navigable textual elements as context. Examples of these graphical elements include fraction lines, large brackets, the radicands of roots, and other elements with no or partial symbolic representation. From the current focus, sonification of graphical content can be requested in the four primary directions. For instance, if the current focus is the symbol $x$, then the reader can request any graphical content directly above this $x$ to be sonified.

Sonification of non-textual graphical content is accomplished by constructing a bounding box located in the desired direction relative to the focus element. The bounding box extends from the edge of the focus in the requested direction, up to the edge of any element within line of sight in the same direction, or the edge of the screen. For sonification above or below the focus, the left and right edges of the bounding box are aligned with the edges of this element. However, for sonification to the left or right of the focus, the bounding box extends vertically to include lines with non-zero pixels, which for the purposes of this evaluation, was sufficient to include the entirety of vertical graphical elements. Therefore, sonifying graphical content to the left or right of the focus will usually play the entire graphical element, while sonifying above or below will only play the part directly in line with the focus. This allows the user to hear symbols like brackets in their entirety, but elements like fraction lines and the extents of square roots in discrete segments relative to navigable elements. The reason for this is two-fold:

(1)	Initial testing suggested that a bracket is difficult to interpret when this symbol is only partly sonified, while fraction lines and the extents of square roots are not. This is because much of the recognisable shape of brackets is included in the vertical dimension, which is rendered concurrently by the vOICe, while fraction lines contain information only in the horizontal direction, which is rendered as time.
(2)	For brackets, it is sufficient for users to know whether they are present next to a navigable element or not, but for fraction lines and the extents of square roots it may be important to know exactly which part is above or below a navigable element. The latter is particularly important when more than one graphical element is above or below a sequence of navigable elements. For example, consider the equation $y=\frac{3}{\sqrt {x}+2}$. The two elements $x$ and 2 are both below the fraction line, but $x$ is also below the extent of the square root. To correctly interpret the equation, the reader must be able to tell that $x$ has two lines above it, but 2 has only one.

For equations containing more complex vertical extenders, a more robust segmentation algorithm might be necessary, but this issue is beyond the scope of this initial evaluation.

3.3 Text Mode

Text mode browsing views the constituent elements of an equation like locations in a text-based adventure game. The equation can be explored by issuing text commands to, for example, inspect the current focus, to move left, right, up, or down from the current focus or to describe any graphical elements that surround the focus. Each command results in informative textual output that describes a particular aspect of the equation. This output is conveyed to the reader as synthesised speech or braille, facilitated by a screen reader. The structure and spirit of these textual commands are deliberately intended to follow loosely the conventions established by textual virtual worlds. To speed up navigation, the responses to many commands include hyperlinks, which can be activated to trigger follow-on commands relevant to the context.

A typical command consists of a keyword describing the action to be performed, followed by an optional argument, for example, the name of an element or a direction. If a command operates on an element, but no element is specified, then the command is assumed to refer to the focus. For example, the “play” command, when followed by the direction “right,” will sonify the element immediately to the right of the focus. However, when no element is specified, the element currently in focus is sonified.

Movement is facilitated by commands describing the direction, such as “left” or “up.” A secondary direction may be specified to refine the movement. For example, the “right up” command will move the focus diagonally right and up from the current focus. The “look” command allows users to obtain a description of the current focus, or a specific element if provided as an argument. This command is also automatically issued when the navigation begins to provide users with an immediate overview of the initial focus.

The following information is included in an element’s description:

•	The textual content of the navigable element, for example, a mathematical symbol (like “$\Omega$”) or the text string (like “exp”).
•	The location of the element within the equation, as normalised horizontal and vertical Cartesian coordinates between the values of 0 and 100.
•	A list of adjacent elements, with their respective relative directions. These are rendered as links that, when activated, move the focus to that element. An example of such output might be “From this element, the following elements can be reached: Up: ‘$x$,’ Right, ‘2’.”
•	A list describing whether non-navigable graphical content can be sonified in each possible direction. An example of such output might be, “There is additional graphical content left, up, and right.” The items in this list are hyperlinks that sonify the relevant graphical content when activated.
•	A link that can be activated to sonify the navigable element being described.

In addition, when the “look” command is issued without arguments, the output also contains a list of items that are vertically aligned with the focus. These horizontally sequential elements are often terms of an expression. Users are also notified when the current line changes by a notification of the number of elements in the new line, for example, after a movement command.

To illustrate the implementation of text mode, let us examine the practical exploration of the equation $y=\frac{x}{\sqrt {2}}$. When the user first opens the software, a description of the user’s initial location is displayed. In the following transcript, embedded links in the output are indicated in blue and surrounded by square brackets. These links can be used to issue followup commands as an alternative to entering commands via the keyboard.

The description of the initial location contains a list of elements in the current line segment, in this case the “$y$” and “$=$.” A line segment is defined as a sequence of elements with the same vertical position. The user also received a detailed description of the focus (in this case, the element “$y$”), and the directions to nearby elements are indicated as well. Notice that the description contains the location of the focus, which consists of the screen coordinates normalised to a value between 0 and 100. Notice also that the description concludes with instructions for sonification.

The user is now able to issue commands to move the focus or to query surrounding elements. For example, the user may move right:

The user may issue subsequent movement commands to explore the equation. The user may also have a graphical region next to an element sonified. For example, if the user is focused on the $x$, then the fraction line may be sonified as follows:

The “play” command allows the user to sonify a specified region on the screen. In the above case, the user chose to have the region below the “$x$” sonified. As the region corresponds to the fraction line, the software would have produced a constant tone indicating a horizontal line.

3.4 Graphical Mode

As an alternative to the text-driven exploration described in the previous section, graphical mode allows the user to explore the elements of an equation using the keyboard cursor keys to trigger movement. As the user navigates to a new element, it is announced by the screen reader after which the shape of the element is sonified using the vOICe algorithm. The reader is also informed when the vertical position of the focus changes as a result of a horizontal movement or when the horizontal position changes as a result of a vertical movement. For example, when the reader moves to the right, and the new focus is higher than it was previously, as it might be when moving into the numerator of a fraction, this is announced by the word “raised.” By learning to interpret these announcements, the reader can identify the spatial positions of the elements of an equation, such as exponents and fractions. When the user navigates to an adjacent element, and in the process passes over non-navigable graphical information, that graphical information is sonified before that of the newly focused element.

For illustration of graphical mode, we again use the example equation $y = \frac{x}{\sqrt {2}}$. The announcements of the program (also visible in the status bar of the interface) are denoted in the text as follows:

Given that the focus is initially on the “$y$,” the user may move to the “$=$” by pressing the right cursor key. The software will announce the new focus (the “$=$”) and also sonify the shape of the symbol, thereby conveying its shape. Pressing any of the other cursor keys will not move the focus; instead, the user will receive an announcement indicating that no items are available in that direction.

From the “$=$,” the user may press the right cursor key again, and the focus will change to the “$x$,” which is the numerator of the fraction. The software will announce the “$x$” and also sonify its corresponding region. The software will also announce the word “raised,” indicating to the user that the new focus is situated higher on the screen than the previous focus. If the user waits for more than two seconds before pressing another key, then the fraction line below the “$x$” will be sonified as a constant tone. Due to the spacing of the equation, the user will also perceive a short segment of the square root below the fraction line. The user may also request the sonification of the region below the current focus (using shift and the down cursor key), which will sonify the fraction line as well. Requesting a sonification of a region in any other direction will result in an announcement indicating that no graphical content in that direction can be found.

As described above, the document model allows each node to have up to 12 edges, denoting the four primary directions, each with three secondary variants. However, since there are only four cursor keys on a standard keyboard, the same fine-grained navigation available in text mode is not possible. Instead, horizontal movement prioritises nodes in the upward direction, while vertical movement prioritises nodes in the leftward direction. This decision was based on initial user feedback that suggested a preference for reading the numerator of a fraction before the denominator when navigating horizontally, as well as a preference for first reading the left-most element in a term when navigating vertically. Although the navigation options in graphical mode may appear more limited than those available in text mode, and as a result sometimes require more steps to arrive at an intended destination, this is mitigated by the much greater speed at which cursor-key driven navigation is possible.

When a touch screen is available, graphical mode also allows the user to explore the content using gestures. Each keyboard command has an equivalent touch gesture, but a touch screen has the benefit of making a more spontaneous exploration of the content possible. As the user moves one finger across the screen, the entire column of the image situated beneath the point of contact is sonified as a tone chord using the vOICe algorithm. In addition, the portion of the column immediately under the fingertip is sonified with greater intensity. This is intended to allow targeted and interactive graphical exploration of the equation while still providing context about the geometrically surrounding area. Finally, a two-finger gesture allows the line segment between the fingertips to acts as a scanner, with the pixels along this line sonified as a tone chord. By moving the fingertips closer together or further apart or by rotating them around each other, the sonification of shapes with any orientation and size can be achieved. This interactive and localised image exploration is not possible with the classical implementation of the vOICe algorithm.

3.5 Integrated Browsing

When browsing, the user is initially presented with text mode. This mode appears similar to a typical computer command line interface, with output displayed above a field for input. The reader may switch to graphical mode at any time. The command line interface is then replaced with a canvas area in which the graphical form of the equation is rendered and through which the reader can navigate using the cursor keys. Sonification is provided immediately as the focus changes, while textual notifications regarding the focused element are displayed in a status area and presented by the screen reader.

4 EXPERIMENTAL EVALUATION

Our proposed method of interactive equation exploration was evaluated by 25 test subjects, 11 of whom were blind and 14 of whom were sighted. Before evaluation, subjects were provided with a training session in which they could familiarise themselves with the method of exploration and the text and keyboard commands. After training had been completed, each subject completed two evaluation phases. In the first, candidates were asked to consecutively identify a number of different equations using only text mode. In the second phase, candidates were permitted to use both text and graphical modes and to switch freely between them. The exploration algorithm, as well as the training and testing procedures, were implemented using web technologies and were therefore accessible over the internet. This allowed remote evaluation, which also enabled us to comply with Covid-19 safety protocols.

4.1 Equations under Review

Given that our approach provides access to mathematical content by the selective and structured sonification of graphical (non-textual) structures, we identified a number of mathematical conventions that are difficult to read without access to this information. These mathematical structures cannot, for example, be correctly interpreted from the plain-text extracted from an untagged PDF document, as would, for example, be presented by a screen reader.

The following five print mathematical conventions all depend on graphical structure for their correct interpretation and are not identifiable when rendered as plain text. While these five conventions have been identified for the purposes of our evaluation, the list is not intended to be exhaustive.

•	Exponents. After conversion to plain text, exponents are at best presented over two lines, with the exponent in the first and the base in the second. Fractions are presented in a similar way, resulting in immediate ambiguity.
•	Fractions. The fraction line is lost during the conversion to plain text. The numerator and denominator are usually presented on separate lines. This is similar to the presentation of exponents, as described above.
•	Square roots. The radicand is indicated by a graphical line extending over the argument, which is lost during the conversion to plain text. Hence, it is not possible to determine which terms fall under the root from a plain text representation. Depending on the PDF reader, the square root symbol itself may also be presented on a different line from the radicand.
•	Large brackets. Depending on the source of the PDF, large brackets are often rendered using fonts with custom glyphs, for which the TTS system cannot identify a corresponding textual description. In print mathematics, the size of a bracket is also often used to convey the relationships between elements of the equation to the reader. Therefore, an accessible two-dimensional view requires the vertical extent of the brackets to be rendered.
•	Matrices. After conversion to plain text, matrices are presented over several lines, which appear similar to the plain text representation of fractions and exponents. For reasons similar to those described above, the large square brackets delimiting a matrix are also usually absent from the plain text representation.

On the basis of these five types of mathematical content, we developed two sets of six equations with which to evaluate our approach. The first set was used to evaluate text mode interaction only, while the second was used to evaluate both text and graphical modes of interaction. In both cases, an attempt was made to keep the evaluation short to minimise user fatigue. It should also be kept in mind that blind candidates had to be familiarised with all print mathematical conventions contained within our evaluation, and therefore it was not feasible to examine overly complex equations. However, we also endeavoured to choose equations that are not identifiable from a plain text representation. To achieve this, we chose equations that contain at least two of the graphical conventions in the list above. Tables 1 and 2 list the equations we used for the two stages of our evaluation.

Table 1.

Equation	Exponents	Fractions	Roots	Brackets	Matrices
$y=\frac{x}{2}+x^2$	1	1	0	0	0
$y=\sqrt {x}+x^{2}$	1	0	1	0	0
$y=\frac{2}{\sqrt {x}}$	0	1	1	0	0
$y=\left(\frac{\sqrt {x}}{x -2}\right)^{4}$	1	1	1	2	0
$y = \left(\frac{x}{2}+2\right)^2$	1	1	0	2	0

$\begin{bmatrix} 2 \quad 4 \\ 2 \quad 4 \end{bmatrix} \times \begin{bmatrix} 1 \\ 2 \end{bmatrix}$	0	0	0	4	2

View Table

Table 1. Equations for Stage 1 - Text Mode

Table 2.

Equation	Exponents	Fractions	Roots	Brackets	Matrices
$y=x^2+\frac{2}{x}$	1	1	0	0	0
$y=x^3+\sqrt {x}$	1	0	1	0	0
$y=\frac{\sqrt {x}}{2}$	0	1	1	0	0
$y=\left(\frac{x\sqrt {x}}{x +2}\right)^5$	1	1	1	2	0
$y = \left(1+\frac{2}{x}\right)^5$	1	1	0	2	0

$$$ \begin{bmatrix} 1 \quad 2 \quad 3 \\ 2 \quad 3 \quad 4 \end{bmatrix} \times \begin{bmatrix} 1 \quad 2\\ 2 \quad 3\\ 3 \quad 4 \end{bmatrix}$	0	0	0	4	2

View Table

Table 2. Equations for Stage 2 - Text Mode and Graphical Mode

The equations in these two tables were designed to illustrate the current inaccessibility of mathematical content in untagged PDF documents. The high ambiguity of the textual representation that can be extracted from such PDF documents means that this is not a viable means by which blind readers can access mathematical content. For this reason, and to avoid the unnecessary additional fatigue it would cause test subjects, we did not include this plain text format in our evaluation.

4.2 Test Candidates

Evaluation was performed by 11 blind and 14 sighted human subjects. Subjects were recruited by word-of-mouth, and in the case of blind subjects also by means of mailing lists for blind STEM practitioners. The majority of sighted candidates were recruited locally from the Departments of Mathematics and Engineering at the University of Stellenbosch. However, the majority of blind candidates were recruited internationally. Candidates were between the ages of 18 and 70 and included students (undergraduate and postgraduate) as well as candidates who were not students but had an undergraduate or a postgraduate qualification. Both blind and sighted groups included candidates with mathematical and scientific backgrounds, but also candidates from fields outside STEM such as law and journalism. Table 3 lists some of the attributes of the test candidates.

Table 3.

	Blind	Sighted
Undergraduate student	1	1
Postgraduate student	1	7
Undergraduate qualification	4	5
Postgraduate qualification	8	10
STEM background	8	11

View Table

Table 3. Some Attributes of the Test Candidates

4.3 Training

Since mathematical material remains largely inaccessible to blind readers, it was a challenge to identify suitable blind test candidates. Even when willing individuals were identified, it was found that many did not read mathematics on a regular basis and therefore required an introduction to remind them of the key concepts, especially the typical graphical layout of equations that sighted readers are accustomed to. Hence, this training phase, which was carried out prior to the evaluation, was essential.

An online tutorial was developed to re-familiarise participants with the key attributes of mathematical equations as well as to practise the use of the browsing software. The tutorial contained a step-by-step introduction and explanation of the functionality of the proposed algorithm. This included the commands available in text mode, the commands and gestures available in graphical mode, and an explanation of the sound and synthesised speech output that could be expected while browsing. The tutorial also contained a section that allowed users to listen to sound renderings of common graphical shapes, including a horizontal line, a diagonal line, and a square root. This was included to familiarise participants with the method of sonification, with which most were also not acquainted.

As many of the blind and visually impaired candidates had no prior experience with print mathematical notation, descriptions of the graphical symbols that form part of the evaluation were also included in the tutorial. These symbols included a fraction line, a square root, and both round and square brackets. Using the tutorial, users were able to practise exploring increasingly complex equations using both text and graphical modes. The tutorial also included a section that allowed users to test their ability to interpret a number of equations.

In addition to the tutorial, interactive training sessions were performed with candidates via online voice communications platforms. Users were guided through the initial questions of the tutorial and had the opportunity to ask questions.

4.4 Testing Procedure

The evaluation consisted of two sets of six equations, as described in the previous section. The first set was used to evaluate the use of text mode for navigation, while the second was used to evaluate the mixed use of text and graphical modes for interaction, where subjects could switch freely between the two.

Like training, evaluation was designed to function over the internet. To achieve this, each equation was delivered as a web page, and the browsing software was implemented as an interactive web-based application developed using the Rust and JavaScript programming languages. The start page of the evaluation contained links to each equation and also contained instructions for the candidate. Each link opened the software in a new browser window dedicated to the specific equation and allowed the subject to explore it. After exploration, the browser window could simply be closed to return to the starting page.

Equations were typeset using LaTeX and rendered as a PDF to fill the screen, scaled to a horizontal resolution of 300 pixels, preserving aspect ratio. With the exception of plain text elements that were also announced by screen readers, equations were not visibly displayed, thus ensuring a similar testing environment for both sighted and blind participants. Textual symbols, along with their locations and bounding boxes, were extracted from the PDF documents using the Poppler [freedesktop.org 2005] and PdfMinor [Shinyama, Yusuke 2004] PDF extraction libraries. Although Poppler and PdfMinor are unable to extract completely accurate bounding boxes, using them proved to be sufficient for our evaluation. In the future, more complex equations could be facilitated by utilising sophisticated PDF extraction algorithms, for example, the algorithm described by Baker et al. [2009] or the tools described by Nakamura et al. [Kohase et al. 2020; Nakamura et al. 2020]. For each equation, a table of the symbols extracted in this way, as well as an image containing the graphical rendering of the equation, was stored. Equations were rendered in isolation for the purposes of this study, since the aim was to determine the effectiveness of the proposed browsing method. For an integrated solution, equations can be isolated and extracted from the PDF in which they are embedded, either automatically or based on a selection made by the human reader. The latter approach may be especially beneficial for inline equations, where the start and end of an equation are difficult to determine by pattern matching algorithms, but easy to determine by a human reader who understands the context.

After candidates had completed the training tutorials and declared themselves comfortable using the browsing software, they could proceed to the evaluation stages. The first phase of the evaluation required the participants to explore and identify the first set of six equations using text mode only. Since the textual output was displayed on the screen, it could be announced by the screen reader to blind candidates. Most blind candidates used either the Jaws screen reader developed by Freedom Scientific or the open source NVDA screen reader developed by NVAccess. However, at least one blind candidate used Linux with the Orca screen reader. The use of screen readers was not required for sighted subjects. In contrast to the blind subjects who were all familiar with the use of screen readers, the sighted subjects were not. Therefore, requiring the sighted subjects to use screen readers to enforce the same test procedure among all participants could negatively affect results in a way that is not indicative of the effectiveness of the methods under evaluation.

The second phase of the evaluation required the participants to explore and identify the second set of six equations but allowed them to use both text and graphical exploration modes and to switch among these two freely. For graphical mode, textual output consisted of short notifications that were displayed on a status bar readable by sighted participants and announced by the screen reader of blind participants.

To allow later analysis, a record of the actions performed and time spent during evaluation was kept for each test subject. Candidates were asked to write down, in a separate document and using full English sentences, what they believed each equation to be. Candidates were requested to be as explicit as possible in describing the structure of the equation, for example, noting the start and end of fractions, roots, and exponents. Candidates who were familiar with LaTeX were invited to provide their answers in that notation.

5 SCORING

The transcriptions of the equations provided by the test subjects differed widely in style. Some were provided in LaTeX notation, others as ASCII maths, and many as full and descriptive English sentences. Due to this heterogeneous format, all responses were manually and individually assessed for correctness.

Two figures of merit, both expressed as percentages, were used to score the responses made by the test subjects. The first, which we will refer to as “completely correct,” is based on a binary score for each equation in the test indicating whether it was perfectly correct or not. However, even when an equation is not transcribed with perfect accuracy, the degree to which it is incorrect can vary. Therefore, a second figure of merit, which we will refer to as the “correctness score,” was introduced. The correctness score is based on the number of symbols and graphical elements in the equation concerned and is 100% only when all symbols and graphical elements are both correctly identified and correctly placed. The correctness score also had the effect of normalising the responses, allowing them to be compared.

As mentioned previously, the equations used in the evaluations were designed to contain at least two problematic elements each. Problematic elements are those that cannot be unambiguously identified from a plain text representation derived from an untagged PDF document. The correctness score is therefore based on six indicators: the five possible problematic elements and textual symbols. For each equation, we calculated the maximum score based on the number of times each of these elements was correctly identified, as well as the number of times it was correctly placed.

For each equation transcribed by a test subject, the correctness score is calculated by subtracting the number of inserted and deleted elements, as well as the number of incorrectly placed elements, from the maximum possible score.

6 RESULTS

Table 4 reports the number of times each of the six equations in Stages 1 and 2 were transcribed without any error by the 25 test subjects, while Table 5 reports the corresponding correctness scores. We see that, across all subjects and both stages of the evaluation, 78.1% of the responses were perfectly correct (73.5% and 82.7%, respectively, for blind and sighted subjects). The corresponding correctness score was 95.4% overall, with 93.3% and 97.6% for blind and sighted groups, respectively.

Table 4.

	Stage 1			Stage 2
Equation	Blind	Sighted	Overall	Blind	Sighted	Overall
1	81.8%	78.6%	80.2%	81.8%	92.9%	87.3%
2	90.9%	78.6%	84.7%	100.0%	100.0%	100.0%
3	81.8%	78.6%	80.2%	90.9%	100.0%	95.5%
4	45.5%	50.0%	47.7%	54.5%	64.3%	59.4%
5	63.6%	78.6%	71.1%	72.7%	85.7%	79.2%
6	63.6%	92.9%	78.2%	54.5%	92.9%	73.7%
Average	71.2%	76.2%	73.7%	75.8%	89.3%	82.5%

View Table

Table 4. Number of Completely Correct Responses per Equation for Both Evaluation Stages

Table 5.

	Stage 1			Stage 2
Equation	Blind	Sighted	Overall	Blind	Sighted	Overall
1	96.0%	96.0%	96.0%	95.0%	98.4%	96.7%
2	98.0%	98.8%	98.4%	100.0%	100.0%	100.0%
3	95.5%	97.0%	96.2%	99.4%	100.0%	99.7%
4	88.5%	92.6%	90.5%	91.9%	95.7%	93.8%
5	90.5%	95.8%	93.1%	93.2%	98.9%	96.0%
6	88.8%	98.4%	93.6%	82.3%	99.4%	90.9%
Average	92.9%	96.4%	94.6%	93.6%	98.7%	96.2%

View Table

Table 5. Correctness Scores per Equation for Both Evaluation Stages

7 DISCUSSION

Table 4 shows that, for all except equation four, more than 70% of the equations were, on average, transcribed perfectly. Equation (4) was one of the most difficult equations in both stages, containing all elements identified in Section 4.1 except matrices. As expected, the correctness scores in Table 5 are higher than the corresponding completely correct scores in Table 4 and are all above 90%. This indicates that most elements in an equation were transcribed correctly both in terms of identity and geometric placement.

The tables also show that, even though sighted subjects were not able to see the equation they were exploring, they attained slightly better results on average than blind subjects for both stages of the evaluation. However, this difference was found to be not statistically significant ($p\lt 0.1$.) It must be borne in mind that, although the equation itself was not displayed on the screen, sighted subjects were able to see the textual output generated during browsing. Although blind subjects also had access to this information, since the text was synthesised as speech by the screen reader, arguably the ability to read the text allowed sighted candidates to more easily pick out relevant information. It should also be highlighted that two blind candidates attained a perfect score on all equations in both stages, which demonstrates the efficacy of our approach even when used entirely non-visually.¹

Another advantage that sighted subjects arguably have is their familiarity with visual mathematics. In performing the evaluation, it was discovered that most blind candidates were unfamiliar with the visual appearance of elements such as square roots, brackets, and matrices and had to learn these during the training phase, in addition to the use of the browsing software. Sighted subjects, however, were all familiar with the mathematical notation. Furthermore, sighted subjects were able to form a two-dimensional representation of the equation they were exploring using pen and paper, while blind candidates had to resort to a linear representation such as LaTeX or braille to help them.

Tables 4 and 5 and Figure 3 also show that both blind and sighted groups attained higher accuracies in Stage 2, where they could use both text and graphical modes of exploration, than in Stage 1, in which only text mode could be used. This difference is statistically significant ($p\lt 0.02$). In Stage 1, every question had at least one incorrect response, while for Stage 2, Question 2 was correctly identified by all candidates. This may indicate that the graphical mode adds to users’ understanding of equations. However, all sighted candidates, and 10 of the 11 blind candidates, also used text mode in Stage 2 for some of the equations, especially for Equations (4) and (6) (the matrix equation). It should also be remembered that candidates always completed Stage 1 before proceeding to Stage 2, which may also contribute to the higher accuracy attained in the latter.

Fig. 3. Number of perfectly correct responses per question for each stage.

As can be observed in Figure 4, sighted subjects were able to complete Stage 1 more quickly than blind subjects ($p\lt 0.01$), while in Stage 2 the average durations for the two groups were similar. We believe that the ability to use graphical mode and to switch between this and text mode provided the blind subjects with greater flexibility in how to construct an understanding of the equation structure. This additional flexibility was less useful to sighted subjects who could make better use of the verbose textual output provided by the browser.

Fig. 4. Time (in seconds) taken by blind and sighted users to complete Stages 1 and 2, respectively. Vertical bars denote 95% confidence intervals.

Sighted candidates also more often used the links in text mode for follow-up commands, as opposed to blind candidates, who preferred to type out the commands ($p\lt 0.01$). This may also have contributed to the observed time difference between the two groups.

If we examine the results in terms of the particular type of equation that subjects are asked to identify, then we see that blind subjects attained lower scores than sighted subjects for Equation (6) in both phases ($p\lt 0.01$). Equation (6) included a matrix, and this was incorrectly identified by four blind subjects in Stage 1 and five blind subjects in Stage 2. However, the same three subjects contributed incorrect responses in both stages. This suggests that some blind candidates were not familiar with matrices, a hypothesis that is supported by the form of some of the incorrect responses, such as the placement of multiplication signs inside the matrix itself. In addition, one candidate omitted the matrix equation entirely in Stage 2. The involved two-dimensional structure of a matrix makes it especially difficult to interpret using the linear reading methods commonly available to blind readers. In addition, most braille displays can only display one line at a time and can therefore not express the two-dimensional nature of matrices.

Analysis of the answers submitted by blind subjects shows that most errors are due to missing textual elements. For instance, for the first equation of Stage 1, two subjects missed the “2” in the final exponent. For the matrix equations, all but two of the incorrect responses were due to missing textual elements. Across all blind subjects, missing symbols accounted for 53% and 38% of incorrect answers for Stages 1 and 2, respectively. It was furthermore observed that subjects often did in fact visit all the elements of the equation during exploration, even though they did not include some of them in their answers. This might indicate that candidates missed the elements when writing their answers, rather than when reading them. Nevertheless, there were cases where subjects missed elements because they had in fact never visited them during exploration. Improving the browser to report whether all textual elements have been visited might help in reducing this type of error.

The second most prevalent type of error made by blind subjects was the incorrect placement of brackets. Brackets were either not placed in the correct position or were not included within the submitted answer at all. As in the case of symbols, brackets were often omitted even though they had been deliberately sonified by the subject during exploration. Across all blind subjects, bracket errors accounted for 32% of incorrect responses for Stage 1 and 31% for Stage 2.

For sighted subjects, the most prevalent errors were bracket errors, followed by missing symbols. This finding is interesting, since, unlike many symbols that are reflected in the textual output provided to the reader during exploration, brackets are not and must therefore be rendered as audio. Hence, for brackets, the sighted subjects do not have the advantage of access to the rendered text. Across all sighted subjects, bracket errors accounted for 31% of errors, while missing symbols accounted for 24%.

The fourth equation was most often incorrectly identified by both groups of candidates in both Stages 1 and 2. This equation was arguably one of the most difficult, containing fractions with more than one term in the numerator or denominator with the entire fraction enclosed within brackets. Most of the incorrect responses to this equation were due to bracket errors.

8 INFORMAL FEEDBACK

In addition to the quantitative results gathered during the two testing phases, informal feedback was gathered from the subjects during the training phase as well as after the two testing phases had been completed. This feedback was gathered from informal conversations and should therefore be considered as anecdotal. Nevertheless, it provides a general impression of how the approach was perceived by candidates and may help inform further research initiatives. Questions that were asked include the following:

•	How long did it take a candidate to become familiar with the method?
•	Which exploration mode did candidates prefer and why?
•	Was the sonification understandable?
•	Elements are currently automatically sonified after a navigational command. Would candidates prefer them to be silent, with sonification only on request?
•	How do candidates generally visualise mathematics, spatially or semantically?

In addition, candidates were invited to provide any other comments on the approach and the test.

Overall, the browsing approach was positively received, and most subjects were able to use the interface within an hour of beginning training. About half of the blind candidates successfully learned to use the software by following the online tutorial independently. For these candidates, individual meetings were scheduled before the testing phases to gather feedback about their initial perceptions, as well as to answer any remaining questions.

Although candidates noted the similarity in functionality provided by the two exploration modes, most blind candidates expressed a clear preference for the graphical mode. The graphical mode was perceived to allow faster navigation and an improved grasp of the spatial layout of the equation. This might be because the graphical mode uses fewer words, along with single-key navigational commands. One blind candidate noted that the learning curve, and therefore the barrier to entry, was lower for graphical mode.

Three blind candidates also mentioned that the included links provided in text mode sped up navigation. The links were found to be particularly useful when viewing the output in text mode as a history of the steps taken through the equation. Specifically, previous output allowed navigational steps to be retraced, and the embedded links could then be used to choose an alternative route through the equation.

Although most candidates (nine of the blind candidates and all of the sighted candidates) had no prior experience with the vOICe algorithm, they found using the sonification to be intuitive after practise. One blind candidate reported difficulty in distinguishing between multiple tones when played simultaneously, for example, two parallel horizontal lines, and specifically noted the case of a fraction with a square root in the denominator. However, the same candidate was able to identify all equations correctly.

Candidates were asked whether they thought textual elements should automatically be sonified on navigation in exploration mode, which is the current behaviour, or whether this should be automatic only for non-textual elements, such as fraction lines. One candidate expressed a clear preference for the latter, with sonification on request. However, several other candidates noted that the sonification contributed to their mental image of the equation and that they preferred automatic sonification. Because the pitch of the sonification is based on the position of elements on the screen and is therefore absolute, removing automatic sonification of textual elements would potentially increase the difficulty in judging weather a fraction line or other element is below or above the current focus. Sonifying the focus along with the graphical element, which may be either above or below, allows users to judge the vertical alignment from the relative pitch.

Regardless of their preferred mode of visualisation, 8 of the 11 blind candidates reported that the browser provided them with a clearer understanding of the spatial layout used in mathematical equations. In particular, 1 candidate noted that they had not previously had any understanding of the shape of print square roots and that they were for the first time able to visualise the geometric shape of this symbol after using the browser. Another candidate suggested that more semantic information should be added to the textual output, even though they found the spatial information useful. It should be borne in mind that our approach was specifically designed to address the accessibility of content with no or insufficient accompanying semantic information, such as untagged PDF documents. Nevertheless, it might prove interesting in the future to explore the incorporation of richer semantic textual output by, for example, applying our methods to content in MathML or other semantically orientated markup.

9 SUMMARY AND CONCLUSION

We have introduced a browsing technique specifically aimed at making previously inaccessible mathematical equations, as typically found in untagged PDF documents, accessible to blind or low-vision readers. This browsing technique combines spatial navigational methods commonly used in text-based adventure games with audio-visual sensory substitution for non-textual elements of the equation. Two modes of interaction have been implemented and evaluated: one purely text-based and the other more directly based on the geometric layout of the equation and which accommodates gestures when a touch screen is available. During browsing, the reader can switch freely between these two modes of interaction. For audio-visual sensory substitution, we have adapted and extended the vOICe algorithm, first proposed by Meijer, to allow the interactive exploration of graphical elements within an equation, such as root signs, fraction lines, and many types of brackets. Such elements are represented as graphics in PDF documents and can therefore not easily be rendered as synthesised speech.

Evaluation of our approach by both blind and sighted test subjects showed that, after a training session in many cases not exceeding one hour, equations that would be inaccessible using current prevalent screen readers can be identified to a high degree of accuracy. With the exception of one equation in both stages of the evaluation, on average, more than 70% of equations were perfectly identified. Indeed, it was possible for subjects from both the blind and the sighted groups to obtain a perfect score. While sighted candidates expressed a preference for the text-mode interaction, blind candidates made extensive use of the graphical mode but switched between the two when presented with more complex geometric arrangements such as matrices. Informal discussions with the blind candidates after the evaluation also revealed that in some cases additional insight into the layout conventions of mathematical equations had been gained. This indicates that the browsing approach we propose presents a means by which print-disabled readers can, without additional assistance, interactively explore and thereby decipher mathematical content that was, due to its inaccessible format, unfamiliar to them.

The most common type of error made by blind test subjects was the omission of symbols from the equation. This is an aspect we aim to address in ongoing work, since in some cases the reader can be alerted to the possibility that this type of error is being made. In addition, we aim to consider different input sources. For instance, much older published material is available only in the form of rasterised images embedded in the PDF format, as may be obtained from a document scanner. By including optical character recognition (OCR), it may also be possible to make such content accessible using our approach. Also, we would like to consider other non-textual information present in scientific and technical documents, such as graphs and plots. The overall objective is to develop methods and tools that fill the accessibility gaps in electronic documents that are not currently addressed by screen-reading software and that continue to impede access to blind and sight-disabled readers.

ACKNOWLEDGMENTS

The authors would like to thank the South African Council for Scientific and Industrial Research (CSIR) who supported this research. We would also like to thank Prof. Martin Kidd, from the Centre for Statistical Consultation at the University of Stellenbosch, for assisting with the analysis of the data. Finally, we would like to thank all the test candidates who contributed their time to this study.

Footnotes

¹ Although not formally analysed, the responses contributed by blind candidates with a STEM background did not significantly differ from those with other backgrounds. For example, of the three candidates without a STEM background, one attained a near-perfect score. We suspect that a general familiarity with technology was ultimately more important.
Footnote

REFERENCES

Abboud Sami, Hanassy Shlomi, Levy-Tzedek Shelly, Maidenbaum Shachar, and Amedi Amir. 2014. EyeMusic: Introducing a “visual” colorful experience for the blind using auditory sensory substitution. Restor. Neurol. Neurosci. 32, 2 (2014), 247–257.Google ScholarCross Ref
Reference
Adobe. 2022. PDF Accessibility Overview. Retrieved from https://www.adobe.com/accessibility/pdf/pdf-accessibility-overview.html.Google Scholar
Reference
Thermoform American. 2003. Swell Touch Paper. Retrieved from http://www.americanthermoform.com/product/swell-touch-paper/.Google Scholar
Reference
Baker Josef B., Sexton Alan P., and Sorge Volker. 2009. A linear grammar approach to mathematical formula recognition from PDF. In International Conference on Intelligent Computer Mathematics. Springer, 201–216.Google ScholarDigital Library
Reference
Balan Oana, Moldoveanu Alin, and Moldoveanu Florica. 2015. Navigational audio games: An effective approach toward improving spatial contextual learning for blind people. Int. J. Disab. Hum. Devel. 14, 2 (2015), 109–118.Google ScholarCross Ref
Reference
Bates Enda and Fitzpatrick Dónal. 2010. Spoken mathematics using prosody, earcons and spearcons. In Computers Helping People with Special Needs, Klaus Miesenberger, Joachim Klaus, Wolfgang Zagler, and Arthur Karshmer (Eds.). Springer, Berlin Heidelberg, Berlin, Heidelberg, 407–414.Google Scholar
Reference 1Reference 2
America Braille Authority of North. 1972. Nemeth Code Book. Retrieved from https://nfb.org/Images/nfb/documents/pdf/nemeth_1972.pdf.Google Scholar
Reference
Capelle C., Trullemans C., Arno P., and Veraart C.. 1998. A real-time experimental prototype for enhancement of vision rehabilitation using auditory substitution. IEEE Trans. Biomed. Eng. 45, 10 (Oct.1998), 1279–1293. DOI:Google ScholarCross Ref
Reference
Cervone Davide, Krautzberger Peter, and Sorge Volker. 2016a. Employing semantic analysis for enhanced accessibility features in MathJax. In 13th IEEE Annual Consumer Communications & Networking Conference (CCNC). IEEE, 1129–1134.Google ScholarDigital Library
Reference
Cervone Davide, Krautzberger Peter, and Sorge Volker. 2016b. Towards universal rendering in MathJax. In 13th Web for All Conference. 1–4.Google ScholarDigital Library
Reference
Drümmer Olaf. 2012. PDF/UA (ISO 14289-1)–applying WCAG 2.0 principles to the world of PDF documents. In International Conference on Computers for Handicapped Persons. Springer, 587–594.Google ScholarDigital Library
Reference
Edwards Alistair D. N., McCartney Heather, and Fogarolo Flavio. 2006. Lambda: A multimodal approach to making mathematics accessible to blind students. In 8th International ACM SIGACCESS Conference on Computers and Accessibility. 48–54.Google ScholarDigital Library
Reference
Evans Gareth and Blenkhorn Paul. 2008. Screen readers and screen magnifiers. In Assistive Technology for Visually Impaired and Blind People. Springer London, London, 449–495. Google ScholarCross Ref
Reference
Standardization International Organization for. 2014. Document Management Applications – Electronic Document File Format Enhancement for Accessibility – Part 1: Use of ISO 32000-1 (PDF/UA-1). Standard. International Organization for Standardization, Geneva, CH.Google Scholar
Reference
Frankel Lois, Brownstein Beth, and Soiffer Neil. 2017. Expanding audio access to mathematics expressions by students with visual impairments via MathML. ETS Res. Rep. Series 2017, 1 (2017), 1–53.Google ScholarCross Ref
Reference
freedesktop.org. 2005. Poppler PDF Rendering Library. Retrieved from https://poppler.freedesktop.org/.Google Scholar
Reference 1Reference 2
Friberg Johnny and Gärdenfors Dan. 2004. Audio games: New perspectives on game audio. In ACM SIGCHI International Conference on Advances in Computer Entertainment Technology. 148–154.Google ScholarDigital Library
Reference
Gardner John A.. 1995. DotsPlus – Better than Braille? ACM SIGCAPH Comput. Phys. Handic.52-53 (1995), 4–5.Google ScholarDigital Library
Reference
Gardner John A.. 2014. The LEAN math accessible MathML editor. In International Conference on Computers for Handicapped Persons. Springer, 580–587.Google ScholarCross Ref
Reference
Goncu Cagatay and Marriott Kim. 2011. GraVVITAS: Generic multi-touch presentation of accessible graphics. In IFIP Conference on Human-Computer Interaction. Springer, 30–48.Google ScholarCross Ref
Reference
Gorlewicz Jenna L., Tennison Jennifer L., Uesbeck P. Merlin, Richard Margaret E., Palani Hari P., Stefik Andreas, Smith Derrick W., and Giudice Nicholas A.. 2020. Design guidelines and recommendations for multimodal, touchscreen-based graphics. ACM Trans. Access. Comput. 13, 3 (Aug.2020). DOI:Google ScholarDigital Library
Reference
Hatwell Yvette, Streri Arlette, and Gentaz Edouard. 2003. Touching for Knowing: Cognitive Psychology of Haptic Manual Perception. Vol. 53. John Benjamins Publishing.Google ScholarCross Ref
Reference
Jansson Gunnar, Juhasz Imre, and Cammilton Arina. 2006. Reading virtual maps with a haptic mouse: Effects of some modifications of the tactile and audio-tactile information. Brit. J. Vis. Impair. 24, 2 (2006), 60–66.Google ScholarCross Ref
Reference
Jayant Chandrika. 2006. A survey of math accessibility for blind persons and an investigation on text/math separation. Retrieved from DOI: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9287&rep=rep1&type=pdf.Google Scholar
Reference
Karshmer Arthur I. and Bledsoe Chris. 2002. Access to mathematics by blind students. In International Conference on Computers for Handicapped Persons. Springer, 471–476.Google ScholarCross Ref
Reference 1Reference 2
Karshmer Arthur I., Gupta Gopal, Pontelli Enrico, Miesenberger Klaus, Ammalai N., Gopal Deepa, Batusic Mario, Stöger Bernhard, Palmer B., and Guo Hai-Feng. 2003. UMA: A system for universal mathematics accessibility. In 6th International ACM SIGACCESS Conference on Computers and Accessibility. 55–62.Google ScholarDigital Library
Reference
King Alasdair. 2012. Screenreaders, magnifiers, and other ways of using computers. In Assistive Technology for Blindness and Low Vision. CRC Press, 265–288.Google Scholar
Reference
King Alasdair Robin and Evans Gareth. 2006. Re-presenting Visual Content for Blind People. University of Manchester.Google Scholar
Reference
Klatzky Roberta L., Giudice Nicholas A., Bennett Christopher R., and Loomis Jack M.. 2014. Touch-screen technology for the dynamic display of 2D spatial information without vision: Promise and progress. Multisens. Res. 27, 5-6 (2014), 359–378.Google ScholarCross Ref
Reference
Kohase Kento, Nakamura Shunsuke, and Fujiyoshi Akio. 2020. Layout analysis of PDF documents by two-dimensional grammars for the production of accessible textbooks. In International Conference on Computers Helping People with Special Needs. Springer, 321–328.Google ScholarDigital Library
Reference
Kruger Rynhardt, Wet Febe de, and Niesler Thomas. 2020. Interactive image exploration for visually impaired readers using audio-augmented touch gestures. In 24th International Conference Information Visualisation (IV). IEEE, 544–549.Google ScholarCross Ref
Reference 1Reference 2
Larkin Jill H. and Simon Herbert A.. 1987. Why a diagram is (sometimes) worth ten thousand words. Cogn. Sci. 11, 1 (1987), 65–100.Google ScholarCross Ref
Reference
Loomis Jack M., Klatzky Roberta L., and Giudice Nicholas A.. 2018. Sensory substitution of vision: Importance of perceptual and cognitive processing. In Assistive Technology for Blindness and Low Vision. CRC Press, 179–210.Google Scholar
Reference
Manshad Muhanad S. and Manshad Ahmad S.. 2008. Multimodal vision glove for touchscreens. In 10th International ACM SIGACCESS Conference on Computers and Accessibility. 251–252.Google ScholarDigital Library
Reference
Meijer Peter B. L.. 1992. An experimental system for auditory image representations. IEEE Trans. Biomed. Eng. 39, 2 (1992), 112–121.Google ScholarCross Ref
Reference 1Reference 2Reference 3
Meijer Peter B. L.. 2002. Seeing with sound for the blind: Is it vision? In Tucson Conference on Consciousness.Google Scholar
Reference
Melfi Giuseppe, Schwarz Thorsten, and Stiefelhagen Rainer. 2018. An inclusive and accessible latex editor. In International Conference on Computers Helping People with Special Needs. Springer, 579–582.Google ScholarDigital Library
Reference
Miller Irene, Pather Aquinas, Milbury Janet, Hasty Lucia, O’Day Allison, Spence Diane, and Osterhaus S.. 2010. Guidelines and standards for tactile graphics. Retrieved from http://www.brailleauthority.org/tg/web-manual/index.html.Google Scholar
Reference 1Reference 2
Montfort Nick. 2005. Twisty Little Passages: An Approach to Interactive Fiction. The MIT Press.Google Scholar
Reference
Montfort Nick and Short Emily. 2012. Interactive fiction communities. Dichtung Digit. 41 (2012).Google Scholar
Reference
Murillo-Morales Tomas and Miesenberger Klaus. 2020. AUDiaL: A natural language interface to make statistical charts accessible to blind persons. In International Conference on Computers Helping People with Special Needs. Springer, 373–384.Google ScholarDigital Library
Reference
Nakamura Shunsuke, Kohase Kento, and Fujiyoshi Akio. 2020. A series of simple processing tools for PDF files for people with print disabilities. In International Conference on Computers Helping People with Special Needs. Springer, 314–320.Google ScholarDigital Library
Reference
Raman T. V.. 1994. AsTeR: Audio system for technical readings. Inf. Technol. Disab. 1, 4 (1994).Google Scholar
Reference
Rastogi Ravi, Pawluk Dianne T., and Ketchum Jessica M.. 2010. Issues of using tactile mice by individuals who are blind and visually impaired. IEEE Trans. Neural Syst. Rehab. Eng. 18, 3 (2010), 311–318.Google ScholarCross Ref
Reference
Rosenbaum David A., Dawson Amanda M., and Challis John H.. 2006. Haptic tracking permits bimanual independence. J. Experim. Psychol.: Hum. Percept. Perform. 32, 5 (2006), 1266.Google ScholarCross Ref
Reference
People Royal National Institute of Blind. 2015. Braille Mathematics Notation. Retrieved from https://www.ukaaf.org/wp-content/uploads/2015/05/Braille-Mathematics-Notation-PDF.pdf.Google Scholar
Reference
Yusuke Shinyama,. 2004. PDFMiner Python PDF parser and analyzer. Retrieved from https://www.unixuser.org/∼euske/python/pdfminer/index.html.Google Scholar
Reference
Sjöström Calle, Danielsson Henrik, Magnusson Charlotte, and Rassmus-Gröhn Kirsten. 2003. Phantom-based haptic line graphics for blind persons. Vis. Impair. Res. 5, 1 (2003), 13–32.Google ScholarCross Ref
Reference
Soiffer Neil. 2005. MathPlayer: Web-based math accessibility. In 7th International ACM SIGACCESS Conference on Computers and Accessibility. 204–205.Google ScholarDigital Library
Reference
Soiffer Neil. 2007. MathPlayer v2. 1: Web-based math accessibility. In 9th International ACM SIGACCESS Conference on Computers and Accessibility. 257–258.Google Scholar
Reference
Sorge Volker, Chen Charles, Raman T. V., and Tseng David. 2014. Towards making mathematics a first class citizen in general screen readers. In 11th Web for All Conference. 1–10.Google ScholarDigital Library
Reference
Soviak Andrii, Borodin Anatoliy, Ashok Vikas, Borodin Yevgen, Puzis Yury, and Ramakrishnan I. V.. 2016. Tactile accessibility: Does anyone need a haptic glove? In 18th International ACM SIGACCESS Conference on Computers and Accessibility. 101–109.Google ScholarDigital Library
Reference
Stöger Bernhard and Miesenberger Klaus. 2015. Accessing and dealing with mathematics as a blind individual: State of the art and challenges. Enab. Access Pers. Vis. Impair. 199 (2015), 203.Google Scholar
Navigate to
Reference 1
Reference 2
Reference 3
Reference 4
Suzuki Masakazu, Tamari Fumikazu, Fukuda Ryoji, Uchida Seiichi, and Kanahori Toshihiro. 2003. INFTY: An integrated OCR system for mathematical documents. In ACM Symposium on Document Engineering. 95–104.Google Scholar
Reference 1Reference 2
Thompson David M.. 2005. LaTeX2Tri: Physics and mathematics for the blind or visually impaired. In 20th Conference on Technology and Persons with Disabilities. Citeseer.Google Scholar
Reference
Tornil Bertrand and Baptiste-Jessel Nadine. 2004. Use of force feedback pointing devices for blind users. In ERCIM Workshop on User Interfaces for All. Springer, 479–485.Google Scholar
Reference
Trewin Shari, Hanson Vicki L., Laff Mark R., and Cavender Anna. 2008. PowerUp: An accessible virtual world. In 10th International ACM SIGACCESS Conference on Computers and Accessibility. 177–184.Google ScholarDigital Library
Reference
Tseng Woody. 2021. Access8Math NVDA Adon. Retrieved from https://addons.nvda-project.org/addons/access8math.en.html.Google Scholar
Reference
Plus View. 2005. Iveo 3 Hands-on Learning System. Retrieved from https://viewplus.com/product/iveo-3-hands-on-learning-system/.Google Scholar
Reference
Plus View. 2021. View Plus Tactile Embossers. Retrieved from https://viewplus.com/.Google Scholar
Reference
Consortium World Wide Web. 1998. Mathematical Markup Language. Retrieved from https://www.w3.org/TR/WD-math/.Google Scholar
Reference
Consortium World Wide Web. 2018. Web content accessibility guidelines (WCAG) 2.1. Retrieved from https://www.w3.org/TR/WCAG21/.Google Scholar
Reference 1Reference 2
Xu Cheng, Israr Ali, Poupyrev Ivan, Bau Olivier, and Harrison Chris. 2011. Tactile display for the visually impaired using TeslaTouch. In CHI’11 Extended Abstracts on Human Factors in Computing Systems. ACM, 317–322.Google ScholarDigital Library
Reference
Yamaguchi Katsuhito, Komada Toshihiko, Kawane Fukashi, and Suzuki Masakazu. 2008. New features in math accessibility with INFTY software. In International Conference on Computers for Handicapped Persons. Springer, 892–899.Google ScholarDigital Library
Reference

Index Terms

Mathematical Content Browsing for Print-disabled Readers Based on Virtual-world Exploration and Audio-visual Sensory Substitution
1. Human-centered computing
  1. Accessibility
    1. Accessibility technologies
  2. Human computer interaction (HCI)
    1. Interaction techniques

Recommendations

Exploring Visual and Motor Accessibility in Navigating a Virtual World

For many millions of users, 3D virtual worlds provide an engaging, immersive experience heightened by a synergistic combination of visual realism with dynamic control of the user’s movement within the virtual world. For individuals with visual or ...
Read More
An evaluation of the functionality and accessibility of e-readers for individuals with print disabilities

Purpose - This study aims to examine the usability of three popular electronic reading devices e-readers to determine which device provides the best functionality for individuals with print disabilities. Adaptability and flexibility for use with ...
Read More
Silent Delivery: Make Instant Delivery More Accessible for the DHH Delivery Workers Through Sensory Substitution
Distributed, Ambient and Pervasive Interactions
Abstract
With the rapid development of the instant delivery industry in recent years, delivery jobs, have attracted many DHH (Deaf and Hard of Hearing) people because of their flexible working hours, predictable incoming, and low threshold. However, the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Accessible Computing Volume 16, Issue 2
June 2023
176 pages
ISSN:1936-7228
EISSN:1936-7236
DOI:10.1145/3596450
Editors:
Tiago Guerreiro
Universidade de Lisboa, Portugal
,
Stephanie Ludi
University of North Texas, USA
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 May 2023
- Online AM: 17 February 2023
- Accepted: 6 February 2023
- Revised: 17 January 2023
- Received: 3 February 2022
Published in taccess Volume 16, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Accessibility
sensory substitution
mathematics
virtual worlds
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 915
  Total Downloads
- Downloads (Last 12 months)859
- Downloads (Last 6 weeks)88
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Equation	Exponents	Fractions	Roots	Brackets	Matrices
\(y=\frac{x}{2}+x^2\)	1	1	0	0	0
\(y=\sqrt {x}+x^{2}\)	1	0	1	0	0
\(y=\frac{2}{\sqrt {x}}\)	0	1	1	0	0
\(y=\left(\frac{\sqrt {x}}{x -2}\right)^{4}\)	1	1	1	2	0
\(y = \left(\frac{x}{2}+2\right)^2\)	1	1	0	2	0

\(\begin{bmatrix} 2 \quad 4 \\ 2 \quad 4 \end{bmatrix} \times \begin{bmatrix} 1 \\ 2 \end{bmatrix}\)	0	0	0	4	2

Equation	Exponents	Fractions	Roots	Brackets	Matrices
\(y=x^2+\frac{2}{x}\)	1	1	0	0	0
\(y=x^3+\sqrt {x}\)	1	0	1	0	0
\(y=\frac{\sqrt {x}}{2}\)	0	1	1	0	0
\(y=\left(\frac{x\sqrt {x}}{x +2}\right)^5\)	1	1	1	2	0
\(y = \left(1+\frac{2}{x}\right)^5\)	1	1	0	2	0

\($$ \begin{bmatrix} 1 \quad 2 \quad 3 \\ 2 \quad 3 \quad 4 \end{bmatrix} \times \begin{bmatrix} 1 \quad 2\\ 2 \quad 3\\ 3 \quad 4 \end{bmatrix}\)	0	0	0	4	2

Mathematical Content Browsing for Print-disabled Readers Based on Virtual-world Exploration and Audio-visual Sensory Substitution

ACM Transactions on Accessible Computing

Abstract

1 INTRODUCTION

2 BACKGROUND

2.1 Linear Representations of Graphical Content

2.2 Sensory Substitution

2.2.1 Tactile Perception.

2.2.2 Auditory Feedback.

2.3 Accessible Games

3 PROPOSED APPROACH

3.1 Document Object Model

3.2 Sonification and Navigation

3.3 Text Mode

3.4 Graphical Mode

3.5 Integrated Browsing

4 EXPERIMENTAL EVALUATION

4.1 Equations under Review

4.2 Test Candidates

4.3 Training

4.4 Testing Procedure

5 SCORING

6 RESULTS

7 DISCUSSION

8 INFORMAL FEEDBACK

9 SUMMARY AND CONCLUSION

ACKNOWLEDGMENTS

Footnotes

REFERENCES

Cited By

Index Terms

Recommendations

Exploring Visual and Motor Accessibility in Navigating a Virtual World

An evaluation of the functionality and accessibility of e-readers for individuals with print disabilities

Silent Delivery: Make Instant Delivery More Accessible for the DHH Delivery Workers Through Sensory Substitution

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media