Paper The following article is Open access

Virtual reality validation of naturalistic modulation strategies to counteract fading in retinal stimulation

, , , and

Published 29 March 2022 © 2022 The Author(s). Published by IOP Publishing Ltd
, , Citation Jacob Thomas Thorn et al 2022 J. Neural Eng. 19 026016 DOI 10.1088/1741-2552/ac5a5c

1741-2552/19/2/026016

Abstract

Objective. Temporal resolution is a key challenge in artificial vision. Several prosthetic approaches are limited by the perceptual fading of evoked phosphenes upon repeated stimulation from the same electrode. Therefore, implanted patients are forced to perform active scanning, via head movements, to refresh the visual field viewed by the camera. However, active scanning is a draining task, and it is crucial to find compensatory strategies to reduce it. Approach. To address this question, we implemented perceptual fading in simulated prosthetic vision using virtual reality. Then, we quantified the effect of fading on two indicators: the time to complete a reading task and the head rotation during the task. We also tested if stimulation strategies previously proposed to increase the persistence of responses in retinal ganglion cells to electrical stimulation could improve these indicators. Main results. This study shows that stimulation strategies based on interrupted pulse trains and randomisation of the pulse duration allows significant reduction of both the time to complete the task and the head rotation during the task. Significance. The stimulation strategy used in retinal implants is crucial to counteract perceptual fading and to reduce active head scanning during prosthetic vision. In turn, less active scanning might improve the patient's comfort in artificial vision.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Retinal prostheses provide artificial vision to blind people affected by photoreceptor degeneration through the electrical stimulation of the retina, inducing the perception of localised bright spots of light, called phosphenes. The appearance of patterns of phosphenes allows blind patients to detect and recognise objects, orient themselves or navigate obstacles [1]. However, so far retinal implants have struggled to provide artificial vision useful in everyday life. There are multiple technological and physiological constraints that currently keep artificial vision far away from natural vision, among which are limitations on the electrode number and density, the size of the restored field of view, the unselective stimulation of retinal circuits or cells, and the temporal resolution of the elicited responses [2, 3]. These limitations lead to exhaustion in users, who have to learn an almost totally new way to see, and are not able to count only on the implant for daily autonomous activities [4].

A key factor in artificial vision is the temporal resolution of phosphenes. The electrical stimulation of the retina with a fixed electrode causes the phosphene to fade in less than a second [57]: a phenomenon that makes continuous perception above flicker fusion nearly impossible. The fading of the phosphenes elicited by retinal implants presumably originates in the inner retinal circuits mediating contrast adaptation [68]. Similar to the adaptation process to static visual stimuli (Troxler effect), the engagement of these circuits during retinal ganglion cell (RGC) network-mediated stimulation by either epiretinal or subretinal implants generates an adaptive response to repeated static electrical stimuli [5, 6, 9]. The desensitisation rate increases with the stimulation frequency [7], and at the stimulation rates typically used for retinal stimulation, between 5 and 30 Hz, RGCs rapidly stop responding in less than a few hundredths of seconds. The physiological adaptation of the RGC response over repeated static stimulation observed in animal models is frequently associated with the phosphene fading reported by implanted patients during clinical trials [10, 11]. This association is motivated by the similarity in the time courses of the two phenomena, characterised by a rapid drop followed by a slower decay [6, 12, 13].

In vision, ocular micromovements (e.g. microsaccades) refresh the image projected on the retina avoiding the Troxler effect [14]. In artificial vision, avoiding response adaptation is possible only when three conditions are met. First, the electrodes must be activated by natural or artificial light entering the eye through the pupil, such as for photosensitive prostheses (e.g. Alpha AMS [15], PRIMA [16] and POLYRETINA [17, 18]), so that eye micromovements can effectively shift the stimulated area of the retina. Second, the stimulation must be at high spatial resolution. Microsaccades occur with a great variety of amplitudes (from 0.2 to 10°) and frequencies (from 1 Hz to 20 Hz). Under stable vision conditions, most fixational microsaccades occur one to two times per second, and cover around 1° of visual angle [19]. The stimulation resolution of the array must be fine enough so that a shift of a few hundreds of micrometres (1°) is sufficient to move the projected image by one or two electrodes over the prosthesis leading to stimulation of different retinal areas. Third, physiological micromovements must be preserved, such as in age-related macular degeneration [16].

When electric based prostheses are used (e.g. Argus® II [11]), users must actively perform head movements to refresh the visual field viewed by the camera: a procedure learned during the first weeks of postoperative training. Alternatively, if eye movements are not dysfunctional and an eye tracker is used, large voluntary eye movements could be used to shift the region of interest within the camera's field of view [20]. Although the patient's performance typically improves during the first learning phase [21], it quickly reaches a plateau, and several patients described active scanning as an exhaustive task [4]. Recently, our group proposed a naturalistic spatiotemporal modulation strategy for electrical prostheses, which, in retinal explants, reduced retinal desensitisation and prolonged the RGC response upon static stimulation [7].

Simulated prosthetic vision (SPV) is a key tool to understand the minimal requirements of artificial vision [2232]. However, so far, most SPV studies have excluded the temporal properties of artificial vision, in particular phosphene fading. Only one study recently introduced fading using short video clips presented on a computer screen [33]. In virtual reality (VR), subjects are able to scan the environment using head movements, allowing the systematic investigation of how scanning influences fading. In this work, we introduced perceptual fading into a VR-based SPV, under the hypothesis that fading is solely caused by retinal desensitisation, according to previously reported temporal dynamics in RGCs [7]. We analysed the consequences of fading on a VR task with sighted participants, and simulated fading compensation approaches based on the previously described naturalistic spatiotemporal modulation strategies [7] to evaluate whether they could indeed successfully reduce participants' head scanning.

2. Methods

2.1. Ethical statement

Experiments were approved by the human research ethics committee of École polytechnique fédérale de Lausanne (decision number 042-2018/16.10.2018). Ten sighted volunteers were involved in the study (table 1).

Table 1. List of normally sighted volunteers enrolled in the study.

ParticipantSexAgeMother tongue
1F28Italian
2F26English
3F27Italian
4F28Italian
5F34French
6F34English
7M23French
8F21French
9F24French
10M30French

2.2. Prosthetic vision

SPV was generated by adapting a previously described approach [22]. The experiment was performed using a Dell Precision 3630 computer with an Intel Xeon E-2146G CPU (3.50 GHz) and an Nvidia GeForce GTX 1080 GPU. The VIVE Pro Eye head mounted display was used. Tracking of head orientation and position was provided using two VIVE Base Station 2.0 tracking cameras placed in opposite corners of the room (roughly 2.5 m apart) and aimed at the centre of the room where the participant was sat. SPV was developed using Unity and computed using Cg shaders, allowing real-time operation. The code is available online (https://github.com/lne-lab/polyretina_vr). Both the spatial and temporal properties of phosphenes were considered. Images were converted into phosphenes distributed over a visual angle of 45° using the layout of the POLYRETINA prosthesis (80/120, electrode diameter in µm/electrode pitch in µm) [18]. The total number of phosphenes was 9914. For each frame, three sources of random variability were included: the phosphene size was randomly varied between ±30% of the electrode size, the phosphene brightness was randomly varied between 50% (grey) and 100% (white) of the phosphene's default brightness, and 10% of the electrodes were considered not functional. Then, a distortion due to unintended activation of axon fibres was introduced (λ = 2) [22]. The frame rate of SPV was set to 5 Hz with a 11 ms frame duration (1 frame on/17 frames off). SPV was presented to the right eye only.

2.3. Simulation of perceptual fading

A novel aspect of the SPV was the inclusion of perceptual fading, which is the reduction in brightness of phosphenes over a short amount of time due to retinal desensitisation.

We previously showed using blind retinal explants upon repeated illumination of the same POLYRETINA electrode at 5 Hz that the RGC response persisted for only 0.4 s [7]. In agreement with previous results describing a two-phases desensitisation process in RGCs [6], we found a first rapid decay occurring immediately after the first stimulation pulse followed by a slow decay during prolonged stimulations. On average, the rapid decay occurred over the first 25% of the total response duration (i.e. 0.1 s) and the slow decay for the remaining 75% of the total response duration (i.e. 0.3 s).

In SPV, phosphene fading was simulated via a decrease in brightness (figure 1(a)) whose dynamics matched that of RGCs desensitisation. The brightness decay was approximated as a double linear decay with two time constants (tf, fast decay constant; ts, slow decay constant). Brightness (B) decreased linearly in tf time until at 50% brightness and then decreased linearly in ts time until 0% brightness as in equation (1), where ta is the elapsed time since the phosphenes onset. Brightness values range between 0 and 1. This range was chosen specifically because it corresponds to the values expected by application programming interfaces when rendering pixels on the screen

Equation (1)

Figure 1.

Figure 1. (a) Modelled brightness decay in SPV for a 5 Hz static stimulation. The black line is the double linear decay with two time constants: tf and ts. The grey dashed lines highlight the transition between tf and ts. (b) Modelled exponential brightness recovery. In (a) and (b), the red circles are the values used in SPV with the frame rate set to 5 Hz.

Standard image High-resolution image

Phosphene recovery was simulated via an exponential increase in brightness (figure 1(b)) whose dynamics matched experimental evidence of recovery of retinal excitability after desensitisation [8, 34]. This recovery is a time-dependent process, during which the neuronal membrane is absolutely refractory immediately after a spike train and then gradually increases to its resting potential [35, 36]. Similarly in SPV, individual pixel brightness switched from decay to recovery after 2 s without stimulation (absolute refractory period). The brightness recovery followed an exponential growth (equation (2)), according to classical models of neuronal excitability [3739], where tn is the time the phosphene has been inactive, a is the time for the phosphene to completely recover (set to 2 s) and b defines the arc of recovery (set to 3)

Equation (2)

In SPV, the 'normal fading' (F) condition corresponds to the natural desensitisation of the retina upon 5 Hz stimulation from the same electrode (figure 1), while the 'no fading' (NF) condition corresponds to stable brightness over time (B = 1). Concurrently, compensation approaches for fading were simulated based on the naturalistic spatiotemporal modulation strategies previously characterised with retinal explants from blind mice (figure 2) [7]. The 'saccade' (S) strategy is a spatial modulation strategy (figure 2(a)). It simulates natural microsaccades by having the entire image oscillate horizontally by the pitch of one phosphene once every second. This way, new phosphenes will become active and refresh at least some part of the prosthetic image, thus counteracting fading. The total response duration in RGCs was measured by alternating the illumination (10 ms light pulses at 5 Hz repetition rate) of two neighbouring electrodes at 1 Hz switching rate. The RGC response persisted for 1.8 s. The 'random' (R) and the 'interrupt' (I) conditions are temporal modulation strategies. In the R condition, the duration of the light pulse was varied, leading to variable inter-pulse intervals (figure 2(b)). This strategy emulates the highly irregular temporal profile of light reaching a steady location on the retina due to both eye and object movements. This dynamic change contributes by counteracting the desensitisation of bipolar cells to natural stimuli [40]. Similarly, it reduces desensitisation during static electrical stimulation [7, 41]. In the R condition, the RGC response persisted for 0.8 s. In the I condition, for each five pulses, only the first three were delivered (figure 2(c)), since the last would not have generated a RGC response anyway. The RGC response persisted for 0.4 s, as in the static stimulation. All three strategies were temporally synchronous across each phosphene. The remaining conditions, were a combination of the three strategies already described: saccade/random (SR, figure 2(d)), saccade/interrupt (SI, figure 2(e)), random/interrupt (RI, figure 2(f)), and saccade/random/interrupt (SRI, figure 2(g)).

Figure 2.

Figure 2. Sketches of the naturalistic spatiotemporal modulation strategy previously used in blind retinal explants to determine the total response duration in RGCs [7]. In each panel, the green circle is the illuminated POLYRETINA electrode and the black bars represent the stimuli. The grey circle is a neighbouring POLYRETINA electrode which is activated in the 'Saccade' condition. In the 'Random' condition, the line thickness indicates the pulse duration.

Standard image High-resolution image

The rate of brightness decay in SPV for the compensation approaches was generated from the total response duration measured in RGCs [7]. The durations were split with a ratio of 1:3 to produce the fast (tf) and slow (ts) decay constants used in equation (1). For each approach, the computed time constants are reported in table 2.

Table 2. Fast and slow time constants for each condition were determined according to the persistence of RGC response observed in blind retinas upon stimulation with POLYRETINA [7]. The S strategy involves the alternation between two neighbouring electrodes. Therefore, the total duration of the RGC response was halved to account for the fading of each phosphene.

 Decay time (s)
ConditionTotal duration tf ts
F0.4000.1000.300
S0.7000.1750.525
R0.8000.2000.600
I0.4000.1000.300
SR0.7000.1750.525
SI1.1000.2750.825
RI3.2000.8002.400
SRI2.2000.5501.650
NF

A representative example of image perception at various frames is shown in figure 3 for various conditions.

Figure 3.

Figure 3. Example of perception with six strategies: F (a), SR (b), SI (c), RI (d), SRI (e) and NF (f). Empty cells indicate the cancelled stimuli for the I strategy.

Standard image High-resolution image

The fading logic was applied simultaneously to each phosphene every time there was a stimulus presentation (figure 4). Parameters were updated in real-time on the GPU where the simulation is also being processed. To read/write such an amount of data on the GPU we used multiple render textures and accessed them in a double-buffered fashion to create a read/write data matrix where the values could be stored and updated in real-time. In other words, all processing and necessary data for the simulation remained solely on the GPU, where computations could be completed fast enough to enable real-time SPV.

Figure 4.

Figure 4. Fading logic. ta is the time since the first pulse, tn is the time since the last pulse, Δt is the time since the fading logic was last executed, tf is the fast decay constant, ts is the slow decay constant.

Standard image High-resolution image

2.4. Experiment design

The experiment consisted of a single repeated task in which participants had to read aloud three six-letter words. Words were presented in the mother tongue of the participant and on separate lines. Each word was 7° tall and the entire stimulus (i.e. the three words) took up 27° vertically and around 40° horizontally. Participants were informed that phosphenes would appear and fade over a few seconds and that head movements could be used to reduce fading. Participants could rotate their heads in order to adjust their view of the words. However, translational movements did not affect the view of the stimulus. Only rotational movements were used as they are more natural and comfortable for participants to perform. A similar approach was taken to analyse only the angular head movement of patients implanted with the Argus® II [20]. Once a participant had given their answers, the experimenter would mark the trial as finished and the participant would have a 2 s pause until the next trial. As participants were able to move their heads, their virtual head position was reset at the start of each trial.

The task was designed to not be too difficult but to also take time to complete. As the study was focused on the impact of phosphene fading, it was important that participants could not complete the task in such a short amount of time that the fading was not perceived. By having three words instead of one, it guaranteed that a participant could not complete the task before the image had completely faded prior to any accommodating head rotation. On the other hand, the words were displayed with a generous size, so that participants could actually complete it. SPV includes several laborious aspects of artificial vision such as a low spatial resolution, restricted field of view and refresh rate, as well as unintended visual distortions from axon fibre activation and perceptual desensitisation. In pilot tests, making words smaller, in combination with the already unaccommodating SPV, increased the difficulty to unacceptable levels. Another important aspect of the task was that the three words fit entirely within the field of view of the prosthetic vision (45°). By doing this, any head rotation from participants could be largely attributed to their purposeful effort to reduce fading and not due to participants having to move their heads simply to see the stimulus.

The task was completed under the nine different conditions, presented randomly. Each condition was presented five times for a total of 45 trials per session. Participants were asked to come in for a total of six sessions to control for learning effects. Therefore, each participant completed 270 trials (30 per condition). Participants were given instructions on the task before starting. Most importantly, they were given the information that the SPV would fade over time and that head rotations could be used to counteract the fading.

2.5. Analysis of head rotation

Data analysis was performed in MATLAB (MathWorks). The head rotation data consisted of a non-zero amount of quaternion values depending on the duration of the trial (figure 5(a)). Quaternions are four-dimensional vectors which describe the rotation of an object around three perpendicular axes. To analyse the data, each quaternion was first converted into rotations around the vertical and lateral axes (figure 5(b)), since rotations were almost absent in the longitudinal axis. Then, head rotation data was split into two separate measurements: one focused on the magnitude of head rotations and the other on the spread of head rotations. Magnitude measures the amount that participants had to move their heads during a trial while spread measures the extent of dispersion of head movements away from the stimulus. Here, a head movement away from the stimulus refers to a rotation of the head that produces an angle between the centre of the stimulus (i.e. the three words) and the gaze position. This could be achieved through either lateral or vertical head rotations or a combination of both. Measurements were split this way due to the possibility that participants might have moved their heads a lot, but only in a small area, or alternatively, only had to move their heads a few times, but did so in large arcs.

Figure 5.

Figure 5. Representative example of head rotation data in one trial under condition F. (a) Example of the quaternion data. The red dot is the position of the head, while the black dots are the coordinates at which the participant was looking throughout the trial. The vertical axis is the yaw, the lateral axis is the pitch and the longitudinal axis is the roll. (b) Two-dimensional plot of the head rotations along the lateral and vertical axes. Cyan circles are the peaks at the extremities of the segments, while the red triangles represent the angle of the change in head rotation at the peak. The insert shows a magnified view of a detected peak. (c) Example of a gaze vector (blue) looking away from the stimulus due to the participant rotating their head around the lateral and vertical axes. The black vector is rotated by a quaternion which produces a vector looking directly towards the stimulus. The angle between the two vectors is the rotation of the head away from the stimulus.

Standard image High-resolution image

To measure the magnitude, head rotations from each trial were segmented into individual rotation vectors (figure 5(b)). Segments were identified using a peak analysis on the head's inverse angular velocity. The peak analysis was performed using the MATLAB function 'findpeaks'. This function returns all local maximums from a one-dimensional signal. In this case, the signal was the head's inverse angular velocity. The signal was inverted so that the peaks corresponded to the instances when participants' were slowing their head movement. Segments were then filtered using the k-means algorithm using the peak's height, width and the turning angle at the point of the peak. The first and last data points of each trial were always marked as peaks. If multiple data points were identified for a single segment then the data point closest to the correct k-means cluster centre was chosen. The segment magnitude was calculated by summing the rotation between each data point in the segment. Finally, the magnitude of these vectors was averaged. This approach was chosen over simpler options such as the total head rotation as this was heavily affected by the time taken to complete the task. Similarly, head velocity proved to be a misleading metric as participants' head rotations were often discontinuous, pausing while participants focused on the stimulus and then subsequently moving as fading began to degrade vision. As a consequence of the stop-start nature of head rotations, the data could be segmented into individual rotation vectors, with an associated magnitude.

To measure the spread of head rotations, we calculated the average rotation of the head away from the stimulus for each trial, using the MATLAB function 'dist'. This function returns the angle between two quaternions in unit radians, which is analogous to the Euclidean distance between two points, substituting x and y positions for lateral and vertical rotations. Using this function in combination with an identity quaternion angled directly towards the stimulus, an angle was calculated for each head rotation away from the stimulus at each frame of a trial (figure 5(c)). Then, the data was converted into degrees from radians and averaged per trial.

2.6. Statistical analysis and plots

Statistical analysis and graphical representation were performed with Prism 8 (Graph Pad). The D'Agostino and Pearson omnibus normality test was performed to justify the use of a non-parametric test. The box plots extend from the 25th to 75th percentiles. The line is the median. The + is the mean. The whiskers extend to the smallest and the largest value. In plots, p-values were reported as: * p < 0.05, ** p < 0.01, *** p < 0.001, and **** p < 0.0001.

3. Results

We collected two indicators for each trial: the time to complete the task and the head rotation.

3.1. Time to complete the task

First, we evaluated the time taken by each subject to complete the task (figure 6). This parameter provides the first information on the performance for each condition.

Figure 6.

Figure 6. Quantification of the time taken by each subject (mean ± s.d.). For each condition, each subject performed 30 trials, and the time was averaged among trials. Individual data points are the ten subjects.

Standard image High-resolution image

The Friedman one-way repeated measure analysis of variance by ranks reported a significant difference among the conditions (p < 0.0001). Dunn's multiple comparison test showed that three conditions resulted in a time to complete the task significantly lower than the fading condition: RI (p < 0.0001), SRI (p = 0.0393) and NF (p < 0.001). The remaining conditions (S, R, I, SR, SI) do not result in a statistically significant difference in the time to complete the task compared to the fading condition (table 3), indicating that those strategies might not be as powerful as the others. Interestingly, the RI and SRI strategies are not only significantly better than F condition, but also reported results statistically similar to the NF condition, which indicates that they are extremely efficient in counteracting fading. The other strategies (S, R, I, SR, SI) are significantly worse than NF condition, which confirms their poor performance in counteracting fading. In summary, the most promising strategies to counteract fading were RI and SRI.

Table 3.  P-values from the Dunn's multiple comparison test. Significant comparisons are in bold (p < 0.05).

 FSRISRSIRISRI
S=0.1981
R=0.1536>0.9999
I>0.9999>0.9999>0.9999
SR=0.9895>0.9999>0.9999>0.9999
SI=0.6441>0.9999>0.9999>0.9999>0.9999
RI <0.0001 =0.4093=0.5150 =0.0003 =0.0690=0.1184
SRI =0.0393 >0.9999>0.9999=0.5150>0.9999>0.9999>0.9999
NF <0.0001 =0.0293 =0.0393 <0.0001 =0.0032 =0.0062 >0.9999=0.1536

3.2. Head rotation

In order to counteract the fading of phosphenes during prosthetic vision, implanted patients are instructed to continuously move their head. The side-effect is a high cognitive demand and a decline in comfort of the use of the device from constant scanning. We hypothesised that head movements could be reduced by employing strategies to slow down the fading.

3.2.1. Magnitude of head rotations

During trials under normal fading (F), we observed large head rotations away from the stimulus (figures 7(a) and (b)), while head rotations were virtually absent during the NF condition (figures 7(e) and (f)). Spatiotemporal modulation strategies allowed the reduction of head rotations by a variable factor depending on the strategy (figures 7(c) and (d) for RI strategy).

Figure 7.

Figure 7. (a), (c), (e) Representative example of absolute angular rotation of the participant's head from the stimulus over time for conditions F (a), RI (c) and NF (e) during one trial in the same session. (b), (d), (f) Corresponding vertical and lateral rotations for conditions F (b), RI (d) and NF (f) during a single trial.

Standard image High-resolution image

The average magnitude of the head rotations was calculated then for all the conditions to determine which strategy would provide a statistically significant reduction (figure 8).

Figure 8.

Figure 8. Quantification of the magnitude of head rotations per subject (mean ± s.d.). For each condition, each subject performed 30 trials, and data was averaged among trials. Individual data points are the ten subjects.

Standard image High-resolution image

The Friedman one-way repeated measure analysis of variance by ranks reported a significant difference among the conditions (p < 0.0001). Similar to the time to complete the task, Dunn's multiple comparison test showed that four conditions resulted in a magnitude of head rotations significantly lower than the fading condition: SI (p = 0.0293), RI (p < 0.0001), SRI (p = 0.0023) and NF (p < 0.001). The remaining conditions (S, R, I, SR) do not result in a significant difference in the magnitude of head rotations compared to the fading condition (table 4), indicating that those strategies might not be as powerful as the others. Interestingly, SI, RI and SRI strategies are not only significantly better than the F condition, but also reported results statistically similar to the NF condition, which indicates that they are extremely efficient in counteracting fading and reducing the magnitude of head rotations. The other strategies (S, R, I, SR) are significantly worse than the NF condition, which confirms their poor performance. In summary, the most promising strategies to reduce the magnitude of head rotations were SI, RI and SRI.

Table 4.  P-values from the Dunn's multiple comparison test. Significant comparisons are in bold (p < 0.05).

 FSRISRSIRISRI
S>0.9999
R>0.9999>0.9999
I=0.1184=0.9895>0.9999
SR>0.9999>0.9999>0.9999=0.6441
SI =0.0293 =0.3233=0.6441>0.9999=0.1981
RI <0.0001 =0.0005 =0.0016 >0.9999 =0.0003 >0.9999
SRI =0.0023 =0.0393 =0.0907>0.9999 =0.0218 >0.9999>0.9999
NF <0.0001 <0.0001 <0.0001 =0.1184 <0.0001 =0.4093>0.9999>0.9999

3.2.2. Spread of head rotations

Next, we quantified the spread of the head rotations. Similar to the magnitude, during trials under normal fading (F), we observed a spread of rotations away from the stimulus in both vertical and lateral direction (figure 9, black). Modulation strategies reduce the spread by a variable factor depending on the strategy (figure 9, red for RI strategy), which are further reduced in NF condition (figure 9, blue).

Figure 9.

Figure 9. (a) Representative plot of the head rotations along the lateral and vertical axes from one participant under F (black), RI (red) and NF (blue) conditions. The circles in the centre are the mean of the data. The dashed ellipsoids indicate three standard deviations of the data for both axes. While the data is not normally distributed, Chebyshev's theory of inequality states that at least 88.89% of values will lie within the ellipsoid. (b) Ellipsoids represent the average across all participants of the three standard deviations of the data distributions.

Standard image High-resolution image

The average spread of the head rotations was then calculated for all the conditions to determine which strategy would provide a statistically significant reduction (figure 10).

Figure 10.

Figure 10. Quantification of the spread of head rotations per subject (mean ± s.d.). For each condition, each subject performed 30 trials, and data was averaged among trials and subjects. Individual data points are the ten subjects. Lateral and vertical rotations were combined to create a single angle as described in section 2.

Standard image High-resolution image

The Friedman one-way repeated measure analysis of variance by ranks reported a significant difference among the conditions (p < 0.0001). Similar to the previous analysis, Dunn's multiple comparison test showed that five conditions resulted in a spread of head rotations significantly lower than the fading condition: I (p = 0.0393), SI (p = 0.0016), RI (p < 0.0001), SRI (p < 0.0001) and NF (p < 0.001). Also, the remaining conditions (S, R, SR) do not result in a significant difference in the spread of head rotations compared to the fading condition (table 5), indicating that those strategies might not be as powerful as the others. Interestingly, I, SI, RI and SRI strategies are not only significantly better than the F condition, but also reported results statistically similar to the NF condition, which indicate that they are extremely efficient in counteracting fading and reducing the spread of head rotations. The other strategies (S, R, SR) are significantly worse than the NF condition, which confirms their poor performance. In summary, the most promising strategies to reduce the spread of head rotations were I, SI, RI and SRI.

Table 5.  P-values from the Dunn's multiple comparison test. Significant comparisons are in bold (p < 0.05).

 FSRISRSIRISRI
S>0.9999
R>0.9999>0.9999
I =0.0393 >0.9999>0.9999
SR>0.9999>0.9999>0.9999>0.9999
SI =0.0016 =0.6441=0.1536>0.9999=0.5150
RI <0.0001 =0.0045 =0.0005 =0.8008 =0.0032 >0.9999
SRI <0.0001 =0.0907=0.0161>0.9999=0.0690>0.9999>0.9999
NF <0.0001 <0.0001 <0.0001 =0.0522 <0.0001 =0.6441>0.9999>0.9999

3.3. Evaluation of the learning effect

Last, we asked if there was any learning effect during the task. Participants completed six sessions in total, spread out over at most two weeks, in order to see what effect practice would have on performance. The dataset was then separated for each session (figure 11). For the time to complete the task, we observed a marked learning effect, shown by a performance increase after each session (figure 11(a)). When only the last sessions for each condition are compared, all conditions approach a qualitatively similar performance not significantly different than the F condition (p > 0.9999 for conditions S, R, I, SR, SI, RI and SRI; Kruskal–Wallis with Dunn's multiple comparison test), except for the NF condition which remains significantly better than the F condition (p = 0.0006; Kruskal–Wallis with Dunn's multiple comparison test).

Figure 11.

Figure 11. Quantification (mean ± s.d.) of the learning effect on the time taken to complete the task (a), the magnitude of head rotations (b) and the spread of head rotations (c). For each condition, each subject performed 30 trials, and data was averaged among trials and subjects.

Standard image High-resolution image

This result indicates that learning alone helps subjects to reduce the time to complete the task, independently of the use of fading compensation strategies. However, the situation is remarkably different for head rotations. Learning is not observed for both the magnitude (figure 11(b)) and the spread (figure 11(c)) of head rotations. For the magnitude, considering only the last session for each condition, RI strategy is significantly better than the F condition (p = 0.0049; Kruskal–Wallis with Dunn's multiple comparison test) together with NF (p < 0.0001; Kruskal–Wallis with Dunn's multiple comparison test); also, RI and NF showed a statistically similar magnitude of head rotations (p > 0.9999; Kruskal–Wallis with Dunn's multiple comparison test). A similar result was shown for the spread of head rotations. When considering only the last session for each condition, RI strategy is significantly better than the F condition (p = 0.0084; Kruskal–Wallis with Dunn's multiple comparison test) together with NF (p < 0.0001; Kruskal–Wallis with Dunn's multiple comparison test); also, RI and NF showed a statistically similar magnitude of head rotations (p > 0.9999; Kruskal–Wallis with Dunn's multiple comparison test).

In summary, this data indicates that practice improves the time to complete the task, but only modulation strategies (in particular the RI strategy) can significantly reduce the burden associated with active head scanning.

4. Discussion

The perceptual fading of phosphenes limits the temporal resolution of artificial vision. The most naturalistic way to overcome fading is by exploiting spontaneous ocular micromovements, however this is possible only for high-resolution implants, activated by light and when physiological micromovements are preserved. Patients implanted with the PRIMA devices experienced flicker fusion [16] as these three conditions were met. However, this ideal situation might not be reached depending on the implant type and the patient condition. The clinical trial of Alpha IMS showed that the usefulness of natural eye movements for artificial vision varied among patients. The patients that could make the most of their implant, and perceive complex objects like grating patterns or letters almost continuously were those with best preserved eye movements. Eye tracking showed the presence of square waves, small amplitude drifts and microsaccades during phosphene perception in those patients [42], shifting the input image by one to three electrodes [43]. However, severe visual impairment is frequently associated with aberrant oculomotor functions, such as nystagmus and reduced saccade amplitude [44], which ultimately affects prosthetic vision. Therefore, other patients could not experience stable perception during retinal stimulation. Lastly, artificial vision in patients implanted with devices that cannot exploit eye movements (if preserved) will always be affected by perceptual fading.

As such, there is a wide interest in developing compensatory strategies to reduce, or ideally avoid, fading in artificial vision. It was previously shown that the randomisation of the pulse duration lengthens the response persistence in RGCs [45] and electrically evoked potentials in V1 [41]. Capitalising on these results, we proposed additional strategies to compensate for fading, such as interrupted train sequences and alternation between neighbouring electrodes to mimic artificial saccades [7] and showed that these strategies and their combination could lengthen the response of RGCs to electrical stimulation in explanted blind retinas.

In this article, we tested in SPV the effect of these strategies and their combination to reduce the voluntary head rotation imposed by the fading of artificial vision. The results showed that naturalistic modulation strategies help to reduce the magnitude and spread of head rotation during artificial vision. The use of an interrupted sequence appeared to be the most efficient strategy, in particular when combined with the randomisation of the stimulus duration. The hypothesis underlying the efficacy of interrupted sequence is that it allows to skip redundant stimuli which are not leading to retinal response due to desensitisation. In explanted retinas, for a 5 Hz repetition rate, only the first three stimuli can elicit a RGC response [7]. Therefore, the next two stimuli can be skipped, which allows time for excitability recovery.

Contrary to the previous study in which both temporal and spatial modulations are relevant [7], in SPV the implementation of a spatial modulation strategy (S) did not have a strong contribution on head rotations. The stimulation with retinal explants was limited to the alternation of two neighbouring POLYRETINA electrodes. Therefore, SPV was implemented by having the entire image oscillate horizontally by the pitch of one phosphene once every second. This choice was dictated by the nature of our previous experimental results from which time constants were derived. It should not be excluded that a more complex naturalistic modelling of ocular micromovements, involving random oscillations of the image in multiple directions might provide better results.

Finally, we evaluated the possible effect of learning in our results. Although we found an improvement in the time to complete the task, we did not observe a learning effect in head rotation. This result implies that active scanning due to fading is not expected to significantly improve with experience. Instead, an RI stimulation protocol can reduce the burden of active scanning by limiting the magnitude and spread of head rotations.

In conclusion, this study shows the relevance of stimulation strategies which could counteract perceptual fading to reduce active head scanning during prosthetic vision. Also, while this study was specifically designed for retinal implants, the results and their implications could be translated also to other visual implants, like optic nerve or cortical devices [46, 47]. A question that remains to be answered is the maximum acceptable head rotation to make retinal prostheses more comfortable for patients. Does any reduction represent an improvement for users or is there a threshold value to be reached to obtain a valuable improvement? Future studies will be needed to assess this parameter and provide specifications for new visual prostheses.

Acknowledgments

This project has received funding from École Polytechnique Fédérale de Lausanne, Medtronic plc, the foundation E et G Gelbert, and the European Union's Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie Grant Agreement No. 861423.

Data availability statement

The data that support the findings of this study are available upon reasonable request from the authors.

Authors Contribution

J T T designed the experiment, wrote the code and ran the tests. N A L C designed the experiment. S H ran the tests. M C ran the tests. D G designed and led the study, and wrote the manuscript. All the authors read and accepted the manuscript.

Conflict of interest

The authors declare no competing financial interest.

Please wait… references are loading.
10.1088/1741-2552/ac5a5c