Abstract
The goal of the See ColOr project is to achieve a noninvasive mobility aid for blind users that will
use the auditory pathway to represent in real-time frontal image scenes. We present and discuss here
two image processing methods that were experimented in this work: image simplification by means of
segmentation, and guiding the focus of attention through the computation of visual saliency. A mean shift
segmentation technique gave the best results, but for real-time constraints we simply implemented an
image quantification method based on the HSL colour system. More particularly, we have developed two
prototypes which transform HSL coloured pixels into spatialised classical instrument sounds lasting for
300 ms. Hue is sonified by the timbre of a musical instrument, saturation is one of four possible notes, and
luminosity is represented by bass when luminosity is rather dark and singing voice when it is relatively
bright. The first prototype is devoted to static images on the computer screen, while the second has been
built up on a stereoscopic camera which estimates depth by triangulation. In the audio encoding, distance
to objects was quantified into four duration levels. Six participants with their eyes covered by a dark
tissue were trained to associate colours with musical instruments and then asked to determine on several
pictures, objects with specific shapes and colours. In order to simplify the protocol of experiments, we
used a tactile tablet, which took the place of the camera. Overall, colour was helpful for the interpretation
of image scenes. Moreover, preliminary results with the second prototype consisting in the recognition of
coloured balloons were very encouraging. Image processing techniques such as saliency could accelerate
in the future the interpretation of sonified image scenes.