The University of Michigan has developed two new tools specifically designed for the visually impaired. These tools are aimed at enhancing their accessibility and include features for reading control panel labels and identifying objects in images through touch and audio feedback.
One of the tools, called VizLens, acts as a screen reader for the physical world. By pointing their fingers at buttons of interest on control panels, users can use their smartphone cameras to read the labels to them. It allows them to better understand and interact with various interfaces in their daily environments, such as home appliances and public kiosks.
According to Anhong Guo, Michigan University assistant professor of computer science and engineering who led the development of the apps, the process involves capturing an interface image and using optical character recognition to detect the text labels. It allows users to explore the layout on their smartphone screens.
When they move their fingers on the physical control panel, the app will audibly announce the button located under their finger. These tools provide visually impaired individuals with greater independence and accessibility in their everyday lives.
Furthermore, the University of Michigan’s development team, led by Guo, has created a second app called ImageExplorer, which aims to improve the comprehension of images for visually impaired individuals.
The team has integrated advanced object detection and segmentation models to achieve this. The app enables visually impaired users to explore the content of images and understand the relationships between different objects within them. By leveraging these technologies, the ImageExplorer app enhances the visual experience for individuals with visual impairments.
Many automated caption programs exist for the visually impaired to comprehend images. Still, they frequently contain errors, and users need help identifying and correcting them due to their inability to see the pictures themselves. Thus, Guo’s team aimed to integrate various AI tools to provide users with a more detailed and interactive image exploration experience.
When an image is uploaded to ImageExplorer, the app comprehensively analyses its content. It gives users an overall image description, including detected objects, relevant tags, and a caption.
Additionally, the app incorporates a touch-based interface, enabling users to explore the image’s spatial layout and content by pointing to different areas.
ImageExplorer stands out for the exceptional level of detail it provides to users. It offers users a comprehensive description of objects within an image, including specific information such as the clothing a person is wearing, their activities, and the spatial position of these objects in the picture.
According to Guo, ImageExplorer is a valuable tool for visually impaired users to understand image content despite their inability to see. The development team has received feedback from hundreds of visually impaired individuals who participated in user testing for both VizLens and ImageExplorer.
Based on this feedback, the team continues to refine and improve the tools. While ImageExplorer is a newer concept than VizLens, which was introduced academically in 2016, certain aspects of ImageExplorer still require further improvement. For instance, the tool sometimes provides simplified descriptions like “shirts” for various tops, and there may be occasional discrepancies between different features within ImageExplorer.
Guo acknowledges that the accuracy of ImageExplorer relies on the models used, and as these models improve, so will the tool’s performance. Despite the existing errors, the results presented in 2022 demonstrate that ImageExplorer enables users to make more informed judgments regarding the accuracy of AI-generated captions.
Looking ahead, Guo is eager to receive feedback from the public deployment of these tools. This feedback will help the team observe how people utilise the tools and make necessary adaptations to better suit their needs and daily lives.