The revolutionary power of the internet is hard to quantify but for some, accessing even the basic aspects is difficult. Blind and low-vision people often interact with the web using screen readers, tools that read the contents of a page out loud.
Unfortunately, msot screen readers rely on alternative text (or alt text) and other behind-the-scenes information to do their job properly, and developers can sometimes overlook these significant but small details.
In order to help out, Microsoft has announced (opens in new tab) new features for its Edge browser that automatically add alt text to images that do not have it already, something the company hopes will make the web more accessible.
“When a screen reader finds an image without a label, that image can be automatically processed by machine learning (ML) algorithms to describe the image in words and capture any text it contains,” says Microsoft’s Travis Leithead. “The algorithms are not perfect, and the quality of the descriptions will vary, but for users of screen readers, having some description for an image is often better than no context at all.”
The update hopefully solves the problem of screen readers reading out “unlabelled graphic” whenever an image doesn’t have alt text.
Microsoft Edge accessibility
Powering the technology is, of course, the company's Azure cloud platform. Any unlabelled image will be sent by Microsoft Edge (opens in new tab) to Azure’s computer vision systems, which then auto-generates alt text in English, Spanish, Japanese, Portuguese, or Chinese Simplified.
Some images – like those that are smaller than 50x50 pixels, contain pornographic, gory, or sexually suggestive content, or are very large – unfortunately won’t be sent for analysis.
Microsoft is rolling out the changes to Edge on Windows, macOS, and Linux right now and expects to add them to iOS and Android at a later date.
“This feature is still new, and we know that we are not done,” says Leithead. “We’ve already found some ways to make this feature even better, such as when images do have a label, but that label is not very helpful (e.g., a label of “image” or “picture” or similar synonym that is redundant with the obvious). Continuous image recognition and algorithm improvements will also refine the quality of the service.”
- Here's the best text-to-speech software around