Firefox Nightly introduces Alt text generation for enhanced web accessibility

This could be beneficial for people with screen readers

Reading time icon 4 min. read


Readers help support Windows Report. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help Windows Report sustain the editorial team Read more

Firefox Nightly introduces Alt text generation for enhanced web accessibility

Mozilla’s Firefox Nightly version is working on an experimental feature that will be added to a PDF editor to improve web accessibility for all users.

The feature automatically generates alternative text for images using private on-device AI models. It is all set to be included in Firefox 130, which can empower users with screen readers to understand images better across the web, thereby facilitating a more inclusive browsing experience.

The importance of Alt Text

Alt text is an important part of web accessibility. It provides textual descriptions of images, enabling individuals using assistive technologies such as screen readers to comprehend visual content.

Even if proven significant, many web pages don’t have alt text, making them inaccessible to visually impaired users. According to the Web Almanac’s 2022 report, nearly 50% of images on the web don’t have alt text.

Addressing the issue

To address this problem, Mozilla uses a Transformer-based machine learning model to describe image content accurately. In a recent blog on Mozilla Hacks, Tarek ZIade said:

These models are getting good at describing the contents of the image, yet are compact enough to operate on devices with limited resources. While can’t outperform a large language model like GPT-4 Turbo with Vision, or LLaVA, they are sufficiently accurate to provide valuable insights on-device across a diversity of hardware.

Model architectures like BLIP or even VIT that were trained on datasets like COCO (Common Object In Context) or Flickr30k are good at identifying objects in an image. When combined with a text decoder like OpenAI’s GPT-2, they can produce alternative text with 200M or fewer parameters. Once quantized, these models can be under 200MB on disk, and run in a couple of seconds on a laptop – a big reduction compared to the gigabytes and resources an LLM requires.

Enhancing performance and integration

The experimental feature is integrated into Firefox Nightly’s PDF editor, which signifies an important step towards broader implementation across general browsing to ensure future accessibility for all web users.

By harnessing the capabilities of small open-source models, Mozilla ensures privacy, resource efficiency, and increased transparency. These models work entirely within the device, so users’ data is not transmitted to external servers, and their resource efficiency reduces the environmental impact.

Mozilla extends Firefox Nightly’s infrastructure, thereby adapting the Translations inference architecture to include alt text generation. By using the ONNX runtime and Transformers.js library, Mozilla seamlessly integrates and optimizes model caching within the browser environment for better performance.

What’s in the future?

Mozilla aims to reduce biases and improve alt text accuracy by leveraging ViT (Vision Transformer) + DistilGPT-2 architecture and refining training datasets.

 Tarek Ziadé also highlighted that Firefox can incorporate an image into a PDF using a popular open-source pdf.js library.

A screenshot of the PDF.js alt text modal window

In Firefox 130, PDF.js will automatically generate alt text for the images added to PDFs, allowing users to validate them.

Thus whenever an image is added, Mozilla gets an array of pixels, which are then passed to the ML engine. After a few seconds, you will get a string corresponding to a description of this image.

Initially, when the user adds an image, there might be a delay in downloading the model; however, with time and usage, the process will speed up as the model is stored locally.

In the future, Mozilla aims to provide alt text for any image in PDFs except images with just text.

Mozilla also plans to continuously work on enhancing the alt text generator with input and collaboration from the community. Once it works well with PDF.js, Mozilla hopes to make the feature available in general browsers for users with screen readers.

What do you think about this feature? Share your views with our readers in the comments section below.

More about the topics: Firefox