Shop RNIB Donate now

Exploring AI-generated audio description: can emerging technologies help expand access to broadcast media?

As part of our long-standing commitment to improving media accessibility, RNIB commissioned a research study to evaluate the feasibility of using artificial intelligence (AI) to generate audio description (AD) for television content. The aim: to understand whether today’s advanced AI models—particularly those combining video scene analysis and language generation—could help expand AD provision across genres, formats, and platforms.

Despite regulatory requirements for UK broadcasters to provide AD for at least 10% of their output—and a voluntary commitment by the BBC, ITV, Channel 4, and Sky to reach 20%—large swathes of content remain inaccessible to blind and partially sighted audiences. The gap in content accessibility is likely to widen even further as video-sharing platforms such as YouTube—now dominant among certain demographics, according to the 2025 Ofcom Media Nations Report—and social media platforms like TikTok and Instagram continue to expand their reach. As audiences shift, so too does the demand for accessible content across new formats. At the same time, the need for AD already outpaces the capacity of traditional, manual production workflows. While some have suggested that AI could help deliver AD more widely and efficiently, RNIB believes that robust, evidence-based testing is essential before any such tools are adopted in real-world settings.

This study focused on factual programming—low-risk genres with relatively straightforward visual storytelling. It benchmarked three AI models against professionally written AD across over 160 short video segments drawn from BBC series including Gardeners’ World, DIY SOS, and Bargain Hunt. The evaluation included linguistic analysis, user feedback from blind and partially sighted audiences, and informal consultations with professional audio describers.

Key findings

The results highlight both progress and limitations. While AI-generated descriptions often exceeded moderate expectations in terms of fluency and sentence structure, they frequently lacked accuracy, cohesion, and contextual awareness. Misidentified characters and vague object references were common. Though some AI outputs offered more visual detail than human-authored AD, this often resulted in over-description—unsuitable for fixed broadcast time slots.

Notably, blind and partially sighted participants welcomed the idea of AI-generated AD if it meant expanding coverage. But all agreed that human review remains essential. Audio describers echoed this view, emphasising the importance of editorial judgement, emotional nuance, and audience understanding—areas where current AI tools still fall short.

What comes next?

Rather than viewing AI as a standalone solution, the report recommends exploring hybrid workflows—when the models are sufficiently developed—that complement rather than replace professional describers. Areas of potential development include:

  • Drafting scripts for legacy, low-risk content
  • Suggesting object or character descriptions for human refinement
  • Supporting consistency checks across episodes
  • Optimising output through genre-specific prompt engineering

The report also identifies several priorities for future research and development:

  • Fine-tuning models with AD-specific training data
  • Developing robust evaluation frameworks
  • Testing scalability to long-form content
  • Integrating additional inputs like scripts, subtitles, and audio cues for richer context

RNIB’s position

A core principle for RNIB is that blind and partially sighted audiences should have access to the full spectrum of media content—across platforms, formats, and genres. This research shows that AI has the potential to broaden the reach of audio description in the future, but more work is needed before it can be reliably put into practice.

The Study was was commissioned by RNIB and carried out by the University of Surrey, with advisory input from both the RNIB and the BBC.

The full report is available: