audio production


The Argument for Reaching Non-English-Speaking Audiences with Content

How voice cloning technology is changing content creation

Sean king

SEAN KING SVP, GM Commercial Enterprise, Veritone


  • Content has predominantly been created in English across channels, leaving the majority of the world's audiences untapped for English-only content creators
  • The required investment to translate as well as forfeiting the voice of programming if they don't speak multiple languages has been a barrier for content creators
  • Synthetic voice solutions can minimize investment while maintaining the brand's voice talent even if they don't speak the target language

Technology has become the bridge connecting the world, further advancing globalization. But content has not kept pace in the same way. For decades, content in the media & entertainment space has primarily been produced in English. But, of course, with the center of entertainment being Hollywood, that’s not a surprise. Dubbing solutions, while doing incredible work to extend reach, have fallen short in forming an authentic experience for audiences around the world.

However, as the world becomes more connected, content creators need to consider reaching new audiences. Considering that only 16.5% of the world’s population speaks English, the opportunity to scale and reach new audiences is considerable. But is it worth it?

Surprising stats that illustrate the opportunity

With 83.5% of the world’s population left untapped after accounting for English, the opportunity is immense. If you add Chinese and Hindi, that’s an additional 23% of the world’s population you could potentially reach. Add Spanish, that’s another 7%. With just those languages, including English, your content can reach nearly half of the world’s population.

Of course, the challenge is validating that these audiences care to engage with your content. Without sacrificing quality, especially if you produce content with audio, it’s challenging to validate audiences without putting budget towards translation and hiring a native speaker. For small productions, that’s a gamble with a valuable budget that can be used to grow an already validated audience—that changes with synthetic voice technology.

Testing new audiences without sacrificing content quality

Sound profitable

Recently, Veritone partnered with Bryan Barletta, the voice of Sounds Profitable, a weekly newsletter that covers the technical aspects of podcast advertising in layman’s terms. Recognizing the growing Latinx segment, Bryan wanted to expand the newsletter content and the audio narration to Spanish. But here’s the problem—Bryan doesn’t speak Spanish.

Why is that an issue? When you consider any audio production, especially in the podcasting world, the show’s voice is part of the brand. Typically, when you translate the content and localize it, you lose the unique voice that fans expect to hear. Before synthetic voice technology came around, you had to forfeit your voice talent for an actor who speaks the language fluently, creating a significant disconnect and inauthentic experience for subscribers.

Now, content creators don’t have to worry about that. In Bryan’s case, he worked with Veritone, using Veritone Voice and the expertise of our managed service team, to build a voice model that can speak Spanish while maintaining the unique qualities of his voice.

Creating Bryan’s voice clone

Since Bryan already had hours of audio readily available, the Veritone team could use that as the training data to generate his voice clone. If this audio content were not available, we would use a studio to capture audio of him speaking various phrases. This ensures that we capture all the unique aspects of his voice.

Selecting the best-of-breed translation engine from the outset helps us minimize human intervention in the translation process. Rather than Bryan hiring a translator and a voice actor who speaks the language, this gives him a more cost-effective way to test the waters with his target Spanish-speaking audience—and in his voice!

Initial results and the future of content creation

After publishing the Spanish version of the Sounds Profitable newsletter with Bryan’s custom voice, in less than 30 days, they saw a spike in subscriptions for the Spanish narrated version. This outpaced the English version by 16%. While it’s still in the early stages, it’s helped Bryan determine if he should commit to a fully localized version of his newsletter.

Sounds Profitable is just one of many use cases Veritone has uncovered with the emerging synthetic voice technology. Since launching Veritone Voice in May of this year, Veritone has helped voice talent scale their opportunities and find new monetization channels as well as audiences. This capability will become more critical as the world continues to shrink as technologies create new avenues that accelerate cross-region commerce and consumption.

Editorial originally published for VOICE21

Top Reads

Learn More
voice clone

01.24.23 - ETHAN BAKER

Deepfake Voice—Everything You Should Know in 2023

Learn More

11.01.22 - ASHLEY BAILEY


Learn More
blue background

10.19.21 - ETHAN BAKER