Voice assistants are taking over. By 2023, the number of voice assistants used worldwide is expected to rise to 8 billion. And considering one survey found that nearly 70 percent of Americans use their voice assistant every day, it’s no surprise that brands have been working around the clock to explore different ways to increase or activate user engagement leveraging this novel tech.
As one of the most used voice assistants, Amazon’s Alexa has introduced a number of new features over the last few years to help brands customize the experience for their consumers. But, their newest feature, Brand Voice, could be a game-changer for brand teams.
Brand Voice allows brands to alter Alexa’s voice when using custom voice skills to better fit the persona of the brand itself. For example, Kentucky Fried Chicken Canada is leveraging the new feature to design Alexa skills that speak in a voice very similar to Colonel Sanders. Or in another case, the National Australia Bank is leveraging a more Australian-English accent to sound more like locals of the area. Both voices were created as a part of Amazon Polly, the portion of Amazon Web Services (AWS) that turns text into human-like speech.
“We extended Amazon Polly, an artificial intelligence cloud service which creates life-like speech, to incorporate a spicy Southern accent and speech patterns that are consistent with the world-famous persona of Colonel Sanders. We think KFC customers will agree, the Colonel never sounded so finger lickin’ good,” said Matt Wood, AWS vice president of artificial intelligence, in a statement.
While a number of other text-to-speech customization services already exist, Amazon’s Brand Voice takes their feature a bit further by bringing in professional engineers as a key piece of the voice design process. The engineers work with the brand to develop the right sound after several recording sessions with a spokesperson (who has the same voice the brand would like to use for their text-to-speech voice skills). Deep learning algorithms then teach Alexa how to speak like the spokesperson. With this method, rather than hearing the reading of pre-recorded responses, the tech can adapt the new voice to respond with in-the-moment, customized words and phrases – all driven by how the consumer uses the voice skill. It allows for a more natural-sounding conversation that makes the user feel like they’re genuinely having a conversation with another human and not a device.