Amazon is trying to make Alexa simpler and more intuitive for users through the use of a new large language model (LLM). During its annual hardware event Wednesday, Amazon demoed the generative-AI-powered Alexa that users can soon preview on Echo devices. But in all its talk of new features and a generative-AI-fueled future, Amazon barely acknowledged the longstanding elephant in the room: privacy.
Amazon's devices event featured a new Echo Show 8, updated Ring devices, and new Fire TV sticks. But most interesting was a look at how the company is trying to navigate generative AI hype and the uncertainty around the future of voice assistants. Amazon said users will be able to start previewing Alexa's new features via any Echo device, including the original, in a few weeks.
Alexa's added features are enabled by a new LLM that Amazon says was fine-tuned for voice conversations and that uses algorithms for body language and intonation recognition. The company was clear that Alexa will focus on generative AI going forward. But the new features are in their early stages, Amazon noted, so bumps, bugs, and errors are expected at first.
“Alexa, can I finally stop saying Alexa?”
One development with an immediately noticeable impact is Alexa learning to listen without the user needing to say "Alexa" first. A device will be able to use its camera, a users' pre-created visual ID, and a previous setup with Alexa to determine when someone is speaking to it.
In a demo, David Limp, SVP of devices and services at Amazon, was able to have a conversation with Alexa, step away for a moment, and then return to the device for further conversation without saying "Alexa" again.
"This is made possible by on-device visual processing and acoustic models working in concert trying to determine whether you're addressing the device or someone else in the room," Rohit Prasad, SVP and head AI scientist at Amazon, explained on-stage at Amazon's event.