Amazon’s generative-AI-powered Alexa is as big a privacy red flag as old Alexa

Amazon is trying to make Alexa simpler and more intuitive for users through the use of a new large language model (LLM). During its annual hardware event Wednesday, Amazon demoed the generative-AI-powered Alexa that users can soon preview on Echo devices. But in all its talk of new features and a generative-AI-fueled future, Amazon barely acknowledged the longstanding elephant in the room: privacy.

Amazon's devices event featured a new Echo Show 8, updated Ring devices, and new Fire TV sticks. But most interesting was a look at how the company is trying to navigate generative AI hype and the uncertainty around the future of voice assistants. Amazon said users will be able to start previewing Alexa's new features via any Echo device, including the original, in a few weeks.

Alexa's added features are enabled by a new LLM that Amazon says was fine-tuned for voice conversations and that uses algorithms for body language and intonation recognition. The company was clear that Alexa will focus on generative AI going forward. But the new features are in their early stages, Amazon noted, so bumps, bugs, and errors are expected at first.

“Alexa, can I finally stop saying Alexa?”

One development with an immediately noticeable impact is Alexa learning to listen without the user needing to say "Alexa" first. A device will be able to use its camera, a users' pre-created visual ID, and a previous setup with Alexa to determine when someone is speaking to it.

In a demo, David Limp, SVP of devices and services at Amazon, was able to have a conversation with Alexa, step away for a moment, and then return to the device for further conversation without saying "Alexa" again.

"This is made possible by on-device visual processing and acoustic models working in concert trying to determine whether you're addressing the device or someone else in the room," Rohit Prasad, SVP and head AI scientist at Amazon, explained on-stage at Amazon's event.

Amazon is also working to reduce Alexa's time-to-response and to allow users to pause or use fillers like "um" without breaking the interaction. Prasad said Alexa can do this by using a massive conformer model with billions of parameters.

All this points to an Alexa that listens and watches with more intent than ever. But Amazon's presentation didn't detail any new privacy or security capabilities to make sure this new power isn't used maliciously or in a way that users don't agree with.

When reached for comment, an Amazon spokesperson noted previously established Alexa privacy features, including an indicator light on Echo devices that indicates Alexa is listening and letting users review and manage voice history.

The spokesperson added that Amazon uses "numerous tactics and features" to secure devices and customer data, including "rigorous security reviews during development, encryption to protect data, and regular software security updates," and that Amazon will "continue to take steps to further harden" device security.

Amazon Echo Hub — The Echo Hub for smart home control comes out later this year, Amazon says, for $180. Credit: Amazon

Looking to reduce the tedium of controlling multiple smart home devices, Amazon is also updating Alexa to simplify smart home commands. For example, Alexa could eventually know how to "turn on the new light in the living room," even if you don't know the official "name" of that smart bulb. Amazon also discussed the ability to control multiple devices and set up routines with a single sentence rather than having to say "Alexa" and a device name for each instruction. Other developments include getting Alexa to infer what you want, such as revving up the robovac when you tell it that the house is dirty. Such features, if reliable, would make voice control over smart devices significantly more appealing and intuitive. The features will be previewable "in the coming months," Amazon said.

Generative AI and Fire TVs

Amazon announced the Fire TV Stick 4K Max (2nd Gen) with Wi-Fi 6E on Wednesday for $60. Credit: Amazon

Amazon also demoed ways that Fire TV devices with generative AI could help users find digital content. It had been previously reported that Amazon was exploring some of these features, such as Alexa making movie and TV show recommendations based on prompts like "search for the movie with that guy who played the lawyer in Breaking Bad" or drilling down on options with filters like "show me action movies," followed by "show me ones I don't have to pay for," and "show me ones I haven't seen yet." To make these recommendations, the Fire TV devices pull from various sources of movie and TV content, including IMDB, and on users' Fire TV profile and viewing habits.

Amazon said new Fire TV AI features, including the ability to upload family photos and add filters to them, will arrive via an over-the-air update later this year.

On their own, these planned features make sense. Finding content amid today's streaming wars has become tedious, and I've previously pointed to Fire TV devices as being a potential lifeline for Alexa revenue. But I'm resistant to handing over even more of my viewing habits to Amazon. Amazon has always claimed not to sell customer data, but that data can still be used to make business decisions, to disclose non-user-specific trends with third parties, and for targeted marketing.

Alexa the chatbot

Alexa is also evolving into a chatbot that uses voice input, a camera, and presence detection sensors to chat with users and provide information quickly. There's a lot of emphasis on Alexa having a personality, embodied through a more expressive voice and the chatbot having opinions like a favorite sports team. It should also be able to have back-and-forth conversations and create content, like poems, with specific or even changing cues.

"[Alexa has] always been a bit more transactional than we would like, but that was a limit of the technology, not the vision," Limp said on stage.

Amazon Echo Show 8 — Amazon announced the Echo Show 8 yesterday. Starting at $150, it will ship in October. Credit: Amazon

Limp's demo appeared to show Alexa draft a note inviting people over and including details (like that there will be barbecue chicken) from its previous conversation with Limp. It then seemingly sent the note to Limp's phone, likely to the Alexa app.

Alexa's chatbot features will be available as "an early preview" in the coming weeks. They're triggered when a user says, "Alexa, let's chat" and ends when they say, "Exit."

New Alexa, same privacy concerns

Over the past year, there have been questions about the future of voice assistants like Alexa and Google Assistant as they have failed to become reliable revenue generators.

Many of Amazon's smart devices, like Echos, have historically been sold at cost, with hopes for making money through Alexa-based transactions and through access to digital content (a recent report, however, suggested that Amazon has been moving toward potentially making money from hardware that is cheaper to produce).

But as Amazon pushes to place Alexa in more rooms in people's houses (I can't count how many times Amazon alluded to Alexa being like a family member), the company has done little to improve confidence that its devices will be safe and not invasive.

As Limp said during Wednesday's presentation, Amazon wants Alexa to be able to remember recently played music, enable personal reminders, and recommend recipes based on grocery purchases. To do this, it will need to leverage the "preferences you're sharing, the services you've interacted with, and the information about the environment in and around your home." Giving Alexa all this information may become simpler with this generative AI update, but trusting Alexa has not necessarily become easier.

The use of visual IDs to enable using Alexa without a wake word heightens the dependence on cameras and microphones, yet Amazon hasn't disclosed any revamped approaches to customer privacy. The company was previously caught keeping recordings, including children's, forever, and Amazon workers have been caught listening to Alexa audio and spying on Ring users. Alexa audio has even been used in criminal trials. Amazon says it doesn't send images or videos to the cloud and emphasizes Echo Show devices' microphone/camera off button and integrated physical camera shutter.

Wednesday's event highlighted ways children could have interactive chats with Alexa by playing trivia games and learning facts. This Explore with Alexa feature should launch before the holiday season. But Amazon didn't mention any new and improved ways it will ensure all content shared with children is safe and reliable.

Echo Pop Kids smart speaker with Marvel's Avengers theme (left) and Disney Princess theme (right) — The Echo Pop Kids smart speaker ships next month for $50. Credit: Amazon

Alexa has failed to protect children in the past, including in 2021, when it advised a 10-year-old to touch a penny to a partially exposed plug. Of course, Amazon has addressed that specific error and will likely try to avoid similar situations in the future. But the company hasn't done enough to convince me that there won't be a similar occurrence, especially with generative AI at play.

When reached for comment, Amazon's spokesperson told me that since the penny incident, the company has improved its systems for preemptively detecting and blocking "offensive or potentially harmful content."

However, Alexa's generative AI update is built on "existing internal content policies and guardrails," according to Amazon's rep. We know those policies and guard rails have failed users in the past, even if Amazon says it's working toward improvement. When it comes to children's safety, there is even less room for error.

Amazon's rep added:

We also knowledge-grounded this LLM so that if the model lacks the information for a response, it will automatically consult licensed data sources and the knowledge graph that help Alexa accurately answer customer questions today. The LLM is able to synthesize the most accurate and up-to-date information before responding to reduce inaccurate responses.

Limp named trust as a foundational pillar to the generative AI update, saying:

You have to trust this. It’s in your home. We believe there is absolutely no tradeoff between trust and performance... That’s why we’re going to continue to give customers information and controls so they can be in charge of their experience, even with generative AI powering this in the background.

Privacy and security aren't the flashiest fodder for live presentations, but since Amazon's event has wrapped, the company has not provided any follow-up information about any new ways Amazon is working to ensure it properly handles customer data and privacy.

Amazon wants Alexa to be easier to work with, but also more intelligent about where you are, how you live, what you like, what you've done and seen, and what your kids are interested in. Some of Alexa's developments could improve its usefulness, but its appeal remains limited until Amazon can demonstrate that it prioritizes security and privacy as much as it does making Alexa more competitive.

Listing image: Amazon News/YouTube

Scharon Harding Senior Product Reviewer

Scharon is Ars Technica’s Senior Product Reviewer writing news, reviews, and analysis on consumer technology, including laptops, mechanical keyboards, and monitors. She’s based in Brooklyn.

57 Comments