Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124

Amazon is preparing to relaunch its Alexa voice-powered digital assistant as an artificial intelligence “agent” that can complete practical tasks, as tech groups race to solve challenges that have stymied the system’s AI overhaul.
The $2.4tn company has spent the past two years trying to redesign Alexa, its conversational system embedded in 500 million consumer devices worldwide, so the software’s “brain” is replaced with generative AI.
Rohit Prasad, who leads the Artificial General Intelligence (AGI) team the amazonThe voice assistant still needs to overcome a number of technical hurdles before rollout, told the Financial Times.
These include addressing “hallucinations” or fabricated answers, response speed or “latencies” and reliability issues. “Hallucinations have to be close to zero,” Prasad said. “It’s still an open problem in the industry, but we’re working extremely hard on it.”
The vision of Amazon’s leaders is to transform Alexa, currently used for a narrow set of simple tasks like playing music and setting alarms, into an “agentic” product that acts as a personalized concierge. This can include anything from suggesting restaurants to configuring bedroom lights based on a person’s sleep cycle.
Alexa’s redesign has been underway since the launch of OpenAI’s ChatGPT, backed by Microsoft, in late 2022. While Microsoft, Google, Meta and others have quickly embedded generative AI into their computing platforms and improved their software services, critics have questioned whether Amazon can. Address its technological and organizational struggles in time to compete with rivals.
According to multiple employees who have worked on Amazon’s voice assistant teams in recent years, its efforts have been beset with complexity and followed years of AI research and development.
Several former staffers said the long wait for the rollout was due to the unforeseen problems involved in modifying and combining simpler, predefined algorithms with more powerful but unpredictable large language models.
In response, Amazon said it was “working hard to enable more proactive and capable support” from its voice assistant. It added that a technical implementation of this scale, on a live service and suite of devices used by customers around the world, is not as simple as overlaying LLM on the unprecedented and Alexa service.
Prasad, Alexa’s former chief architect, said last month’s release of the company’s in-house Amazon Nova model — led by its AGI team — was motivated by specific demands for optimal speed, cost and reliability to help AI. Apps like Alexa “reach that last mile, which is really hard”.
To act as an agent, Alexa’s “brain” needs to be able to call hundreds of third-party software and services, Prasad said.
“Sometimes we underestimate how many services are integrated into Alexa, and it’s a huge number. These applications receive billions of requests per week, so when you’re trying to make reliable operations happen at speed. . . You have to be able to do it in a very cost-effective way,” he added.
The complexity comes from Alexa expecting users to respond quickly as well as extremely high levels of accuracy. Such qualities are at odds with the probabilistic nature inherent in today’s generative AI, a statistical software that predicts sounds based on speech and language patterns.
Some former workers also point to a struggle to preserve the assistant’s core features, including its consistency and functionality, while infusing it with new generative features like creativity and free-flowing conversations.
Because of LLM’s more personalized, chatty nature, the company plans to hire experts for the AI’s personality, voice and diction to make it familiar to Alexa users, according to a person familiar with the matter.
A former senior member of the Alexa team said that while LLMs were highly sophisticated, they came with risks, such as generating answers that were “sometimes completely invented”.
“At the scale at which Amazon operates, this can happen multiple times a day,” they said, adding that its brand and reputation have suffered.
In June, Michael Eric, a former machine learning scientist at Alexa and a founding member of its “conversational modeling team,” Publicly stated That Amazon “dropped the ball” to become “the undisputed market leader in conversational AI” with Alexa.
Despite strong scientific talent and “vast” financial resources, the company was “plagued with technical and bureaucratic problems”, suggested “data was poorly annotated” and “documentation was either non-existent or stale”, Eric said.
According to two former employees who worked on Alexa-related AI, the historical technology underpinning the voice assistant was inflexible and difficult to change quickly, with a messy and disorganized code base and an engineering team “spread too thin.”
The original Alexa software, built on technology acquired from British start-up EV in 2012, is a question-answering machine that works by searching through a defined universe of information to find the right response, such as the day’s weather or a specific song in your music library.
The new Alexa uses a bouquet of different AI models to recognize and translate voice queries and generate responses, as well as detect policy violations such as inappropriate responses and hallucination picking. Developing software to translate between legacy systems and new AI models has been a major hurdle in Alexa-LLM integration.
The models include Amazon’s own in-house software, including the latest Nova models, as well as AI Model Cloud from start-up Anthropic, in which Amazon has invested. Exceeded $8 billion Course of last 18 months.
“[T]”The most challenging thing about AI agents is making sure they are safe, reliable and predictable,” Anthropic chief executive Dario Amody told the FT last year.
Agent-like AI software is “where . . . people can actually trust the system”, he added. “Once we reach that point, we will release these systems.”
A current employee said more steps are needed, such as overlaying child safety filters and testing custom integrations with Alexa such as smart lights and ring doorbells.
“Reliability is the problem — it’s about working close to 100 percent,” the employee added. “So you see us. . . Or Apple or Google shipping slowly and incrementally.”
Numerous third parties developing “skills” or features for Alexa said they were unsure when the new generative AI-enabled device would roll out and how new functions would be developed for it.
“We are waiting for details and understanding,” said Thomas Lindgren, co-founder of Swedish content developer Wonderword. “They were much more open when we started working with them. . . Then they changed over time.”
Another partner said that the initial “pressure” that Amazon developers had to prepare for the next generation of Alexa had subsided.
An enduring challenge for Amazon’s Alexa team — which was hit by major layoffs in 2023 — is how to make money. Figuring out how to make the assistant “cheap enough to run at scale” will be a major task, said Jared Roche, co-founder of generative AI group OctoAI.
Options being discussed include creating a new Alexa subscription service or reducing sales of products and services, a former Alexa employee said.
Prasad said Amazon’s goal was to create a variety of AI models that could serve as “building blocks” for different applications beyond Alexa.
“We’ve always been based on customer and practical AI, we’re not doing science for science’s sake,” Prasad said. “We are doing it. . . To deliver customer value and impact, which is becoming more important than ever in this generative AI era as customers want to see a return on investment.”