Amazon is still seen as a bit of a laggard in the race to develop advanced artificial intelligence, but it has quietly created a lab that is now setting records when it comes to AI performance. Amazon’s AGI SF Lab, which is located in San Francisco and dedicated to building artificial general intelligence, or AI that surpasses the capabilities of humans, revealed the first fruits of its work today: A new AI model capable of powering some of the most advanced AI agents available anywhere.

The new model, called Amazon Nova Act, outperforms ones from OpenAI and Anthropic on several benchmarks designed to gauge the intelligence and aptitude of AI agents, Amazon says. On the benchmarks GroundUI Web and ScreenSpot, Amazon Nova Act performs better than Claude 3.7 Sonnet and OpenAI Computer Use Agent. A major part of Amazon’s plan to compete in the AI market is to focus on building agents, and the new model’s abilities reflect its efforts to build a generation of tools that can measure up to the very best available.

“I believe that the basic atomic unit of computing in the future is going to be a call to a giant [AI] agent,” says David Luan, who leads Amazon’s AGI SF Lab. He was previously a vice president of engineering at OpenAI and later cofounded Adept, a startup that pioneered work on AI agents, before joining Amazon in 2024 when the ecommerce giant took a stake in the company.

Most of the leading AI labs are now focused on building increasingly capable AI agents. Getting AI to master independent actions, as well as conversation, promises to make the technology more useful and valuable. The shift from chat to action is still very much a work in progress, however.

In the past six months, OpenAI, Anthropic, Google, and others have demonstrated web-browsing agents that take actions in response to a prompt. But for the most part, these agents are still unreliable, and they can easily be tripped up by open-ended requests.

Luan says that Amazon’s goal is building AI agents that are dependable rather than flashy. The thing holding agents back is not the need for “more cool demos of interesting capabilities that work 60 percent of the time, it’s the Waymo problem,” he says, referring to how self-driving cars needed to be trained to deal with unusual edge cases before they could take to the streets unsupervised.

Many so-called agents are built by combining large language models with multiple human-written rules that are designed to prevent them from veering off course, but also makes their behavior brittle. Amazon Nova Act is a version of the company’s most powerful homegrown model Amazon Nova that has received additional training to help it make decisions about what actions to take and at what time. In general, Luan says, AI models struggle to decide when they should intervene in a task.

To improve Nova’s agential abilities, Amazon is using reinforcement learning, a method that has helped other AI models better simulate reasoning.

Share.
Exit mobile version