Close Menu
Technology Mag

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Microsoft announces layoffs that will impact at least 6,000 employees

    May 13, 2025

    Square’s New Handheld Payment Scanner Looks Like a Phone

    May 13, 2025

    Apple’s new Accessibility Reader can customize text across apps — and in real life

    May 13, 2025
    Facebook X (Twitter) Instagram
    Subscribe
    Technology Mag
    Facebook X (Twitter) Instagram YouTube
    • Home
    • News
    • Business
    • Games
    • Gear
    • Reviews
    • Science
    • Security
    • Trending
    • Press Release
    Technology Mag
    Home » Waymo wants to use Google’s Gemini to train its robotaxis
    News

    Waymo wants to use Google’s Gemini to train its robotaxis

    News RoomBy News RoomOctober 30, 20244 Mins Read
    Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Email

    Waymo has long touted its ties to Google’s DeepMind and its decades of AI research as a strategic advantage over its rivals in the autonomous driving space. Now, the Alphabet-owned company is taking it a step further by developing a new training model for its robotaxis built on Google’s multimodal large language model (MLLM) Gemini.

    Waymo released a new research paper today that introduces an “End-to-End Multimodal Model for Autonomous Driving,” also known as EMMA. This new end-to-end training model processes sensor data to generate “future trajectories for autonomous vehicles,” helping Waymo’s driverless vehicles make decisions about where to go and how to avoid obstacles.

    But more importantly, this is one of the first indications that the leader in autonomous driving has designs to use MLLMs in its operations. And it’s a sign that these LLMs could break free of their current use as chatbots, email organizers, and image generators and find application in an entirely new environment on the road. In its research paper, Waymo is proposing “to develop an autonomous driving system in which the MLLM is a first class citizen.” 

    End-to-End Multimodal Model for Autonomous Driving, also known as EMMA

    The paper outlines how, historically, autonomous driving systems have developed specific “modules” for the various functions, including perception, mapping, prediction, and planning. This approach has proven useful for many years but has problems scaling “due to the accumulated errors among modules and limited inter-module communication.” Moreover, these modules could struggle to respond to “novel environments” because, by nature, they are “pre-defined,” which can make it hard to adapt.

    Waymo says that MLLMs like Gemini present an interesting solution to some of these challenges for two reasons: the chat is a “generalist” trained on vast sets of scraped data from the internet “that provide rich ‘world knowledge’ beyond what is contained in common driving logs”; and they demonstrate “superior” reasoning capabilities through techniques like “chain-of-thought reasoning,” which mimics human reasoning by breaking down complex tasks into a series of logical steps.

    Waymo’s EMMA model.
    Screenshot: Waymo

    Waymo developed EMMA as a tool to help its robotaxis navigate complex environments. The company identified several situations in which the model helped its driverless cars find the right route, including encountering various animals or construction in the road.

    Other companies, like Tesla, have spoken extensively about developing end-to-end models for their autonomous cars. Elon Musk claims that the latest version of its Full Self-Driving system (12.5.5) uses an “end-to-end neural nets” AI system that translates camera images into driving decisions.

    This is a clear indication that Waymo, which has a lead on Tesla in deploying real driverless vehicles on the road, is also interested in pursuing an end-to-end system. The company said that its EMMA model excelled at trajectory prediction, object detection, and road graph understanding.

    “This suggests a promising avenue of future research, where even more core autonomous driving tasks could be combined in a similar, scaled-up setup,” the company said in a blog post today.

    But EMMA also has its limitations, and Waymo acknowledges that there will need to be future research before the model is put into practice. For example, EMMA couldn’t incorporate 3D sensor inputs from lidar or radar, which Waymo said was “computationally expensive.” And it could only process a small amount of image frames at a time.

    There are also risks to using MLLMs to train robotaxis that go unmentioned in the research paper. Chatbots like Gemini often hallucinate or fail at simple tasks like reading clocks or counting objects. Waymo has very little margin for error when its autonomous vehicles are traveling 40mph down a busy road. More research will be needed before these models can be deployed at scale — and Waymo is clear about that.

    “We hope that our results will inspire further research to mitigate these issues,” the company’s research team writes, “and to further evolve the state of the art in autonomous driving model architectures.”

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Email
    Previous ArticleAI Slop Is Flooding Medium
    Next Article Dropbox cuts its workforce by 20 percent in latest round of layoffs

    Related Posts

    Microsoft announces layoffs that will impact at least 6,000 employees

    May 13, 2025

    Apple’s new Accessibility Reader can customize text across apps — and in real life

    May 13, 2025

    Square’s $399 Handheld accepts tap-to-pay at your table

    May 13, 2025

    DJI is skipping the US with its most advanced drone yet

    May 13, 2025

    Microsoft extends Office app support on Windows 10 to 2028

    May 13, 2025

    Microsoft reveals its rejected Start menu redesigns

    May 13, 2025
    Our Picks

    Square’s New Handheld Payment Scanner Looks Like a Phone

    May 13, 2025

    Apple’s new Accessibility Reader can customize text across apps — and in real life

    May 13, 2025

    US Border Agents Are Asking for Help Taking Photos of Everyone Entering the Country by Car

    May 13, 2025

    Square’s $399 Handheld accepts tap-to-pay at your table

    May 13, 2025
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo
    Don't Miss
    Gear

    How to Use Apple Maps on the Web

    By News RoomMay 13, 2025

    The boundaries of Apple’s walled garden aren’t as well defined as they used to be;…

    DJI is skipping the US with its most advanced drone yet

    May 13, 2025

    Microsoft extends Office app support on Windows 10 to 2028

    May 13, 2025

    Microsoft reveals its rejected Start menu redesigns

    May 13, 2025
    Facebook X (Twitter) Instagram Pinterest
    • Privacy Policy
    • Terms of use
    • Advertise
    • Contact
    © 2025 Technology Mag. All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.