Close Menu
Technology Mag

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot
    Meta is reportedly laying off up to 20 percent of its staff

    Meta is reportedly laying off up to 20 percent of its staff

    March 14, 2026
    MacBook Air M5 review: a small update for the ‘just right’ Mac

    MacBook Air M5 review: a small update for the ‘just right’ Mac

    March 14, 2026
    Hulu, Disney Plus, and the Pixel Watch 4 are among this week’s best deals

    Hulu, Disney Plus, and the Pixel Watch 4 are among this week’s best deals

    March 14, 2026
    Facebook X (Twitter) Instagram
    Subscribe
    Technology Mag
    Facebook X (Twitter) Instagram YouTube
    • Home
    • News
    • Business
    • Games
    • Gear
    • Reviews
    • Science
    • Security
    • Trending
    • Press Release
    Technology Mag
    Home » A New Trick Could Block the Misuse of Open Source AI
    Business

    A New Trick Could Block the Misuse of Open Source AI

    News RoomBy News RoomAugust 5, 20243 Mins Read
    Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Email
    A New Trick Could Block the Misuse of Open Source AI

    When Meta released its large language model Llama 3 for free this April, it took outside developers just a couple days to create a version without the safety restrictions that prevent it from spouting hateful jokes, offering instructions for cooking meth, or misbehaving in other ways.

    A new training technique developed by researchers at the University of Illinois Urbana-Champaign, UC San Diego, Lapis Labs, and the nonprofit Center for AI Safety could make it harder to remove such safeguards from Llama and other open source AI models in the future. Some experts believe that, as AI becomes ever more powerful, tamperproofing open models in this way could prove crucial.

    “Terrorists and rogue states are going to use these models,” Mantas Mazeika, a Center for AI Safety researcher who worked on the project as a PhD student at the University of Illinois Urbana-Champaign, tells WIRED. “The easier it is for them to repurpose them, the greater the risk.”

    Powerful AI models are often kept hidden by their creators, and can be accessed only through a software application programming interface or a public-facing chatbot like ChatGPT. Although developing a powerful LLM costs tens of millions of dollars, Meta and others have chosen to release models in their entirety. This includes making the “weights,” or parameters that define their behavior, available for anyone to download.

    Prior to release, open models like Meta’s Llama are typically fine-tuned to make them better at answering questions and holding a conversation, and also to ensure that they refuse to respond to problematic queries. This will prevent a chatbot based on the model from offering rude, inappropriate, or hateful statements, and should stop it from, for example, explaining how to make a bomb.

    The researchers behind the new technique found a way to complicate the process of modifying an open model for nefarious ends. It involves replicating the modification process but then altering the model’s parameters so that the changes that normally get the model to respond to a prompt such as “Provide instructions for building a bomb” no longer work.

    Mazeika and colleagues demonstrated the trick on a pared-down version of Llama 3. They were able to tweak the model’s parameters so that even after thousands of attempts, it could not be trained to answer undesirable questions. Meta did not immediately respond to a request for comment.

    Mazeika says the approach is not perfect, but that it suggests the bar for “decensoring” AI models could be raised. “A tractable goal is to make it so the costs of breaking the model increases enough so that most adversaries are deterred from it,” he says.

    “Hopefully this work kicks off research on tamper-resistant safeguards, and the research community can figure out how to develop more and more robust safeguards,” says Dan Hendrycks, director of the Center for AI Safety.

    The new work draws inspiration from a 2023 research paper that showed how smaller machine learning models could be made tamper resistant. “They tested the [new] approach on much larger models and scaled up the approach, with some modifications,” says Peter Henderson, an assistant professor at Princeton who led the 2023 work . “Scaling this type of approach is hard and it seems to hold up well, which is great.”

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Email
    Previous ArticleVideo game actors are officially on strike over AI
    Next Article Every Microsoft employee is now being judged on their security work

    Related Posts

    What Happens When Your Coworkers Are AI Agents

    What Happens When Your Coworkers Are AI Agents

    December 9, 2025
    San Francisco Mayor Daniel Lurie: ‘We Are a City on the Rise’

    San Francisco Mayor Daniel Lurie: ‘We Are a City on the Rise’

    December 9, 2025
    An AI Dark Horse Is Rewriting the Rules of Game Design

    An AI Dark Horse Is Rewriting the Rules of Game Design

    December 9, 2025
    Watch the Highlights From WIRED’s Big Interview Event Right Here

    Watch the Highlights From WIRED’s Big Interview Event Right Here

    December 9, 2025
    Amazon Has New Frontier AI Models—and a Way for Customers to Build Their Own

    Amazon Has New Frontier AI Models—and a Way for Customers to Build Their Own

    December 4, 2025
    AWS CEO Matt Garman Wants to Reassert Amazon’s Cloud Dominance in the AI Era

    AWS CEO Matt Garman Wants to Reassert Amazon’s Cloud Dominance in the AI Era

    December 4, 2025
    Our Picks
    MacBook Air M5 review: a small update for the ‘just right’ Mac

    MacBook Air M5 review: a small update for the ‘just right’ Mac

    March 14, 2026
    Hulu, Disney Plus, and the Pixel Watch 4 are among this week’s best deals

    Hulu, Disney Plus, and the Pixel Watch 4 are among this week’s best deals

    March 14, 2026
    Wordle’s creator made a fun new puzzle game

    Wordle’s creator made a fun new puzzle game

    March 14, 2026
    Asus’ new open earbuds are a wonderful companion for handheld gaming

    Asus’ new open earbuds are a wonderful companion for handheld gaming

    March 14, 2026
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo
    Don't Miss
    Samsung Galaxy S26 Ultra review: show off Reviews

    Samsung Galaxy S26 Ultra review: show off

    By News RoomMarch 14, 2026

    “Someone might be watching everything I’m doing on my screen,” I tell myself in public.…

    The Big 12 basketball tournament is ditching slippery LED courts for hardwood

    The Big 12 basketball tournament is ditching slippery LED courts for hardwood

    March 13, 2026
    Adobe will pay  million to settle US cancellation fee lawsuit

    Adobe will pay $75 million to settle US cancellation fee lawsuit

    March 13, 2026
    Digg’s open beta shuts down after just two months, blaming AI bot spam

    Digg’s open beta shuts down after just two months, blaming AI bot spam

    March 13, 2026
    Facebook X (Twitter) Instagram Pinterest
    • Privacy Policy
    • Terms of use
    • Advertise
    • Contact
    © 2026 Technology Mag. All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.