Close Menu
Technology Mag

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Israel-Tied Predatory Sparrow Hackers Are Waging Cyberwar on Iran’s Financial System

    June 21, 2025

    Samsung’s Galaxy Watch 7 has returned to its lowest-ever price

    June 21, 2025

    The Verge’s guide to Amazon Prime Day 2025

    June 21, 2025
    Facebook X (Twitter) Instagram
    Subscribe
    Technology Mag
    Facebook X (Twitter) Instagram YouTube
    • Home
    • News
    • Business
    • Games
    • Gear
    • Reviews
    • Science
    • Security
    • Trending
    • Press Release
    Technology Mag
    Home » Amazon Is Investigating Perplexity Over Claims of Scraping Abuse
    Business

    Amazon Is Investigating Perplexity Over Claims of Scraping Abuse

    News RoomBy News RoomJune 28, 20243 Mins Read
    Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Email

    Amazon’s cloud division has launched an investigation into Perplexity AI. At issue is whether the AI search startup is violating Amazon Web Services rules by scraping websites that attempted to prevent it from doing so, WIRED has learned.

    An AWS spokesperson, who talked to WIRED on the condition that they not be named, confirmed the company’s investigation of Perplexity. WIRED had previously found that the startup—which has backing from the Jeff Bezos family fund and Nvidia, and was recently valued at $3 billion—appears to rely on content from scraped websites that had forbidden access through the Robots Exclusion Protocol, a common web standard. While the Robots Exclusion Protocol is not legally binding, terms of service generally are.

    The Robots Exclusion Protocol is a decades-old web standard that involves placing a plaintext file (like wired.com/robots.txt) on a domain to indicate which pages should not be accessed by automated bots and crawlers. While companies that use scrapers can choose to ignore this protocol, most have traditionally respected it. The Amazon spokesperson told WIRED that AWS customers must adhere to the robots.txt standard while crawling websites.

    “AWS’s terms of service prohibit customers from using our services for any illegal activity, and our customers are responsible for complying with our terms and all applicable laws,” the spokesperson said in a statement.

    Scrutiny of Perplexity’s practices follows a June 11 report from Forbes that accused the startup of stealing at least one of its articles. WIRED investigations confirmed the practice and found further evidence of scraping abuse and plagiarism by systems linked to Perplexity’s AI-powered search chatbot. Engineers for Condé Nast, WIRED’s parent company, block Perplexity’s crawler across all its websites using a robots.txt file. But WIRED found the company had access to a server using an unpublished IP address—44.221.181.252—which visited Condé Nast properties at least hundreds of times in the past three months, apparently to scrape Condé Nast websites.

    The machine associated with Perplexity appears to be engaged in widespread crawling of news websites that forbid bots from accessing their content. Spokespeople for The Guardian, Forbes, and The New York Times also say they detected the IP address repeatedly visiting their servers.

    WIRED traced the IP address to a virtual machine known as an Elastic Compute Cloud (EC2) instance hosted on AWS, which launched its investigation after we asked whether using AWS infrastructure to scrape websites that forbade it violated the company’s terms of service.

    Last week, Perplexity CEO Aravind Srinivas responded to WIRED’s investigation first by saying the questions we posed to the company “reflect a deep and fundamental misunderstanding of how Perplexity and the Internet work.” Srinivas then told Fast Company that the secret IP address WIRED observed scraping Condé Nast websites and a test site we created was operated by a third-party company that performs web crawling and indexing services. He refused to name the company, citing a nondisclosure agreement. When asked if he would tell the third party to stop crawling WIRED, Srinivas replied, “It’s complicated.”

    Share. Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Email
    Previous ArticleFramework Laptop 16, six months later
    Next Article The Mystery Ranch Coulee 30 is Everything You Need in Day Pack

    Related Posts

    A False Start on the Road to an All-American Bitcoin

    June 20, 2025

    A Deep Learning Alternative Can Help AI Agents Gameplay the Real World

    June 20, 2025

    This AI Model Never Stops Learning

    June 20, 2025

    Those Creatine Gummies You Bought Online Might Not Contain Any Creatine

    June 20, 2025

    How Private Equity Killed the American Dream

    June 20, 2025

    eBay and Vestiaire Collective Want an Exemption from Trump’s Tariffs

    June 18, 2025
    Our Picks

    Samsung’s Galaxy Watch 7 has returned to its lowest-ever price

    June 21, 2025

    The Verge’s guide to Amazon Prime Day 2025

    June 21, 2025

    Most Cheap Laptops Only Last a Few Years. The Framework Laptop 12 Could Last a Decade

    June 21, 2025

    Final Fantasy fans, now is the time to get into Magic: The Gathering

    June 21, 2025
    • Facebook
    • Twitter
    • Pinterest
    • Instagram
    • YouTube
    • Vimeo
    Don't Miss
    Gear

    Gear News This Week: Adobe Wants to Make iPhone Photos Better, and TCL Brings Flexibility to Atmos

    By News RoomJune 21, 2025

    The larger JBuds Party ($70) offers 30 watts of power to make it “one of…

    The Mysterious Inner Workings of Io, Jupiter’s Volcanic Moon

    June 21, 2025

    The music industry is building the tech to hunt down AI songs

    June 21, 2025

    Meta’s Oakley Smart Glasses Have 3K Video—Watch Out, Ray-Ban

    June 21, 2025
    Facebook X (Twitter) Instagram Pinterest
    • Privacy Policy
    • Terms of use
    • Advertise
    • Contact
    © 2025 Technology Mag. All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.