The Robots Exclusion Protocol is a widely accepted web standard that uses "robots.txt" files to instruct web crawlers which parts of a website they can and cannot access. Perplexity, an AI search startup, is accused of ignoring this protocol to scrape parts of the web that operators don't want to be accessed by bots.
Perplexity allegedly used Forbes' content improperly by reproducing their exclusive reporting on Eric Schmidt's drone project without proper attribution5. The AI-generated articles, podcasts, and videos created by Perplexity used Forbes' text and images, leading Forbes to threaten legal action for copyright infringement.
Wired blocked Perplexity's web crawler in 2024 because they found that Perplexity was ignoring the Robots Exclusion Protocol, a widely accepted web standard, to scrape parts of the web that operators don't want to be accessed by bots. Despite this, Perplexity's service was still able to summarize Wired's articles in detail, leading to the suspicion that Perplexity was using an unlisted IP address to circumvent the robots.txt files and scrape the websites anyway.