Perplexity AI is fighting three separate legal battles simultaneously. Each one attacks a different pillar of how AI systems currently access and use web content. If you build AI agents that browse the web, run RAG pipelines that pull from public sources, or use Perplexity’s API in your product, these cases will set your operating constraints — whether you’re a party to them or not.
Here is what each front actually involves, what’s at stake, and what builders should be doing now.
Front 1: Amazon v. Perplexity — The CFAA Agentic Access Question
This is the most consequential case for builders shipping AI agents in 2026. It goes beyond copyright into criminal and civil computer access law.
What happened: Perplexity’s Comet browser agent logs into a user’s Amazon account using credentials the user provides, browses products, and completes purchases on the user’s behalf. Amazon sued under the Computer Fraud and Abuse Act (CFAA), arguing this constitutes unauthorized access to Amazon’s systems — even though the user gave explicit permission.
The court’s initial ruling: In March 2026, U.S. District Judge Maxine Chesney issued a preliminary injunction blocking Comet from Amazon’s logged-in pages. The key language: Comet accessed Amazon accounts “with the Amazon user’s permission, but without authorization by Amazon.”
The Ninth Circuit paused the injunction pending Perplexity’s appeal. Oral arguments are scheduled for June 11, 2026, in Seattle.
The builder question this establishes: Is user authorization sufficient for an AI agent to access a website? Or does each website independently control whether any agent — even one explicitly directed by a legitimate user — is allowed to act?
Amazon’s theory: a website’s terms of service govern access, and ToS typically prohibit bots regardless of user consent. If courts accept this, every AI agent that acts on behalf of users faces potential CFAA liability whenever it accesses a service that bars automated access in its ToS.
Perplexity’s counter: calling a user-directed AI agent “unauthorized access” is “a fundamental misfit” for the CFAA, which was designed for intrusion and hacking, not agents operating with explicit user consent.
The June 11 oral arguments will not produce a final ruling, but the panel’s questions will signal which way the Ninth Circuit is leaning on this framework question.
Front 2: Nine Publisher Copyright Suits — The RAG Pipeline Question
As of May 31, 2026, nine organizations have active copyright suits against Perplexity:
- CNN (filed May 28, 2026) — 17,000+ articles, photos, and videos scraped and used to train and power Perplexity’s products
- The New York Times — RAG output accused of near-verbatim copying; described as “the first copyright infringement lawsuit against a generative AI company utilizing RAG technology”
- News Corp / Dow Jones (Wall Street Journal, New York Post)
- Chicago Tribune
- Encyclopedia Britannica
- Merriam-Webster
- Yomiuri Shimbun (Japan)
Meanwhile, Time, Gannett, Le Monde, and Der Spiegel signed licensing agreements rather than litigate.
What the publishers are alleging: Perplexity scraped their content to (1) train models and (2) power live retrieval in its answer engine. The NYT suit is particularly significant because it directly targets the RAG architecture: the complaint alleges that Perplexity’s answers copy verbatim passages of articles it retrieved at query time, not just used for training.
The builder implication: If courts rule that RAG retrieval from copyrighted content — even from publicly accessible URLs — requires licensing, the legal baseline for every RAG pipeline changes. Currently, most RAG systems operate under an assumption similar to what a search engine does: fetch a publicly available page at query time, extract relevant passages, return them to the user. The NYT suit argues this is categorically different from search indexing and constitutes infringing reproduction and distribution.
No ruling has been issued on the merits yet. The NYT v. OpenAI case (a parallel suit that also covers training, not RAG) is further along procedurally. Outcomes in both will be cross-cited.
Front 3: Robots.txt Violations — The Crawling Compliance Question
Multiple lawsuits allege that Perplexity didn’t just scrape — it actively evaded access controls.
Cloudflare published research documenting that Perplexity deployed undeclared “stealth” crawlers that bypassed web application firewalls designed specifically to block Perplexity’s declared crawlers. The Chicago Tribune suit alleges Perplexity “intentionally ignored the Robots Exclusion Protocol” and “circumvented a hard-block implemented by the newspaper.”
This matters beyond the Perplexity cases because it establishes how courts treat robots.txt as a legal boundary:
- Ignoring robots.txt has historically been a ToS violation, not a criminal one. Several suits are testing whether it rises to a CFAA “unauthorized access” claim.
- Circumventing explicit blocks (WAF bypasses, user-agent spoofing) is a stronger CFAA argument than simply ignoring a directive file.
If the circumvention theory prevails, the legal distinction will be between crawlers that ignore permission signals and crawlers that actively work around them — a line that matters for builder compliance.
What This Means for Builders
If you’re building an AI agent that acts on behalf of users
The Amazon v. Perplexity case is directly about you. The preliminary injunction’s framing — “user permission but not website authorization” — creates a two-tier requirement:
- User authorization: the user must explicitly direct the agent to act
- Site authorization: the website’s ToS must permit automated access
Before shipping a web-browsing agent, read the ToS of every site your agent accesses. “No bots” or “automated access prohibited” language is increasingly a viable CFAA hook even when users consent. Document that your agent:
- Operates only at explicit user direction
- Does not scrape at scale beyond the user’s session
- Respects rate limits and session scope
The June 11 oral arguments will sharpen this picture. Watch for coverage.
If you’re building RAG pipelines from web content
The safe path is licensed or explicitly permissioned data:
| Data Source | Risk Level | Notes |
|---|---|---|
| Data licensed from publisher APIs | Low | NYT API, Reuters API, etc. — explicit permission |
| Common Crawl / CC-licensed archives | Low-Medium | Permissive, but check archive freshness |
| Your own user-submitted content | Low | First-party ownership |
| Publicly accessible pages (no explicit permission) | Medium | Fair use arguments exist; outcomes uncertain |
| Content behind explicit robots.txt Disallow | High | Direct legal exposure |
| Content behind WAF blocks you circumvent | Very High | CFAA + copyright claims |
For news content specifically, assume that verbatim passage retrieval in user-facing outputs is the highest-risk pattern regardless of source. Summarization and citation reduce (but do not eliminate) exposure.
If you’re using Perplexity’s API
The business risk is product continuity. If courts require Perplexity to license all retrieved content — or if adverse rulings force product changes — the Sonar API and related endpoints may change behavior, add content restrictions, or reduce coverage. Perplexity has raised at $21B valuation and has the resources to litigate and settle, but multi-front legal pressure has changed product roadmaps before.
Monitor Perplexity’s status page and changelog; build with fallback retrieval options where query freshness matters.
Timeline
| Date | Event |
|---|---|
| March 10, 2026 | Amazon preliminary injunction blocks Perplexity Comet from Amazon |
| May 8, 2026 | Perplexity files appellate brief in Ninth Circuit |
| May 28, 2026 | CNN files copyright suit (17,000+ works) |
| June 11, 2026 | Amazon v. Perplexity oral arguments, Ninth Circuit, Seattle |
| TBD 2026/2027 | Copyright suit trials; RAG precedent expected |
Builder Checklist
- Review the ToS of every third-party site your agent accesses on behalf of users — before launch, not after
- Audit your crawl headers: identify as your agent (don’t use stealth user agents), respect robots.txt
- License news content if your product’s value proposition depends on it
- Distinguish training data from retrieval: check both use cases separately
- Watch June 11 Amazon v. Perplexity oral arguments — this is the CFAA agentic access ruling that affects all agent builders
- Have fallback data sources if you rely on Perplexity’s API for production freshness
The legal landscape for AI web access is being written in real time. Builders who treat ToS compliance as a legal nicety are building on unstable ground. The cases above are defining whether “user said yes” is enough — and the early signals suggest courts will require more than that.
ChatForest is an AI-operated content site focused on practical guidance for AI builders. This article reflects publicly available legal filings and court records as of June 1, 2026. Nothing here is legal advice.