Reddit files lawsuit against Perplexity over ‘unauthorised AI training’
The social platform has alleged that Perplexity and its data partners have illegally harvested Reddit content to power AI tools despite repeated warnings
The social platform has alleged that Perplexity and its data partners have illegally harvested Reddit content to power AI tools despite repeated warnings
Reddit has filed a lawsuit against AI startup Perplexity AI and three data-scraping firms, naming Oxylabs, AWMProxy, and SerpApi, accusing them of unlawfully collecting and reselling Reddit content to train Perplexity’s AI systems. The complaint, lodged in the U.S. District Court for the Southern District of New York, alleges the defendants used disguised web scrapers to bypass Reddit’s protections and access vast troves of user-generated posts.
According to the filing, Reddit had earlier issued a cease-and-desist notice to Perplexity, but the company allegedly continued referencing Reddit material with citations reportedly surging forty-fold afterward. “Perplexity was caught red-handed using markers confirming scraped data access,” the lawsuit claims.
Reddit contends that this case highlights the rise of large-scale data misuse driven by the growing demand for human-generated information to train AI systems. The platform’s leadership views the situation as part of a broader trend in which companies bypass licensing agreements and rely on unauthorised data sources to strengthen their AI capabilities.
The complaint further notes that Reddit’s data is a highly valuable resource due to its vast and dynamic collection of real human discussions. While Reddit has formal data licensing partnerships with OpenAI and Google, it has consistently acted against entities using its content without permission. A similar lawsuit was filed earlier this year against Anthropic, reflecting Reddit’s continued efforts to safeguard its intellectual property. Both Oxylabs and SerpApi have rejected Reddit’s claims, maintaining that their data-gathering practices comply with legal standards. The lawsuit marks another flashpoint in the ongoing conflict between technology platforms and AI firms over the ethical and lawful use of online data.