Feed the bots: A fun experiment with LLM scrapers
lol... babble.c... I wonder if my wife got a copy of that piece of code? 😅 Jokes aside — funny little experiment with bot-crawling here, worth a read. 📚👍🏻
LLM scrapers are hammering sites nonstop, dodging blocks, ignoring robots.txt, and wasting bandwidth. The smartest defense? Feed them junk—trap pages make them easy to spot and cheap to starve.
Summary
The article argues that a dynamically generated “nonsense maze” did not attract more AI scrapers but made existing ones easier to identify, revealing that they now account for nearly all of the site’s traffic. It describes these bots as aggressive LLM data scrapers that ignore robots.txt, evade blocks by changing user agents and IPs, and generate ongoing server and bandwidth costs. The author concludes that serving them garbage content is a cheap, practical defense, while IP bans, rate limits, logins, CAPTCHAs, and proof-of-work are ineffective or harmful to users.