Feed the bots: A fun experiment with LLM scrapers

lol... babble.c... I wonder if my wife got a copy of that piece of code? 😅 Jokes aside — funny little experiment with bot-crawling here, worth a read. 📚👍🏻

TL;DR

LLM scrapers are hammering sites nonstop, dodging blocks, ignoring robots.txt, and wasting bandwidth. The smartest defense? Feed them junk—trap pages make them easy to spot and cheap to starve.

Summary

The article argues that a dynamically generated “nonsense maze” did not attract more AI scrapers but made existing ones easier to identify, revealing that they now account for nearly all of the site’s traffic. It describes these bots as aggressive LLM data scrapers that ignore robots.txt, evade blocks by changing user agents and IPs, and generate ongoing server and bandwidth costs. The author concludes that serving them garbage content is a cheap, practical defense, while IP bans, rate limits, logins, CAPTCHAs, and proof-of-work are ineffective or harmful to users.

Read article ↗ maurycyz.com

Feed the bots: A fun experiment with LLM scrapers

Summary

Recent links

ChatGPT Work will be your new work agent

Apple upgrades Creator Studio with smarter integrations

Jony Ive’s new Ferrari charges best upside down

Ferrari unveils a $640,000 electric car with Jony Ive