AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt

pelespirit@sh.itjust.works · 8 months ago

AI haters build tarpits to trap and trick AI scrapers that ignore robots.txt

vrighter@discuss.tchncs.de · 8 months ago

yes but now you’ve shifted the problem again. You went from detecting infinite sites by detecting loops in an infinite tree without loops or with infinite distinct urls, to somehow keeping a list of all infinite distinct urls to avoid going to one twice(which you wouldn’t anyway, because there are infinite links), to assuming you have a list that already detected which sites these are so you could avoid them and therefore not have to worry about detecting them (the very thing you started with).

It’s ok to admit that your initial idea was wrong. You did not solve a coding problem. You changed the requirements so it’s not your problem anymore.

And storing a domain whitelist would’t work either, btw. A tarpit entrance is just one url among lots of legitimate ones, in legitimate domains.

Lovable Sidekick@lemmy.world · 8 months ago

Okay fine, I 100% concede that you’re right. Bye now.