this post was submitted on 21 Mar 2025
210 points (99.1% liked)

Linux

8141 readers
476 users here now

A community for everything relating to the GNU/Linux operating system

Also check out:

Original icon base courtesy of [email protected] and The GIMP

founded 2 years ago
MODERATORS
 

LLM scrapers are taking down FOSS projects' infrastructure, and it's getting worse.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 70 points 3 months ago* (last edited 3 months ago) (12 children)

Wow that was a frustrating read. I dd not know it was quite that bad. Just to highlight one quote

they don’t just crawl a page once and then move on. Oh, no, they come back every 6 hours because lol why not. They also don’t give a single flying fuck about robots.txt, because why should they. [...] If you try to rate-limit them, they’ll just switch to other IPs all the time. If you try to block them by User Agent string, they’ll just switch to a non-bot UA string (no, really). This is literally a DDoS on the entire internet.

[–] jatone 29 points 3 months ago (11 children)

the solution here is to require logins. thems the breaks unfortunately. it'll eventually pass as the novelty wears off.

[–] [email protected] 12 points 3 months ago (4 children)

Next you'll have to invest in preventing automated signups

[–] [email protected] 5 points 3 months ago (1 children)

Signups in most platforms are quite hard. Straight up give your phone and do SMS verification, or at least give email and to register that email you will have to provide phone anyway. Captchas nowadays became so hard that even humans struggle with them and it often takes multiple attempts to get it right.

[–] [email protected] 4 points 3 months ago (1 children)

provide phone number to look at this foss project's website, not too sure about that

[–] [email protected] 5 points 3 months ago

Honestly if any site demands my phone number it can get fucked.

load more comments (2 replies)
load more comments (8 replies)
load more comments (8 replies)