this post was submitted on 28 May 2025
676 points (99.7% liked)
Microblog Memes
8208 readers
2403 users here now
A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, ~~Twitter~~ X, KBin, Threads or elsewhere.
Created as an evolution of White People Twitter and other tweet-capture subreddits.
Rules:
- Please put at least one word relevant to the post in the post title.
- Be nice.
- No advertising, brand promotion or guerilla marketing.
- Posters are encouraged to link to the toot or tweet etc in the description of posts.
Related communities:
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'd like to play devil's advocate for a sec and ask this question, how is a company scraping information from publicly available sources to train AI models any different than companies scraping that same publicly available data and indexing it for search?
While the search model is helpful to is all, Google isn't doing it out of the kindness of their hearts, they have a whole business model based on selling advertising utilizing the information they have freely indexed. Yet very few complain about search indexers crawling their data like they do AI bots.
Again, just playing devil's advocate for the sake of curiosity.
You likely consented to search crawlers. You didn't consent to having your site slammed by AI bots to regurgitate your site either privately or publicly.
If memory serves me correctly, nobody concented to the search indexes originally either, it took time for those guard rails to be put in place and respected. I would imagine that this new tech will undergo the same growing pains as guard rails get implemented.
Yeah but the difference is that search engines act in synergy while AI models usually extract value from the site. One is getting your woodworking shop in the phonebook without consent, the other is taking your lathe out the door.
+1 for a woodworking analogy. :)
Originally, you had to submit your website to search engines.