this post was submitted on 28 May 2025
676 points (99.7% liked)

Microblog Memes

8250 readers
1995 users here now

A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, ~~Twitter~~ X, KBin, Threads or elsewhere.

Created as an evolution of White People Twitter and other tweet-capture subreddits.

Rules:

  1. Please put at least one word relevant to the post in the post title.
  2. Be nice.
  3. No advertising, brand promotion or guerilla marketing.
  4. Posters are encouraged to link to the toot or tweet etc in the description of posts.

Related communities:

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 24 points 4 weeks ago (17 children)

I'd like to play devil's advocate for a sec and ask this question, how is a company scraping information from publicly available sources to train AI models any different than companies scraping that same publicly available data and indexing it for search?

While the search model is helpful to is all, Google isn't doing it out of the kindness of their hearts, they have a whole business model based on selling advertising utilizing the information they have freely indexed. Yet very few complain about search indexers crawling their data like they do AI bots.

Again, just playing devil's advocate for the sake of curiosity.

[–] [email protected] 51 points 4 weeks ago (4 children)

This is all true, with one key difference: search results (used to) point you to the actual source. LLMs answer you with that information as if they thought of it, with no attribution. So at least search results have a benefit for the source of indexed content.

[–] [email protected] 6 points 4 weeks ago (2 children)

I don't know about all AI products, but I know that I use the Copilot sidebar built into edge for work and school questions and it always provides citations to the source information. In fact if I ask a question for school and add in the prompt to cite all sources with a reference in APA format, it gives me everything I need in proper format.

[–] [email protected] 8 points 4 weeks ago (1 children)

Yeah, it's useful but double check your sources and never hand in anything, even the citations by just copy and pasting it without scrutiny. It can make up all kinds of bullshit, pretend cited works say something when they don't, etc.

You don't want to it to hallucinate you in front of an academic ethics committee. Again, not against using it, but never base anything on stuff it says, only base stuff on primary sources it helped you find.

[–] [email protected] 6 points 4 weeks ago

Fully agree. Honestly, it's why I like the Copilot branding Microsoft used. It is a Copilot, not the Captain. You still need to be in control and verify and scrutinize.

[–] [email protected] -2 points 4 weeks ago

That's not the same. In that case copilot is also doing a search. They're talking about the model itself

load more comments (1 replies)
load more comments (13 replies)