Technology

71760 readers

3674 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

244

ChatGPT Mostly Source Wikipedia; Google AI Overviews Mostly Source Reddit (www.seroundtable.com)

submitted 1 week ago by [email protected] to c/[email protected]

20 comments fedilink hide all child comments

A study from Profound of OpenAI's ChatGPT, Google AI Overviews and Perplexity shows that while ChatGPT mostly sources its information from Wikipedia, Google AI Overviews and Perplexity mostly source their information from Reddit.

you are viewing a single comment's thread
view the rest of the comments

[–] [email protected] 35 points 1 week ago (10 children)

Throughout most of my years of higher education as well as k-12, I was told that sourcing Wikipedia was forbidden. In fact, many professors/teachers would automatically fail an assignment if they felt you were using wikipedia. The claim was that the information was often inaccurate, or changing too frequently to be reliable. This reasoning, while irritating at times, always made sense to me.

Fast forward to my professional life today. I've been told on a number of occasions that I should trust LLMs to give me an accurate answer. I'm told that I will "be left behind" if I don't use ChatGPT to accomplish things faster. I'm told that my concerns of accuracy and ethics surrounding generative AI is simply "negativity."

These tools are (abstractly) referencing random users on the internet as well as Wikipedia and treating them both as legitimate sources of information. That seems crazy to me. How can we trust a technology that just references flawed sources from our past? I know there's ways to improve accuracy with things like RAG, but most people are hitting the LLM directly.

The culture around Generative AI should be scientific and cautious, but instead it feels like a cult with a good marketing team.

[–] [email protected] 19 points 1 week ago (1 children)

The common reasons given why Wikipedia shouldn't be cited is often missing the main reason. You shouldn't cite Wikipedia because it is not a source of information, it is a summary of other sources which are referenced.

You shouldn't cite Wikipedia for the same reason you shouldn't cite a library's book report, you should read and cite the book itself. Libraries are a great resource and their reading lists and summaries of books can be a great starting point for research, just like Wikipedia. But citing the library instead of the book is just intellectual laziness and shows to any researcher you are not serious.

Wikipedia itself also says the same thing: https://en.m.wikipedia.org/wiki/Wikipedia:Citing_Wikipedia

[–] [email protected] 5 points 1 week ago

You shouldn’t cite Wikipedia because it is not a source of information, it is a summary of other sources which are referenced.

Right, and if an LLM is citing Wikipedia 47.9% of the time, that means that it's summarizing Wikipedia's summary.

You shouldn’t cite Wikipedia for the same reason you shouldn’t cite a library’s book report, you should read and cite the book itself.

Exactly my point.

load more comments (8 replies)