This is an automated archive made by the Lemmit Bot.
The original was posted on /r/machinelearning by /u/Actual_Requirement58 on 2025-06-11 02:35:19+00:00.
We ran a study to test how truth degrades in LLMs over recursive generations—but instead of measuring hallucinations, we measured semantic drift.
The common assumption is that recursive use of LLM outputs results in factual degradation. But when we systematically tested this over 10 academic domains and 10 generations of GPT-4o outputs, we found something different:
- Facts are mostly retained: Only a 2% drop in factual accuracy over 10 generations
- Semantic intent collapses: A new metric we introduced, Purpose Fidelity, dropped 42.5%
- That’s a 6.63× higher rate of semantic drift vs factual decay
Examples:
A Descartes excerpt (“Cogito, ergo sum”) became career advice about leadership and self-awareness
A history excerpt on the Berlin Wall became a lesson in change management
Law and medicine were rewritten as “best practices” for business professionals
Chemistry and CS stayed stable: semantic degradation was domain-specific
Why this matters: Most LLM eval frameworks focus on factual accuracy and hallucination rates. But our data suggests the real long-term risk may be subtle, systematic recontextualization. Outputs can look factual and well-structured, while completely losing their intended purpose. This may impact content authenticity, training data curation, and long-term epistemic stability.
📄 Full paper (ResearchGate) - https://www.researchgate.net/publication/392558645_The_Half-Life_of_Truth_Semantic_Drift_vs_Factual_Degradation_in_Recursive_Large_Language_Model_Generation
🧵 Medium summary for general audience - https://medium.com/@maxwell.ian/when-ai-loses-its-mind-but-keeps-the-facts-the-hidden-danger-of-recursive-ai-content-08ae538b745a