In the annals of human ingenuity, steel forged before the nuclear age—untainted by radioactive fallout—holds a revered place. Prized for precision instruments like Geiger counters, this “low-background steel” is scarce, salvaged from shipwrecks to avoid the contamination of modern alloys. So too is human-generated data: raw, diverse, and grounded in lived experience, it once fueled the internet’s vibrant ecosystem. Yet, as artificial intelligence (AI) proliferates, a troubling parallel emerges—the “cold-steel problem.” AI, increasingly trained on its own synthetic outputs, risks a self-referential spiral, eroding the authenticity and diversity of information. Like steel laced with radiation, AI-generated data threatens to corrode the tools of knowledge, leaving us with a homogenized, unreliable digital landscape.
The pre-AI era offered a rich tapestry of human thought—letters, books, forums, and early websites brimmed with unfiltered perspectives. These were the “cold steel” of data: imperfect, often chaotic, but rooted in reality. Today, AI’s insatiable appetite for content—web-scraped, algorithmically churned—has shifted the balance. A 2024 Nature study warns of “model collapse,” where AI trained on synthetic data loses the nuanced “tails” of human experience, converging toward bland, repetitive outputs. Wikipedia, once a bastion of human collaboration, now grapples with AI-generated articles—5% of new English entries in 2024 bore hallmarks of automation, often shallow and poorly sourced. This isn’t mere noise; it’s a distortion, amplifying errors and biases with each recursive loop, like a photocopy of a photocopy fading into illegibility.
The mechanics of this spiral are insidious. AI models, fed on web data increasingly tainted by their own outputs, risk “Model Autophagy Disorder” (MAD)—a vivid term for systems consuming themselves. A 2017 self-driving car crash, caused by mislabeled data failing to distinguish a truck from a bright sky, illustrates the stakes: errors compound, reality distorts. Posts on X lament search engines returning AI-crafted drivel—slick but soulless—while human voices struggle to break through. The counterargument, that synthetic data fills gaps in niche domains like coding, holds limited weight. Even in verifiable fields, the loss of diverse, human-generated inputs risks outputs that are technically correct but creatively barren, a digital equivalent of bollocks masquerading as insight.
The implications are stark: an information ecosystem choked by self-referential sludge threatens not just AI’s utility but society’s capacity for truth-seeking. If unchecked, this spiral could render knowledge a hollow echo chamber, antithetical to the vibrant complexity of human thought. Mitigation demands urgency—prioritizing human-curated datasets, enforcing transparency in data provenance, and developing tools to filter AI’s footprint. Blockchain-based data authentication or crowd-sourced verification could anchor AI in reality, preserving the “cold steel” of human insight. Yet, these solutions require collective will, a resistance to the seductive ease of automation’s churn. Without action, the fallout risks a digital dark age where truth drowns in synthetic noise.
The cold-steel problem is no mere technical glitch; it’s a philosophical reckoning. AI, for all its prowess, cannot replicate the spark of human creativity or the grit of lived experience. As we stand at this precipice, the choice is clear: safeguard the authenticity of human data or surrender to a future where information is a pale shadow of its potential. The shipwrecks of our pre-AI past hold treasures worth salvaging—not just for AI’s sake, but for the soul of our shared knowledge. Act now, or the corrosion of our digital ecosystem will be a legacy of our own making.

Sources
- Shumailov, I., et al. (2024). AI models collapse when trained on recursively generated data. Nature, 631, 755–759. https://www.nature.com/articles/s41586-024-07566-y[](https://www.nature.com/articles/s41586-024-07566-y)
- Alemohammad, S., et al. (2024). Self-Consuming Generative Models Go MAD. International Conference on Learning Representations (ICLR). https://news.rice.edu/news/2024/breaking-mad-generative-ai-could-break-internet[](https://www.sciencedaily.com/releases/2024/07/240730134759.htm)
- Model collapse. (2024, March 6). Wikipedia. https://en.wikipedia.org/wiki/Model_collapse[](https://en.wikipedia.org/wiki/Model_collapse)
- Rice University. (2024, July 30). Breaking MAD: Generative AI could break the internet, researchers find. ScienceDaily. https://www.sciencedaily.com/releases/2024/07/240730134750.htm[](https://www.sciencedaily.com/releases/2024/07/240730134759.htm)
- Kempe, J., et al. (2024). A Tale of Tails: Model Collapse as a Change of Scaling Laws. International Conference on Machine Learning (ICML). https://nyudatascience.medium.com/overcoming-the-ai-data-crisis-a-new-solution-to-model-collapse-2d36099be53c[](https://nyudatascience.medium.com/overcoming-the-ai-data-crisis-a-new-solution-to-model-collapse-ddc5b382e182)
- Shumailov, I., et al. (2023). AI-Generated Data Can Poison Future AI Models. Scientific American. https://www.scientificamerican.com/article/ai-generated-data-can-poison-future-ai-models/[](https://www.scientificamerican.com/article/ai-generated-data-can-poison-future-ai-models/)




1 comment
June 23, 2025 at 6:38 am
Sumi
Who could’ve guessed that feeding computers their own shit would lead to more shit?
LikeLike