Doggy-Style? Why AI is creating so many Golden Retrievers (A Study by Nature)

Marlies Kloet - Own work Golden Retriever, Power of Gold, Sticker, CC BY-SA 3.0 File:At Sticker Lumis Golden.JPG Created: 6 July 2012 Uploaded: 12 September 2012

During the last hype of AI (the end of which we are currently witnessing), many AI experts believed that bigger, faster, deeper learning would yield better, more correct and faster results. Turns out that wasn’t what happened: In our reality AI data started flodding the internet with useless “slop“, a process experts soon started called “enshitification” – a nightmare for today’s forum administrators and mailing list maintainers. And now there’s less and less diversity in dogs? How could that happen?

AI Expoiting the Planet and Polluting the Internet

Enshitification is what happens when an AI is polluting the internet with nonsensical data being created by an AI that was fed mostly by AI output data. The more AI training models learn from data generated by AI training models, the less the quality of the output. Cory Doctorow coined the term enshitification: When bots create meaningless gibberish and post it in your favorite social network or forum, and later their scrapers and crawlers come back to learn upon their own nonsense and then the AI will create even worse BS to feed the networks. The process keeps running, making it hard to find substance and content. Not only Doctorow says it can already be seen in the decline of Google search results quality over the last years. If you read Cory and others, you’ll see that my description above is far from complete, and that it doesn’t even hit the worst issues. Go read Cory himself, it’s complex.

In 2024, several large studies discussed the topic of dogfooding AI at great depth, including comparisons with the mad cow syndrome. Remember? We fed cow remnants to cows, causing weird prion brain deseases and utter madness in the poor cattle’s behaviour. Now there’s terms like the MAD AI syndrom and Autophagy , a brain desease for AI, but on a digital, kind of “meta”, informational layer. A virus of the mind?

Greedy dogfooding till the models collapse

A few days ago a Nature paper showed the effects of AI being dogfooded on its own outputs – or rather only the effects of the sloppy manure of enshitification. All the downsides of AI tech based on LLMs can be found in there: its BIAS, partisan selections and simplified statistic modellings and ommitting key facts. Currently, there’s a Golden Retriever takeover happening in the web. You don’t believe? Have a look:

Believe it or not: AI makes Golden Retrievers take over the web. (Nature)

Why? Well the most liked dog races are over-represented in the training data, as they are in the web. And the learning algorithms tend – no matter how they work – to use a “winner takes it all” scheme. That leaves no other dog breed in the race (sorry for the pun). And after we have nothing left but Golden Retrievers, the models collapse and dogs won’t be what they used to be, once only the “dominant features” of what used to be a retriever face are prominent in the AI’s output.

And then, like in real life, a human has to clean up. Some prompt engineer has to come, some coder has to balance the weights and add some fences, just like in real life dog owners have to clean up behind their dogs with the plastic doggy bags (not the ones you get in restaurants, that’s for sure).

However, the paper from nature has a nice take on the whole story in its last paragraph:

The companies that sourced train-
ing data from the pre-AI Internet might have
models that better represent the real world.
It will be interesting to see how this plays out,
as more companies race to make their mark
in the generative-AI space — and, in doing so,
populate the Internet with increasing amounts
of AI-produced content
.”

Looks like a unsolvable conundrum: The better they get, the worse the output, the more new data you train on, the worse the output. If only someone with a minimum understanding of statistics would have told us at the start, then the 2 trillion USD might have helped where we really need them more.

“When you let other people tell you what’s right
When you leave your instinct and your own truth behind
That’s a virus of the mind,
That’s a virus of the mind

(…)

And I ended up feeling like I was a freak
So I found some wine and something to eat
And I talked to the dog to pass the time
I told myself, “I’m doing just fine”
It’s just a virus of the mind”

(Heather Nova, “Virus of the Mind“)