The number of questions being asked on StackOverflow is dropping rapidly Like cutting down a forest without growing new trees, the AI corporations seem to be consuming the natural raw material of their money-making machines faster than it can be replenished.
Natural, human-generated information, be they works of art, or conversations about factual things like how to write software, are the source of training data for Large Language Models (LLMs), which is what people are calling “artificial intelligence” nowadays. LLM shops spend untold millions on curating the information they harvest to ensure this data is strictly human-generated and free of other LLM-generated content. If they do not do this, the non-factual “hallucinations” (fictional content) that these LLMs generate may come to dominate the factual human-made training data, making the answers that the LLMs generate increasingly more prone to hallucination.
The Internet is already so full of LLM-generated content that it has become a major problem for these companies. The new LLMs are more and more often trained on fictional LLM-generated content that passes as factual and human-made, which is rapidly making LLMs less and less accurate as time goes on — a viscous downward spiral.
But it gets worse. Thanks to all of the AI hype, everyone is asking questions of LLMs nowadays and not of other humans. So the source of these LLMs training data, web sites like StackOverflow and Reddit, are now no longer recording as many questions from humans to other humans. If that human-made information disappears, so does the source of natural resources that make it possible to build these LLMs.
Even worse still, if there are any new innovations in science or technology, unless humans are asking question to the human innovators, the LLMs can’t learn about these things innovations either. Everyone will be stuck in this churning maelstrom of AI “slop,” asking only questions that have asked by millions of others before, and never receiving any true or accurate answers on new technology. And nobody, neither the humans nor the machines, will be learning anything new at all, while the LLMs become more and more prone to hallucinations with each new generation of AI released to the public.
I think we are finally starting to see the real limitations of this LLM technology come into clear view, the rate at which it is innovating is simply not sustainable. Clearly pouring more and more money and energy into scaling up these LLM project will not lead to increased return-on-investment, and will definitely not lead to the “singularity” in which machine intelligence surpasses human intelligence. So how long before the masses finally realize they have been sold nothing but a bill of goods by these AI corporations?