AI’s Hallucinations Are a Warning—About Ourselves

AI’s Hallucinations Are a Warning—About Ourselves

August 19, 2025 By Florian Krampe

Tobias Ide’s recent article on his experiments with Generative AI should send a chill down the spine of any researcher, including those who work on climate and environmental security.

When Ide asked leading AI platforms to summarize academic literature, they didn’t simply generate errors; they also confidently invented nonexistent articles, fabricated authors, and profoundly misrepresented real research. Worse yet, these results were presented without a hint of self-doubt.

Ide’s warning is not merely justified; it is essential. In a high-stakes field like climate security, analysis directly influences policy on everything from defense spending to humanitarian aid. Grounding crucial decisions in fabricated evidence is a catastrophic risk.

Ide’s conclusion that GenAI cannot be trusted as a primary source is unequivocally correct. But his alarming findings also reveal a deeper, two-sided problem that we ignore at our peril. Not only is AI used this way in present circumstances flawed, but our approach to using it is dangerously simplistic.

An LLM Not A Librarian

A fundamental mismatch of expectation versus reality is at the core of the problem.

Users instinctively treat Large Language Models (LLMs) as if they are super-intelligent librarians. Platforms are expected to fetch and verify specific, factual information from a digital archive. And sometimes they indeed do amplify the problem of confidence, as in the result Tobias identified.

Yet current AI platforms are not librarians. An LLM is a sophisticated pattern-matcher. It functions as a text generator trained to produce responses that statistically resemble what a correct answer should look like. It is more akin to a brilliant novelist. It can write a compelling and plausible-sounding story about academic research, but it has no inherent commitment to factual accuracy.

Ide’s prompts are a perfectly reasonable question to ask a human researcher. Submitted to an LLM without clear instructions and choosing the right model, they were essentially asking a novelist to perform a financial audit. The result was predictable: a well-written piece of fiction.

This mismatch highlights a more human side of the crisis: a profound lack of ‘AI literacy’ among the very students, researchers, and policymakers rapidly adopting these tools. Our most urgent task is not simply to critique the AI, also to educate its users.

In short, we are wielding a powerful new instrument with an instruction manual written for a different machine. Crucially, Ide’s experiment demonstrates a naive-user approach. He asked the AI to analyze articles it could not access, prompting it to invent a response. Had his prompts been more understanding of GenAI’s fundamental limitations, the results would have been likely vastly more useful, though still warranting critical validation. For instance, the prompt could have provided the 13 articles as attachments and asked for a summary. In fact, I repeated the experiment with the paid version of Google’s Gemini 2.5 which immediately jumped into Deep Research mode and identified the correct table of content for the 2021 and 2012 issues of Journal of Peace Research.

A Confidence Game

To be sure, none of this excuses the technology itself. The most dangerous feature of current GenAI is its deceptive confidence.

When it is unable to access or verify information from the Journal of Peace Research‘s 2021 special issue, a truly intelligent system would respond: “I cannot access that specific source and cannot guarantee an accurate answer.”

Instead, by design, these platforms tested by Ide chose to hallucinate a plausible-sounding response to appear helpful. This is a critical design flaw that developers must be pressured to address. Transparency about a model’s limitations should be a feature, not a bug discovered through user frustration.

Better Uses

So, where do we go from here? A categorical rejection of AI is neither feasible nor wise. In fact, I am using an AI assistant as I write this piece. I am not deploying it to simply generate a reply to Tobia’s excellent blog, but to help refine and streamline my own work.

These tools have immense potential to process vast amounts of information and identify novel patterns, but only if we use them as they are in reality, rather than what we wish them to be. So the path forward requires a dual approach.

First, the research and policy community must demand more from developers. We need AI systems that are as honest as they are powerful. Our platforms must transparently communicate their own uncertainties and limitations.

Second, and more urgently, we must champion critical AI literacy within our own institutions. Universities, think tanks, and government agencies need to move beyond simply using AI and teach a more critical methodology for researcher engagement. This means training users to ask better questions, i.e. prompts, understand the difference between generation and retrieval, and treat every output as a draft to be rigorously verified and validated, rather than a final answer.

New methodological approaches show what this might look like in practice. For instance, I am a member of a team working on a new study on transboundary water cooperation (currently under peer review) which uses a framework of Large Language Models to analyze thousands of historical conflict and cooperation events.

We did not ask the AI to write our paper. Rather this process requires careful, deliberate design across an interdisciplinary team, including the implementation of a “hallucination guardrail” and a rigorous two-stage validation process where human experts review and verify the machine-generated results to ensure their accuracy and theoretical robustness. Such methods are powerful, but they are not magic; they are sophisticated tools that demand sophisticated use.

Tobias Ide has exposed the ghost in the machine. The challenge of using AI in climate security research is not just about correcting the technology’s hallucinations; it is about correcting our own illusions about what this technology is and how it should be used.

Florian Krampe is Director of SIPRI’s Climate Change and Risk Programme and an Affiliated Researcher at Uppsala University’s Department of Peace and Conflict Research. His work focuses on the intersection of climate change, security, and peace, with a special emphasis on environmental peacebuilding and the governance of climate-related security risks in conflict-affected states.

Photo Credits: Licensed by Adobe Stock.

Topics: meta

New Security Beat

NewSecurityBeat

AI’s Hallucinations Are a Warning—About Ourselves

An LLM Not A Librarian

A Confidence Game

Better Uses

Join the Conversation

Featured Media

What We’re Reading

New Security Beat

NewSecurityBeat

AI’s Hallucinations Are a Warning—About Ourselves

An LLM Not A Librarian

A Confidence Game

Better Uses

Join the Conversation

Featured Media

What We’re Reading

Related Stories