Testing Large Language Models (LLMs) for disinformation analysis – what I learnt from a zero-shot experiment

Testing Large Language Models (LLMs) for disinformation analysis – what I learnt from a zero-shot experiment

09/08/2025

A quick disclaimer: they are not magic wands.

Bildquelle: Pexel

This article was written by Maria Hul, Junior Researcher at opsci.ai and published in German on EJO.

I spent 6 months in the opsci.ai research team experimenting with LLMs, which can both process and generate text by predicting the next word in a sequence. You’ve probably heard a lot about LLMs. In this post, I share a brief overview of how I used them to support disinformation detection; the challenges we encountered during the research; and how we are adapting our methods to improve the AI tools available to support journalists in their work. 

The initial challenge

When attempting to understand what types of disinformation content circulate online, one of the fundamental challenges lies in the unstructured nature of most online content. Tweets, Telegram posts, comments, and even memes rarely come with labels indicating the categories we may be interested in:What is the topic? Which arguments does it present? Is the content attempting to evoke specific emotions? Yet, to understand how disinformation works, this is precisely what’s needed: transforming free-form text into structured, analyzable data.

This is where LLMs show real promise. In what we call zero-shot mode, we can “label” data (categorize it) without much training (for the more advanced, without fine-tuning). For instance, when given a post like: “The vaccine was rushed. No one knows what’s really in it.”, a well-prompted LLM might output:

Target: health authorities

Narrative: experimental vaccines / risk for population 

Emotion: distrust, fear

We use this approach to automatically label posts, extract entities, and assign them to narratives that we have identified in PROMPT on the topics of the Russian invasion of Ukraine, European elections and LGTBQI+ rights.

What did I do?

I used Large Language Models (LLMs) to automatically annotate disinformation-related messages in 8 languages. Instead of relying on a single model, I experimented with several LLMs, all in zero-shot mode.

The LLMs were tasked with annotating over t 20 fields, covering both surface-level and deep structural linguistic features of each social media post (the full codebook is discussed here). These include, for example:

Disinformation typology (e.g., conspiracy, fabrication, biased)

Narrative themes (e.g., anti-EU, migration, geopolitical framing)

Facticity and verifiability

Information manipulation and persuasion techniques 

Rhetorical figures(e.g., irony, hyperbole, repetition)

Axiological framing (e.g., “us vs elites”, “us vs migrants”)

Emotional triggers (e.g., fear, anger, nostal- gia)

To see what the LLMs yielded, I tested them on a sample of the EUvsDisinfo database, an EU initiative which looks specifically at Russian-driven disinformation.

Each input post was submitted to the model together with a standardized system prompt describing both the task and the taxonomy. Outputs were then validated to ensure format consistency. 

What are the challenges?

Despite their capabilities, LLMs were primarily designed to generate text - not to classify it. This leads to three main challenges in practice:

  1. Cost: Each call to a commercial LLM incurs a token-based fee. Analyzing millions of posts can quickly amount to thousands of euros. 

  2. Inconsistency: The same input can occasionally produce slightly different outputs. Without additional constraints, LLMs may “hallucinate” labels or produce outputs in inconsistent formats. 

  3. Latency and Scalability: Running an LLM on each individual post takes time and computational resources - posing a real limitation for real-time or large-scale monitoring, especially when speed is essential to support journalists in identifying and responding to disinformation narratives.

How are we addressing these challenges?

First, we are turning to smaller models, more resource-efficient approaches. Based on feedback from journalists, we try to simplify the data we extract. This involves relying more on metadata - after all, you don’t really need an LLM to tell you whether a post comes from Facebook or Telegram - and reducing the number of categories. Do we really need to track thousands of existing rhetorical nuances? Trade-offs are inevitable.

Second, we use several models and cross-validate their findings, looking for inconsistencies. We compare the majority “votes” of the LLMs with our own manual verification, focusing particularly on areas where errors are most likely—such as less frequently used languages, not just English.. 

Third, we optimize performance through asynchronous processing: LLMs return staggered results, so we’re not held back by the slowest task. We also bundle requests into batches to improve processing efficiency.

Conclusion

As recent research shows, zero-shot annotation can work - some even argue that these methods not only outpace, but also outperform human (or ‘manual’) reviews. PROMPT is not designed to replace journalists, but to support them: by flagging weak signals, subtle rhetorical devices like coded language or dog whistles, and identifying broader patterns hidden within vast datasets. LLMs are just one component in a wider AI pipeline that helps filter and structure the flood of online information. In short: these experiments are not the end—they’re just the beginning.