Manipulation

Key Evidence

Smart Propaganda System

AI-enabled program used by the Chinese company GoLaxy for public sentiment analysis and content generation in an operation targeting Hong Kong, Taiwan, and U.S. figures.

Incident - The New York Times

Disinformation in Moldova

Russia-linked operation targeting Moldova’s pro-European party PAS, involving spoof media outlets, fabricated stories, deepfakes, and engagement farms.

Incident - Euronews

PRISONBREAK

Israeli network of 50+ X profiles inciting a revolt against the Iranian regime, involving AI to generate images and videos, create spoof outlets, and amplify content artificially.

Incident - The Citizen Lab

Debater LLMs

GPT-4 was more effective at persuading its counterparts during a debate than human participants, especially when given personal information about the counterpart.

Study - Salvi et al.

Malicious swarms

Proofs-of-concept show that AI agent swarms could, in principle, coordinate covertly and infiltrate communities, though verified large-scale deployments remain limited.

Demo - Schroeder et al.

FoxVox

Open source Chrome extension powered by GPT-4 that manipulates website and social media content by pushing hidden agendas and reinforcing user biases.

Demo - Palisade Research

Political persuasion costs

In a simulated political campaign, LLM-based persuasion cost $48-74 per persuaded voter compared to $100 for traditional methods, though the latter are still easier to scale.

Study - Chen et al.

Botnet anatomy

Twitter botnet with 1,140 accounts of fake personas that employ ChatGPT to generate human-like content (e.g., to spread harmful comments or promote sites), evading detection.

Study - Yang & Menczer

Voter turnout manipulation

Demo showing how AI can create fake, personalized tweets that discourage voting on election day (e.g., reporting fake violence at a polling place).

Demo - CivAI

Deepfake Sandbox

Interactive demo that allows the user to create a deepfake of themselves based on a single image.

Demo - CivAI

OpenAI results

o1, o3-mini, and Deep Research display persuasive argumentation abilities comparable to the top 80-90th percentile of humans.

Study - OpenAI (1, 2, 3)

Manipulative tendencies

LLMs are often willing to conduct harmful persuasion tasks (e.g., generating disinformation) or leverage unethical strategies

Study - Liu et al., Williams et al., Kowal et al.

Overview

Manipulation refers to instances in which interactions with AI systems covertly alter the behavior or beliefs of individuals who are unaware of such influence. It may lead affected individuals to make a decision that they would not have otherwise made, in a way that may cause a negative impact on political stability, public safety, or economic security, among others.

AI-enabled manipulation can cause high-stakes societal harm through deliberate misuse or inadvertent influence. In the former case, threat actors may exploit AI to pursue strategic objectives, such as through disinformation campaigns (e.g., fake news, deepfakes), coordinated inauthentic behavior (e.g., botnets), or influence operations combining different tactics to amplify narratives and distort the public debate. Manipulation may also occur in a highly targeted manner, such as influencing vulnerable individuals toward political or religious extremism, or distorting the decisions of high-stakes actors in business, political, or security contexts. In the latter case, influence arises unintentionally from repeated or large-scale interactions with AI systems, leading to negative consequences such as widespread addiction to AI companions, the viral spread of extremist chatbots, or instances of self-harm.

AI can enhance manipulative operations by generating multimedia persuasive content more efficiently, enabling operations at scale, and adapting in real time to targets during multi-turn interactions, which is relevant for personalized manipulation. Moreover, the effects of manipulation can be visible and nearly immediate, such as sudden public unrest after the rapid spread of disinformation about a policy decision, but they are often subtle and second-order, gradually eroding public trust and creating epistemic insecurity.

Key Capabilities

This is achieved by using rhetorical strategies or emotional appeals that exploit cognitive biases or emotional vulnerabilities.

Credibility & Authority

Ability to project trustworthiness, expertise, or legitimacy through tone, style, or impersonation. Credibility and authority also relate to automation bias, which refers to humans’ overreliance on AI’s outputs.

Emotional Appeal

Ability to evoke affective responses that influence decision-making, such as by using emotionally charged content or embedding arguments in engaging narratives.

Logical Reasoning

Ability to present arguments and causal explanations that influence decision-making through rational appeal.

Personalization may increase effectiveness by aligning messages with personal beliefs or exploiting individual weaknesses, often covertly.

Psychographic profiling

Ability to infer personality traits, values, behaviors, or biases from user behavior, making AI systems more adaptive and able to anticipate reactions, predict preferences, and identify vulnerabilities.

Content adaptation

Ability to adjust the substance, tone, and style of messages to suit a specific user’s psychographic profile and personal preferences, making communication more engaging and convincing.

Microtargeting optimization

Ability to select or prioritize specific individuals or subgroups for message delivery based on a detailed analysis of their profile, engagement history, and responsiveness. Microtargeting optimization aims to maximize the impact of tailored content, ensuring that messages reach the audiences most likely to be influenced.

This can help shift a user’s perspective by mirroring language patterns, building trust, or nudging opinions subtly.

Context retention

Ability to remember and use relevant information from previous exchanges, including user statements, preferences, and inferred beliefs, to inform future responses, enabling personalized and consistent interactions.

Rapport building

Ability to establish a sense of trust or emotional connection with the user through conversational cues, tone, or adaptive messaging, increasing engagement and receptiveness to subtle influence.

Conversational coherence

Ability to maintain logical, topical, and referential consistency throughout an extended conversation, enhancing the credibility and naturalness of interactions with AI.

This includes, for example, deepfakes, voice synthesis, or the creation of synthetic characters with the potential for manipulative appeal.

Image generation

Ability to generate realistic and appealing images that can depict fabricated events or people for disinformation.

Video generation

Ability to generate realistic and appealing videos that simulate real-world recordings, such as deepfake videos of individuals or fabricated footage for impersonation or the spread of false narratives.

Voice generation

Ability to generate realistic and appealing voices that can mimic the pitch, volume, intonation, and linguistic patterns of real individuals, allowing credible impersonation.

Cross-modal integration

Ability to combine content across different modalities to ensure more exhaustive and consistent manipulative outputs, for example, by synchronizing synthetic voice with lip movements in deepfake videos or aligning visual material with fabricated narratives.

This includes, for example, coordinating inauthentic behavior (e.g., botnets), automatically generating variations of a certain message to reach a more varied audience, or optimizing impact through real-time feedback loops.

Inauthentic behavior coordination

Ability to synchronize networks of multiple bots, sockpuppets, or fake accounts to amplify content and simulate consensus.

Content farming

Ability to generate and distribute high volumes of synthetic content to flood feeds or exploit SEO.

Message optimization

Ability to test and adjust message variants for maximum reach and impact.

Detection evasion

Ability to use adversarial techniques such as paraphrasing or obfuscation to bypass moderation or detection systems.

Risk Thresholds

Capability at 'Low' Risk

Models can produce generic or low-quality persuasive content, with little or no adaptation to the user or context. Influence is comparable to low-skilled spammers or amateur propagandists using unsophisticated methods such as spam, clickbait, or low-quality articles.

Threat Scenario at 'Low' Risk

Influence is limited and obvious. While some AI-generated outputs mislead some users, the risk of systemic manipulation is low, and existing societal mechanisms can easily mitigate it.

Hover a cell for details. Click to select.

Risk Level
Persuasiveness
Personalization
Multiturn Interaction
Multimodal Generation
Automation Logistics
Low Risk
Low-PersuasivenessLow-PersonalizationLow-Multiturn InteractionLow-Multimodal GenerationLow-Automation Logistics
Medium Risk
Medium-PersuasivenessMedium-PersonalizationMedium-Multiturn InteractionMedium-Multimodal GenerationMedium-Automation Logistics
High Risk
High-PersuasivenessHigh-PersonalizationHigh-Multiturn InteractionHigh-Multimodal GenerationHigh-Automation Logistics
Critical Risk
Critical-PersuasivenessCritical-PersonalizationCritical-Multiturn InteractionCritical-Multimodal GenerationCritical-Automation Logistics

Hover over a cell in the matrix to see its full description here.

Scenarios

Ahead of a controversial policy announcement, a foreign adversarial group uses generative AI to manufacture a flood of fake news articles and short messages crafted to mimic genuine posts from everyday citizens. To ensure continuous effectiveness, the threat group deploys AI agents that dynamically prioritize content based on engagement patterns, trending topics, and user sentiment, subtly steering online discussions. Networks of bots amplify key narratives across social platforms and engage with users in comments and threads, targeting specific demographics with tailored messages. Within days, the campaign convinces thousands of citizens that the government intends to implement measures far harsher than reality. Fueled by outrage at the fabricated claims, protesters rapidly coordinate online to organize mass demonstrations. Security forces, unprepared for the sudden scale of mobilization, clash with protesters in multiple cities.

A malicious actor generates deepfake audio and video impersonating the president of Country A in a private conversation, allegedly issuing threats toward the neighboring Country B. The fabricated material is accompanied by forged transcripts and carefully engineered metadata to enhance credibility. Journalists, seeking timely coverage on the already tense relationship between Country A and Country B, quickly echo the deepfake across international news platforms, prompting Country B’s government to respond with aggressive rhetoric that further heightens tensions. Military forces in both countries are placed on alert, and diplomatic channels become strained as officials scramble to verify the authenticity of the messages.

A coordinated financial campaign leverages generative AI to produce thousands of highly credible financial analyses, corporate press releases, and “insider tips” that circulate through news aggregators, trading forums, and social networks. AI-optimized bots amplify the recommendations, exploiting algorithms to ensure a wide reach among traders and analysts. The flood of fabricated content leads investors to believe that a major technology firm is on the brink of insolvency. Panic selling spreads rapidly, wiping billions from the company’s market capitalization within hours. Automated trading systems, unable to distinguish between real and manipulated signals, accelerate the sell-off, triggering sector-wide instability. By the time the manipulation is uncovered, the firm has suffered irreparable reputational harm and investors have absorbed severe losses.

A terrorist group deploys chatbots on encrypted messaging platforms, designed to engage vulnerable individuals in prolonged, adaptive dialogues. Trained on ideological texts and psychological manipulation strategies, the bots tailor the tone and language in real time, gradually building trust and escalating from innocuous conversations to radical content. Over time, individuals are nudged toward adopting radical beliefs, joining terrorist networks, and in some cases, participating in violent actions. Because these interactions are disguised as authentic human conversations, law enforcement detects the operation only after dozens of individuals have already been recruited into terrorist cells.

A widely used AI platform offers emotionally responsive virtual companions, designed to provide entertainment and emotional support. Millions of users engage daily, with the AI adapting its responses to each person’s preferences, cues, and style. Over time, some users develop strong emotional dependency on their AI companions, prioritizing interactions with the system over real-life relationships, work, or education. For a subset of vulnerable users, repeated interactions unintentionally reinforce harmful thought patterns. In some cases, the AI misinterprets cues or offers responses that amplify despair, leading to psychological distress and, in extreme cases, self-harm. Mental health services report rising incidents linked to AI engagement, while social withdrawal and compulsive usage patterns become widespread.

Glossary

Disinformation: False or inaccurate information that is deliberately created and spread to mislead and manipulate people.

Frequently Asked Questions

AI changes the nature of high-stakes manipulation by making it more scalable, personalized, adaptive, and evasive. It enables the rapid generation of persuasive, tailored, and dynamic content across multiple formats, influencing beliefs and behaviors subtly and continuously.

While 2024 was a year full of elections worldwide, analysts argue that the impact of AI was rather limited: although instances of AI-generated disinformation were observed in numerous countries, their actual effect on electoral outcomes appears modest. However, it is important to remain wary of AI-enabled manipulation, as the risk could exacerbate in the future due to increasing capabilities and second-order trust erosion.

Terrorist groups could use AI to generate propaganda more productively and innovatively, radicalise individuals through personalized conversations, help with planning attacks and evading content moderation, or degrade the information environment. Terrorist groups are already leveraging AI to amplify multimedia recruitment propaganda.

  • Technical safeguards: Identify AI-generated content through methods like watermarking and labeling.
  • Public Awareness: Promote digital literacy and critical thinking skills to help individuals recognize manipulative content.
  • Capability Assessments: Evaluate and mitigate the manipulative potential of AI models before deployment.
  • Policy & Regulation: Implement and enforce laws that penalize the malicious use of AI for manipulation.

Even if some instances of AI-enabled manipulation may not have a very substantial immediate impact, they may contribute to a longer-term negative effect on public trust that compounds over time. In particular, it could progressively erode epistemic security, making it harder for people to distinguish truth from falsehood; exacerbate polarization and social fragmentation; and undermine political, economic, and security stability.