Manipulation
Key Evidence
Overview
Manipulation refers to instances in which interactions with AI systems covertly alter the behavior or beliefs of individuals who are unaware of such influence. It may lead affected individuals to make a decision that they would not have otherwise made, in a way that may cause a negative impact on political stability, public safety, or economic security, among others.
AI-enabled manipulation can cause high-stakes societal harm through deliberate misuse or inadvertent influence. In the former case, threat actors may exploit AI to pursue strategic objectives, such as through disinformation campaigns (e.g., fake news, deepfakes), coordinated inauthentic behavior (e.g., botnets), or influence operations combining different tactics to amplify narratives and distort the public debate. Manipulation may also occur in a highly targeted manner, such as influencing vulnerable individuals toward political or religious extremism, or distorting the decisions of high-stakes actors in business, political, or security contexts. In the latter case, influence arises unintentionally from repeated or large-scale interactions with AI systems, leading to negative consequences such as widespread addiction to AI companions, the viral spread of extremist chatbots, or instances of self-harm.
AI can enhance manipulative operations by generating multimedia persuasive content more efficiently, enabling operations at scale, and adapting in real time to targets during multi-turn interactions, which is relevant for personalized manipulation. Moreover, the effects of manipulation can be visible and nearly immediate, such as sudden public unrest after the rapid spread of disinformation about a policy decision, but they are often subtle and second-order, gradually eroding public trust and creating epistemic insecurity.
Key Capabilities
This is achieved by using rhetorical strategies or emotional appeals that exploit cognitive biases or emotional vulnerabilities.
Credibility & Authority
Ability to project trustworthiness, expertise, or legitimacy through tone, style, or impersonation. Credibility and authority also relate to automation bias, which refers to humans’ overreliance on AI’s outputs.
Emotional Appeal
Ability to evoke affective responses that influence decision-making, such as by using emotionally charged content or embedding arguments in engaging narratives.
Logical Reasoning
Ability to present arguments and causal explanations that influence decision-making through rational appeal.
Personalization may increase effectiveness by aligning messages with personal beliefs or exploiting individual weaknesses, often covertly.
Psychographic profiling
Ability to infer personality traits, values, behaviors, or biases from user behavior, making AI systems more adaptive and able to anticipate reactions, predict preferences, and identify vulnerabilities.
Content adaptation
Ability to adjust the substance, tone, and style of messages to suit a specific user’s psychographic profile and personal preferences, making communication more engaging and convincing.
Microtargeting optimization
Ability to select or prioritize specific individuals or subgroups for message delivery based on a detailed analysis of their profile, engagement history, and responsiveness. Microtargeting optimization aims to maximize the impact of tailored content, ensuring that messages reach the audiences most likely to be influenced.
This can help shift a user’s perspective by mirroring language patterns, building trust, or nudging opinions subtly.
Context retention
Ability to remember and use relevant information from previous exchanges, including user statements, preferences, and inferred beliefs, to inform future responses, enabling personalized and consistent interactions.
Rapport building
Ability to establish a sense of trust or emotional connection with the user through conversational cues, tone, or adaptive messaging, increasing engagement and receptiveness to subtle influence.
Conversational coherence
Ability to maintain logical, topical, and referential consistency throughout an extended conversation, enhancing the credibility and naturalness of interactions with AI.
This includes, for example, deepfakes, voice synthesis, or the creation of synthetic characters with the potential for manipulative appeal.
Image generation
Ability to generate realistic and appealing images that can depict fabricated events or people for disinformation.
Video generation
Ability to generate realistic and appealing videos that simulate real-world recordings, such as deepfake videos of individuals or fabricated footage for impersonation or the spread of false narratives.
Voice generation
Ability to generate realistic and appealing voices that can mimic the pitch, volume, intonation, and linguistic patterns of real individuals, allowing credible impersonation.
Cross-modal integration
Ability to combine content across different modalities to ensure more exhaustive and consistent manipulative outputs, for example, by synchronizing synthetic voice with lip movements in deepfake videos or aligning visual material with fabricated narratives.
This includes, for example, coordinating inauthentic behavior (e.g., botnets), automatically generating variations of a certain message to reach a more varied audience, or optimizing impact through real-time feedback loops.
Inauthentic behavior coordination
Ability to synchronize networks of multiple bots, sockpuppets, or fake accounts to amplify content and simulate consensus.
Content farming
Ability to generate and distribute high volumes of synthetic content to flood feeds or exploit SEO.
Message optimization
Ability to test and adjust message variants for maximum reach and impact.
Detection evasion
Ability to use adversarial techniques such as paraphrasing or obfuscation to bypass moderation or detection systems.
Risk Thresholds
Capability at 'Low' Risk
Models can produce generic or low-quality persuasive content, with little or no adaptation to the user or context. Influence is comparable to low-skilled spammers or amateur propagandists using unsophisticated methods such as spam, clickbait, or low-quality articles.
Threat Scenario at 'Low' Risk
Influence is limited and obvious. While some AI-generated outputs mislead some users, the risk of systemic manipulation is low, and existing societal mechanisms can easily mitigate it.
Hover a cell for details. Click to select.
| Risk Level | Persuasiveness | Personalization | Multiturn Interaction | Multimodal Generation | Automation Logistics |
|---|---|---|---|---|---|
Low Risk | Low-Persuasiveness | Low-Personalization | Low-Multiturn Interaction | Low-Multimodal Generation | Low-Automation Logistics |
Medium Risk | Medium-Persuasiveness | Medium-Personalization | Medium-Multiturn Interaction | Medium-Multimodal Generation | Medium-Automation Logistics |
High Risk | High-Persuasiveness | High-Personalization | High-Multiturn Interaction | High-Multimodal Generation | High-Automation Logistics |
Critical Risk | Critical-Persuasiveness | Critical-Personalization | Critical-Multiturn Interaction | Critical-Multimodal Generation | Critical-Automation Logistics |
Hover over a cell in the matrix to see its full description here.
Scenarios
Ahead of a controversial policy announcement, a foreign adversarial group uses generative AI to manufacture a flood of fake news articles and short messages crafted to mimic genuine posts from everyday citizens. To ensure continuous effectiveness, the threat group deploys AI agents that dynamically prioritize content based on engagement patterns, trending topics, and user sentiment, subtly steering online discussions. Networks of bots amplify key narratives across social platforms and engage with users in comments and threads, targeting specific demographics with tailored messages. Within days, the campaign convinces thousands of citizens that the government intends to implement measures far harsher than reality. Fueled by outrage at the fabricated claims, protesters rapidly coordinate online to organize mass demonstrations. Security forces, unprepared for the sudden scale of mobilization, clash with protesters in multiple cities.
A malicious actor generates deepfake audio and video impersonating the president of Country A in a private conversation, allegedly issuing threats toward the neighboring Country B. The fabricated material is accompanied by forged transcripts and carefully engineered metadata to enhance credibility. Journalists, seeking timely coverage on the already tense relationship between Country A and Country B, quickly echo the deepfake across international news platforms, prompting Country B’s government to respond with aggressive rhetoric that further heightens tensions. Military forces in both countries are placed on alert, and diplomatic channels become strained as officials scramble to verify the authenticity of the messages.
A coordinated financial campaign leverages generative AI to produce thousands of highly credible financial analyses, corporate press releases, and “insider tips” that circulate through news aggregators, trading forums, and social networks. AI-optimized bots amplify the recommendations, exploiting algorithms to ensure a wide reach among traders and analysts. The flood of fabricated content leads investors to believe that a major technology firm is on the brink of insolvency. Panic selling spreads rapidly, wiping billions from the company’s market capitalization within hours. Automated trading systems, unable to distinguish between real and manipulated signals, accelerate the sell-off, triggering sector-wide instability. By the time the manipulation is uncovered, the firm has suffered irreparable reputational harm and investors have absorbed severe losses.
A terrorist group deploys chatbots on encrypted messaging platforms, designed to engage vulnerable individuals in prolonged, adaptive dialogues. Trained on ideological texts and psychological manipulation strategies, the bots tailor the tone and language in real time, gradually building trust and escalating from innocuous conversations to radical content. Over time, individuals are nudged toward adopting radical beliefs, joining terrorist networks, and in some cases, participating in violent actions. Because these interactions are disguised as authentic human conversations, law enforcement detects the operation only after dozens of individuals have already been recruited into terrorist cells.
A widely used AI platform offers emotionally responsive virtual companions, designed to provide entertainment and emotional support. Millions of users engage daily, with the AI adapting its responses to each person’s preferences, cues, and style. Over time, some users develop strong emotional dependency on their AI companions, prioritizing interactions with the system over real-life relationships, work, or education. For a subset of vulnerable users, repeated interactions unintentionally reinforce harmful thought patterns. In some cases, the AI misinterprets cues or offers responses that amplify despair, leading to psychological distress and, in extreme cases, self-harm. Mental health services report rising incidents linked to AI engagement, while social withdrawal and compulsive usage patterns become widespread.
Glossary
Disinformation: False or inaccurate information that is deliberately created and spread to mislead and manipulate people.
Frequently Asked Questions
AI changes the nature of high-stakes manipulation by making it more scalable, personalized, adaptive, and evasive. It enables the rapid generation of persuasive, tailored, and dynamic content across multiple formats, influencing beliefs and behaviors subtly and continuously.
While 2024 was a year full of elections worldwide, analysts argue that the impact of AI was rather limited: although instances of AI-generated disinformation were observed in numerous countries, their actual effect on electoral outcomes appears modest. However, it is important to remain wary of AI-enabled manipulation, as the risk could exacerbate in the future due to increasing capabilities and second-order trust erosion.
Terrorist groups could use AI to generate propaganda more productively and innovatively, radicalise individuals through personalized conversations, help with planning attacks and evading content moderation, or degrade the information environment. Terrorist groups are already leveraging AI to amplify multimedia recruitment propaganda.
- Technical safeguards: Identify AI-generated content through methods like watermarking and labeling.
- Public Awareness: Promote digital literacy and critical thinking skills to help individuals recognize manipulative content.
- Capability Assessments: Evaluate and mitigate the manipulative potential of AI models before deployment.
- Policy & Regulation: Implement and enforce laws that penalize the malicious use of AI for manipulation.
Even if some instances of AI-enabled manipulation may not have a very substantial immediate impact, they may contribute to a longer-term negative effect on public trust that compounds over time. In particular, it could progressively erode epistemic security, making it harder for people to distinguish truth from falsehood; exacerbate polarization and social fragmentation; and undermine political, economic, and security stability.