Prompt Injection
Definition
Prompt injection is the class of attacks against large language model systems in which adversarial content is inserted into the model’s context — through user input, retrieved documents, web content, or tool outputs — to manipulate the model into taking actions, revealing information, or generating outputs contrary to the deploying organization’s intent. Direct prompt injection occurs when an attacker controls the user input. Indirect prompt injection occurs when the attacker controls content the model retrieves (a poisoned web page, a malicious email, a document a user uploads). Prompt injection is recognized as a foundational security risk for LLM-integrated systems and is included in security guidance frameworks including the OWASP Top 10 for LLM Applications.
Why it matters for communications
Prompt injection has emerged as one of the highest-stakes enterprise AI security topics — and one of the most frequently misunderstood in public communications. Communications around AI deployment, AI security posture, and AI breach incidents require accurate vocabulary covering the prompt injection attack surface. Crisis communications around prompt-injection-related incidents has become a recurring category in cybersecurity adjacent AI Communications.
Related terms Jailbreak · Red Teaming · Tool Use · Retrieval-Augmented Generation · LLM security
Related entities OWASP · NIST · Anthropic · OpenAI · Google · enterprise security vendors · academic AI security research community
Primary sources OWASP Top 10 for LLM Applications · Anthropic, OpenAI, and Google DeepMind security publications · NIST AI RMF security guidance · academic literature on prompt injection.
