Chapter 2: Vulnerabilities and Attacks on LLMs
Is this Chapter for You?
Every capability you explored in Chapter 1 – from prompt engineering to RAG pipelines to agentic AI workflows – has a corresponding attack surface. Understanding those attack surfaces isn’t optional for technical professionals building with or deploying AI systems. It’s how you protect your users, your data, and your organization.
This chapter is designed for technical professionals who need to understand AI attack vectors well enough to explain them, demonstrate them, and ultimately defend against them.
-
Are you a developer, architect, or security professional responsible for deploying or securing AI-powered applications?
-
Do you need to explain AI security risks to stakeholders, customers, or leadership in concrete terms – not just abstract warnings?
-
Do you want to understand how prompt injection, data poisoning, model theft, and agentic attack vectors actually work in practice?
If so, this chapter will give you that fluency.
Responsible Use
The techniques in this chapter are presented for educational purposes – to help you understand, identify, and communicate about AI security risks in professional contexts. The goal is attack fluency for defense, not exploitation. Use this knowledge to protect systems and educate stakeholders. Always obtain explicit authorization before testing any techniques against systems you don’t own.
How is this Chapter Different?
TL;DR
The primary goal of this chapter is to build your attack fluency – understanding vulnerabilities deeply enough to explain them to a customer, trace an attack flow on a whiteboard, and recognize when a system is exposed. We go beyond awareness to demonstration-ready knowledge.
Most AI security content either stays at a surface-level overview or dives into academic research papers. This chapter occupies the practical middle ground: you’ll learn the attack taxonomy, see sanitized examples of real techniques, walk through Mermaid diagrams of attack flows, and study named case studies with companies, dates, and outcomes.
By the end, you’ll be able to have a credible, detailed conversation about AI security threats with anyone from a junior developer to a CISO.
What You’ll Learn in This Chapter
By the end of this chapter, you will be able to:
- Map the AI attack surface: Identify where attacks target each stage of the AI lifecycle – from training through deployment to inference
- Explain the OWASP LLM Top 10 (2025): Describe all 10 vulnerability categories with real-world examples and demonstrate why each matters
- Demonstrate prompt-level attacks: Walk through direct injection, indirect injection, system prompt leaking, and jailbreaking with sanitized examples
- Analyze data and training attacks: Distinguish between data poisoning, RAG poisoning, backdoor attacks, and supply chain risks
- Describe model and infrastructure attacks: Explain serialization exploits, adversarial inputs, model theft, and denial of service vulnerabilities
- Assess agentic attack vectors: Identify how tool use, multi-agent systems, and autonomous decision-making create novel attack surfaces
- Cite real incidents: Reference specific companies, dates, and outcomes for each attack category when discussing threats with customers
Chapter Topics: Understanding AI Vulnerabilities and Attacks
Here’s what we’ll cover in this chapter:
-
The AI Attack Surface – The complete attack taxonomy: OWASP LLM Top 10 (2025), threat actors, and why every AI capability from Chapter 1 has a corresponding vulnerability
-
Prompt-Level Attacks – Direct injection, indirect injection, system prompt leaking, and jailbreaking – with sanitized examples you can use in customer conversations
-
Data and Training Attacks – Data poisoning, RAG poisoning, backdoor attacks, and supply chain risks – how attackers compromise models before they ever reach production
-
Model and Infrastructure Attacks – Serialization exploits, adversarial inputs, model theft, and denial of service – the threats that come with self-hosted and on-premises deployments
-
Agentic Attack Vectors – How tool use, multi-agent orchestration, and autonomous action create entirely new attack surfaces – mapped to the OWASP Agentic AI Top 10 (2026)
-
Output and Trust Exploitation – Sensitive information disclosure, misinformation generation, and how improper output handling turns AI into an attack vector against downstream systems
-
Small Language Model Threats – Why SLMs on edge devices, phones, and IoT create unique security challenges – and why smaller doesn’t mean safer
Learning by Doing
Each section in this chapter includes:
- Attack flow diagrams: Mermaid visualizations showing how each technique works step by step
- Sanitized examples: Real-world-style prompt examples and attack scenarios you can study safely
- Named case studies: Specific incidents with companies, dates, and outcomes – no “a major tech company” generics
- Interactive quizzes: Test your understanding of attack concepts and OWASP mappings
These elements are designed to build demonstration-ready knowledge you can use in professional contexts.
Learning Progression
This chapter builds on the AI foundations from Chapter 1. Once you understand the attack landscape, continue to Chapter 3: Protecting LLMs from Attacks to learn how to defend against every attack covered here.
Ready to understand what your AI systems are really up against?