7. Agentic AI

Introduction

Throughout this chapter, we’ve explored the foundations of AI systems, from understanding their core architectures to examining deployment strategies, technical underpinnings, crafting effective prompts, and implementing inference techniques. Now we turn to the most significant development in the 2025/2026 AI landscape: agentic AI is here, and it’s in production.

This isn’t speculation about a future technology. Software engineers use Claude Code and Cursor daily to write production code. Businesses run autonomous workflows through n8n and CrewAI. OpenClaw demonstrates open-source agentic capabilities. The shift from passive tools to proactive agents has already happened – and it fundamentally changes both what AI can do and what can go wrong.

What will I get out of this?

By the end of this section, you will be able to:

Describe the current state of agentic AI as production technology, not future speculation.
Define the characteristics that make an AI system “agentic”, including goal-setting, tool use, decision-making, and autonomous action.
Identify production agentic tools across three categories: developer tools, business automation, and open-source frameworks.
Explain the agent loop (plan, act, observe, decide) and identify where trust boundaries are crossed.
Analyze the security implications of agentic AI – how each capability creates a corresponding attack surface.
Distinguish between simple automation, tool-augmented LLMs, and autonomous agents on the spectrum of AI agency.
Recognize why agentic AI amplifies security risks compared to traditional generative AI.

Agentic AI Is Production Reality

Generative AI evolved in stages, and we are firmly in the agentic stage:

Stage 1: Base Models – LLMs that generate text from prompts. Useful but limited to training data. No ability to verify facts or take actions.
Stage 2: RAG Systems – LLMs augmented with external knowledge retrieval. Can reference current information. Still reactive – only responds to prompts.
Stage 3: Agentic AI (Now) – AI systems that pursue goals, make decisions, use tools, and take autonomous action. Can write code, execute commands, browse the web, manage files, and interact with external services – all without human intervention at each step.

Why “Now” Matters

In 2023, agentic AI was a research concept. In 2025/2026, it’s daily tooling. Claude Code ships as a production CLI tool. Cursor is a standard development environment. Devin handles entire engineering tasks. n8n workflows orchestrate multi-step business processes with AI agents making decisions at each node. The transition from “interesting research” to “how we work” happened faster than most predicted.

The Agent Loop

At the core of every agentic system is a loop: the agent plans what to do, selects and uses tools, observes the results, and decides whether to continue or return a result. This loop is what gives agents their power – and their risk.

graph TB
    subgraph "Trust Boundary: Agent Core"
        A["1. Plan<br/>(Decompose goal into steps)"] --> B["2. Select Tool<br/>(Choose appropriate action)"]
        B --> C["3. Execute Tool<br/>(Take action in the world)"]
        C --> D["4. Observe Result<br/>(Evaluate what happened)"]
        D --> E{"5. Goal Met?"}
        E -->|No| A
    end

    subgraph "Trust Boundary: External World"
        F[API Endpoints]
        G[File System]
        H[Database]
        I[Web Browser]
        J[Code Execution]
    end

    C -.->|"crosses trust boundary"| F
    C -.->|"crosses trust boundary"| G
    C -.->|"crosses trust boundary"| H
    C -.->|"crosses trust boundary"| I
    C -.->|"crosses trust boundary"| J

    E -->|Yes| K[Return Result]

    style A fill:#2d5016,color:#fff
    style B fill:#2d5016,color:#fff
    style C fill:#8b0000,color:#fff
    style D fill:#2d5016,color:#fff
    style F fill:#4a4a4a,color:#fff
    style G fill:#4a4a4a,color:#fff
    style H fill:#4a4a4a,color:#fff
    style I fill:#4a4a4a,color:#fff
    style J fill:#4a4a4a,color:#fff

Notice the red “Execute Tool” step – this is where the agent crosses from its internal reasoning space into the external world. Every tool execution is a trust boundary crossing, and every trust boundary crossing is a potential attack surface.

Security Implication

In traditional generative AI, the worst case is a bad text output. In agentic AI, the worst case is unauthorized actions – deleting files, executing malicious code, exfiltrating data through API calls, or making unauthorized purchases. The agent loop transforms information risk into action risk. We’ll explore how to attack these systems in Chapter 2.

The Spectrum of Agency

Not every AI-powered system is truly “agentic.” Agency exists on a spectrum, and understanding where a system falls on that spectrum determines its capabilities and its risk profile.

Level	Description	Example	Attack Surface
Simple Automation	LLM used as a classifier or text processor in a fixed pipeline	Spam filter using LLM for classification	Low – limited to output manipulation
Tool-Augmented LLM	LLM can call predefined tools but follows a scripted workflow	Chatbot that can look up order status via API	Medium – tool misuse, data exposure
Semi-Autonomous Agent	LLM decides which tools to use and in what order, with human approval gates	Code review assistant that suggests changes and applies them after approval	Medium-High – decision manipulation, approval bypass
Autonomous Agent	LLM independently plans, executes, and iterates without human intervention	Coding agent that implements features end-to-end	High – full action space available to compromised agent

Script vs. Agent: The Critical Distinction

Not an Agent: A Python script that prompts an LLM with “Is this email a phishing attempt? Answer TRUE or FALSE” and routes the email accordingly. This is automation with an LLM component – the script has no agency beyond executing predefined steps.

True Agent: A system that receives “investigate this security alert,” then autonomously decides to check logs, query the SIEM, correlate with threat intelligence, draft a report, and escalate if severity warrants it – choosing its own path based on what it discovers.

Agentic Tools in Production

Agentic AI tools are already part of the daily workflow for millions of professionals. Here’s the current landscape across three categories:

Developer Tools: AI That Writes Code

These tools represent the most mature category of agentic AI, already deeply embedded in software development workflows.

Claude Code (Anthropic) A CLI-based agentic coding assistant that can navigate codebases, write and edit files, run terminal commands, and manage git operations. It operates in a terminal environment with direct access to the file system and development tools.

Agentic capability: Reads code, plans changes across multiple files, executes commands, verifies results
Attack surface: File system access, command execution, potential to introduce vulnerabilities in code it writes

Cursor An AI-native IDE (based on VS Code) that integrates LLMs directly into the development workflow. It can understand entire codebases, suggest multi-file changes, and execute terminal commands.

Agentic capability: Codebase-wide understanding, multi-file edits, integrated terminal operations
Attack surface: Code modification at scale, dependency changes, configuration alterations

Devin (Cognition AI) An autonomous software engineering agent that can handle entire development tasks – from reading issue tickets to writing code, creating tests, and submitting pull requests.

Agentic capability: End-to-end software engineering tasks with minimal human intervention
Attack surface: Full development environment access, can introduce subtle bugs or backdoors, interacts with external services

Security Seed: Developer Tool Risks

Every agentic developer tool has write access to your codebase and often your terminal. A compromised or manipulated agent could introduce security vulnerabilities that pass code review – because the reviewer might also be an AI agent. Chapter 2 explores how indirect prompt injection can weaponize these tools.

Business Automation: AI That Runs Workflows

These tools bring agentic capabilities to business processes, enabling non-developers to build sophisticated AI-powered workflows.

n8n An open-source workflow automation platform that combines visual workflow building with AI agent capabilities. Agents can make decisions at each workflow node, routing data, calling APIs, and processing information based on LLM reasoning.

Agentic capability: Multi-step workflows with AI decision points, integration with 400+ services
Attack surface: Access to connected services, API credentials, data flowing through workflows

AutoGPT / Auto-GPT One of the first autonomous agent frameworks, AutoGPT can decompose goals into tasks, execute them using various tools (web browsing, code execution, file management), and iterate based on results.

Agentic capability: Goal decomposition, autonomous planning and execution, web browsing
Attack surface: Internet access, file system operations, potential for uncontrolled resource consumption

CrewAI A framework for orchestrating multiple AI agents that collaborate on complex tasks. Each agent has a defined role, tools, and objectives. The agents communicate and delegate to each other.

Agentic capability: Multi-agent collaboration, role-based specialization, tool orchestration
Attack surface: Agent-to-agent communication channels, permission inheritance between agents, amplified scope through collaboration

Security Seed: Automation Risks

Business automation agents often have access to API keys, customer data, and operational systems. An agent with access to your CRM, email system, and payment processor creates a blast radius that extends far beyond text generation. Chapter 2 examines how workflow injection attacks can cascade through connected systems.

Open-Source Frameworks and Emerging Tools

The open-source ecosystem is rapidly producing agentic frameworks that make it easier to build custom agents.

OpenClaw An open-source agentic AI framework that demonstrates how agents can be built with composable tool-use patterns. It provides a reference implementation for agent architectures that prioritize transparency and extensibility.

Agentic capability: Modular tool-use framework, transparent decision logging, extensible architecture
Attack surface: Depends on configured tools and permissions – extensibility means the attack surface grows with each added capability

LangChain / LangGraph The most popular framework for building LLM-powered applications, with specific support for agentic workflows through LangGraph’s state machine approach.

Agentic capability: Stateful agent workflows, tool integration, memory management
Attack surface: Tool execution permissions, state manipulation, memory poisoning

OpenAI Assistants API OpenAI’s managed platform for building agents with code interpretation, file search, and function calling capabilities.

Agentic capability: Code execution in sandbox, file analysis, custom function calling
Attack surface: Function call manipulation, file-based attacks, sandbox escape potential

Microsoft AutoGen A framework for building multi-agent systems where agents can converse with each other, execute code, and collaborate on tasks.

Agentic capability: Multi-agent conversations, code execution, human-in-the-loop patterns
Attack surface: Inter-agent trust assumptions, code execution, escalation of privileges

Security Seed: Open-Source Agent Risks

Open-source agent frameworks give you full control but also full responsibility. There are no built-in guardrails unless you add them. Default configurations often prioritize capability over security – a pattern that Chapter 2 will show attackers know how to exploit.

Agentic Workflows: How Agents Work Together

An agentic workflow is a structured series of steps dynamically executed by one or more AI agents. What makes a workflow “agentic” is that AI agents guide the progression rather than following a predetermined path.

Distinguishing Workflow Types

Workflow Type	Characteristics	Example
Traditional	Deterministic, follows predefined sequence	A form submission that always follows the same steps
AI-Enhanced	Uses AI in predetermined ways	Text summarization: input, LLM call, output
Agentic	Dynamic, adaptable, agent-guided	Research process where agents determine paths based on findings

The Three Pillars of Agentic Workflows

Planning: The agent breaks down complex tasks into sub-tasks and determines execution order. Attack surface: plan manipulation through crafted inputs can steer the agent toward malicious actions.
Tool Utilization: Agents use predefined tools with specific permissions to carry out tasks. Attack surface: tool misuse, permission escalation, and indirect prompt injection through tool outputs.
Reflection and Iteration: Agents assess results, adjust plans, and loop until satisfied. Attack surface: poisoned observations can cause agents to take increasingly harmful corrective actions.

Example: Multi-Agent Workflow

Consider a content creation workflow:

Manager Agent: Receives the brief, decomposes the task, assigns to specialists
Research Agent: Searches for relevant information from multiple sources
Writer Agent: Creates draft based on research and brief
Editor Agent: Reviews for accuracy and clarity
Quality Control Agent: Final check against criteria

Each agent makes independent decisions. The collective workflow adapts based on intermediate results. And each inter-agent communication is a potential injection point – Chapter 2 explores how compromising one agent can cascade through the entire workflow.

Security Implications: Every Capability Is an Attack Surface

This is the critical insight that bridges Chapter 1 to Chapter 2: every agentic capability creates a corresponding attack surface. The same features that make agents powerful make them dangerous when compromised.

Capability	What It Enables	What Can Go Wrong
File system access	Read/write code, data, configs	Data exfiltration, config tampering, malware deployment
Code execution	Run scripts, install packages, build software	Remote code execution, supply chain attacks
API integration	Connect to services, databases, external tools	Credential theft, unauthorized API calls, data leaks
Web browsing	Research, data gathering, verification	Indirect prompt injection via malicious web content
Multi-agent collaboration	Complex task decomposition, parallel work	Trust chain exploitation, cascading compromises
Autonomous decision-making	Handle tasks without human approval	Unauthorized actions, business logic manipulation

Why Agentic AI Is Riskier Than Traditional AI

Unlike traditional generative AI that might leak data through its outputs, agentic systems can make operational decisions and execute actions. A compromised chatbot might reveal sensitive information. A compromised agent might delete databases, exfiltrate credentials, or deploy malicious code. The shift from “information risk” to “action risk” is the defining security challenge of the agentic era.

The Business Case

The adoption of agentic AI is driven by compelling business outcomes:

Operational Efficiency: Companies using agentic AI report 20-30% productivity gains in affected workflows
ROI Timeline: Initial investment typically recovered within 12-18 months
Market Growth: PwC projects agentic AI could contribute $2.6-$4.4 trillion annually to global GDP by 2030
Adoption Forecast: Gartner predicts 33% of enterprise software will incorporate agentic capabilities by 2028
Developer Adoption: Surveys indicate over 60% of professional developers use AI coding assistants daily in 2025

These numbers explain why organizations are adopting agentic tools despite the security implications – the productivity gains are too significant to ignore. The challenge, which we’ll address in Chapters 2 and 3, is how to capture these benefits while managing the expanded attack surface.

Looking Ahead: From Understanding to Security

This section completes Chapter 1 – your foundation in how AI and LLM systems work. You now understand:

How LLMs generate text, reason through problems, and process multiple modalities
The landscape of providers, models, and deployment options
How to communicate effectively with AI through prompt engineering
How inference works in practice, from API calls through RAG pipelines
How agentic AI systems plan, act, and iterate autonomously

In Chapter 2, we shift from understanding to attacking. Every concept you’ve learned here has a corresponding vulnerability:

Prompt engineering becomes prompt injection
RAG pipelines become data poisoning vectors
The agent loop becomes an attack surface for indirect manipulation
Tool use becomes a path to unauthorized system access
Multi-agent workflows become cascading compromise chains

The foundation you’ve built in this chapter is essential – you can’t effectively attack or defend systems you don’t understand.

Key Takeaways

Agentic AI systems pursue goals, make decisions, use tools, and take autonomous action – moving beyond passive text generation into real-world operations
The agent loop (plan, select tool, execute, observe, decide) is the core architecture, with each tool execution crossing a trust boundary
Agency exists on a spectrum from simple automation through tool-augmented LLMs to fully autonomous agents, with risk scaling accordingly
Every agentic capability (file access, code execution, API integration, web browsing) creates a corresponding attack surface
The shift from “information risk” to “action risk” is the defining security challenge of the agentic era

Test Your Knowledge

Ready to test your understanding of agentic AI? Head to the quiz to check your knowledge.

Chapter 1 Complete!

Congratulations on completing Chapter 1! You now have a strong foundation in AI and LLM fundamentals – from basic principles through to production agentic systems. In Chapter 2, you’ll explore how these systems can be attacked, manipulated, and exploited. In Chapter 3, you’ll learn how to defend them.

Previous Section Back to Top