9. The AI Application Security Continuous Loop
Introduction
The Blueprint layers define what to protect. Layer by layer, you’ve built a defense architecture that covers data, models, infrastructure, users, access services, and zero-day threats. But security is not a one-time deployment – it’s a continuous cycle. Vulnerabilities evolve, attack techniques advance, and the AI systems you deploy today will face threats that didn’t exist when you first assessed them.
AI Scanner and AI Guard form the operational heartbeat of AI security. Scanner proactively assesses models for vulnerabilities before and during deployment. Guard provides runtime protection by filtering inputs and outputs in real time. Together, they create a continuous loop: scan for weaknesses, protect against them, validate that protections work, and improve based on what you learn. This cycle runs indefinitely – because attackers never stop adapting, and neither should your defenses.
What will I get out of this?
By the end of this section, you will be able to:
- Describe AI Scanner’s proactive assessment capabilities, including attack technique testing, vulnerability identification, and harmful content pattern detection.
- Explain AI Guard’s runtime protection capabilities, including prompt injection blocking, sensitive information leakage prevention, and compliance enforcement.
- Map the scan-protect-validate-improve continuous loop and explain how each phase feeds the next.
- Identify how Scanner and Guard complement each Blueprint layer, connecting assessment and protection to the controls you learned in Sections 3-8.
- Compare deployment options for both Scanner and Guard (Trend-hosted vs. self-hosted).
AI Scanner: Proactive Assessment
AI Scanner is a pre-deployment and periodic assessment tool that evaluates AI models for vulnerabilities. Think of it as a penetration test specifically designed for AI systems – it probes models using known attack techniques and reports what succeeded, what the model is vulnerable to, and what needs remediation before the model reaches production.
What AI Scanner Does
Scanner evaluates models against a comprehensive library of attack techniques:
- Attack technique testing: Scanner runs known prompt injection variants, jailbreak attempts, and adversarial inputs against the model. It doesn’t wait for an attacker to find these vulnerabilities – it finds them first.
- Vulnerability identification: Scanner identifies which specific vulnerability categories the model is susceptible to, mapped to OWASP categories. A model might resist prompt injection but be vulnerable to system prompt leakage – Scanner distinguishes between these.
- Harmful content pattern detection: Scanner tests whether the model will generate harmful, biased, or policy-violating content under various conditions. This includes testing safety alignment under adversarial pressure.
- Data leakage assessment: Scanner probes whether the model has memorized and will reproduce sensitive training data – a key indicator of information disclosure risk.
OWASP Mapping
Scanner’s assessment capabilities map directly to the attack categories you studied in Chapter 2:
| Scanner Capability | OWASP Category | What It Catches |
|---|---|---|
| Injection technique testing | LLM01: Prompt Injection | Models susceptible to direct and indirect injection |
| Data leakage probing | LLM02: Sensitive Information Disclosure | Models that memorize and reproduce training data |
| Poisoning indicator detection | LLM04: Data and Model Poisoning | Models showing signs of backdoor behavior or training compromise |
| System prompt extraction testing | LLM07: System Prompt Leakage | Models that reveal their configuration under adversarial probing |
Deployment Options
Scanner is available in two deployment modes:
- Trend-hosted (cloud): Scanner runs in Trend Micro’s cloud infrastructure. Organizations send model API endpoints to be assessed. Ideal for teams that want assessment without infrastructure overhead.
- Self-hosted (on-premises/private cloud): Scanner runs within the organization’s own infrastructure. Model data never leaves the network. Required for organizations with strict data residency or air-gapped environments.
Defense Connection
AI Scanner directly addresses the prompt injection techniques you studied in Chapter 2 by testing for them before deployment. Rather than discovering that your model is vulnerable to role-play jailbreaks when a user exploits them in production, Scanner identifies these weaknesses in a controlled assessment environment – giving you time to apply prompt hardening and guardrails before the model goes live.
AI Guard: Runtime Protection
If Scanner is the pre-flight safety check, Guard is the in-flight protection system. AI Guard operates in the live request path, inspecting every prompt that enters and every response that exits an AI system in real time. It is the enforcement layer that turns Scanner’s findings into active defenses.
What AI Guard Does
Guard sits between applications and AI services, filtering traffic in both directions:
- Prompt injection blocking: Guard detects and blocks injection attempts – both direct injections in user prompts and indirect injections embedded in documents, tool outputs, or other data the model processes.
- Harmful content prevention: Guard filters responses that contain violent, illegal, abusive, or policy-violating content before they reach users.
- Sensitive information leakage detection: Guard scans model outputs for PII, credentials, system prompt fragments, and other sensitive data, redacting or blocking responses that would leak this information.
- Compliance enforcement: Guard applies content policies that ensure model outputs meet regulatory and organizational requirements – particularly important for industries with strict content regulations.
GA and Deployment
AI Guard reached general availability on December 1, 2025. Like Scanner, Guard offers dual deployment options:
- Trend-hosted: Guard runs as a cloud service. API traffic is routed through Trend’s filtering infrastructure. Low latency is maintained through optimized processing pipelines.
- Self-hosted: Guard runs within the organization’s network, ensuring no AI traffic leaves the environment. Required for organizations handling classified data or operating under strict data sovereignty requirements.
OWASP Mapping
Guard’s runtime capabilities defend against attack categories at the point of interaction:
| Guard Capability | OWASP Category | What It Blocks |
|---|---|---|
| Injection blocking | LLM01: Prompt Injection | Injection attempts in prompts before they reach the model |
| Leakage prevention | LLM02: Sensitive Information Disclosure | PII, credentials, and training data in model responses |
| Output filtering | LLM05: Improper Output Handling | Harmful or malicious content in model outputs |
| Prompt leakage blocking | LLM07: System Prompt Leakage | System prompt fragments in model responses |
Defense Connection
AI Guard provides runtime defense against the sensitive information disclosure attacks from Chapter 2. When an attacker attempts to extract training data or PII through crafted prompts, Guard’s output filtering detects and blocks the leakage in real time – even if the model itself would have complied with the extraction request.
The Continuous Loop
Scanner and Guard are not standalone tools – they form a continuous cycle where each phase feeds the next. Assessment informs protection, protection generates data that drives reassessment, and reassessment improves protection. The loop never stops because the threat landscape never stops changing.
The Scan-Protect-Validate-Improve Cycle
graph LR
S["<b>SCAN</b><br/><small>AI Scanner assesses<br/>models for vulnerabilities.<br/>Attack technique testing,<br/>data leakage probing,<br/>poisoning detection.</small>"]
P["<b>PROTECT</b><br/><small>AI Guard deploys<br/>runtime guardrails based<br/>on Scanner findings.<br/>Injection blocking,<br/>output filtering.</small>"]
V["<b>VALIDATE</b><br/><small>Monitor Guard<br/>effectiveness. Re-scan<br/>periodically. Measure<br/>detection rates and<br/>false positives.</small>"]
I["<b>IMPROVE</b><br/><small>Update Guard rules<br/>from new findings.<br/>Feed blocked patterns<br/>back into Scanner.<br/>Tighten protections.</small>"]
S -->|"Identifies<br/>vulnerabilities"| P
P -->|"Runtime logs<br/>& blocked threats"| V
V -->|"New patterns<br/>& gaps"| I
I -->|"Updated threat<br/>intelligence"| S
style S fill:#1C90F3,color:#fff
style P fill:#2d5016,color:#fff
style V fill:#ffa726,color:#fff
style I fill:#6a1b9a,color:#fff
Phase-by-Phase Breakdown
1. SCAN – Assess Models for Vulnerabilities
Scanner runs a comprehensive assessment against the model, testing every known attack technique in its library. The output is a vulnerability report that identifies:
- Which attack categories the model is susceptible to
- How severe each vulnerability is (does it require sophisticated exploitation or can a script kiddie trigger it?)
- Which specific prompt patterns succeeded
- What remediation is recommended
This assessment happens both pre-deployment (as a gate in the CI/CD pipeline) and periodically (scheduled re-assessments to catch drift or newly discovered vulnerability patterns).
2. PROTECT – Deploy Runtime Guardrails
Based on Scanner’s findings, Guard rules are configured to defend against the identified vulnerabilities. If Scanner found that the model is susceptible to role-play jailbreaks, Guard is configured with semantic detection rules for role-play patterns. If Scanner found data leakage risk, Guard’s output filtering is configured to detect and redact the specific data patterns that were exposed.
Guard’s rules are not generic – they are tuned to the specific model’s vulnerabilities, as identified by Scanner.
3. VALIDATE – Monitor and Reassess
With Guard deployed, the organization monitors its effectiveness:
- Detection rates: How many real attacks is Guard catching? Are there attack patterns that bypass the current rules?
- False positive rates: Is Guard blocking legitimate requests? High false positive rates degrade user experience and lead to teams disabling protections.
- New attack patterns: Are attackers using techniques that weren’t in Scanner’s library at the time of the last assessment?
- Periodic re-scanning: Scanner runs scheduled assessments to verify that Guard’s protections are still effective and to detect new vulnerabilities introduced by model updates, configuration changes, or newly discovered techniques.
4. IMPROVE – Update and Tighten
The Validate phase generates insights that drive improvement:
- Guard rule updates: New blocked patterns from Guard logs are incorporated into rule sets. If Guard catches a novel injection variant, that pattern is added to the detection rules.
- Scanner library updates: Attack techniques discovered through Guard’s blocking logs are added to Scanner’s testing library. The next assessment will include these patterns.
- Policy refinements: False positive analysis leads to policy tuning – making rules more precise to catch attacks without blocking legitimate use.
- Feedback to Blueprint layers: Findings feed back to the relevant Blueprint layers. If Guard is catching an unusual volume of data exfiltration attempts, that triggers a Layer 1 (Data) review.
The improvement phase feeds directly back into the next scan – and the loop begins again.
Scanner/Guard Across Blueprint Layers
Scanner and Guard are not confined to a single Blueprint layer. They operate across the entire stack, contributing assessment and protection capabilities to every layer you’ve studied.
| Blueprint Layer | AI Scanner Contribution | AI Guard Contribution |
|---|---|---|
| Layer 1: Data | Identifies sensitive data patterns in model inputs/outputs; detects memorized training data | Blocks sensitive data leakage in responses; enforces data classification policies at runtime |
| Layer 2: Models | Assesses models for adversarial robustness, backdoor indicators, and safety alignment | N/A (Guard protects at runtime, not at the model artifact level) |
| Layer 3: Infrastructure | Evaluates model endpoints for vulnerability exposure; feeds findings into AI-SPM risk scoring | Generates runtime telemetry that feeds into posture monitoring dashboards |
| Layer 4: Users | Tests whether models generate harmful content that would affect users; assesses over-confidence risk | Filters harmful, biased, or misleading content before it reaches users |
| Layer 5: Access | Tests model susceptibility to injection, jailbreaking, and system prompt extraction | Provides the runtime filtering that Layer 5’s AI Gateway depends on for prompt/response inspection |
| Layer 6: Zero-Day | Identifies emerging vulnerability patterns through continuous re-assessment | Generates behavioral data that feeds Layer 6’s anomaly detection baselines |
The key insight: Scanner and Guard are the tools that make the Blueprint operational. The Blueprint defines what to protect. Scanner and Guard continuously verify that it’s protected and actively enforce those protections in real time.
Defense Connection
The cross-layer integration means that a single Scanner assessment or Guard deployment addresses multiple OWASP categories simultaneously. A Scanner finding that a model leaks training data affects Layer 1 (data protection), Layer 5 (response filtering), and Layer 4 (user protection). Guard’s injection blocking serves both Layer 5 (access control) and Layer 6 (zero-day defense) by catching both known patterns and novel variations.
Defense Perspective: GitHub Copilot Code Injection
The attack (from Chapter 2 Section 2): CVE-2025-53773 demonstrated that GitHub Copilot could be manipulated through indirect prompt injection. Attackers embedded hidden instructions in code repositories (in comments, documentation, or obfuscated code patterns) that, when processed by Copilot, caused it to generate code containing malicious payloads – backdoors, credential exfiltration, or dependency confusion attacks. Developers who accepted Copilot’s suggestions unknowingly introduced attacker-controlled code into their projects.
How the Scanner/Guard continuous loop would have caught this:
-
SCAN phase (AI Scanner): Pre-deployment assessment of the Copilot-integrated development environment would have tested the model’s susceptibility to indirect injection through code context. Scanner would have identified that the model follows embedded instructions from repository content – flagging the injection vector before developers encountered it.
-
PROTECT phase (AI Guard): Guard deployed as a filter between the code repository context and the model would have detected the hidden instruction patterns in code comments and documentation. Output filtering would have identified malicious code patterns (credential exfiltration, suspicious network calls, encoded payloads) in generated suggestions before they reached the developer.
-
VALIDATE phase: If a novel injection pattern bypassed Guard’s initial rules, monitoring would detect the anomalous code generation patterns – suggestions that include network calls, base64-encoded strings, or references to external endpoints not in the project’s normal dependency set.
-
IMPROVE phase: The blocked injection attempts and detected anomalies feed back into both Guard rules (catching similar patterns) and Scanner’s test library (testing for this class of indirect injection in future assessments).
The key insight: the continuous loop catches what static, one-time security reviews miss. Code repositories change constantly – new commits introduce new injection opportunities. Only continuous scanning and runtime protection keep pace with the evolving attack surface.
Trend Vision One Integration
AI Scanner and AI Guard are GA components within the Trend Vision One platform. Scanner integrates with CI/CD pipelines for automated pre-deployment assessment, while Guard deploys as an inline filter between applications and AI services. Both components feed findings back to the Vision One console, providing security teams with unified visibility across the entire AI security posture. Scanner assessment results appear alongside AI-SPM findings and Guard blocking logs in a single investigation timeline – enabling security teams to trace a vulnerability from its discovery (Scanner assessment) through its runtime defense (Guard blocking) to its impact on overall posture (AI-SPM risk score).
Key Takeaways
- AI Scanner proactively assesses models for vulnerabilities through attack technique testing, data leakage probing, and poisoning indicator detection before deployment
- AI Guard provides runtime protection by filtering prompts for injection patterns and scanning responses for PII, harmful content, and system prompt leakage in real time
- The scan-protect-validate-improve continuous loop ensures defenses evolve with the threat landscape as Scanner findings inform Guard rules and Guard blocking logs feed back into Scanner’s test library
- Scanner and Guard operate across all six Blueprint layers, connecting pre-deployment assessment to runtime enforcement for comprehensive AI security coverage
Test Your Knowledge
Ready to test your understanding of the AI security continuous loop? Head to the quiz to check your knowledge.
Up next
Scanner and Guard operationalize the Blueprint for security teams. But developers building AI applications need their own framework for secure development practices. In Section 10, you’ll learn the LEARN Architecture – a developer-focused mnemonic that organizes five key application-level defense practices, complementing the infrastructure-focused Blueprint.