4. Layer 2: Secure Your AI Models
Introduction
AI models are both an organization’s most valuable intellectual property and one of its most vulnerable attack surfaces. A fine-tuned model encodes months of training, proprietary data, and domain expertise into a single artifact – an artifact that can be poisoned, stolen, or weaponized if left unprotected. In Chapter 2, you saw how supply chain attacks inject malicious code through model files, how serialization exploits turn model loading into remote code execution, and how agentic supply chain vulnerabilities compromise the tools that deliver models to production.
Layer 2 of the Security for AI Blueprint addresses these threats by securing the model lifecycle – from the moment a model artifact is built or downloaded through container packaging, vulnerability scanning, integrity verification, and deployment. This layer treats every model as potential executable code until proven safe, and every model source as untrusted until verified.
What will I get out of this?
By the end of this section, you will be able to:
- Explain how container security applies to AI model serving environments and why containerized deployments require AI-specific scanning.
- Describe model integrity verification techniques including model signing, hash verification, and provenance tracking.
- Design a supply chain defense strategy for AI model artifacts, including repository security and dependency auditing.
- Identify pre-deployment vulnerability scanning approaches for AI models, including adversarial robustness testing.
- Map Layer 2 controls to specific OWASP categories and explain how they defend against supply chain and poisoning attacks.
- Apply the Layer 2 Security Checklist to evaluate an organization’s model security posture.
Container Security for AI Workloads
Most production AI models run inside containers – Docker images deployed to Kubernetes clusters, cloud inference endpoints, or edge devices. The container is the delivery mechanism that packages the model, its runtime dependencies, and its serving infrastructure into a deployable unit. If the container is compromised, the model is compromised.
AI containers differ from typical application containers in important ways:
- Large base images: Model serving frameworks (TensorFlow Serving, vLLM, Triton Inference Server) require GPU drivers, CUDA libraries, and ML-specific dependencies that significantly expand the attack surface
- Model artifacts inside the container: The model weights, configuration files, and tokenizers are packaged alongside the runtime – any vulnerability in these artifacts becomes part of the deployed container
- Elevated hardware access: GPU passthrough and shared memory requirements mean AI containers often need broader host access than typical microservices
- Long-running processes: Inference servers run continuously, unlike batch containers that start and stop – giving attackers more time to exploit vulnerabilities
The Container Security Pipeline
graph LR
MA["Model Artifact<br/><small>Weights, config,<br/>tokenizer files</small>"]
CB["Container Build<br/><small>Base image + model +<br/>serving framework</small>"]
VS["Vulnerability Scan<br/><small>Image scanning,<br/>dependency audit,<br/>artifact verification</small>"]
SR["Secure Registry<br/><small>Signed images,<br/>access-controlled<br/>artifact store</small>"]
DP["Deployment<br/><small>Runtime protection,<br/>monitoring, policy<br/>enforcement</small>"]
MA -->|"Integrity<br/>verified"| CB
CB -->|"Build<br/>complete"| VS
VS -->|"Scan<br/>passed"| SR
SR -->|"Authorized<br/>pull"| DP
G1["Gate: Hash<br/>Verification"]
G2["Gate: CVE +<br/>Malware Scan"]
G3["Gate: Image<br/>Signing"]
G4["Gate: Runtime<br/>Policy"]
G1 -.-> MA
G2 -.-> VS
G3 -.-> SR
G4 -.-> DP
style MA fill:#2d5016,color:#fff
style CB fill:#2d5016,color:#fff
style VS fill:#2d5016,color:#fff
style SR fill:#2d5016,color:#fff
style DP fill:#2d5016,color:#fff
style G1 fill:#1C90F3,color:#fff
style G2 fill:#1C90F3,color:#fff
style G3 fill:#1C90F3,color:#fff
style G4 fill:#1C90F3,color:#fff
Every stage in this pipeline has a security gate. Model artifacts are hash-verified before inclusion in the build. Built images are scanned for known CVEs and malware. Signed images are stored in access-controlled registries. And runtime policies enforce what the container can access in production. A failure at any gate blocks progression to the next stage.
Defense Connection
Container security for AI workloads directly addresses LLM03: Supply Chain. The serialization exploits from Chapter 2 – where loading a model file executes hidden code – are caught at the vulnerability scan gate. Image scanning detects malicious pickle payloads, backdoored dependencies, and known CVEs in model serving frameworks before they reach production.
Model Integrity and Provenance
If container security protects the delivery vehicle, model integrity protects the cargo. Model integrity verification ensures that the model you deploy is the model you intended to deploy – unmodified, untampered, and from a verified source.
graph LR
UM["Untrusted<br/>Model<br/><small>Downloaded from<br/>hub or vendor</small>"]
SIG["Signature<br/>Verification<br/><small>Cryptographic<br/>signing check</small>"]
HASH["Hash<br/>Verification<br/><small>SHA-256 against<br/>published hash</small>"]
PROV["Provenance<br/>Check<br/><small>Source, training<br/>data, chain</small>"]
FMT["Format<br/>Enforcement<br/><small>Safetensors<br/>required</small>"]
APPROVE["Approved<br/>for Deploy<br/><small>All gates<br/>passed</small>"]
REJECT["Rejected<br/><small>Failed<br/>verification</small>"]
UM --> SIG
SIG -->|"Valid"| HASH
SIG -->|"Failed"| REJECT
HASH -->|"Match"| PROV
HASH -->|"Mismatch"| REJECT
PROV -->|"Verified"| FMT
PROV -->|"Unknown"| REJECT
FMT -->|"Safe format"| APPROVE
FMT -->|"Unsafe format"| REJECT
style UM fill:#cc7000,color:#fff
style SIG fill:#2d5016,color:#fff
style HASH fill:#2d5016,color:#fff
style PROV fill:#2d5016,color:#fff
style FMT fill:#2d5016,color:#fff
style APPROVE fill:#1C90F3,color:#fff
style REJECT fill:#8b0000,color:#fff
The model integrity verification pipeline ensures that every model passes through multiple verification gates before deployment. An untrusted model enters the pipeline and must clear signature verification, hash matching, provenance validation, and format enforcement before being approved. Failure at any gate results in rejection.
Model Signing
Model signing applies the same principles as code signing: the model creator generates a cryptographic signature over the model artifacts (weights, configuration, tokenizer). Consumers verify the signature before loading the model. If the artifacts have been modified – even a single byte changed in the weights file – the signature verification fails.
Frameworks like Sigstore’s cosign and Hugging Face’s model signing infrastructure enable this workflow. The challenge is adoption: many organizations download models from hubs and load them directly without any signature verification.
Hash Verification
At a minimum, every model artifact should have a published SHA-256 hash that consumers verify after download. This catches:
- Man-in-the-middle modifications during download
- Storage corruption that could alter model behavior unpredictably
- Supply chain substitution where an attacker replaces a legitimate model with a poisoned version
Provenance Tracking
Provenance goes beyond “is this model intact?” to answer “where did this model come from?” A complete provenance record includes:
- Source: Which organization or individual created the model
- Training data: What datasets were used (at least at a category level)
- Training process: What hardware, framework version, and hyperparameters were used
- Modification history: What fine-tuning, quantization, or adaptation has been applied
- Distribution chain: Which registries or hubs the model passed through before reaching you
Defense Connection
Model integrity controls defend against LLM04: Data and Model Poisoning at the distribution layer. In Chapter 2 Section 3, you learned how poisoned models can be uploaded to model hubs with legitimate-looking documentation. Provenance tracking and hash verification would detect that a downloaded model doesn’t match the expected artifact from the trusted source – flagging the substitution before the model is loaded.
Supply Chain Defense
The AI model supply chain extends far beyond model weights. It includes every dependency, tool, framework, and service that contributes to building and deploying a model. Defending this supply chain requires a systematic approach.
Model Repository Security
Organizations that host internal model registries need the same security controls they apply to code repositories:
- Access controls: Role-based permissions for uploading, modifying, and downloading models
- Audit logging: Track who uploaded which model, when, and from where
- Automated scanning: Every model uploaded to the registry is automatically scanned for known vulnerabilities and malicious content
- Version pinning: Deployments reference specific model versions by hash, not by mutable tags
Dependency Auditing
AI applications depend on complex stacks of Python packages, framework versions, and system libraries. A single compromised dependency can undermine the security of the entire stack – as the Ultralytics supply chain attack demonstrated when a cryptocurrency miner was injected into a popular computer vision package.
Dependency auditing for AI projects should include:
- Software Bill of Materials (SBOM): Generate and maintain an SBOM for every model serving container
- Automated vulnerability scanning: Run
pip audit,safety check, or equivalent tools in CI/CD pipelines - Dependency pinning: Lock all dependencies to specific versions with verified hashes
- Transitive dependency review: Audit not just direct dependencies but their dependencies as well
Defense Connection
Supply chain defense addresses WarningASI04: Agentic Supply Chain Vulnerabilities at the model and tooling layer. The malicious MCP server case from Chapter 2 Section 5 showed how a single compromised tool provider can affect every agent that connects to it. The same principle applies to model supply chains – a single compromised model or dependency can propagate to every deployment that uses it.
AI Model Vulnerability Scanning
Pre-deployment scanning extends beyond traditional CVE detection to address AI-specific vulnerability categories. An AI model vulnerability scan assesses:
What Gets Scanned
| Scan Type | What It Checks | Why It Matters |
|---|---|---|
| Serialization safety | Model files for pickle exploits, embedded code, unsafe deserialization | Prevents RCE via model loading (Chapter 2 Section 4) |
| Weight integrity | Model weights against known-good hashes | Detects tampering, backdoor insertion |
| Adversarial robustness | Model responses to known adversarial inputs | Identifies models susceptible to adversarial manipulation |
| Alignment verification | Model outputs for safety policy compliance | Catches models with degraded safety alignment from malicious fine-tuning |
| Data leakage potential | Model outputs for memorized training data (PII, credentials) | Identifies models that will leak sensitive information in production |
When to Scan
Scanning should occur at multiple points in the model lifecycle:
- On acquisition: When downloading or receiving a model from any source
- Post-training/fine-tuning: After any modification to model weights
- Pre-deployment: As a gate in the CI/CD pipeline before production deployment
- Periodic re-assessment: On a regular schedule to catch newly discovered vulnerability patterns
Defense Perspective: Malicious Model Distribution
The attack (from Chapter 2 Section 3): Security researchers discovered multiple malicious models on the Hugging Face Hub that exploited pickle serialization to execute arbitrary code when loaded. The models had proper README files, model cards, and benchmarks – they appeared completely legitimate. But their serialized weight files contained hidden payloads including reverse shells, credential harvesters, and cryptocurrency miners.
What Layer 2 controls would have prevented or mitigated:
-
Container vulnerability scanning: The scanning gate in the container security pipeline would have detected the malicious pickle payloads during image build. Artifact scanners that inspect serialized objects for embedded code would flag the hidden payloads before they reach the container registry.
-
Model integrity verification: Hash verification against the model creator’s published checksums would fail if the model had been tampered with. Provenance tracking would reveal that the model’s distribution chain didn’t match the expected path from the original creator.
-
Format enforcement: A policy requiring safetensors format (which cannot execute code during loading) would block pickle-based model files entirely. Models that only exist in pickle format would require additional review before approval.
-
Supply chain governance: An approved model catalog – where only vetted, scanned models are available for deployment – prevents engineers from downloading unreviewed models directly from public hubs.
The key insight: these malicious models passed human review because they looked legitimate. Automated scanning and format enforcement catch what human reviewers miss – the hidden code buried inside serialized weight files.
AI Scanner Cross-Reference
AI Scanner contributes to Layer 2 by assessing models for common attack vulnerabilities during pre-deployment. Scanner evaluates whether a model is susceptible to prompt injection, system prompt leakage, and adversarial manipulation – vulnerabilities that exist in the model itself regardless of infrastructure protections. See Section 9 for the complete AI Scanner/Guard workflow and how the scan-protect-validate-improve cycle integrates with Layer 2’s model security controls.
Trend Vision One Container Security extends to AI model serving environments, scanning container images for vulnerabilities before deployment. Vision One’s artifact scanning can detect malicious serialized objects – such as the pickle-based exploits covered in Chapter 2 – before they execute in production environments. Container Security integrates with the same vulnerability intelligence that powers the broader Vision One platform, ensuring that newly discovered CVEs in model serving frameworks (TensorFlow Serving, vLLM, Triton) are detected within hours of disclosure.
Key Takeaways
- Model integrity verification through cryptographic signing and hash checking detects tampering, backdoor insertion, and supply chain substitution before models reach production
- Container security for AI workloads requires scanning for serialization exploits, malicious dependencies, and CVEs in model serving frameworks at every pipeline stage
- Supply chain defense includes model repository access controls, dependency auditing with SBOMs, and version pinning to prevent compromised artifacts from propagating
- Pre-deployment vulnerability scanning covers serialization safety, weight integrity, adversarial robustness, alignment verification, and data leakage potential
Test Your Knowledge
Ready to test your understanding of AI model security? Head to the quiz to check your knowledge.
Up next
With models secured, the next layer protects the infrastructure that runs them. In Section 5, you’ll learn about Layer 3: Secure Your AI Infrastructure – including AI Security Posture Management (AI-SPM), GPU cluster security, orchestration layer protection, and identity management for AI service accounts.