4. Layer 2: Secure Your AI Models

Introduction

AI models are both an organization’s most valuable intellectual property and one of its most vulnerable attack surfaces. A fine-tuned model encodes months of training, proprietary data, and domain expertise into a single artifact – an artifact that can be poisoned, stolen, or weaponized if left unprotected. In Chapter 2, you saw how supply chain attacks inject malicious code through model files, how serialization exploits turn model loading into remote code execution, and how agentic supply chain vulnerabilities compromise the tools that deliver models to production.

Layer 2 of the Security for AI Blueprint addresses these threats by securing the model lifecycle – from the moment a model artifact is built or downloaded through container packaging, vulnerability scanning, integrity verification, and deployment. This layer treats every model as potential executable code until proven safe, and every model source as untrusted until verified.

What will I get out of this?

By the end of this section, you will be able to:

  1. Explain how container security applies to AI model serving environments and why containerized deployments require AI-specific scanning.
  2. Describe model integrity verification techniques including model signing, hash verification, and provenance tracking.
  3. Design a supply chain defense strategy for AI model artifacts, including repository security and dependency auditing.
  4. Identify pre-deployment vulnerability scanning approaches for AI models, including adversarial robustness testing.
  5. Map Layer 2 controls to specific OWASP categories and explain how they defend against supply chain and poisoning attacks.
  6. Apply the Layer 2 Security Checklist to evaluate an organization’s model security posture.

Container Security for AI Workloads

Most production AI models run inside containers – Docker images deployed to Kubernetes clusters, cloud inference endpoints, or edge devices. The container is the delivery mechanism that packages the model, its runtime dependencies, and its serving infrastructure into a deployable unit. If the container is compromised, the model is compromised.

AI containers differ from typical application containers in important ways:

  • Large base images: Model serving frameworks (TensorFlow Serving, vLLM, Triton Inference Server) require GPU drivers, CUDA libraries, and ML-specific dependencies that significantly expand the attack surface
  • Model artifacts inside the container: The model weights, configuration files, and tokenizers are packaged alongside the runtime – any vulnerability in these artifacts becomes part of the deployed container
  • Elevated hardware access: GPU passthrough and shared memory requirements mean AI containers often need broader host access than typical microservices
  • Long-running processes: Inference servers run continuously, unlike batch containers that start and stop – giving attackers more time to exploit vulnerabilities

The Container Security Pipeline

graph LR
    MA["Model Artifact<br/><small>Weights, config,<br/>tokenizer files</small>"]
    CB["Container Build<br/><small>Base image + model +<br/>serving framework</small>"]
    VS["Vulnerability Scan<br/><small>Image scanning,<br/>dependency audit,<br/>artifact verification</small>"]
    SR["Secure Registry<br/><small>Signed images,<br/>access-controlled<br/>artifact store</small>"]
    DP["Deployment<br/><small>Runtime protection,<br/>monitoring, policy<br/>enforcement</small>"]

    MA -->|"Integrity<br/>verified"| CB
    CB -->|"Build<br/>complete"| VS
    VS -->|"Scan<br/>passed"| SR
    SR -->|"Authorized<br/>pull"| DP

    G1["Gate: Hash<br/>Verification"]
    G2["Gate: CVE +<br/>Malware Scan"]
    G3["Gate: Image<br/>Signing"]
    G4["Gate: Runtime<br/>Policy"]

    G1 -.-> MA
    G2 -.-> VS
    G3 -.-> SR
    G4 -.-> DP

    style MA fill:#2d5016,color:#fff
    style CB fill:#2d5016,color:#fff
    style VS fill:#2d5016,color:#fff
    style SR fill:#2d5016,color:#fff
    style DP fill:#2d5016,color:#fff
    style G1 fill:#1C90F3,color:#fff
    style G2 fill:#1C90F3,color:#fff
    style G3 fill:#1C90F3,color:#fff
    style G4 fill:#1C90F3,color:#fff

Every stage in this pipeline has a security gate. Model artifacts are hash-verified before inclusion in the build. Built images are scanned for known CVEs and malware. Signed images are stored in access-controlled registries. And runtime policies enforce what the container can access in production. A failure at any gate blocks progression to the next stage.

Defense Connection

Container security for AI workloads directly addresses LLM03: Supply Chain. The serialization exploits from Chapter 2 – where loading a model file executes hidden code – are caught at the vulnerability scan gate. Image scanning detects malicious pickle payloads, backdoored dependencies, and known CVEs in model serving frameworks before they reach production.


Model Integrity and Provenance

If container security protects the delivery vehicle, model integrity protects the cargo. Model integrity verification ensures that the model you deploy is the model you intended to deploy – unmodified, untampered, and from a verified source.

graph LR
    UM["Untrusted<br/>Model<br/><small>Downloaded from<br/>hub or vendor</small>"]
    SIG["Signature<br/>Verification<br/><small>Cryptographic<br/>signing check</small>"]
    HASH["Hash<br/>Verification<br/><small>SHA-256 against<br/>published hash</small>"]
    PROV["Provenance<br/>Check<br/><small>Source, training<br/>data, chain</small>"]
    FMT["Format<br/>Enforcement<br/><small>Safetensors<br/>required</small>"]
    APPROVE["Approved<br/>for Deploy<br/><small>All gates<br/>passed</small>"]
    REJECT["Rejected<br/><small>Failed<br/>verification</small>"]

    UM --> SIG
    SIG -->|"Valid"| HASH
    SIG -->|"Failed"| REJECT
    HASH -->|"Match"| PROV
    HASH -->|"Mismatch"| REJECT
    PROV -->|"Verified"| FMT
    PROV -->|"Unknown"| REJECT
    FMT -->|"Safe format"| APPROVE
    FMT -->|"Unsafe format"| REJECT

    style UM fill:#cc7000,color:#fff
    style SIG fill:#2d5016,color:#fff
    style HASH fill:#2d5016,color:#fff
    style PROV fill:#2d5016,color:#fff
    style FMT fill:#2d5016,color:#fff
    style APPROVE fill:#1C90F3,color:#fff
    style REJECT fill:#8b0000,color:#fff

The model integrity verification pipeline ensures that every model passes through multiple verification gates before deployment. An untrusted model enters the pipeline and must clear signature verification, hash matching, provenance validation, and format enforcement before being approved. Failure at any gate results in rejection.

Model Signing

Model signing applies the same principles as code signing: the model creator generates a cryptographic signature over the model artifacts (weights, configuration, tokenizer). Consumers verify the signature before loading the model. If the artifacts have been modified – even a single byte changed in the weights file – the signature verification fails.

Frameworks like Sigstore’s cosign and Hugging Face’s model signing infrastructure enable this workflow. The challenge is adoption: many organizations download models from hubs and load them directly without any signature verification.

Hash Verification

At a minimum, every model artifact should have a published SHA-256 hash that consumers verify after download. This catches:

  • Man-in-the-middle modifications during download
  • Storage corruption that could alter model behavior unpredictably
  • Supply chain substitution where an attacker replaces a legitimate model with a poisoned version

Provenance Tracking

Provenance goes beyond “is this model intact?” to answer “where did this model come from?” A complete provenance record includes:

  • Source: Which organization or individual created the model
  • Training data: What datasets were used (at least at a category level)
  • Training process: What hardware, framework version, and hyperparameters were used
  • Modification history: What fine-tuning, quantization, or adaptation has been applied
  • Distribution chain: Which registries or hubs the model passed through before reaching you
Defense Connection

Model integrity controls defend against LLM04: Data and Model Poisoning at the distribution layer. In Chapter 2 Section 3, you learned how poisoned models can be uploaded to model hubs with legitimate-looking documentation. Provenance tracking and hash verification would detect that a downloaded model doesn’t match the expected artifact from the trusted source – flagging the substitution before the model is loaded.


Supply Chain Defense

The AI model supply chain extends far beyond model weights. It includes every dependency, tool, framework, and service that contributes to building and deploying a model. Defending this supply chain requires a systematic approach.

Model Repository Security

Organizations that host internal model registries need the same security controls they apply to code repositories:

  • Access controls: Role-based permissions for uploading, modifying, and downloading models
  • Audit logging: Track who uploaded which model, when, and from where
  • Automated scanning: Every model uploaded to the registry is automatically scanned for known vulnerabilities and malicious content
  • Version pinning: Deployments reference specific model versions by hash, not by mutable tags

Dependency Auditing

AI applications depend on complex stacks of Python packages, framework versions, and system libraries. A single compromised dependency can undermine the security of the entire stack – as the Ultralytics supply chain attack demonstrated when a cryptocurrency miner was injected into a popular computer vision package.

Dependency auditing for AI projects should include:

  • Software Bill of Materials (SBOM): Generate and maintain an SBOM for every model serving container
  • Automated vulnerability scanning: Run pip audit, safety check, or equivalent tools in CI/CD pipelines
  • Dependency pinning: Lock all dependencies to specific versions with verified hashes
  • Transitive dependency review: Audit not just direct dependencies but their dependencies as well
Defense Connection

Supply chain defense addresses WarningASI04: Agentic Supply Chain Vulnerabilities at the model and tooling layer. The malicious MCP server case from Chapter 2 Section 5 showed how a single compromised tool provider can affect every agent that connects to it. The same principle applies to model supply chains – a single compromised model or dependency can propagate to every deployment that uses it.


AI Model Vulnerability Scanning

Pre-deployment scanning extends beyond traditional CVE detection to address AI-specific vulnerability categories. An AI model vulnerability scan assesses:

What Gets Scanned

Scan Type What It Checks Why It Matters
Serialization safety Model files for pickle exploits, embedded code, unsafe deserialization Prevents RCE via model loading (Chapter 2 Section 4)
Weight integrity Model weights against known-good hashes Detects tampering, backdoor insertion
Adversarial robustness Model responses to known adversarial inputs Identifies models susceptible to adversarial manipulation
Alignment verification Model outputs for safety policy compliance Catches models with degraded safety alignment from malicious fine-tuning
Data leakage potential Model outputs for memorized training data (PII, credentials) Identifies models that will leak sensitive information in production

When to Scan

Scanning should occur at multiple points in the model lifecycle:

  1. On acquisition: When downloading or receiving a model from any source
  2. Post-training/fine-tuning: After any modification to model weights
  3. Pre-deployment: As a gate in the CI/CD pipeline before production deployment
  4. Periodic re-assessment: On a regular schedule to catch newly discovered vulnerability patterns

Defense Perspective: Malicious Model Distribution

The attack (from Chapter 2 Section 3): Security researchers discovered multiple malicious models on the Hugging Face Hub that exploited pickle serialization to execute arbitrary code when loaded. The models had proper README files, model cards, and benchmarks – they appeared completely legitimate. But their serialized weight files contained hidden payloads including reverse shells, credential harvesters, and cryptocurrency miners.

What Layer 2 controls would have prevented or mitigated:

  1. Container vulnerability scanning: The scanning gate in the container security pipeline would have detected the malicious pickle payloads during image build. Artifact scanners that inspect serialized objects for embedded code would flag the hidden payloads before they reach the container registry.

  2. Model integrity verification: Hash verification against the model creator’s published checksums would fail if the model had been tampered with. Provenance tracking would reveal that the model’s distribution chain didn’t match the expected path from the original creator.

  3. Format enforcement: A policy requiring safetensors format (which cannot execute code during loading) would block pickle-based model files entirely. Models that only exist in pickle format would require additional review before approval.

  4. Supply chain governance: An approved model catalog – where only vetted, scanned models are available for deployment – prevents engineers from downloading unreviewed models directly from public hubs.

The key insight: these malicious models passed human review because they looked legitimate. Automated scanning and format enforcement catch what human reviewers miss – the hidden code buried inside serialized weight files.


AI Scanner Cross-Reference

AI Scanner contributes to Layer 2 by assessing models for common attack vulnerabilities during pre-deployment. Scanner evaluates whether a model is susceptible to prompt injection, system prompt leakage, and adversarial manipulation – vulnerabilities that exist in the model itself regardless of infrastructure protections. See Section 9 for the complete AI Scanner/Guard workflow and how the scan-protect-validate-improve cycle integrates with Layer 2’s model security controls.

Trend Vision One Container Security extends to AI model serving environments, scanning container images for vulnerabilities before deployment. Vision One’s artifact scanning can detect malicious serialized objects – such as the pickle-based exploits covered in Chapter 2 – before they execute in production environments. Container Security integrates with the same vulnerability intelligence that powers the broader Vision One platform, ensuring that newly discovered CVEs in model serving frameworks (TensorFlow Serving, vLLM, Triton) are detected within hours of disclosure.


Layer 2 Model Security Checklist

Use this checklist to evaluate your organization’s Layer 2 security posture:

  • Enforce safetensors format – require safetensors for all model files where available; pickle-based formats require explicit security review before approval
  • Verify model integrity – every model download is verified against published SHA-256 hashes before loading
  • Track model provenance – maintain records of source, training data category, modification history, and distribution chain for every deployed model
  • Sign model artifacts – use cryptographic signing (cosign, Hugging Face model signing) for internally produced models
  • Scan container images – all AI serving containers are scanned for CVEs, malware, and malicious artifacts before registry push
  • Maintain approved model catalog – only vetted, scanned models are available for production deployment; public hub models require review
  • Generate model SBOMs – every model serving container has a Software Bill of Materials tracking all dependencies
  • Pin dependencies – all Python packages and framework versions are pinned to specific versions with verified hashes
  • Automate vulnerability scanning – CI/CD pipeline includes automated model and dependency scanning as a deployment gate
  • Monitor for new CVEs – continuous monitoring for vulnerabilities in deployed model serving frameworks with automated alerting
  • Audit model access – logging and monitoring for who downloads, modifies, or deploys models in internal registries
Key Takeaways
  • Model integrity verification through cryptographic signing and hash checking detects tampering, backdoor insertion, and supply chain substitution before models reach production
  • Container security for AI workloads requires scanning for serialization exploits, malicious dependencies, and CVEs in model serving frameworks at every pipeline stage
  • Supply chain defense includes model repository access controls, dependency auditing with SBOMs, and version pinning to prevent compromised artifacts from propagating
  • Pre-deployment vulnerability scanning covers serialization safety, weight integrity, adversarial robustness, alignment verification, and data leakage potential

Test Your Knowledge

Ready to test your understanding of AI model security? Head to the quiz to check your knowledge.


Up next

With models secured, the next layer protects the infrastructure that runs them. In Section 5, you’ll learn about Layer 3: Secure Your AI Infrastructure – including AI Security Posture Management (AI-SPM), GPU cluster security, orchestration layer protection, and identity management for AI service accounts.