<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Section 3 Quiz :: Introduction to AI Security</title>
    <link>https://example.org/chapter1/s3/activity/index.html</link>
    <description>Test Your Knowledge: Deployment Considerations Let’s see how much you’ve learned! This quiz tests your understanding of deployment patterns, cost trade-offs, serialization security, and safety guardrails for AI systems.&#xA;--- shuffle_answers: true shuffle_questions: false --- ## An organization needs to deploy an AI model in an air-gapped military facility with no internet connectivity. Which deployment pattern is their only viable option? &gt; Hint: Consider which deployment patterns require internet access and which do not. - [ ] Cloud API deployment via OpenAI or Anthropic &gt; Cloud APIs require internet connectivity to send requests to the provider&#39;s servers, which is impossible in an air-gapped environment. - [ ] Serverless inference through AWS Bedrock &gt; Serverless platforms are cloud-based and require internet connectivity. - [x] Self-hosted deployment using open-weight models like Llama or DeepSeek &gt; Correct! Self-hosted deployment is the only option for air-gapped environments. The model weights are downloaded once, then inference runs entirely on local hardware with no external network dependency. Only open-weight models (Llama, DeepSeek, Qwen, Mistral) support this approach. - [ ] Edge deployment on smartphones &gt; While edge deployment doesn&#39;t require internet, military applications typically need more capable models than SLMs on smartphones can provide. ## A startup is building a prototype AI chatbot and expects traffic to be unpredictable -- some days 100 requests, other days 10,000. Which deployment approach makes the most economic sense? &gt; Hint: Think about which pricing model best handles variable workloads. - [x] Cloud API with pay-per-token pricing &gt; Correct! Cloud APIs charge per token used, so the startup only pays for actual usage. During low-traffic days they pay very little, and during spikes the API scales automatically. No idle hardware costs and no upfront investment. - [ ] Self-hosted deployment on dedicated GPUs &gt; Dedicated GPUs have fixed costs regardless of usage. On low-traffic days, the startup would be paying for idle hardware. - [ ] Edge deployment on user devices &gt; Edge deployment is designed for on-device use cases, not centralized chatbots. - [ ] Building their own custom model from scratch &gt; This would require enormous investment in training infrastructure and expertise, inappropriate for a prototype. ## Apple Intelligence, Google Gemini Nano, and Microsoft Phi-4 running on laptops are all examples of which deployment pattern? &gt; Hint: Think about where the AI computation actually happens. - [ ] Cloud API deployment &gt; Cloud APIs process data on remote servers, not on the user&#39;s device. - [ ] Serverless inference &gt; Serverless still runs in the cloud, just without managing the infrastructure yourself. - [ ] Hybrid deployment &gt; While these could be part of a hybrid system, the specific pattern being described is on-device. - [x] Edge / on-device deployment &gt; Correct! These are all examples of Small Language Models running directly on end-user devices -- smartphones, laptops, and tablets. The AI computation happens locally with zero data transmission, providing complete privacy, offline functionality, and ultra-low latency. ## Why is the Safetensors serialization format considered more secure than Pickle for model deployment? &gt; Hint: Think about what happens when a serialized model file is loaded into memory. - [ ] Safetensors uses stronger encryption for model weights &gt; Neither format encrypts model weights. The security difference is about code execution, not encryption. - [ ] Safetensors files are smaller and harder to tamper with &gt; File size and tamper resistance aren&#39;t the primary security distinctions between these formats. - [x] Safetensors prevents arbitrary code execution during deserialization, while Pickle allows it -- meaning a tampered Pickle file can run malicious code the moment it&#39;s loaded &gt; Correct! Pickle&#39;s flexibility comes at a security cost: it can execute arbitrary Python code during deserialization. An attacker who tampers with a Pickle model file can inject code that runs automatically when anyone loads the model. Safetensors was specifically designed to prevent this attack vector. - [ ] Safetensors only works with trusted models from verified sources &gt; Safetensors doesn&#39;t verify model sources. Its security comes from preventing code execution during loading, regardless of source. ## A refusal pathway in an LLM is triggered when a user asks how to create a phishing email. What is a known challenge with this safety mechanism? &gt; Hint: Consider the trade-offs between being too restrictive and too permissive. - [ ] Refusal pathways make the model completely unusable &gt; Refusal pathways only trigger for harmful requests, not all requests. - [x] Over-censorship -- the model may refuse legitimate requests (such as cybersecurity research) that resemble harmful ones &gt; Correct! This is a major challenge. Refusal mechanisms can be overly broad, blocking legitimate queries about cybersecurity, medical research, or chemistry education because they superficially resemble harmful requests. Studies show refusal rates vary widely across models, with some being far more conservative than others. - [ ] Refusal pathways are only present in open-source models &gt; Both open-source and closed-source models implement refusal mechanisms, though they can be removed from open-source models. - [ ] Refusal pathways are 100% effective at preventing all harmful outputs &gt; No safety mechanism is perfect. Research has shown various techniques to bypass refusal pathways, which we&#39;ll explore in Chapter 2. ## A hybrid deployment architecture routes simple queries to a local Phi-4 model and complex analysis to Claude Opus 4 via API. What is the primary benefit of this approach? &gt; Hint: Think about what each tier optimizes for. - [ ] It eliminates all security risks &gt; Hybrid deployment doesn&#39;t eliminate security risks -- each tier has its own considerations. - [ ] It makes the system faster than any single deployment option &gt; Speed depends on the specific query. The cloud API tier may be slower than a local-only approach for simple queries. - [x] It optimizes cost by routing to the cheapest capable model for each request while maintaining access to frontier capabilities when needed &gt; Correct! Hybrid deployment lets you handle the majority of simple requests cheaply and privately on local models, while reserving expensive cloud API calls for complex tasks that genuinely need frontier capabilities. This can reduce costs by 60-80% compared to using a frontier model for everything. - [ ] It requires less total infrastructure than a single deployment option &gt; Hybrid actually requires more infrastructure (both local hardware and API integration) but optimizes the cost-performance trade-off. ## Moderation endpoints serve as external safety tools alongside a model&#39;s built-in refusal pathways. What limitation do they share even when both are implemented together? &gt; Hint: Think about the inherent challenges of automated content assessment. - [ ] They cannot process text in languages other than English &gt; Modern moderation endpoints support multiple languages and content types. - [ ] They make the model too slow for real-time applications &gt; While moderation adds latency, modern implementations are optimized for real-time use. - [x] They may still produce false positives (blocking legitimate content) and false negatives (missing harmful content) because automated content classification remains imperfect &gt; Correct! Even with both refusal pathways and moderation endpoints, AI content safety systems struggle with nuance and context. A medical discussion about drug interactions might be flagged as drug-related harmful content (false positive), while a cleverly worded harmful request might slip through (false negative). - [ ] They only work with cloud-deployed models, not self-hosted ones &gt; Moderation endpoints can be implemented for any deployment type, though some are provider-specific services.</description>
    <generator>Hugo</generator>
    <language>en-us</language>
    <atom:link href="https://example.org/chapter1/s3/activity/index.xml" rel="self" type="application/rss+xml" />
  </channel>
</rss>