Secure LLM Usage
Click any control badge to view its details. Download SVG
Key Control Areas
Data Classification for LLM Contexts
Prompt Security
Jailbreaking and Safety Feature Bypass
Output Validation and Hallucination Risk
Provider Security and Data Residency
Privacy Inference Attacks on Fine-Tuned Models
Model Extraction and System Prompt Protection
Audit of LLM Interactions
When to Use
Organisation uses LLMs via API or embedded enterprise products (Microsoft 365 Copilot, GitHub Copilot, Gemini for Workspace) for business-sensitive tasks including data processing, analysis, or decision support. LLMs are integrated into enterprise applications as a component. Organisation has or is planning to fine-tune LLMs on enterprise data including customer records, employee data, or proprietary content. LLMs are used in security tooling: AI-assisted SIEM analysis, AI-augmented code review, AI-based threat intelligence. Employees use consumer LLM tools for work purposes, introducing enterprise data into provider-side systems.
When NOT to Use
Organisation has no AI integration plans and no employee exposure to LLM tools in any operational capacity. AI usage is strictly limited to isolated, air-gapped, non-sensitive tasks with no enterprise data input. Note: this contra-indication is increasingly rare — even organisations that do not formally deploy LLMs may have employees using AI tools informally, and many enterprise SaaS products now embed LLMs by default.
Typical Challenges
Prompt injection defence maturity remains at the equivalent of early SQL injection — the vulnerability class is well-understood but reliable systematic defences do not exist. No framework equivalent to parameterised queries has emerged for prompt injection, and mitigations (input/output filtering, canary tokens, instruction hierarchy) reduce risk but can be bypassed by determined attackers. Provider-side data handling is contractually constrained but practically opaque: what providers log, retain, and potentially use for training is difficult to audit independently. Privacy inference attacks on fine-tuned models (membership inference, model inversion) require ML engineering competence to assess and mitigate that most security teams have not yet developed. Output validation is inherently difficult — there is no deterministic way to verify LLM output correctness, and human review of high-volume LLM output at scale is impractical. Jailbreak techniques evolve continuously as researchers and attackers discover new ways to probe model safety constraints — defences effective today may fail after model updates. Provider lock-in grows as fine-tuned models and proprietary system prompts become tied to specific provider APIs.
Threat Resistance
Prompt injection — direct (system prompt override via user input) and indirect (malicious instructions embedded in fetched content, documents, or tool outputs) — addressed through strict instruction/data separation, output validation before LLM-parsed data drives any action, and boundary protection preventing untrusted content from reaching the system prompt context. Jailbreaking and safety feature bypass defended through adversarial pre-deployment testing that characterises the model's actual safety envelope, application-layer guardrails that complement model-level safety, and anomaly monitoring for adversarial prompt patterns. Hallucination-driven misconfiguration mitigated by output verification requirements scaled to decision stakes and mandatory human review for security-critical outputs — never using LLM output as the sole authority for security decisions. Provider-side data exposure addressed through DPAs, training opt-out agreements, zero data retention where required, and data classification controls governing what enters context windows. Privacy inference attacks on fine-tuned models (membership inference, model inversion) mitigated through privacy impact assessments before fine-tuning, differential privacy in training pipelines, and strict data minimisation in fine-tuning datasets. Model extraction and system prompt leakage addressed through rate limiting, query pattern monitoring, and treating system prompts as classified security policy documents. AI-amplified social engineering addressed through awareness training calibrated to the scale and sophistication of AI-generated phishing and pretexting.
Assumptions
Organisations access LLMs via cloud API services (Anthropic, OpenAI, Google, Azure OpenAI) over TLS-protected network connections. The pattern covers any use of LLMs: direct API integration, embedded enterprise products (Microsoft 365 Copilot, GitHub Copilot, Gemini for Workspace), and fine-tuned models on enterprise data. Some organisations additionally deploy open-weight models (Llama, Mistral) self-hosted on-premises, where provider-side risks shift but model-layer risks remain. The security risks addressed here apply at the model interaction layer regardless of whether the LLM underlies a chat interface, an application, or an agent. For security of AI agents that take autonomous actions and invoke tools, see SP-047 (Secure Agentic AI Frameworks). Model capabilities and the associated attack surface are advancing rapidly — this pattern should be reviewed quarterly.
Developing Areas
- Jailbreak research and defence maturity: jailbreaking techniques are catalogued by researchers (role-play, many-shot, encoding, context manipulation) but defences remain reactive rather than systematic. The community is debating whether model-level training mitigations or application-layer policy engines are the more durable solution. No consensus has formed on standardised adversarial testing scope or pass/fail criteria for LLM safety assessments.
- Privacy inference attack tooling: membership inference and model inversion attack toolkits (LiRA, MI-BENCH) are maturing as research tools but are not yet routinely used in enterprise security assessments. The gap between academic attack capability and enterprise defensive awareness is significant — organisations fine-tuning on personal data may be exposed to GDPR liability risks they have not assessed.
- EU AI Act Article 50 (synthetic content disclosure): the obligation to disclose AI involvement in content generation creates downstream security implications for phishing detection, forensic attribution, and content authenticity verification. The technical implementation of disclosure (watermarking, provenance metadata) is unresolved and the enforcement ecosystem is not yet mature.
- NIST AI 100-2 and adversarial ML standardisation: NIST's Adversarial Machine Learning taxonomy provides the most structured framework for LLM attack classification, but enterprise implementation guidance translating the taxonomy into specific testing requirements and control baselines is still developing.
Related Patterns
Patterns that operate within or alongside this one. Click any to view.