Data Security Pattern
Click any control badge to view its details. Download SVG
Key Control Areas
Access Enforcement and Least Privilege
Data Classification and Labelling
Information Flow and Boundary Protection
Cryptographic Protection
Media Lifecycle Management
Data Integrity and Quality
Risk Assessment and Privacy
When to Use
Organisations that process Personally Identifiable Information (PII), operate in regulated sectors (financial services, healthcare, government, defence), process commercially sensitive information, handle payment card data subject to PCI DSS, or manage intellectual property that represents competitive advantage. Any organisation subject to data protection regulations including GDPR, CCPA, HIPAA, or sector-specific requirements. Organisations pursuing ISO 27001 certification where Annex A controls A.5.12-A.5.14 and A.8.10-A.8.12 require formal data classification and handling. Organisations that have experienced data breaches or near-misses and need to strengthen their data protection posture.
When NOT to Use
Publicly available information that is freely available from multiple sources and carries no confidentiality, integrity, or regulatory obligation. Organisations that exclusively handle public-domain data with no personal information component. However, even in these cases, data integrity and availability controls typically remain relevant.
Typical Challenges
Management appetite and organisational buy-in are the primary obstacles -- data security requires sustained investment and cultural change, not a one-time project. Selling data classification to business units that view labelling as overhead rather than protection requires demonstrating tangible risk reduction and regulatory compliance benefits. Keeping the classification scheme simple enough for consistent application while capturing meaningful differences in data sensitivity is a constant tension. The pace of technological change introduces new data vectors (cloud storage, SaaS applications, AI/ML training data, IoT telemetry) faster than policies can adapt. Shadow IT creates uncontrolled data repositories outside the security architecture. Data sprawl across hybrid and multi-cloud environments makes comprehensive inventory and consistent control application extremely difficult. Encryption key management at scale introduces operational complexity and availability risks. Data sharing with business partners and third parties requires balancing protection with usability. Improving services means greater use of data within organisations and more sharing externally, expanding the attack surface continuously. The level and sophistication of external threats, including targeted data exfiltration by advanced persistent threats and e-crime groups, is increasing.
Threat Resistance
Data exfiltration by external attackers through network intrusion, application exploitation, or supply chain compromise. Insider threats including malicious data theft by employees or contractors and negligent data exposure through mishandling. Ransomware and destructive malware targeting data availability and integrity. Man-in-the-middle attacks intercepting data in transit over untrusted networks. Unauthorised access to data at rest through stolen credentials, privilege escalation, or physical media theft. Data leakage through removable media, cloud synchronisation services, email, or printing. Residual data exposure from improperly sanitised media or decommissioned systems. Privacy violations from excessive data collection, unauthorised processing, or inadequate retention controls. Regulatory enforcement actions resulting from non-compliant data handling practices.
Assumptions
The organisation has established a data classification scheme and personnel understand their obligations for classifying and handling data at each level. Data ownership is defined, with business data owners accountable for classification decisions and IT responsible for implementing technical controls. A cryptographic key management infrastructure exists or is planned. The organisation maintains a data inventory or is working toward one, covering structured and unstructured data across on-premises and cloud environments. Legal and regulatory data protection requirements have been identified and mapped to the classification scheme.
Developing Areas
- Automated data classification accuracy remains a significant gap. While tools like Microsoft Purview, Titus, and AWS Macie can identify structured sensitive data patterns (credit card numbers, national insurance numbers) with high confidence, classification of unstructured content -- strategic documents, intellectual property, commercially sensitive analysis -- relies on machine learning models with accuracy rates typically below 80%. Misclassification in either direction is costly: over-classification creates friction and label fatigue, while under-classification leaves sensitive data unprotected. Organisations deploying automated classification must invest in continuous model tuning and human review of edge cases.
- Data residency enforcement in multi-cloud and SaaS environments is becoming a complex operational challenge driven by proliferating sovereignty requirements. The EU Data Act, Schrems II transfer mechanisms, China's PIPL, and India's DPDPA each impose different constraints on where data can be stored and processed. Technical enforcement mechanisms -- cloud provider region locks, data boundary configurations, and cross-border transfer monitoring -- exist but require constant maintenance as cloud providers add new regions and services. Most organisations lack a unified view of where their data actually resides across hundreds of SaaS applications.
- Homomorphic encryption is transitioning from theoretical curiosity to early practical application, enabling computation on encrypted data without decryption. IBM, Microsoft SEAL, and startup implementations (Zama, Duality Technologies) now support specific use cases including privacy-preserving analytics, encrypted database queries, and secure multi-party computation. However, performance overhead remains 1,000-10,000x compared to plaintext operations, limiting practical deployment to narrow use cases where the privacy benefit justifies the computational cost.
- Unstructured data protection in AI training pipelines is an urgent and largely unsolved problem. Organisations feeding corporate data into large language models for fine-tuning or retrieval-augmented generation (RAG) risk exposing confidential information, personal data, and intellectual property through model outputs. Data classification must extend to AI pipeline inputs, but existing DLP and classification tools were not designed for the vector embeddings, chunked document stores, and prompt-response patterns that characterise modern AI data flows. The governance frameworks for AI training data are still being defined by regulators and standards bodies.
- Cross-border data transfer mechanisms post-Schrems II remain in flux despite the EU-US Data Privacy Framework adopted in 2023. Legal challenges to the framework are anticipated, and organisations that relied on Privacy Shield (invalidated in 2020) and then scrambled to implement Standard Contractual Clauses are wary of building architecture around transfer mechanisms that may be invalidated again. The technical response -- data localisation, encryption with customer-held keys, and processing-in-place architectures -- adds significant cost and complexity that most organisations are still evaluating.
Related Patterns
Patterns that operate within or alongside this one. Click any to view.