← Patterns / SP-013

Data Security Pattern

Data is the asset that organisations ultimately exist to protect. While perimeter defences, access controls, and monitoring capabilities are all essential, they exist in service of one fundamental objective: preventing unauthorised access to, modification of, or destruction of the organisation's data. The Data Security pattern addresses this directly by establishing a comprehensive framework for classifying data according to its sensitivity and business value, then applying proportionate controls across the entire data lifecycle -- from creation through processing, storage, transmission, and eventual disposal. The pattern begins with data classification, which is the foundation upon which all other data security controls are built. Without a clear, consistently applied classification scheme, organisations cannot make rational decisions about protection levels. Classification must be practical -- schemes with too many tiers create confusion and inconsistent labelling, while oversimplified schemes fail to differentiate between data that requires fundamentally different protection. A three-to-four tier model (Public, Internal, Confidential, Restricted) is the pragmatic standard for most organisations. Each tier maps to specific handling requirements for storage, transmission, access, and disposal. Access control is the enforcement mechanism for classification decisions. Data must be protected by role-based access controls implementing least privilege and separation of duties principles. Information flow enforcement ensures that data does not move between security domains without appropriate controls -- for example, preventing confidential data from being copied to uncontrolled external systems or transmitted over unencrypted channels. These controls must operate at the network layer (boundary protection, flow enforcement), the application layer (input validation, output handling), and the physical layer (media controls, facility access). Cryptography is the technical backbone of data confidentiality and integrity. Data at rest must be encrypted where its classification warrants it, using validated cryptographic algorithms and properly managed keys. Data in transit requires transmission confidentiality and integrity controls -- TLS for application traffic, VPN or dedicated links for network-level protection. The choice of cryptographic controls should be risk-driven and aligned with current best practice; organisations should maintain cryptographic agility to respond to evolving threats including post-quantum computing considerations. The pattern also addresses the full media lifecycle: how data is stored on physical and removable media, how media is labelled, how access to media is controlled, how media is transported between locations, and critically, how media is sanitised and disposed of when no longer needed. Data remnance -- the residual representation of data after attempted erasure -- is a persistent risk that requires deliberate sanitisation procedures. Privacy impact assessments ensure that data processing activities involving personal information are evaluated for compliance with data protection regulations before implementation.

Release: 26.02 Authors: Aurelius, Vitruvius Updated: 2026-02-06

Assess

ATT&CK This pattern addresses 425 techniques across 13 tactics View on ATT&CK Matrix →

Click any control badge to view its details. Download SVG

Key Control Areas

Access Enforcement and Least Privilege

AC-03 AC-05 AC-06

These controls form the primary barrier preventing unauthorised data access. Access enforcement (AC-03) ensures that every data access request is validated against the organisation's access control policy before being permitted. Separation of duties (AC-05) prevents any single individual from having sufficient access to compromise data integrity -- for example, the person who approves a payment should not be the same person who initiates it. Least privilege (AC-06) restricts each user, process, and system to the minimum access required for their function. Implementation requires a combination of role-based access control (RBAC), attribute-based access control (ABAC) for context-sensitive decisions, and regular access reviews to detect and remediate privilege creep. Privileged access to data stores (database administrators, storage administrators) requires enhanced controls including privileged access management tools, session recording, and just-in-time access provisioning.

Data Classification and Labelling

AC-15 AC-16 RA-02

Classification is the decision framework that drives all other data security controls. Security categorisation (RA-02) establishes the impact levels for data confidentiality, integrity, and availability -- typically using a scheme aligned with business risk appetite and regulatory requirements. Automated marking (AC-15) and automated labelling (AC-16) apply classification metadata to data objects programmatically, reducing reliance on user judgement for routine classification decisions. Modern data classification tools can inspect content, apply rules-based classification, and integrate with data loss prevention systems to enforce handling policies. Classification must be applied at the point of creation and maintained through the data lifecycle, including when data is copied, transformed, or aggregated.

Information Flow and Boundary Protection

AC-04 SC-07 AC-20

Data does not stay in one place. It flows between systems, networks, users, and organisations. Information flow enforcement (AC-04) controls these movements, ensuring data does not cross security boundaries without appropriate authorisation and protection. Boundary protection (SC-07) implements network-level controls at the perimeter and between internal security zones to restrict data movement. Use of external information systems (AC-20) addresses the risk of data being accessed from or stored on systems outside the organisation's direct control -- including personal devices, cloud services, and third-party platforms. Implementation requires data loss prevention (DLP) tools, network segmentation, API gateways with content inspection, and contractual controls for external data processing.

Cryptographic Protection

SC-08 SC-09 SC-13

Encryption is the last line of defence when other controls fail. Transmission integrity (SC-08) ensures data is not modified in transit, while transmission confidentiality (SC-09) prevents interception of sensitive data during transmission. Use of cryptography (SC-13) establishes the organisation's cryptographic standards for algorithms, key lengths, key management, and certificate lifecycle. Organisations should mandate TLS 1.2 or higher for all network communications, AES-256 for data at rest, and implement hardware security modules (HSMs) for key management in high-assurance environments. Cryptographic controls must cover all data states: at rest (disk encryption, database encryption, file-level encryption), in transit (TLS, IPsec, SSH), and in use (where feasible, through confidential computing or tokenisation).

Media Lifecycle Management

MP-02 MP-03 MP-04 MP-05 MP-06

Physical and removable media remain a significant data leakage vector. Media access (MP-02) restricts who can read from or write to removable media. Media labelling (MP-03) ensures physical media is marked with its classification level so handlers know what protection it requires. Media storage (MP-04) mandates appropriate physical protection for media based on its classification -- locked cabinets, climate-controlled vaults, or offsite secure storage. Media transport (MP-05) controls how media moves between locations, requiring encryption, tamper-evident packaging, and chain-of-custody documentation for sensitive data. Media sanitisation and disposal (MP-06) is critical: when data is no longer needed, the media must be sanitised to a standard appropriate to the classification level, using clearing, purging, or physical destruction as warranted.

Data Integrity and Quality

SI-09 SI-10 SI-12 SC-04

Data security is not only about confidentiality. Data integrity ensures that information is accurate, complete, and unmodified by unauthorised processes. Information input restrictions (SI-09) limit who can input data into systems and under what conditions, preventing unauthorised data modification. Information accuracy, completeness, validity, and authenticity controls (SI-10) validate data at input to prevent corruption, injection attacks, and processing errors. Information output handling and retention (SI-12) governs how data is presented, distributed, and retained -- ensuring outputs are handled according to their classification and that retention periods comply with legal and regulatory requirements. Information remnance protection (SC-04) addresses residual data in shared system resources such as memory, storage, and registers that could be exploited to access data from previous processes.

Risk Assessment and Privacy

RA-01 RA-02 RA-03 RA-04 PL-05

Data security controls must be proportionate to risk. Risk assessment policy (RA-01) establishes the framework for evaluating data security risks. Security categorisation (RA-02) determines the baseline protection level for each data type. Risk assessment (RA-03) identifies specific threats to data and evaluates the likelihood and impact of data compromise. Risk assessment update (RA-04) ensures assessments remain current as the threat landscape, data inventory, and regulatory environment change. Privacy impact assessment (PL-05) is essential for any processing of personal data, evaluating compliance with data protection regulations (GDPR, CCPA, LGPD) and identifying privacy risks before new processing activities begin.

When to Use

Organisations that process Personally Identifiable Information (PII), operate in regulated sectors (financial services, healthcare, government, defence), process commercially sensitive information, handle payment card data subject to PCI DSS, or manage intellectual property that represents competitive advantage. Any organisation subject to data protection regulations including GDPR, CCPA, HIPAA, or sector-specific requirements. Organisations pursuing ISO 27001 certification where Annex A controls A.5.12-A.5.14 and A.8.10-A.8.12 require formal data classification and handling. Organisations that have experienced data breaches or near-misses and need to strengthen their data protection posture.

When NOT to Use

Publicly available information that is freely available from multiple sources and carries no confidentiality, integrity, or regulatory obligation. Organisations that exclusively handle public-domain data with no personal information component. However, even in these cases, data integrity and availability controls typically remain relevant.

Typical Challenges

Management appetite and organisational buy-in are the primary obstacles -- data security requires sustained investment and cultural change, not a one-time project. Selling data classification to business units that view labelling as overhead rather than protection requires demonstrating tangible risk reduction and regulatory compliance benefits. Keeping the classification scheme simple enough for consistent application while capturing meaningful differences in data sensitivity is a constant tension. The pace of technological change introduces new data vectors (cloud storage, SaaS applications, AI/ML training data, IoT telemetry) faster than policies can adapt. Shadow IT creates uncontrolled data repositories outside the security architecture. Data sprawl across hybrid and multi-cloud environments makes comprehensive inventory and consistent control application extremely difficult. Encryption key management at scale introduces operational complexity and availability risks. Data sharing with business partners and third parties requires balancing protection with usability. Improving services means greater use of data within organisations and more sharing externally, expanding the attack surface continuously. The level and sophistication of external threats, including targeted data exfiltration by advanced persistent threats and e-crime groups, is increasing.

Threat Resistance

Data exfiltration by external attackers through network intrusion, application exploitation, or supply chain compromise. Insider threats including malicious data theft by employees or contractors and negligent data exposure through mishandling. Ransomware and destructive malware targeting data availability and integrity. Man-in-the-middle attacks intercepting data in transit over untrusted networks. Unauthorised access to data at rest through stolen credentials, privilege escalation, or physical media theft. Data leakage through removable media, cloud synchronisation services, email, or printing. Residual data exposure from improperly sanitised media or decommissioned systems. Privacy violations from excessive data collection, unauthorised processing, or inadequate retention controls. Regulatory enforcement actions resulting from non-compliant data handling practices.

Assumptions

The organisation has established a data classification scheme and personnel understand their obligations for classifying and handling data at each level. Data ownership is defined, with business data owners accountable for classification decisions and IT responsible for implementing technical controls. A cryptographic key management infrastructure exists or is planned. The organisation maintains a data inventory or is working toward one, covering structured and unstructured data across on-premises and cloud environments. Legal and regulatory data protection requirements have been identified and mapped to the classification scheme.

Developing Areas

Automated data classification accuracy remains a significant gap. While tools like Microsoft Purview, Titus, and AWS Macie can identify structured sensitive data patterns (credit card numbers, national insurance numbers) with high confidence, classification of unstructured content -- strategic documents, intellectual property, commercially sensitive analysis -- relies on machine learning models with accuracy rates typically below 80%. Misclassification in either direction is costly: over-classification creates friction and label fatigue, while under-classification leaves sensitive data unprotected. Organisations deploying automated classification must invest in continuous model tuning and human review of edge cases.
Data residency enforcement in multi-cloud and SaaS environments is becoming a complex operational challenge driven by proliferating sovereignty requirements. The EU Data Act, Schrems II transfer mechanisms, China's PIPL, and India's DPDPA each impose different constraints on where data can be stored and processed. Technical enforcement mechanisms -- cloud provider region locks, data boundary configurations, and cross-border transfer monitoring -- exist but require constant maintenance as cloud providers add new regions and services. Most organisations lack a unified view of where their data actually resides across hundreds of SaaS applications.
Homomorphic encryption is transitioning from theoretical curiosity to early practical application, enabling computation on encrypted data without decryption. IBM, Microsoft SEAL, and startup implementations (Zama, Duality Technologies) now support specific use cases including privacy-preserving analytics, encrypted database queries, and secure multi-party computation. However, performance overhead remains 1,000-10,000x compared to plaintext operations, limiting practical deployment to narrow use cases where the privacy benefit justifies the computational cost.
Unstructured data protection in AI training pipelines is an urgent and largely unsolved problem. Organisations feeding corporate data into large language models for fine-tuning or retrieval-augmented generation (RAG) risk exposing confidential information, personal data, and intellectual property through model outputs. Data classification must extend to AI pipeline inputs, but existing DLP and classification tools were not designed for the vector embeddings, chunked document stores, and prompt-response patterns that characterise modern AI data flows. The governance frameworks for AI training data are still being defined by regulators and standards bodies.
Cross-border data transfer mechanisms post-Schrems II remain in flux despite the EU-US Data Privacy Framework adopted in 2023. Legal challenges to the framework are anticipated, and organisations that relied on Privacy Shield (invalidated in 2020) and then scrambled to implement Standard Contractual Clauses are wary of building architecture around transfer mechanisms that may be invalidated again. The technical response -- data localisation, encryption with customer-held keys, and processing-in-place architectures -- adds significant cost and complexity that most organisations are still evaluating.

Related Patterns

Patterns that operate within or alongside this one. Click any to view.

SP-010 Identity Management SP-014 Awareness and Training SP-019 Secure Ad-Hoc File Exchange SP-018 ISMS SP-025 Advanced Monitoring and Detection

AC: 8AT: 2CA: 1CP: 1IA: 1MP: 5PE: 2PL: 1RA: 4SC: 5SI: 3

AC-02 Account Management

AC-03 Access Enforcement

AC-04 Information Flow Enforcement

AC-05 Separation Of Duties

AC-06 Least Privilege

AC-15 Automated Marking

AC-16 Automated Labeling

AC-20 Use Of External Information Systems

AT-02 Security Awareness

AT-03 Security Training

CA-07 Continuous Monitoring

CP-09 Information System Backup

IA-02 User Identification And Authentication

MP-02 Media Access

MP-03 Media Labeling

MP-04 Media Storage

MP-05 Media Transport

MP-06 Media Sanitization And Disposal

PE-03 Physical Access Control

PE-19 Information Leakage

PL-05 Privacy Impact Assessment

RA-01 Risk Assessment Policy And Procedures

RA-02 Security Categorization

RA-03 Risk Assessment

RA-04 Risk Assessment Update

SC-04 Information Remnance

SC-07 Boundary Protection

SC-08 Transmission Integrity

SC-09 Transmission Confidentiality

SC-13 Use Of Cryptography

SI-09 Information Input Restrictions

SI-10 Information Accuracy, Completeness, Validity, And Authenticity

SI-12 Information Output Handling And Retention