Overview
Data Security is the set of policies, procedures, and technologies used to protect data from unauthorized access, corruption, theft, and loss throughout its lifecycle. DMBOK2 Chapter 7 covers this knowledge area, emphasizing that data security is not merely an IT function but a business responsibility that requires collaboration between security professionals, data governance teams, legal/compliance, and business stakeholders. Data security must balance the need to protect sensitive information with the need to make data accessible for legitimate business use — overly restrictive security hampers business operations while insufficient security exposes the organization to breaches, regulatory penalties, and reputational damage. Data security encompasses multiple dimensions: confidentiality (ensuring data is accessible only to authorized individuals), integrity (ensuring data is accurate and has not been tampered with), and availability (ensuring data is accessible when needed). These three properties form the CIA triad, the foundational model for information security. Data security activities include classifying data by sensitivity level, implementing access controls, encrypting data at rest and in transit, monitoring data access for anomalies, managing user identities and authentication, conducting security audits, and planning for incident response. Regulatory compliance is a major driver of data security programs. Organizations must comply with regulations such as GDPR (European data protection), HIPAA (healthcare data in the US), PCI-DSS (payment card data), SOX (financial reporting controls), CCPA/CPRA (California consumer privacy), and industry-specific requirements. Each regulation imposes specific requirements for data classification, access control, encryption, breach notification, and data subject rights. Data security intersects heavily with Data Governance (which establishes security policies and data ownership), Data Quality (which ensures integrity), Data Architecture (which designs security into system architectures), and all other DMBOK2 knowledge areas that handle sensitive data.
Key Concepts
Data Classification
Data classification is the process of categorizing data based on its sensitivity level and the impact of unauthorized disclosure. Common classification levels (from most to least sensitive): (1) CONFIDENTIAL/RESTRICTED — highest sensitivity, severe business impact if disclosed (trade secrets, PII, health records, financial account numbers). Requires strongest controls: encryption, strict access limits, audit logging. (2) INTERNAL/PRIVATE — moderate sensitivity, intended for internal use only (internal policies, employee directories, non-public financial data). Requires access controls and monitoring. (3) PUBLIC — approved for external disclosure (marketing materials, published reports, public website content). Minimal controls required. Some organizations add additional levels like 'Sensitive' between Confidential and Internal. Classification should be based on: regulatory requirements, contractual obligations, business impact of disclosure, and data content. Classification drives ALL other security decisions — access controls, encryption, retention, and handling procedures depend on the classification level.
Access Control Models
Access control models define how permissions to access data are granted and managed. Four primary models: (1) ROLE-BASED ACCESS CONTROL (RBAC) — permissions are assigned to roles (e.g., 'Analyst', 'Manager'), and users are assigned to roles. Most widely implemented model. Simplifies administration but can lead to role explosion. (2) ATTRIBUTE-BASED ACCESS CONTROL (ABAC) — access decisions based on attributes of the user (department, clearance), the resource (classification, owner), the action (read, write), and the environment (time, location). More flexible and granular than RBAC. (3) MANDATORY ACCESS CONTROL (MAC) — access determined by security labels assigned to both users (clearance levels) and data (classification levels). Users cannot change classifications. Used in military/government: a user with 'Secret' clearance can access 'Secret' and below but not 'Top Secret.' (4) DISCRETIONARY ACCESS CONTROL (DAC) — the data owner decides who gets access. Most flexible but least controlled — owners can grant access to anyone. Common in file systems. The exam frequently tests the distinctions between these four models.
Encryption Methods
Encryption converts readable data (plaintext) into an unreadable format (ciphertext) using algorithms and keys. Encryption protects data in three states: (1) AT REST — data stored in databases, files, backups. Methods: Transparent Data Encryption (TDE) for entire databases, file-system encryption, column-level encryption for specific sensitive fields. (2) IN TRANSIT — data moving across networks. Methods: TLS/SSL for web traffic, VPN for network tunnels, SSH for secure shell connections. (3) IN USE — data being processed in memory (emerging area). Methods: homomorphic encryption (compute on encrypted data), secure enclaves (Intel SGX, AWS Nitro). Two fundamental encryption types: SYMMETRIC — same key encrypts and decrypts (AES-256). Fast, used for bulk data encryption. ASYMMETRIC — different keys for encryption (public key) and decryption (private key). Slower, used for key exchange and digital signatures. RSA and ECC are common asymmetric algorithms. FIELD-LEVEL ENCRYPTION encrypts individual data elements (e.g., SSN, credit card number) and provides the most granular protection.
Data Masking and Anonymization
Techniques for protecting sensitive data while preserving its usefulness. DATA MASKING replaces sensitive values with realistic but fictitious data. Types: (1) STATIC MASKING — creates a permanently masked copy of the data for non-production environments (dev, test, training). Irreversible. (2) DYNAMIC MASKING — masks data in real-time as it is queried, based on the user's role/permissions. The underlying data remains unchanged. Different users see different levels of masking. ANONYMIZATION removes or transforms personally identifiable information so individuals cannot be re-identified. Techniques: generalization (replacing exact age '34' with range '30-40'), suppression (removing fields entirely), pseudonymization (replacing names with random identifiers — NOTE: pseudonymization is REVERSIBLE with a mapping table and is NOT true anonymization under GDPR), k-anonymity (ensuring each record is indistinguishable from at least k-1 others), differential privacy (adding mathematical noise to query results). The key distinction: masked data can look real but is fake; anonymized data has been irreversibly transformed to prevent re-identification.
Identity and Access Management (IAM)
IAM is the framework for managing digital identities and controlling user access to data and systems. Core components: (1) IDENTIFICATION — establishing who the user claims to be (username, email). (2) AUTHENTICATION — verifying identity through knowledge factors (passwords), possession factors (tokens, smart cards, mobile devices), biometric factors (fingerprint, face recognition). Multi-Factor Authentication (MFA) combines two or more factor types. (3) AUTHORIZATION — determining what the authenticated user is allowed to do (permissions, roles, privileges). (4) ACCOUNTING/AUDITING — logging what users actually do for monitoring and compliance. These four steps form the 'IAAA' framework. Additional IAM concepts: Single Sign-On (SSO) — one authentication grants access to multiple systems; Federated Identity — sharing identity across organizational boundaries using standards like SAML, OAuth, or OpenID Connect; Privileged Access Management (PAM) — extra controls for administrative accounts; Identity Governance — periodic access reviews, certification, and role lifecycle management.
Data Privacy vs Data Security
While closely related, data privacy and data security address different concerns. DATA SECURITY focuses on protecting data from unauthorized access, modification, and destruction through technical and administrative controls (encryption, access controls, firewalls, monitoring). It asks: 'Is the data protected from threats?' DATA PRIVACY focuses on the proper handling, processing, and use of personal data in accordance with individuals' rights and regulatory requirements. It asks: 'Is personal data being collected, used, and shared appropriately?' You can have security without privacy (data is locked down but used in ways individuals didn't consent to) and privacy requirements without security (policies exist but aren't technically enforced). GDPR enshrines key privacy principles: lawfulness and transparency, purpose limitation, data minimization, accuracy, storage limitation, integrity and confidentiality, and accountability. Privacy requires both organizational controls (policies, consent management, DPIAs) and technical controls (security measures). A Data Protection Officer (DPO) is required under GDPR for certain organizations.
Regulatory Compliance Requirements
Organizations must comply with multiple overlapping data security and privacy regulations. KEY REGULATIONS: GDPR (General Data Protection Regulation) — EU regulation protecting personal data of EU residents. Requires: lawful basis for processing, data subject rights (access, erasure, portability), breach notification within 72 hours, DPO appointment, Privacy Impact Assessments. Penalties up to 4% of global revenue. HIPAA (Health Insurance Portability and Accountability Act) — US regulation protecting Protected Health Information (PHI). Requires: administrative, physical, and technical safeguards, minimum necessary standard, business associate agreements. PCI-DSS (Payment Card Industry Data Security Standard) — industry standard for protecting cardholder data. 12 requirements including encryption, access control, monitoring, and vulnerability management. SOX (Sarbanes-Oxley Act) — US regulation requiring internal controls over financial reporting, audit trails, and data integrity for public companies. CCPA/CPRA (California Consumer Privacy Act) — requires disclosure of data collection practices, opt-out rights for data sale, and data deletion upon request.
Security Architecture and Defense in Depth
Security architecture designs security controls into the data infrastructure rather than bolting them on afterward. DEFENSE IN DEPTH is the principle of implementing multiple layers of security controls so that if one layer fails, others still protect the data. Layers include: (1) PERIMETER SECURITY — firewalls, intrusion detection/prevention systems, DMZ (demilitarized zone) separating public and private networks; (2) NETWORK SECURITY — network segmentation, VLANs, VPN, encryption of network traffic; (3) HOST/SYSTEM SECURITY — operating system hardening, patch management, endpoint protection; (4) APPLICATION SECURITY — input validation, secure coding practices, vulnerability scanning, web application firewalls; (5) DATA SECURITY — encryption, masking, access controls, classification at the data layer; (6) PHYSICAL SECURITY — data center access controls, environmental protections. The principle of LEAST PRIVILEGE states that users and systems should have only the minimum access necessary to perform their functions — no more.
Security Monitoring and Auditing
Continuous monitoring detects security incidents, policy violations, and anomalous behavior. Key components: (1) AUDIT LOGGING — recording who accessed what data, when, from where, and what action they performed. Logs must be tamper-proof, retained for compliance periods, and reviewed regularly. (2) SIEM (Security Information and Event Management) — aggregates and correlates security events from multiple sources to detect threats. Examples: Splunk, IBM QRadar, Microsoft Sentinel. (3) DATA ACTIVITY MONITORING (DAM) — specifically monitors database access patterns and flags suspicious queries (e.g., a user downloading an unusually large dataset or accessing data outside their normal scope). (4) USER BEHAVIOR ANALYTICS (UBA) — uses machine learning to establish baselines of normal user behavior and detect deviations. (5) VULNERABILITY ASSESSMENTS — regular scanning of systems for known security weaknesses. (6) PENETRATION TESTING — simulated attacks to test security controls. Monitoring supports both real-time detection (stopping attacks in progress) and forensic investigation (understanding what happened after an incident).
Data Breach Response
A data breach response plan outlines how an organization detects, contains, investigates, and recovers from security incidents involving unauthorized access to data. Key phases: (1) PREPARATION — establishing the incident response team, communication templates, legal contacts, and testing the plan through tabletop exercises; (2) IDENTIFICATION — detecting the breach through monitoring, alerts, or external notification; (3) CONTAINMENT — isolating affected systems to prevent further data loss (short-term containment) and implementing temporary fixes (long-term containment); (4) ERADICATION — removing the threat (malware, compromised accounts, vulnerabilities); (5) RECOVERY — restoring systems and data from clean backups, monitoring for recurrence; (6) LESSONS LEARNED — documenting what happened, what worked, what failed, and improving controls. Notification requirements vary by regulation: GDPR requires notification to authorities within 72 hours and to affected individuals without undue delay; HIPAA requires notification within 60 days; state breach notification laws vary. The cost of a breach includes regulatory fines, legal fees, remediation costs, and reputational damage.
Cloud Data Security
Cloud environments introduce specific security considerations governed by the SHARED RESPONSIBILITY MODEL: the cloud provider secures the infrastructure (physical security, hypervisor, network), while the customer secures their data, access controls, configurations, and applications. Key cloud security concepts: (1) DATA RESIDENCY/SOVEREIGNTY — ensuring data is stored in geographic regions required by regulations; (2) ENCRYPTION KEY MANAGEMENT — deciding whether the cloud provider or customer manages encryption keys (customer-managed keys provide more control); (3) IDENTITY FEDERATION — integrating corporate IAM with cloud IAM; (4) NETWORK ISOLATION — virtual private clouds, security groups, private endpoints; (5) CONFIGURATION MANAGEMENT — cloud misconfigurations (open S3 buckets, excessive permissions) are the leading cause of cloud data breaches; (6) MULTI-TENANT RISKS — ensuring data isolation between customers sharing infrastructure. Cloud Security Posture Management (CSPM) tools continuously monitor cloud configurations for security best practices.
Insider Threats and Social Engineering
Insider threats originate from individuals within the organization who have legitimate access to systems and data. Types: (1) MALICIOUS INSIDERS — employees or contractors who deliberately steal, modify, or destroy data for personal gain, revenge, or espionage; (2) NEGLIGENT INSIDERS — employees who accidentally cause security incidents through carelessness (clicking phishing links, misconfiguring systems, losing devices); (3) COMPROMISED INSIDERS — employees whose credentials have been stolen by external attackers. Insider threats are particularly dangerous because insiders already have authorized access, bypassing perimeter security. SOCIAL ENGINEERING attacks manipulate people into revealing information or performing actions that compromise security. Common techniques: phishing (fraudulent emails), spear phishing (targeted phishing), pretexting (fabricated scenarios), baiting (leaving infected USB drives), and tailgating (following authorized personnel into secure areas). Defenses include: security awareness training, least privilege access, monitoring unusual data access patterns, data loss prevention (DLP) tools, and separation of duties.
Best Practices
- ✓ Classify all data by sensitivity level (Confidential, Internal, Public) and apply security controls proportional to the classification
- ✓ Implement defense in depth — never rely on a single security control; layer multiple controls so one failure does not expose data
- ✓ Apply the principle of least privilege — grant users only the minimum access required for their job functions
- ✓ Encrypt sensitive data both at rest and in transit — use AES-256 for data at rest and TLS 1.2+ for data in transit
- ✓ Implement multi-factor authentication (MFA) for all privileged and remote access to data systems
- ✓ Maintain comprehensive audit logs of all data access and review them regularly for anomalies and policy violations
- ✓ Conduct regular security awareness training for all employees, especially regarding phishing and social engineering
- ✓ Establish and regularly test a data breach response plan with defined roles, communication procedures, and notification timelines
- ✓ Use data masking for non-production environments — never use real sensitive data in development or testing
- ✓ Perform regular vulnerability assessments and penetration testing to identify and remediate security weaknesses
- ✓ Implement Data Loss Prevention (DLP) tools to detect and prevent unauthorized data exfiltration
- ✓ Review and certify user access rights periodically (at least quarterly for sensitive data) to remove stale permissions
💡 Exam Tips
- ★ Data Security is 6% of the exam — expect approximately 6 questions
- ★ Know the CIA triad: Confidentiality (only authorized access), Integrity (data not tampered with), Availability (accessible when needed)
- ★ Know ALL FOUR access control models: RBAC (role-based), ABAC (attribute-based), MAC (mandatory/labels), DAC (discretionary/owner-controlled) — and when each is used
- ★ Static masking creates a permanent fake copy for non-production; dynamic masking masks in real-time based on user role — know the difference
- ★ Pseudonymization is REVERSIBLE (with a mapping key) and is NOT true anonymization under GDPR — this distinction is frequently tested
- ★ GDPR requires breach notification to authorities within 72 hours — memorize this specific timeframe
- ★ Data classification DRIVES all other security decisions — access controls, encryption, retention, and handling procedures depend on classification level
- ★ The shared responsibility model in cloud: provider secures infrastructure, customer secures data, access, and configurations
- ★ Understand encryption types: symmetric (same key, AES, fast for bulk data) vs asymmetric (public/private key pair, RSA, for key exchange)
- ★ Principle of least privilege means minimum necessary access — it applies to users, applications, and systems
- ★ Data privacy focuses on PROPER USE of personal data; data security focuses on PROTECTION from threats — they overlap but are distinct
- ★ Defense in depth means multiple layers of security controls (perimeter, network, host, application, data, physical) — if one fails, others protect