Amazon SNS Data Protection Policies Block, Mask, or Log Sensitive Data with 99% Sample Rate
These articles are AI-generated summaries. Please check the original sources for full details.
Data Protection in Amazon SNS
Amazon SNS Data Protection Policies identify PII/PHI in messages using machine learning and pattern matching. A 2025 audit policy logs findings with 99% sample rate to CloudWatch.
Why This Matters
Event-driven architectures prioritize speed, but sensitive data leaks (e.g., PII/PHI) expose systems to compliance risks. Manual masking or detection pipelines are error-prone and costly. SNS Data Protection automates this with predefined identifiers (e.g., email, DOB) and three operations: audit, de-identify, or deny. A 2022 study estimated data breach costs at $4.2M per incident, making automated controls critical.
Key Insights
- “99% sample rate in Audit policy, 2025” (from CloudWatch log configuration)
- “De-identify over masking for PII in healthcare apps” (example from context)
- “Lambda used by developers to test SNS policies” (code example in context)
Working Example
{
"Description": "Audit sensitive data without blocking delivery",
"Version": "2021-06-01",
"Statement": [
{
"DataDirection": "Inbound",
"DataIdentifier": [
"arn:aws:dataprotection::aws:data-identifier/EmailAddress",
"arn:aws:dataprotection::aws:data-identifier/DateOfBirth",
"arn:aws:dataprotection::aws:data-identifier/CreditCardNumber"
],
"Operation": {
"Audit": {
"FindingsDestination": {
"CloudWatchLogs": {
"LogGroup": "/aws/vendedlogs/sns-audit/"
}
},
"SampleRate": "99"
}
},
"Principal": ["*"],
"Sid": "AuditSensitiveData"
}
],
"Name": "sns-audit-policy"
}
import boto3
import os
import json
import logging
sns = boto3.client('sns')
logger = logging.getLogger()
logger.setLevel(logging.INFO)
def lambda_handler(event, context):
message = {
"patientId": "PAT123456",
"name": "John Doe",
"dob": "12-01-2012",
"diagnosis": "Flu"
}
topics = ["AUDIT_TOPIC_ARN", "DEIDENTIFY_TOPIC_ARN", "DENY_TOPIC_ARN"]
results = {}
for topic_env in topics:
topic_arn = os.environ.get(topic_env)
try:
response = sns.publish(
TopicArn=topic_arn,
Message=json.dumps(message)
)
results[topic_env] = {
"status": "success",
"messageId": response.get("MessageId")
}
logger.info(f"Published to {topic_env}: {response.get('MessageId')}")
except sns.exceptions.InvalidParameterException as e:
logger.error(f"[{topic_env}] Sensitive data detected: {str(e)}")
results[topic_env] = {"status": "failed", "error": "Sensitive data not allowed"}
except Exception as e:
logger.error(f"[{topic_env}] Error: {str(e)}")
results[topic_env] = {"status": "failed", "error": str(e)}
return {"status": "completed", "results": results}
Practical Applications
- Use Case: Healthcare apps using De-identify to mask patient DOB before sending to analytics systems
- Pitfall: Using Audit alone without De-identify may leave sensitive data exposed in logs
References:
Continue reading
Next article
WhatsApp's Typing Status Architecture: Real-Time Efficiency at Scale
Related Content
AWS NACL — Subnet-Level Security in AWS 🔐
AWS Network Access Control Lists (NACLs) provide subnet-level security, controlling inbound and outbound traffic for enhanced VPC protection.
Building a Secure Bastion Host Architecture in AWS: A Complete Step-by-Step Guide
This guide details building a secure bastion host architecture in AWS, enhancing security by isolating critical resources and controlling access.
Building Graph-Based Zero-Trust Network Simulations for Insider Threat Detection
Learn to build a dynamic Zero-Trust simulation using graph-based micro-segmentation and adaptive policy engines to block threats in real-time.