Is Your S3 Bucket Leaking Customer Data Right Now?

In 2021, security researchers discovered that 47 organizations — including American Airlines, Ford, and a state Department of Health — had left 38 million records containing COVID vaccination data, Social Security numbers, and employee information sitting wide open on the internet. No hacking required. The data was simply there, accessible to anyone who knew how to look.

The cause wasn’t a sophisticated breach. It was a default setting in Microsoft’s Power Apps that nobody had changed. A single checkbox that most developers didn’t know existed.

AWS S3 buckets have the same problem — and it keeps happening. Verizon. Twitch. GoDaddy. Hundreds of smaller companies you never heard about because they quietly patched the leak before a researcher found it. Or didn’t.

The Capital One breach (2019) resulted in 100 million customer records exposed — not through a sophisticated hack, but through a misconfigured web application firewall that had excessive permissions to dozens of S3 buckets. The regulatory fine alone was $80 million.

Why This Keeps Happening to Smart Teams

This isn’t a story about careless companies. Many of these organizations had dedicated security teams and significant budgets. The problem is structural: cloud storage is designed to make sharing easy. That same design makes it easy to accidentally share too much.

Here’s the pattern that plays out constantly:

Development phase — A developer creates an S3 bucket and sets it to public for quick testing. Saves time, avoids the friction of configuring access policies.
Go-live — The app ships. The S3 bucket moves to production. No one remembers — or checks — that the bucket is still public.
Months later — Customer records accumulate. A security researcher — or attacker — runs a bucket enumeration tool. Your data appears in a scan.
Discovery — You find out from a researcher’s disclosure email, a news article, or a regulatory notification. Average time to detect: over 200 days.

How to Check Your Exposure in 10 Minutes

If you have access to your AWS console, here’s a quick audit you can run right now. This isn’t a substitute for a full security assessment — but it will surface the most obvious exposures immediately.

Step 1 — Check your Block Public Access settings

In the AWS Console, navigate to S3 → Block Public Access settings for this account. All four options should be enabled. If any are off, that’s your first priority.

# Via AWS CLI — check account-level Block Public Access
aws s3control get-public-access-block \
  --account-id $(aws sts get-caller-identity --query Account --output text)

# Expected: all four "true"
# BlockPublicAcls, IgnorePublicAcls,
# BlockPublicPolicy, RestrictPublicBuckets

Step 2 — Audit individual bucket ACLs

Even with account-level settings, individual buckets can have their own access policies. Run a scan across all buckets to find any that are publicly accessible:

# List all buckets and check public access
aws s3api list-buckets --query "Buckets[].Name" --output text | \
  tr '\t' '\n' | while read bucket; do
    echo "=== $bucket ===" 
    aws s3api get-bucket-acl --bucket $bucket 2>/dev/null
  done

Step 3 — Enable AWS Macie for automated discovery

AWS Macie uses machine learning to automatically scan your S3 buckets for sensitive data — PII, credentials, financial records — and alerts you when it finds something you should be protecting. It takes about 5 minutes to enable and runs continuously in the background.

Quick win: Enabling S3 Block Public Access at the account level takes under two minutes and prevents any bucket from being accidentally made public — even if someone tries to. This single control would have prevented dozens of major breaches.

The 5-Point S3 Security Baseline

For any organization storing customer data in S3, these five controls are the non-negotiable baseline. Think of them as the locks on your front door — before you worry about the alarm system.

Block Public Access enabled at account level — prevents any bucket from being publicly accessible, even if misconfigured at the bucket level.
Server-side encryption enabled (SSE-KMS) — all data encrypted at rest using AWS Key Management Service, with audit logs of key usage.
Bucket policies reviewed and least-privilege — every bucket policy audited to ensure only the services and roles that genuinely need access have it.
AWS Macie enabled — continuous automated scanning to detect sensitive data and misconfigurations before attackers do.
CloudTrail logging active — every API call logged, so you have a complete audit trail if something goes wrong and need to understand what was accessed.

Don’t forget Azure Blob Storage. The same exposure pattern applies. Azure containers can be set to public access, and the 38 million record incident mentioned above happened entirely within Microsoft’s cloud. The service is different; the misconfiguration risk is identical.

What This Means for Your Business

If you’re a startup or growing business in the Bay Area, you’re probably using AWS or Azure in ways that started small and grew fast. S3 buckets get created for one use case and end up holding customer data three product iterations later. That’s normal. What isn’t normal — and what regulators won’t accept as an excuse — is not knowing what’s in them or who can access them.

GDPR, HIPAA, and California’s CCPA all require you to implement “appropriate technical measures” to protect customer data. An S3 bucket with public access doesn’t meet that bar. The fines for a breach involving exposed cloud storage are no longer theoretical — they’re well-established and growing.

The good news: this is one of the most fixable categories of cloud security risk. The controls are well-understood, the tooling is built into AWS and Azure, and a focused audit can close the most critical gaps in a matter of days — not months.

LineSight Digital offers cloud security assessments specifically for Bay Area startups and small businesses. We’ll audit your S3 and Azure Blob configurations, IAM policies, and monitoring setup — and give you a clear, prioritized remediation roadmap. No obligation.