Deep Root Analytics Exposed 198 Million U.S. Voter Records Through a Public AWS S3 Bucket

Published: Sunday, 24 May 2026

aws
s3
data loss

UpGuard reported that Deep Root Analytics exposed 1.1 TB of downloadable data for up to 198 million potential U.S. voters through an Amazon Web Services S3 bucket with no access protections.

The exposed repository contained names, dates of birth, home and mailing addresses, phone numbers, voter registration details and modelled ethnicity, religion and political preference data linked by persistent RNC identifiers. The incident illustrates a clear cloud governance failure: using an S3 bucket as an open internet-facing distribution endpoint for highly sensitive data is an unsafe architecture unless the content is intentionally public. Public S3 website-style access is trivially discoverable and routinely scanned, making “security by obscurity” ineffective. Beyond the storage misconfiguration itself, the source report points to broader third-party risk and data concentration problems across Deep Root Analytics, Data Trust and TargetPoint. The core lesson is not just that a bucket was public, but that sensitive, identity-linked analytics data was placed into an architecture with no meaningful access control, weak governance over vendor-operated cloud assets and no effective preventive guardrails.

What went wrong ¶

What Happened	Cause	Action
Sensitive voter data was stored in a publicly accessible S3 bucket	UpGuard found that the `dra-dw` Amazon S3 bucket “lacked any protection against access” and that 1.1 TB was fully downloadable by anyone with the URL. This aligns with an AWS S3 public endpoint exposure pattern: content intended to remain private was placed into storage configured for open internet retrieval.	Validate that no S3 buckets containing PII, political data, customer data or internal analytics are publicly accessible through bucket policy, ACLs or static website/public object access settings. SkySiege surfaces buckets that have been configured to have public accessibility.
An architecture intended for public distribution was used for private high-risk data	Open S3 website-style access is suitable only for fully public datasets or assets. Here, the bucket held identity-linked voter records and modelled preferences, which is the opposite of an appropriate public distribution use case.	Validate whether any cloud storage service is being used as a direct file-serving endpoint for data that is not explicitly approved for public release. SkySiege flags the S3 website endpoint service as an anti-pattern as other services such as Cloudfront provide much stronger data access controls than S3.
The bucket was trivially discoverable and externally scannable	UpGuard states the data could be accessed by navigating to a six-character Amazon subdomain. S3 public endpoints are easy to enumerate and are actively scanned with common tools, so discoverability risk was inherent once public access was enabled.	Validate that teams do not assume obscure bucket names provide protection. SkySiege recommends that buckets use a unique naming convention to help identify the account and the bucket usage. This stymies scanning but is still not defence in depth.
Massive amounts of PII and sensitive modelled attributes were concentrated in one exposed repository	The report lists names, birth dates, addresses, phone numbers, voter registration details, modelled ethnicity, modelled religion and 9.5 billion modelled political scores tied back to real identities through RNC IDs. This created a single high-impact failure point.	Validate whether sensitive datasets are unnecessarily centralised, linkable and accessible from the same storage location. For data like this it’s potentially best to host in a proper database such as RDS systems which would and should be segregated away in private subnets.
Third-party vendor governance appears weak across a shared political data ecosystem	The exposed bucket was operated by Deep Root Analytics but contained data associated with Data Trust and TargetPoint. The report ties multiple contractors to the same broader RNC data operation, showing cross-vendor data handling without effective control over cloud storage security.	Validate contractual, technical and monitoring controls for vendors hosting or processing shared datasets. Separate entities should have separate ownership of their data including separate environments. Data sharing should be made explicit instead.
Basic preventive controls did not stop public exposure of highly sensitive data	UpGuard describes the repository as missing even the simplest protections against public access. The bucket remained exposed for an unknown period until notification prompted remediation.	Validate that preventive controls such as account-level S3 public access blocks, policy guardrails and continuous configuration monitoring are enabled. SkySiege checks for the existence of bucket policies, public access controls and other missing data protection policies across S3 services as standard.
Identity linkage turned analytics data into directly attributable personal profiles	The report explains that RNC IDs in analytical files could be matched to names and addresses in contact files, making modelled beliefs and voting propensity attributable to identified individuals.	Validate whether pseudonymous identifiers can be re-linked using adjacent datasets in the same environment. Data should be classified across all resources as to it’s classification, whether it holds personal data and other considerations such as legal obligations.

Why this matters ¶

This incident is not just a storage mistake. It is a clear example of cloud architecture being misapplied to sensitive data. Public S3 access is a valid AWS feature, but only for data meant to be openly distributed. When the same pattern is used for private analytics warehouses, exposure becomes immediate, scalable and easy for external parties to discover.

The operational lesson is that internet exposure of cloud storage is often deterministic, not accidental in practice. Once public access is enabled, scanning and discovery are easy. That makes detection gaps and governance weaknesses more important than naming secrecy or assumptions that nobody will look. If an organisation lacks continuous visibility into public buckets, it may not know that sensitive data is already downloadable.

Business impact is severe here because the repository combined high-volume PII, inferred sensitive attributes and political scoring. That increases downstream risk of fraud, profiling, spam, targeted manipulation and reputational damage. It also raises legal and compliance concerns wherever organisations process sensitive personal data without adequate safeguards.

For enterprise diligence, the broader concern is third-party governance. The source shows multiple firms participating in a shared data ecosystem, yet one vendor’s cloud misconfiguration created systemic exposure. That is exactly the kind of control failure SkySiege surfaces: public cloud storage, missing policies and controls and inefficient use of cloud resources when considering data protection and management.

References ¶

Original Article

Visit

Deep Root Analytics Exposed 198 Million U.S. Voter Records Through a Public AWS S3 Bucket

A misconfigured Amazon S3 bucket used as an internet-accessible file repository left 1.1 TB of voter PII and modelled political data downloadable without authentication

What went wrong ¶

Why this matters ¶

References ¶