The data trove was apparently made public by accident by one of the data-mining companies that compiled it. It includes a mix of private information and data gleaned from public voter rolls: “the voter’s date of birth, home and mailing addresses, phone number, registered party, self-reported racial demographic, voter registration status” as well as computer “modeled” speculation about each person’s race and religion, according to an analysis provided to The Intercept.
The leak was discovered by Chris Vickery, an analyst at the U.S. cybersecurity firm UpGuard, who last year discovered an enormous breach of Mexican voter data and in 2015 a 300GB leak of records of 191 million voters. This new incident is more extensive, according the analysis, written by UpGuard:
UpGuard’s Cyber Risk Team can now confirm that unsecured databases containing the sensitive personal details of over 198 million American voters was left exposed to the internet. The data, which was stored in a publicly accessible cloud server owned by Republican data firm Deep Root Analytics, included 1.1 terabytes of entirely unsecured personal information compiled by DRA and at least two other contractors, TargetPoint Consulting, Inc. and Data Trust. In total, the personal information of nearly all of America’s 200 million registered voters was exposed, including their names, dates of birth, home addresses, phone numbers, and voter registration details, as well as voter ethnicities and religions as “modeled” by the firms’ data scientists.
(DRA, TargetPoint, and Data Trust were not immediately available for comment.)
Two of the firms linked to the database, Deep Root Analytics and Target Point, were among three firms hired by the RNC to do most of its data modeling and voter scoring in 2016, according to a December Ad Age story, with a mandate to shore up unconvinced Trump-leaning voters, sway weak Hillary Clinton supporters, and capture undecided voters.
What UpGuard appears to have discovered, sitting on an Amazon cloud storage drive with no password or username required for access by anyone on the internet, was terabytes of the data used to map the voter proclivities and demographics key to finding voters in those buckets. Beyond personal information like religion, age, and probable ethnicity, certain database files among those made public include individual scores for nearly 50 different beliefs, according to UpGuard’s analysis:
Each of fields under each of the forty-eight columns signifies the potential voter’s modeled likelihood of supporting the policy, political candidate, or belief listed at the top of the column, with zero indicating very unlikely, and one indicating very likely.
Calculated for 198 million potential voters, this adds up to a spreadsheet of 9.5 billion modeled probabilities, for questions ranging from how likely it is the individual voted for Obama in 2012, whether the agree with the Trump foreign policy of “America First,” and how likely they are to be concerned with auto manufacturing as an issue, among others.
The below screenshot, provided by Vickery, shows just some of the alignments on which 198 million Americans were scored:
Most Americans would likely be disturbed that this kind of information was generated about them in the first place, to say nothing of the fact that it was accidentally made public by the very companies being paid by the Republican Party to make it, with essentially zero security precautions of any kind taken with how it was stored in the cloud.
Update: June 19th, 2017
Bill Daddi, apparently handling public relations for Deep Root Analytics, provided the following message to The Intercept:
As you can understand, we can’t comment on much here as we are not at liberty to discuss the details of work on behalf of any entity that might be a client, nor provide specifics of our proprietary data and analysis.
There is a general statement that has been released, which is below. This hopefully addresses some of your questions.
To help you understand what Deep Root Analytics does, we inform local television ad buys for advertisers. We don’t make the buys, nor engage in any digital marketing or targeting outreach. We help entities understand what local TV ad buys to make.
As indicated in the below, we have engaged Stroz Freidberg to conduct a thorough review, and that process is underway. Based upon this review we have determined that the access that was made without our knowledge happened because of a change that was made in the files’ asset access protocols. We are in the process of determining how that change was made and take full responsibility for the change, but suffice to say we have updated the settings to prevent further access. We believe the change that was made happened post June 1 2017, which was when we last evaluated and updated our security settings. We do not believe that our systems have been hacked. To date, the only entity that we are aware of that had access to the data was Chris Vickery.
“Deep Root Analytics has become aware that a number of files within our online storage system were accessed without our knowledge.
Deep Root Analytics builds voter models to help enhance advertiser understanding of TV viewership. The data accessed was not built for or used by any specific client. It is our proprietary analysis to help inform local television ad buying.
The data that was accessed was, to the best of our knowledge this proprietary information as well as voter data that is publicly available and readily provided by state government offices. Since this event has come to our attention, we have updated the access settings and put protocols in place to prevent further access. We take full responsibility for this situation.
Deep Root Analytics maintains industry standard security protocols. We built our systems in keeping with these protocols and had last evaluated and updated our security settings on June 1, 2017.
We are conducting an internal review and have retained cyber security firm Stroz Friedberg to conduct a thorough investigation. Through this process, which is currently underway, we have learned that access was gained through a recent change in asset access settings since June 1, 2017. We accept full responsibility, will continue with our investigation, and based on the information we have gathered thus far, we do not believe that our systems have been hacked. To date, the only entity that we are aware of that had access to the data was Chris Vickery. “
Get real time update about this post categories directly on your device, subscribe now.