ON THE INTERNET, the personal data users give away for free is transformed into a precious commodity. The puppy photos people upload train machines to be smarter. The questions they ask Google uncover humanity’s deepest prejudices. And their location histories tell investors which stores attract the most shoppers. Even seemingly benign activities, like staying in and watching a movie, generate mountains of information, treasure to be scooped up later by businesses of all kinds.
Personal data is often compared to oil—it powers today’s most profitable corporations, just like fossil fuels energized those of the past. But the consumers it’s extracted from often know little about how much of their information is collected, who gets to look at it, and what it’s worth. Every day, hundreds of companies you may not even know exist gather facts about you, some more intimate than others. That information may then flow to academic researchers, hackers, law enforcement, and foreign nations—as well as plenty of companies trying to sell you stuff.
What Constitutes “Personal Data”?
The internet might seem like one big privacy nightmare, but don’t throw your smartphone out the window just yet. “Personal data” is a pretty vague umbrella term, and it helps to unpack exactly what it means. Health records, social security numbers, and banking details make up the most sensitive information stored online. Social media posts, location data, and search-engine queries may also be revealing but are also typically monetized in a way that, say, your credit card number is not. Other kinds of data collection fall into separate categories—ones that may surprise you. Did you know some companies are analyzingthe unique way you tap and fumble with your smartphone?
All this information is collected on a wide spectrum of consent: Sometimes the data is forked over knowingly, while in other scenarios users might not understand they’re giving up anything at all. Often, it’s clear something is being collected, but the specifics are hidden from view or buried in hard-to-parse terms-of-service agreements.
Consider what happens when someone sends a vial of saliva to 23andme. The person knows they’re sharing their DNA with a genomics company, but they may not realize it will be resold to pharmaceutical firms. Many apps use your location to serve up custom advertisements, but they don’t necessarily make it clear that a hedge fund may also buy that location data to analyze which retail stores you frequent. Anyone who has witnessed the same shoe advertisement follow them around the web knows they’re being tracked, but fewer people likely understand that companies may be recording not just their clicks but also the exact movements of their mouse.
In each of these scenarios, the user received something in return for allowing a corporation to monetize their data. They got to learn about their genetic ancestry, use a mobile app, or browse the latest footwear trends from the comfort of their computer. This is the same sort of bargain Facebook and Google offer. Their core products, including Instagram, Messenger, Gmail, and Google Maps, don’t cost money. You pay with your personal data, which is used to target you with ads.
Who Buys, Sells, and Barters My Personal Data?
The trade-off between the data you give and the services you get may or may not be worth it, but another breed of business amasses, analyzes, and sells your information without giving you anything at all: data brokers. These firms compile info from publicly available sources like property records, marriage licenses, and court cases. They may also gather your medical records, browsing history, social media connections, and online purchases. Depending on where you live, data brokers might even purchase your information from the Department of Motor Vehicles. Don’t have a driver’s license? Retail stores sell info to data brokers, too.
The information data brokers collect may be inaccurate or out of date. Still, it can be incredibly valuable to corporations, marketers, investors, and individuals. In fact, American companies alone are estimated to have spent over $19 billion in 2018 acquiring and analyzing consumer data, according to the Interactive Advertising Bureau.
Data brokers are also valuable resources for abusers and stalkers. Doxing, the practice of publicly releasing someone’s personal information without their consent, is often made possible because of data brokers. While you can delete your Facebook account relatively easily, getting these firms to remove your information is time-consuming, complicated, and sometimes impossible. In fact, the process is so burdensome that you can pay a service to do it on your behalf.
Amassing and selling your data like this is perfectly legal. While some states, including California and Vermont, have recently moved to put more restrictions on data brokers, they remain largely unregulated. The Fair Credit Reporting Act dictates how information collected for credit, employment, and insurance reasons may be used, but some data brokers have been caught skirting the law. In 2012 the “person lookup” site Spokeo settled with the FTC for $800,000 over charges that it violated the FCRA by advertising its products for purposes like job background checks. And data brokers that market themselves as being more akin to digital phone books don’t have to abide by the regulation in the first place.
There are also few laws governing how social media companies may collect data about their users. In the United States, no modern federal privacy regulation exists, and the government can even legally request digital data held by companies without a warrant in many circumstances (though the Supreme Court recently expanded Fourth Amendment protections to a narrow type of location data).
The good news is, the information you share online does contribute to the global store of useful knowledge: Researchers from a number of academic disciplines study social media posts and other user-generated data to learn more about humanity. In his book, Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are, Seth Stephens-Davidowitz argues there are many scenarios where humans are more honest with sites like Google than they are on traditional surveys. For example, he says, fewer than 20 percent of people admit they watch porn, but there are more Google searches for “porn” than “weather.”
Personal data is also used by artificial intelligence researchers to train their automated programs. Every day, users around the globe upload billions of photos, videos, text posts, and audio clips to sites like YouTube, Facebook, Instagram, and Twitter. That media is then fed to machine learning algorithms, so they can learn to “see” what’s in a photograph or automatically determine whether a post violates Facebook’s hate-speech policy. Your selfies are literally making the robots smarter. Congratulations.
The History of Personal Data Collection
Humans have used technological devices to collect and process data about the world for thousands of years. Greek scientists developed the “first computer,” a complex gear system called the Antikythera mechanism, to trace astrological patterns as far back as 150 BC. Two millennia later, in the late 1880s, Herman Hollerith invented the tabulating machine, a punch card device that helped process data from the 1890 United States Census. Hollerith created a company to market his invention that later merged into what is now IBM.
By the 1960s, the US government was using powerful mainframe computers to store and process an enormous amount of data on nearly every American. Corporations also used the machines to analyze sensitive information including consumer purchasing habits. There were no laws dictating what kind of data they could collect. Worries over supercharged surveillance soon emerged, especially after the publication of Vance Packard’s 1964 book, The Naked Society, which argued that technological change was causing the unprecedented erosion of privacy.