Table of Contents
Frances Haugen’s testimony at the Senate hearing today raised serious questions about how Facebook’s algorithms work—and echoes many findings from our previous investigation.
On Sunday night, the primary source for the Wall Street Journal’s Facebook Files, an investigative series based on internal Facebook documents, revealed her identity in an episode of 60 Minutes.
Frances Haugen, a former product manager at the company, says she came forward after she saw Facebook’s leadership repeatedly prioritize profit over safety.
Before quitting in May of this year, she combed through Facebook Workplace, the company’s internal employee social media network, and gathered a wide swath of internal reports and research in an attempt to conclusively demonstrate that Facebook had willfully chosen not to fix the problems on its platform.
Today she testified in front of the Senate on the impact of Facebook on society. She reiterated many of the findings from the internal research and implored Congress to act.
The company’s AI algorithms gave it an insatiable habit for lies and hate speech. Now the man who built them can’t fix the problem.
“I’m here today because I believe Facebook’s products harm children, stoke division, and weaken our democracy,” she said in her opening statement to lawmakers. “These problems are solvable. A safer, free-speech respecting, more enjoyable social media is possible. But there is one thing that I hope everyone takes away from these disclosures, it is that Facebook can change, but is clearly not going to do so on its own.”
During her testimony, Haugen particularly blamed Facebook’s algorithm and platform design decisions for many of its issues. This is a notable shift from the existing focus of policymakers on Facebook’s content policy and censorship—what does and doesn’t belong on Facebook. Many experts believe that this narrow view leads to a whack-a-mole strategy that misses the bigger picture.
“I’m a strong advocate for non-content-based solutions, because those solutions will protect the most vulnerable people in the world,” Haugen said, pointing to Facebook’s uneven ability to enforce its content policy in languages other than English.
Haugen’s testimony echoes many of the findings from an MIT Technology Review investigation published earlier this year, which drew upon dozens of interviews with Facebook executives, current and former employees, industry peers, and external experts. We pulled together the most relevant parts of our investigation and other reporting to give more context to Haugen’s testimony.
How does Facebook’s algorithm work?
Colloquially, we use the term “Facebook’s algorithm” as though there’s only one. In fact, Facebook decides how to target ads and rank content based on hundreds, perhaps thousands, of algorithms. Some of those algorithms tease out a user’s preferences and boost that kind of content up the user’s news feed. Others are for detecting specific types of bad content, like nudity, spam, or clickbait headlines, and deleting or pushing them down the feed.
All of these algorithms are known as machine-learning algorithms. As I wrote earlier this year:
Unlike traditional algorithms, which are hard-coded by engineers, machine-learning algorithms “train” on input data to learn the correlations within it. The trained algorithm, known as a machine-learning model, can then automate future decisions. An algorithm trained on ad click data, for example, might learn that women click on ads for yoga leggings more often than men. The resultant model will then serve more of those ads to women.
And because of Facebook’s enormous amounts of user data, it can
develop models that learned to infer the existence not only of broad categories like “women” and “men,” but of very fine-grained categories like “women between 25 and 34 who liked Facebook pages related to yoga,” and [target] ads to them. The finer-grained the targeting, the better the chance of a click, which would give advertisers more bang for their buck.
The same principles apply for ranking content in news feed:
Just as algorithms [can] be trained to predict who would click what ad, they [can] also be trained to predict who would like or share what post, and then give those posts more prominence. If the model determined that a person really liked dogs, for instance, friends’ posts about dogs would appear higher up on that user’s news feed.
Before Facebook began using machine-learning algorithms, teams used design tactics to increase engagement. They’d experiment with things like the color of a button or the frequency of notifications to keep users coming back to the platform. But machine-learning algorithms create a much more powerful feedback loop. Not only can they personalize what each user sees, they will also continue to evolve with a user’s shifting preferences, perpetually showing each person what will keep them most engaged.
Who runs Facebook’s algorithm?
Within Facebook, there’s no one team in charge of this content-ranking system in its entirety. Engineers develop and add their own machine-learning models into the mix, based on their team’s objectives. For example, teams focused on removing or demoting bad content, known as the integrity teams, will only train models for detecting different types of bad content.
This was a decision Facebook made early on as part of its “move fast and break things” culture. It developed an internal tool known as FBLearner Flow that made it easy for engineers without machine learning experience to develop whatever models they needed at their disposal. By one data point, it was already in use by more than a quarter of Facebook’s engineering team in 2016.
Many of the current and former Facebook employees I’ve spoken to say that this is part of why Facebook can’t seem to get a handle on what it serves up to users in the news feed. Different teams can have competing objectives, and the system has grown so complex and unwieldy that no one can keep track anymore of all of its different components.
As a result, the company’s main process for quality control is through experimentation and measurement. As I wrote:
Teams train up a new machine-learning model on FBLearner, whether to change the ranking order of posts or to better catch content that violates Facebook’s community standards (its rules on what is and isn’t allowed on the platform). Then they test the new model on a small subset of Facebook’s users to measure how it changes engagement metrics, such as the number of likes, comments, and shares, says Krishna Gade, who served as the engineering manager for news feed from 2016 to 2018.
If a model reduces engagement too much, it’s discarded. Otherwise, it’s deployed and continually monitored. On Twitter, Gade explained that his engineers would get notifications every few days when metrics such as likes or comments were down. Then they’d decipher what had caused the problem and whether any models needed retraining.
How has Facebook’s content ranking led to the spread of misinformation and hate speech?
During her testimony, Haugen repeatedly came back to the idea that Facebook’s algorithm incites misinformation, hate speech, and even ethnic violence.
“Facebook … knows—they have admitted in public—that engagement-based ranking is dangerous without integrity and security systems but then not rolled out those integrity and security systems in most of the languages in the world,” she told the Senate today. “It is pulling families apart. And in places like Ethiopia it is literally fanning ethnic violence.”
Here’s what I’ve written about this previously:..
Read The Full Article at MIT Technology Review