Why AI isn’t going to solve Facebook’s fake news problem
Moderation will always be a problem on a platform this huge.
Facebook has a lot of problems right now, but one that’s definitely not going away any time soon is fake news. As the company’s user base has grown to include more than a quarter of the world’s population, it has (understandably) struggled to control what they all post and share. For Facebook, unwanted content can be anything from mild nudity to serious violence, but what’s proved to be most sensitive and damaging for the company is hoaxes and misinformation — especially when it has a political bent.
So what is Facebook going to do about it? At the moment, the company doesn’t seem to have a clear strategy. Instead, it’s throwing a lot at the wall and seeing what works. It’s hired more human moderators (as of February this year it had around 7,500); it’s giving users more information in-site about news sources; and in a recent interview, Mark Zuckerberg suggested that the company might set up some sort of independent body to rule on what content is kosher. (Which could be seen as democratic, an abandonment of responsibility, or an admission that Facebook is out of its depth, depending on your view.) But one thing experts say Facebook needs to be extremely careful about is giving the whole job over to AI.
So far, the company seems to be just experimenting with this approach. During and interview with The New York Times about the Cambridge Analytica scandal, Zuckerberg revealed that for the special election in Alabama last year, the company “deployed some new AI tools to identify fake accounts and false news.” He specified that these were Macedonian accounts (an established hub in the fake-news-for-profit business), and the company later clarified that it had deployed machine learning to find “suspicious behaviors without assessing the content itself.”
This is smart because when it comes to fake news, AI isn’t up to the job.
The challenges of building an automated fake news filter with artificial intelligence are numerous. From a technical perspective, AI fails on a number of levels because it just can’t understand human writing the way humans do. It can pull out certain facts and do a crude sentiment analysis (guessing whether a piece of content is “happy” or “angry” based on keywords), but it can’t understand subtleties of tone, consider cultural context, or ring someone up to corroborate information. And even if it could do all this, which would knock out the most obvious misinformation and hoaxes, it would eventually run up against edge cases that confuse even humans. If people on the left and the right can’t agree on what is and is not “fake news,” there’s no way we can teach a machine to make that judgement for us.
In the past, efforts to deal with fake news using AI have quickly run into problems, as with the Fake News Challenge — a competition to crowdsource machine learning solutions held last year. Dean Pomerleau of Carnegie Mellon University, who helped organize the challenge tells The Verge that he and his team soon realized AI couldn’t tackle this alone.
“We actually started out with a more ambitious goal of creating a system that could answer the question ‘Is this fake news, yes or no?’ We quickly realized machine learning just wasn’t up to the task.”
Pomerleau stresses that comprehension was the primary problem, and to understand why exactly language can be so nuanced, especially online, we can turn to the example set by Tide pods. As Cornell professor James Grimmelmann explained in a recent essay on fake news and platform moderation, the internet’s embrace of irony has made it extremely difficult to judge sincerity and intent. And Facebook and YouTube have found this out for themselves when they tried to remove Tide Pod Challenge videos in January this year.
As Grimmelmann explains, when it came to deciding which videos to delete, the companies would have been faced with a dilemma. “It’s easy to find videos of people holding up Tide Pods, sympathetically noting how tasty they look, and then giving a finger-wagging speech about not eating them because they’re dangerous,” he says. “Are these sincere anti-pod-eating public service announcements? Or are they surfing the wave of interest in pod-eating by superficially claiming to denounce it? Both at once?”
Grimmelmann calls this effect “mimetic kayfabe,” borrowing the pro-wrestling term for the willing suspension of disbelief by audience and wrestlers. He also says this opacity in meaning is not limited to meme culture, and has been embraced by political partisans — often responsible for creating and sharing fake news. Pizzagate is the perfect example of this, says Grimmelmann, as it is “simultaneously a real conspiracy theory, a gleeful masquerade of a conspiracy theory, and a disparaging meme about conspiracy theories.”
So if Facebook had chosen to block any pizzagate articles during the 2016 election, they would likely have been hit with complaints not only about censorship, but also protests that such stories were “only a joke.” Extremists exploit this ambiguity frequently, as was best shown in the leaked style guide of neo-Nazi website The Daily Stormer.
Founder Andrew Anglin advised would-be writers “the unindoctrinated should not be able to tell if we are joking or not,” before making it clear they’re not: “This is obviously a ploy and I actually do want to gas kikes. But that’s neither here nor there.”
Considering this complexity, it’s no wonder that Pomerleau’s Fake News Challenge ended up asking teams to complete a simpler task: make an algorithm that can simply spot articles covering the same topic. Something they turned out to be pretty good at.
With this tool a human could tag a story as fake news (for example, claiming a certain celebrity has died) and then the algorithm would knock out any coverage repeating the lie. “We talked to real-life fact-checkers and realized they would be in the loop for quite some time,” says Pomerleau. “So the best we could do in the machine learning community would be to help them do their jobs.”
This seems to be Facebook’s preferred approach. For the Italian elections this year, for example, the company hired independent fact-checkers to flag fake news and hoaxes. Problematic links weren’t deleted, but when shared by a user they were tagged with the label “Disputed by 3rd Party Fact Checkers.” Unfortunately, even this approach has problems, with a recent report from the Columbia Journalism Review highlighting fact-checker’s many frustrations with Facebook.
The journalists involved said it often wasn’t clear why Facebook’s algorithms were telling them to check certain stories, while sites well-known for spreading lies and conspiracy theories (like InfoWars) never got checked at all.
However, there’s definitely a role for algorithms in all this. And while AI can’t do any of the heavy lifting in stamping out fake news, it can filter it in the same way spam is filtered out of your inbox. Anything with bad spelling and grammar can be knocked out, for example; or sites that rely on imitating legitimate outlets to entice readers. And as Facebook has shown with its targeting of Macedonian accounts “that were trying to spread false news” during the special election in Alabama, it can be relatively easy to target fake news when it’s coming from known trouble-spots.
Experts say, though, that is the limit of AI’s current capabilities. “This kind of whack-a-mole could possibly help filtering out get-rich-fast teenagers from Tbilisi, but is unlikely to effect consistent but large-scale offenders like InfoWars,” Mor Naaman, an associate professor of information science at Cornell Tech, tells The Verge. He adds that even these simpler filters can create problems. “Classification is often based on language patterns and other simple signals, which may ‘catch’ honest independent and local publishers together with producers of fake news and misinformation,” says Naaman.
And even here, there is a potential dilemma for Facebook. Although in order to avoid accusations of censorship, the social network should be open about the criteria its algorithms use to spot fake news, if it’s too open people could game the system, working around its filters.
For Amanda Levendowski, a teaching fellow at NYU law, this is an example of what she calls the “Valley Fallacy.” Speaking to The Verge about Facebook’s AI moderation she suggests this is a common mistake, “where companies start saying, ‘We have a problem, we must do something, this is something, so we must do this,’ without carefully considering whether this could create new or different problems.” Levendowski adds that despite these problems, there are plenty of reasons tech firms will continue to pursue AI moderation, ranging from “improving users’ experiences to mitigating the risks of legal liability.”
These are surely temptations for Zuckerberg, but even then, it seems that leaning too hard on AI to solve its moderation problems would be unwise. And not something he would want to explain to Congress next week.