If you searched Google immediately after the recent mass shooting in Texas for information on the gunman, you would have seen what Justin Hendrix, the head of the NYC Media Lab, called a “misinformation gutter.” A spokesperson for Google later gave a statement to Gizmodo that placed blame squarely on an algorithm:
"The search results appearing from Twitter, which surface based on our ranking algorithms, are changing second by second and represent a dynamic conversation that is going on in near real-time. For the queries in question, they are not the first results we show on the page. Instead, they appear after news sources, including our Top Stories carousel which we have been constantly updating. We’ll continue to look at ways to improve how we rank tweets that appear in search."
In other words, it was an algorithm — not a human making editorial decisions — that was responsible for this gaffe. But as Gizmodo’s Tom McKay pointed out, this kind of framing is intentional and used frequently by Twitter and other social networks when problems arise.
He writes: “Google, Twitter, and Facebook have all regularly shifted the blame to algorithms when this happens, but the issue is that said companies write the algorithms, making them responsible for what they churn out.”
Algorithms can be gamed, algorithms can be trained on biased information, and algorithms can shield platforms from blame. Mike Ananny puts it this way:
By continually claiming that it is a technology company — not a media company — Facebook can claim that any perceived errors in Trending Topics or News Feed products are the result of algorithms that need tweaking, artificial intelligence that needs more training data, or reflections of users. It claims that it is not taking any editorial position.
Platforms rely on these algorithms to perform actions at scale, but algorithms at scale also become increasingly inscrutable, even to the people who wrote the code. In her recent TED Talk about the complexity of AI, Zeynep Tufekci points out that not even the people behind Facebook’s algorithms truly understand them:
We no longer really understand how these complex algorithms work. We don't understand how they're doing this categorization. It's giant matrices, thousands of rows and columns, maybe millions of rows and columns, and not the programmers and not anybody who looks at it, even if you have all the data, understands anymore how exactly it's operating any more than you'd know what I was thinking right now if you were shown a cross section of my brain. It's like we're not programming anymore, we're growing intelligence that we don't truly understand.
This is problematic for journalism. We cannot write about what we cannot see, but we increasingly write about what we think will be surfaced by these algorithms, which generate eyeballs, which then generate clicks, which then generate an increasingly smaller pool of digital ad dollars (the majority of which are now going to Facebook and Google.)
And despite our (perhaps) growing unease with these platforms, we still rely on the them for distribution. In their excellent report on the convergence between publishers and platforms, Emily Bell and Taylor Owen write that “A growing number of news organizations see investing in social platforms as the only prospect for a sustainable future, whether for traffic or for reach,” echoing what Franklin Foer recently wrote in The Atlantic about The New Republic’s increasing dependency on these platforms — and what their algorithms might surface: “Dependence generates desperation — a mad, shameless chase to gain clicks through Facebook, a relentless effort to game Google’s algorithms. It leads media outlets to sign terrible deals that look like self-preserving necessities: granting Facebook the right to sell their advertising, or giving Google permission to publish articles directly on its fast-loading server. In the end, such arrangements simply allow Facebook and Google to hold these companies ever tighter.”
This reliance on algorithmic click-chasing was the basis for a recent essay by Maciej Ceglowski, who runs a bookmarking site called Pinboard and frequently writes about socio-technological issues. He traces one story that burgeoned out of Amazon’s “frequently bought together” algorithm, and then spread very quickly to other media outlets, despite little evidence that it was true. Justification for republishing, he wrote, was often because other news outlets had already reported on it. He writes: “Together with climate change, this algorithmic takeover of the public sphere is the biggest news story of the early 21st century. We desperately need journalists to cover it. But as they grow more dependent on online publishing for their professional survival, their capacity to do this kind of reporting will disappear, if it has not disappeared already.”
Our conversation is below.
You write that "the real story in this mess is not the threat that algorithms pose to Amazon shoppers, but the threat that algorithms pose to journalism." But it seems like there's also an additional threat here – that there's an increasing reliance on using these algorithms and tools to show what’s performing well on competitors' social media sites to find story ideas.
I'm curious to get your thoughts on how so many news organizations wind up with the same piece on the same topic — and how we move away from click-chasing when so many are reliant on clicks to pay the bills.
It’s important not to confuse individual tools with the broader structural problems in journalism that make people feel those tools are necessary. The tool(s) you mention (and many others like it) are a symptom of the broader problem, which is that social networks have become the primary distribution channel for news. News organizations have become wholly dependent on Facebook and Google in particular for online distribution, and for whatever revenue they still get, but the interests of those two companies do not align with those of journalists or the public. The problem is the casino, not the specific slot machines in it.
Are there ways to build reward mechanisms or incentives into search/social to combat this mentality of click-chasing? If it's not going to come from the publishers, could it come from elsewhere? Tech workers? Who has the leverage here to create change?
There is no technical solution for this problem. Publishers need to find a way to isolate journalists from the immediate pressure of click chasing, while finding a way to keep the lights on that is less reliant on Google and Facebook. In particular, we need some way to insulate working journalists from the pressure to pile on to stories they don’t have adequate time to research, and the pressure to evaluate their success by clicks and virality on a per-story basis.
The only people who have the leverage to change the rules of the casino are tech workers. Even a small group of specialized workers at Google or Facebook, if they made a concerted effort, have enough clout that they could push through policies that would improve the state of online journalism. But they have so far declined to use this power, even for modest workplace changes in their own self interest.
Who do you think is covering algorithms well and accurately? How would you like to see them covered?
Just like the story of industrial revolution was not really about valves and pistons, what’s happening around machine learning is not tied to the details of the implementation. The real story of algorithms and machine learning is how these technologies are affecting communities and societies. The algorithm story only makes sense when it’s rooted in a larger context of power relationships between people.
Journalists should not be intimidated by the technical content of machine learning, which despite its name and reputation is not prohibitively complex. The distinguishing feature of machine learning in our era is that its implementation is simple but inscrutable, even to the programmers who build it. You feed massive data sets into a fairly simple mathematical contraption, and get useful results. But you can’t point to why you got those results, any more than a brain surgeon can point to individual thoughts in your head.
Two key points I wish journalists would remember: first, for machine learning to be effective, it has to train on truly enormous amounts of user data. This drives a dynamic of restless and aggressive surveillance.
Second, when these techniques fail, they fail in ways that are not human. AI is simple math, there's no room in it for ethics, or common sense, or empathy. Some of the egregious errors we will catch, but a lot of them we will not. These systems, which would require an extraordinary level of oversight to run safely, are being deployed across the planet by naive Stanford grads with almost no human supervision, for profit, and have the potential to profoundly reshape our society.
Is it even possible to cover these systems well and accurately? If material cannot be found, it basically doesn't exist — so what's the line, and how do journalists balance?
Journalists have painted themselves into an ethical corner, because their work is hostage to the same surveillance and distribution system that is destroying their profession. They deserve some of the blame for this.
Moreover, Google has shown itself to be aggressive towards criticism that starts to hit home. But journalists are supposed to be good at confronting power. Weaning media outlets off of Facebook, and off of online ads as the sole source of revenue, will create some space for more assertive and skeptical coverage of that world.
It's almost like we need a technical ombudsman to assess these things. I'm curious what you think of that — or what role(s) within a newsroom should be taking this on. It's an ethical question, but usually the people who think about ethics in newsrooms come from a news background, not a technical one.
What you need is just a regular ombudsman, vigilant against the corrosive effects of click chasing, rushed deadlines, and herd behavior.
The technical machinery of the online ad casino is not what's vital here. What’s important is countering its effects. Witness just the last week of news, where a humanitarian crisis in Puerto Rico got almost no front-page coverage because news organizations (correctly) determined that they would get the most clicks by covering a feud between the President and the NFL. At one point, the NYT had eight NFL feud stories on its front page, while millions of Puerto Ricans went without water or power.
It doesn’t take someone steeped in algorithms to point out that this is a shameful dereliction of journalistic duty. It just takes an independent voice within the organization unafraid to speak up. But the New York Times just got rid of its ombudsman role.
How do we effectively cover the technical organizations that sponsor our conferences?
If you want to bite the hand that feeds you, it’s good to have somewhere else to eat. Stop letting them sponsor your conferences! Switch to a cheaper hotel, have the conference in Dayton instead of New York, do whatever you have to do to wean yourselves off that sweet, sweet tech money. I can pass the hat for this among tech workers, if you think it will help.
But stop cozying up to the people destroying your profession. I have been to these news conferences and watched Google baldly assert that it lacks the resources to prune fake news, while prominent journalists nodded along. It was embarrassing to even watch.
Focus less on the technology (where it’s easy to fall prey to bullshit) and much more on the power relationships and which way the money flows. Talk to technical experts and cultivate technical sources. Don’t let the big tech companies intimidate you.
[Disclosure: Poynter receives funding from Google News Lab.]
What's the answer here? Like, what do we do? Reporters aren't going to be given more time to write up stories, they need to file, they have Crowdtangle and Chartbeat and Twitter Trends and ways to see what's resonating — what's the way forward?
Recognize that you are going down a path of no return for your profession. The online ad racket thrives on novelty, and you will be forever chasing new formats, with diminishing returns, while upstarts with no ideological scruples like Breitbart eat your lunch. Stop looking at which of your stories ‘resonate’, based on clicks and views, and find an alternative way to evaluate your work that is not so granular. Otherwise you’re doomed. Doooomed.
Evaluate reporters base on their aggregate body of work, not on each individual story. Burn your tracking codes. Give people a way to pay you directly, rather than through ads. Stop enabling surveillance.
Reporters have been busy, harried and on deadline since the first newspaper. That’s part of the professional identity. But you’re also prone to self-pity in a way that doesn’t do credit to your industry. Get out there and bust heads.
Switching gears for a second, I'm curious how you come up with the longform essay topics on your blog. You’ve written about Antarctica and scurvy and flight and infrastructure projects, and they're all incredibly well-researched and varied. What inspires you to take something on?
I can’t walk three steps without having a deep insight. I remember things better when I write about them, and I also enjoy long-windedly explaining things to people. So when I get particularly obsessed about a topic, it's fun for me to write it up.
You talk about algorithms, and the need for better reporting — but I also see a need for better coverage of numbers in general (You point this out more than a decade ago.) If you were going to design a curriculum for journalists to follow, what kinds of courses and skills would you want them to know? (And the all-important question: How does this then spread to smaller newsrooms where resources are even more strapped…?)
First off, journalists need to know how to do practical security. In conjunction with Tech Solidarity, I've been offering a 50-minute security briefing for working journalists, and I’m happy to offer it to any reporter, for free. People need to be safe.
If I could teach journalists three mathy things, they would be these:
1. Basic probability and statistics.
2. The concept of one-way functions and cryptographic signatures that is at the heart of secure messaging and e-commerce.
3. The basic concepts that underpin machine learning. I realize this sounds daunting, but I am an art major and absorbed this stuff.
We need better online explainers for this, and this is something the tech industry can help with. I’m hoping to get a deeper understanding of machine learning this year so I can write a decent, intellectually honest explainer for journalists and other interested people outside the field.