For the modern media company, each reader represents a potential buffet of user information. IP addresses, browsing preferences and demographic data all are fair game for publishers looking to build detailed profiles of their audiences. Edward Snowden, the former military contractor who laid bare the extent of NSA surveillance, described a bleak state of affairs for Internet security in a recent interview, noting that “information is being stolen” by companies and governments with every click.
News organizations in general rely on user data to tell them what stories are being viewed, who’s viewing them and how long they’re being viewed for. But this practice poses a problem for outlets that want to safeguard the privacy of their readers without sacrificing useful information that can tell them more about their audience.
That’s precisely the dilemma that The Intercept faced throughout much of its infancy. Co-founded by national security journalists Laura Poitras, Glenn Greenwald and Jeremy Scahill, The Intercept has taken steps to ensure the anonymity of its sources. But the outlet still wrestled with how to measure the tendencies of its readers without compromising their identities.
“It’s important because we want to live our values,” said Ryan Tate, the deputy editor of The Intercept. “The Intercept has an editorial voice that’s pro-privacy and very concerned with surveillance. And we just didn’t feel like it would be appropriate for us to install a default tracking system like all the ones that are out there that can be very invasive to peoples’ privacy. We thought we had an obligation to try to find a solution that would allow us to practice what we preach.”
The solution? Create a workaround where The Intercept receives a picture of audience activity that doesn’t disclose information that could be used against its readers. That idea, which has been in the works for nearly a year, comes to fruition this month as The Intercept rolls out a new audience measurement system in partnership with analytics firm Parse.ly. Tate and Editor-in-Chief Betsy Reed explained the concept’s technical side on The Intercept earlier this month:
Together with Parse.ly, we’ve arrived at a system whereby readers of The Intercept will not directly ping Parse.ly. Instead, they will continue to send web requests to our own servers, which will, in turn, forward some of those requests on to Parse.ly, after stripping out readers’ internet protocol, or IP, addresses. Parse.ly will use these requests to track our readers via random unique identifiers that we generate. It will not be possible for Parse.ly to correlate readers’ visits to The Intercept with their visits to other Parse.ly-enabled sites.
As a result of the changes, The Intercept can determine which users have visited the site before, a capacity it didn’t have before. It can also determine the breakdown of unique visits, regular viewers and “drive-by readers.” Eventually, Tate says, staffers will be able to divine the time readers spend on The Intercept’s various articles.
“But we intentionally know very little beyond that,” Tate said.
In giving its readers additional protections, The Intercept hopes to lead by example and convince other news organizations that analytics can be useful without revelatory information, Tate said. This may be complicated, he says, because The Intercept couldn’t find any existing models that simultaneously preserved reader privacy and provided the depth of information staffers were looking for. Along the way, the project was turned down by Chartbeat, another analytics company.
“There are ways to work with Internet services providers and Cloud providers in a way that splits the difference, if you will, between convenience and privacy,” Tate said. “You don’t have to go all the way. We don’t have a perfect setup, but I do hope it serves as inspiration for other providers.”