News organizations have got themselves into a tough spot. After years of not valuing page load time, social platforms have begun implementing systems that either host articles directly (Facebook Instant Articles) or dictate technical standards for how story elements are structured (Google AMP). The result is largely the same: news organizations that use these platforms see improvements in load times, but they are losing control over how they distribute and present their journalism.
Although performance is an area that the news industry should dedicate more resources to, using these platforms to solve that problem raises ethical questions.
This issue is most visible with interactive stories: articles that ask readers questions about themselves and use that data to personalize the resulting narrative. If platforms host these articles, they could capture reader responses and do things such as add that data to an advertising profile, or sell it to a third party. If a news organization is aware or suspects this is a possibility, it could chill a newsroom’s output of these types of stories, which raises questions of press freedom.
For example, the New York Times ran an interesting article on jury selection that asked the reader questions about their personal values and then showed how those views might affect whether they’d be stricken from a jury. The story specifically states at the top “Your responses will not be stored.” But if this story ran within, for instance, a future Facebook Instant Articles quiz component, what guarantees would news organizations and readers have that Facebook would honor that promise?
I’ve worked on stories that ask about readers’ experiences with abortion and seen others that ask about trust levels with law enforcement. These are worthwhile pieces of journalism and a format that we should keep experimenting with. But with platforms hosting the story code, could answers of “I am pro-life” or “I don’t trust the police” be sold to a political campaign or collected by law enforcement and added to an individual’s predictive “threat score” that jurisdiction is using?
Red flags should go up any time one creates structured data around what people believe or value.
Reader answers in these stories are not guaranteed to be truthful, either. Readers could click a button that doesn’t reflect their views because they’re curious to see how the interactive reacts differently. It would be hard, if not impossible, for a computer or a law enforcement agent to distinguish, however.
Could a platform say they won’t collect data against certain topics? Yes, certainly. And companies such as Google and Facebook have current guidelines on what data they will or will not target against. But a newsroom’s take on what constitutes sensitive data might not align with a platform’s policies. Even with such a policy in place, the online data collection ecosystem is opaque and auditing such a process would be outside the resources of almost all newsrooms.
As a news organization, what is our responsibility to run or not run stories that we suspect the platform will mine for information that could be used against the reader’s interests?
How likely is a platform to routinely store this data given how many thousands of articles are published a day. I’m not sure, but here are a few possible scenarios using some of the examples above:
- The social platform mined answers from just high traffic pieces. These types of quizzes are some of the most popular content news organizations publish. The Times dialect quiz, for example, was one of the most-visited piece of content they ever ran.
- Law enforcement is interested in a few stories or in a group of individual’s specific responses.
- An advertiser is interested in a segment of the population (gun owners, or women who have had abortions, etc.) and hasn’t previously found a reliable way to get that data.
In any of these scenarios, it’s plausible that the platform could assign an engineer for a day to create a simple tagging system. Or, if platforms create “quiz” components, they could be built to returned structured data automatically.
If page load speed is the problem, promoting technical guidelines and best practices would be a constructive alternative. Common standards would help make the web faster for our readers and do so without changing a core piece of the publishing process.
Many newsrooms are getting on board with these platforms but this relationship is not just a new distribution model or a business decision. It could affect what stories journalists do and how we do them.
Michael Keller is a data journalist for Bloomberg News and a former Research Fellow at the Tow Center for Digital Journalism at Columbia University. His work has appeared in the Washington Post, TheAtlantic.com, Al Jazeera America and Newsweek. Follow him at @mhkeller.