From transcription to trust: How AI is transforming news production

A conversation with Dr. Nicholas Diakopoulos

Posted

Though technological innovation moves at lightning speed, Artificial Intelligence (AI) remains in a fledgling phase. News organizations are understandably grappling with what AI means to both operations and long-term sustainability. Nicholas Diakopoulos, Ph.D., is dedicated to discovering those answers. He’s a communication studies and computer science professor at Northwestern University and the director of the Computational Journalism Lab (CJL). E&P asked Dr. Diakopoulos about his work in AI, automation and algorithms for news production, and some of the most often-asked AI questions we hear from readers.

E&P: AI has become somewhat synonymous with generative AI and chatbots, but AI technology for journalism and news production is a little broader than that category. What are the myriad ways you’ve observed AI leveraged by Northwestern University students or newsrooms around the country?

Diakopoulous: At this point, it might be easier to talk about where newsrooms are not using AI, such as things like ethical decisions, source relationship maintenance or a face-to-face interview. It’s otherwise quite pervasive and is used throughout gathering, production and distribution processes. One use case that folks have almost already taken for granted is automated transcription. However, sophisticated AI is driving the ability to accurately recognize and transcribe a whole range of human voices into text. Machine learning is used to optimize headlines and guide the placement of stories on homepages or moderate online discussion forums. AI-based pattern recognition and classification have augmented journalists’ abilities to expand the scope and scale of what's possible in investigations. It can also enhance fact-checking efforts and support the creative process by helping to find an angle on a story.

E&P: When newsrooms use AI technologies to produce journalism, how much disclosure to readers/viewers/listeners is ethically required? In your opinion, how should we message this to the public?

Diakopoulous: We are actively researching this area and just published a synthesis of related research on the Generative AI in the Newsroom (GAIN) blog. Ethically, journalists should be able to explain how they know what they know, which may include disclosing the use of AI. There is at least some demand from the public to be forthright with labeling content as AI-generated. But there’s a lot we don’t know about how to do that effectively. Is it enough to put a label that says “AI,” or does that label need to include a more detailed explanation? How should the label be designed and conveyed differently for text vs. image, audio or video? Should a label elaborate on all of the different ways AI was used, or is labeling low-stakes uses unnecessary? Thinking about how human behavior and expectations might evolve could be helpful. If, in 10 years, 98% of content on the internet is AI-generated, will anyone still expect labels? Sure, but maybe it will be for labels like “Human Written.” 

E&P: Trust in news is an ongoing concern for all news media outlets. From the data you’ve seen, what does the public generally think of AI being used to create news, and could it potentially further harm that level of trust?

Diakopoulous: Based on surveys by the Reuters Institute in several countries earlier this year, the public tends to be more pessimistic than optimistic about generative AI for news and journalism. Respondents were mostly uncomfortable with content made entirely by AI but were more comfortable if the AI had oversight by a journalist. Research published by the BBC has also corroborated this. Respondents to the Reuters survey also tended to be less comfortable with AI use for hard news topics such as international affairs and politics than for soft news like sports. The public seems to want to know someone (a person!) oversees what they’re reading. So, reassuring the audience about this could help bolster their trust in the outlet.

E&P: We’ve seen mounting contention between AI developers and news media businesses, who’d like compensation for using their content to train LLMs. A growing list of news publishers have chosen to sign limited-length licensing agreements, yet often, those terms are not disclosed. Do you know how “fair” those agreements may be to publishers — in the compensation structure or the equitable distribution of those partnerships across large to small news publishers, nonprofits and niche publishers?

Diakopoulous: Speaking as someone who hasn’t seen the specifics of any of these agreements, it’s hard to say whether they are fair. If you buy the fair use defense for training with copyrighted content, you might argue it’s nice that news publishers get anything. On the other hand, news media are an essential piece that tech companies realize they need to license to incorporate into their products. I’m skeptical of agreements referencing technical consulting from AI companies to news media as that doesn’t seem likely to spur innovation. Smaller publishers will need to band together if they want to get a piece. Personally, I think we’d all be better served if news were treated as a public good with public support so that less energy was focused on needing to make news organizations profitable. Who could we tax to support that?

E&P: Data journalism has become such an important skill set for aspiring journalists, and we’ve seen J-schools modify and evolve curricula to train students in data journalism. Do you see AI as a subset of the data journalism discipline, or are AI journalists becoming a singular, dedicated discipline in the field?

Diakopoulous: Developing and using AI effectively in a newsroom context encompasses a new set of skills, but it builds on core data literacies that are central to data journalism. I like the term “computational journalism” as it encapsulates all of the above. However, I do think there can be specialists within that broad rubric that focus on, for instance, machine learning, generative AI or data collection and analysis. Modern newsrooms depend not only on traditional roles like editors and reporters but increasingly on other kinds of specialists, including interface designers, data scientists, software developers, product managers and others. Still, reporters and editors need to develop enough expertise to critique and assess how a response from something like ChatGPT could go wrong and to collaborate with folks in these other specialized roles. 

E&P: For our readers who want to smartly but perhaps cautiously embrace AI, what newsroom processes might they consider testing as a jumping-off point?

Diakopoulous: Behind-the-scenes uses of AI with appropriate journalistic oversight are the lowest risk for getting started. Tasks like transcription, summarizing transcripts of meetings or headline ideation are sure to speed up the production process. Remember to revisit your newsroom’s ethics guidelines around the use of AI and explore the range of resources for effectively prompting tools like ChatGPT. Have fun as you experiment and explore how these tools could help your specific workflows and, more importantly, provide value to your audience and community. 

Gretchen A. Peck is a contributing editor to Editor & Publisher. She's reported for E&P since 2010 and welcomes comments at gretchenapeck@gmail.com.

Comments

No comments on this item Please log in to comment by clicking here