It was with a measure of existential dread that I tried out a new artificial intelligence tool that threatens to replace the craft I train human journalists to do.
A new feature in Google’s NotebookLM can turn documents and other files into a “Deep Dive” podcast episode. Users have marveled at its uncanny ability to resemble a real human conversation, even when faced with its own inanimateness.
But could it do actual journalism?
I decided to do a test with one of the stories I enjoyed doing the most. Years ago, I filed an NPR audio postcard about the encierro txico, a running of the bulls for kids in Spain. Youngsters who “dream of one day sprinting a hair’s breadth ahead of a thousand-pound charging toro” get to pretend by running from fake bulls on wheelbarrows pushed by grownups.
Now, I fed the bot the audio interviews I did in Spanish with some of the children and their parents, along with background readings. In minutes, it spit out a dialogue between two voices in English — one that sounded eerily like former “Morning Edition” host David Greene and the other an indeterminate female voice.
Their enthusiasm rivaled the banality of their observations. “It’s wild! This town is trying to hold on to its roots, while also embracing, like, the 21st century!” the David Greene imposter said, adding that a ritual that “seems so small on the surface, opens up this huge conversation, well, about everything, really!”
The hosts, if I can call them that, did have good storytelling transitions (“Speaking of generations …”). They identified quotes I had also used in my piece: a mother contrasting Spanish bullfighting to American gun culture and a 13-year-old fulminating that people trying to ban the bullfights “should be banned themselves.”
There was no tape of those actualities, nor did we hear the fight songs. And I like to think that my turns of phrase were more insightful. But is it just a matter of time before the software can produce a sound-rich piece like mine?
To be sure, generative AI is a powerful tool. It can sift through mounds of information and suggest questions, storylines and different points of view. At NPR, we are actively researching AI and discussing its ethical use. When my colleagues at “Planet Money” had ChatGPT, another bot, come up with interview questions based on an academic paper, one of the paper’s authors thought they were so incisive they couldn’t have been AI-generated.
What distinguishes NotebookLM from other AI tools is its audio feature. If anything, it’s a testament to the secret of the enduring success of radio and, more recently, podcasting — that listening can be a low-stress and enjoyable way of digesting information. The overwrought dialogue between the fake David Greene and his cohost may have sounded like a parody, but I think it’s a form of flattery.
Still, a tool needs a hand to hold it. After writing a book on the skills required to produce news for the ear — based on more than 80 interviews with NPR reporters, producers and hosts — I realize just how much the foundations rest on things only humans can do. Like interacting with other humans. And gaining their trust.
The way a good reporter spends time with an interviewee who’s been through hardship or tragedy and feels their pain before he turns on the microphone, or the way a host imagines she’s talking to her own mother while presenting the news. In public radio, our aim is to do journalism in a way that elicits empathy as it informs.
And whether hosting in a studio, or reporting on the street, a journalist needs to be skilled in the art of news judgment as well as thoughtful conversation: They need to read the room, or the crowd, or the face of an interviewee, and respond in the moment with the right word or gesture or follow-up question.
A computer might be able to help a reporter come up with angles for a story, and even provide leads. But journalism — especially the kind you hear on the radio and in a wide variety of podcasts these days, produced by both public service and commercial media — hinges on verbal conversations with real people; in their homes, their workplaces or on the street, in addition to over Zoom or on the phone. And the human brain is the only neural network that can intuit where such a conversation might take you, not to mention read between the lines of what is said.
Without the interviews I did in the field, NotebookLM could not have produced that episode. Even with them, the artificial intelligence yielded little more than a collection of inflated bromides.
Picture the scene as I did those interviews: me sticking out like a sore thumb with my microphone and headphones, the lone adult surrounded by a pack of tots squealing with delight as they whacked stuffed bullheads. Until someone invents a bot that can hack that, and the rest of what human journalists can do, leave the reporting to them.