Above: Illustration by Izanbar/123RF.com
BitDepth#1405 for May 08, 2023
Amid all the conversation about artificial intelligence (AI), there hasn’t been much consideration about what is generating its output.
To clarify, AI is not intelligent in the human understanding of the word. AI does not create.
When it isn’t responding to a prompt, tools like ChatGPT and Stable Diffusion are feeding, eating the internet’s store of human knowledge.
Of course, if AI is eating our knowledge, we are consuming what comes out the other end.
GIGO, or Garbage In, Garbage Out remains true in computing.
AI’s gatekeepers work hard to improve the rubbish to reasonable ratio in AI output.
What ChatGPT “writes” is really a recasting of the original thoughts and research of human writers, while AI imaging tools reinterpret and remix images that have been consumed en masse by computers.
That’s caused some untidiness to emerge, as manufactured images show up with stock photo logos and watermarks still visible and text responses betray their source material.
What happens to all the information that AI gorges on isn’t easily explained. The latent space where all this data is logged, sorted and digested exceeds normal human perception, and AI is still very much in its infancy as an applied technology.
But it’s becoming clear that some perceptual distinction needs to be made between work that is the result of human thought and generative AI.
Daisy Veerasingham, CEO of the Associated Press noted that the datasets that feed AI are sourced from professionally created work and that mass ingestion offers neither reward nor attribution to the original authors.
“If your biggest monetisation model is driven by advertising and search and now you don’t have to go back to a content originator for those results then that’s a fundamental other pressure on that business,” Veerasingham said at an INMA panel on April 4 in Italy.
“We need the money to keep journalism funded in the world at large.”
And it’s not just media that’s worried. Universal Music Group quickly issued take-down notices to block a song with AI generated vocals resembling a duet by Drake and The Weeknd.
Take-downs don’t really stick though, so search for Heart on my Sleeve. Or better yet, Kanye West AI covers.
All this activity has pushed Adobe’s Content Authenticity Initiative (CAI) into higher gear.
There is no reliable way to tag images, web content or text so that an author can opt out of being harvested by AI bots or demand attribution.
Image metadata standards have been around for more than 60 years without slowing piracy or encouraging author authentication.
Existing image tags and indicators on social media are controlled by the platforms and applied by users, not creators.
Adobe’s CAI began at the other end of the problem, by attempting to create a framework for its image editing tools to track how they are made while embedding provenance information into the file.
The company has also launched its own image generation AI, Firefly, and claims to have trained it on images from its own stock collection with the consent of the creators whose work is offered there.
CAI promises that images tagged with its information will allow a viewer to, “Understand more about what’s been done to it, where it’s been, and who’s responsible.”
Beyond the concerns about attribution and unauthorised harvesting of intellectual property is the impact of these tools on the collective understanding of what is real and what is fake.
Humans have to at least think about making up non-existent facts. Tools such as ChatGPT have proven able to do compelling facsimiles of reporting and research in a matter of seconds without regard for petty concerns like accuracy and fairness.
At stake for journalism is nothing less than the pact between audience and news packager that the reporting and commentary on offer is true to the best knowledge of its creators.
It’s not hard to imagine a future in which there is trusted information, clearly unreliable information and a vast wasteland between them.
Until now, you either created work, copied it or remixed it, but what AI is doing isn’t any of those things, exactly.
The presumption of authorship is being tested through the intervention of technology tools, of which AI is only one.
It’s not hard to imagine a future, much closer than we think, in which there is trusted information, clearly unreliable information and a vast wasteland between them where disinformation and uncertainty roam freely.
Journalism cannot rely on respectful tradition and must forge a new pact of trustworthiness with its audience.
AI is only going to get better at what it does, so it shouldn’t be surprising that a business that depends on verifiable and attributable truth as a marketing position must raise its game.
No single media house can do this on their own.
The change will demand a total commitment by media houses to become more transparent about where information comes from (which isn’t the same as revealing sources) and how it was managed to create the reporting or visuals (The News Provenance Project).
The only sensible response to journalism distrust is for the practice itself to embrace greater transparency and open verification as part of the publication and broadcast process.
Some of these changes will be enabled by technology. Efforts at signing creative work published online may tap into blockchain technology to create immutable links to its provenance (TechNewsTT posts have been written to the blockchain since 2021).
But some of these changes will be difficult; demanding changes to news-gathering with greater emphasis on audience involvement in the making of journalism itself.