Posts about journalism

Gibberish from the machine


I’m honored that Germany’s Stern asked me to write about AI and journalism for a 75th anniversary edition. Here’s a version prior to final editing and trimming for print and translation. And I learned a new word: Kauderwelsch (“The variety of Romansch spoken in the Swiss town of Chur (Kauder) in canton Graubünden) means gibberish. 


We have Gutenberg to blame. It is because of his invention, print, that society came to think of public discourse, creativity, and news as “content,” a commodity to fill the products we call publications or lately websites. Journalists believe that their value resides primarily in making content. To fill the internet’s insatiable maw, reporters at some online sites are given content quotas, and their news organizations no longer appoint editors-in-chief but instead “chief content officers.” For the record, Stern still has actual editors, many of them.

And now here comes a machine — generative artificial intelligence or large language models (LLMs), such as ChatGPT — that can create no end of content: text that sounds just like us because it has been trained on all our words. An LLM maps the trillions of relationships among billions of words, turning them and their connections into numbers a computer can calculate. LLMs have no understanding of the words, no conception of truth. They are programmed only to predict the next most likely word to occur in a sentence.

A New York lawyer named Steven Schwartz had to learn his lesson about ChatGPT’s factual fallibility the hard way. In a now-infamous case, attorney Schwartz asked ChatGPT for precedents in a lawsuit involving an errant airline snack cart and his client’s allegedly injured knee. Schwartz needed to find cases relating to highly technical issues of international treaties and bankruptcy. ChatGPT dutifully delivered more than a half-dozen citations.

As soon as Schwartz’s firm filed the resulting legal brief in federal court, opposing counsel said they could not find the cases, and the judge, P. Kevin Castel, directed the lawyers to produce them. Schwartz returned to ChatGPT. The machine is programmed to tell us what we want to hear, so when Schwartz asked whether the cases were real, ChatGPT said they were. Schwartz then asked ChatGPT to show him the complete cases; it did, and he sent them to the court. The judge called them “gibberish” and ordered Schwartz and his colleagues into court to explain why they should not be sanctioned. I was there, along with many more journalists, to witness the humbling of the attorneys at the hands of technology and the media.

“The world now knows about the dangers of ChatGPT,” the lawyers’ lawyer told the judge. “The court has done its job warning the public of these risks.” Judge Castel interrupted: “I did not set out to do that.” The problem here was not with the technology but with the lawyers who used it, who failed to heed warnings about the dubious citations, who failed to use other tools — even Google — to verify them, and who failed to serve their clients. The lawyers’ lawyer said Schwartz “was playing with live ammo. He didn’t know because technology lied to him.”

But ChatGPT did not lie because, again, it has no conception of truth. Nor did it “hallucinate,” in the description of its creators. It simply predicted strings of words, which sounded right but were not. The judge fined the lawyers $5,000 each and acknowledged that they had suffered humiliation enough in news coverage of their predicament.

Herein lies a cautionary tale for news organizations that are rushing to have large language models write stories — because they want to be cool and trendy, or save work, or perhaps to eliminate jobs, and manufacture ever more content. The news companies CNET and G/O Media have gotten into hot water for using AI to produce content that turned out to be less than factual. America’s largest newspaper chain, Gannett, just turned off artificial intelligence that was producing embarrassing sports stories that would call a football game “a close encounter of the athletic kind.” I have heard online editors plead that they are in a war to produce more and more content to attract more likes and clicks so they may earn more digital advertising pennies. Their problem is that they think their mission is only to make content.

My advice to editors and publishers is to steer clear of large language models for writing the news, except in well-proven use cases, such as turning highly structured financial reports into basic news stories, which must be checked before release. I would give the same advice to Microsoft and Google about connecting LLMs with their search engines. Fact-free gibberish coming out of the machine could ruin the authority and credibility of both news and technology companies — and affect the reputation of artificial intelligence overall.

There are good uses for AI. I benefit from it every day in, for example, Google Translate, Maps, Assistant, and autocomplete. As for large language models, they could be useful to augment — not replace — journalists’ work. I recently tested a new Google tool called NotebookLM, which can take a folder filled with a journalist’s research and summarize it, organize it, and allow the writer to ask questions of it. LLMs could also be used in, for example, language education, where what matters is fluency, not facts. My international students use these programs to smooth out their English for school and work. I even believe LLMs could be used to extend literacy, to help people who are intimidated by writing to communicate more effectively and tell their own stories.

Ah, but therein lies the rub for writers, like me. We believe we are special, that we hold a skill — a talent for writing — that few others can boast. We are storytellers and wield the power to tell others’ tales, to decide what tales are told, who shall be heard in them, and how they will begin and neatly end. We think that gives us the ability to explain the world in what journalists like to call the first draft of history — the news.

Now writers and journalists see both the internet and AI as competition. The internet enables the silent mass of citizens who were not heard in media to at last have their say — and to create a lot of content. And by producing credible prose in seconds, AI devalues writing and robs writers of their special status.

This is one reason why I believe we see hostile coverage of technology in media these days. News organizations and their proprietors claim that Google, Facebook, et al steal away audience, attention, and advertising money (as if God granted publishers those assets in perpetuity). Journalists are engaged in their latest moral panic — another in a long line of panics over movies, television, comic books, rock lyrics, and video games. They warn about the dangers of the internet, social media, our phones, and now AI, claiming that these technologies will make us stupid, addict us, take away our jobs, and destroy democracy under a deluge of disinformation.

They should calm down. A 2020 study found that in the US no age group “spent more than an average of a minute a day engaging with fake news, nor did it occupy more than 0.2% of their overall media consumption.” The issue for democracy isn’t so much disinformation but the willingness — the eagerness — of some citizens to believe lies that stoke their own fears and hatreds. Journalism should be reporting on the roots of bigotry and extremism rather than simplistically blaming technology.

In my book, The Gutenberg Parenthesis, I track society’s entry into the age of print as we now leave it for the digital age that follows. Print’s development as an institution of authority took time. Not until fifty years after Gutenberg’s Bible, around 1500, did the book take the shape we know today, with titles, title pages, and page numbers. It took another century, a few years either side of 1600, before the technology and its technologists — printers — faded into the background, making way for tremendous innovation with print: the birth of the modern novel with Cervantes, the essay with Montaigne, and the newspaper. A business model for print did not arrive until one century more, in 1710, with the advent of copyright. Come the 1800s, the technology of print — which had hardly changed since Gutenberg — evolved at last with the arrival of steam-powered presses and typesetting machines, leading to the birth of mass media. The twentieth century brought print’s first competitors, radio and television. And here we are today, just over a quarter century past the introduction of the commercial web browser. This is to say that we are likely at just the beginning of a long transition into the digital age. It is only 1480 in Gutenberg years.

In the beginning, rumor was trusted more than print because any anonymous printer could produce a book or pamphlet — just as anyone today can make a web site or tweet. In 1470 — only fifteen years after Gutenberg’s Bible came off the press — Latin scholar Niccolò Perotti made what is said to be the first call for censorship of print. Offended by a bad translation of Pliny, he wrote to the Pope demanding that a censor be assigned to approve all text before it came off the press. As I thought about this, I realized Perroti was not seeking censorship. Instead, he was anticipating the establishment of the institutions of editing and publishing, which would assure quality and authority in print for centuries.

Like Perotti in his day, media and politicians today demand that something must be done about harmful content online. Governments — like editors and publishers — cannot cope with the scale of speech now, so they deputize platforms to police and censor all that is said online. It is an impossible task.

Journalists must be careful using AI to produce the news. At the same time, there is a danger in demonizing the technology. In the best case, the rise of AI might force journalists to examine their role in society, to ask how they improve public discourse. The internet provides them with many new ways to connect with communities, to build relationships of trust and authority with them, to listen to their needs, to discover and share voices too long not heard in the public sphere, to expand the work of journalism past publishing to the wider canvas of the internet.

Journalists think their content is what makes them valuable, and so publishers and their lawyers and lobbyists are threatening to sue AI companies, dreaming of huge payments for machines that read their content. That is no strategy for the future of journalism. Neither is Axel Springer’s plan to replace journalists in content factories with AI. That is not where the value of journalism lies. It lies with reporting on and serving communities. Like Nicollò Perotti, we should anticipate the creation of new services to help internet users cope with the abundance of content today, to verify the truth and falsity of what we see online, to assess authority, to discover more diverse voices, to nurture new talent, to recommend content that is worth our time and attention. Could such a service be the basis of a new journalism for the online, AI age?

A generation later: What have we learned?

The date sneaked up on me this year, attacking from behind. Every year on 9/11 I reflect, grateful that I survived the attack. This year, though, I find myself angry. Some of that might be my own loss: my father to COVID this year; my imminent unemployment.

But I am angry on this 22nd anniversary at what has fallen since: at the authoritarianism that overtook this country and threatens the world, at racism and bigotry set loose, at the pandemic killing still, at my own field — journalism — failing to meet these challenges. 

A generation has passed since 9/11/01 and what have we learned? Authoritarians attacked us that day and now authoritarians attack from within. My failing field — journalism — elevates the evil as if it is merely another side in a spectator sport.

Since 9/11/01, our only popularly elected presidents succeeded in strengthening the nation. Under Biden, the economy & nation are strong. But journalism fails at informing the public and wants to make jet lag an election issue while normalizing the fascism in the house. WTF. 

It was on 9/11/01, on my way to work through the World Trade Center, that I decided it was time to leave my job. I would teach. Now I leave that role and I ask what I have accomplished. I pray my students will turn around journalism, for we, their elders, have failed. 

I am, of course, still grateful to have survived 9/11/01. The images and lessons of that day are seared into my soul and will never leave me; they define me. I regret that the spirit in the nation was perverted into war in Iraq. I worry about the state of politics everywhere. 

But on this day I will try to rise above my anger and remember the names of the souls lost and the faces of the selfless first responders I saw rushing toward danger and mercy. This is a day for memorial and gratitude to them.

The only suitable memorial to those lost on 9/11/01 is to recognize the evil that took them and for our institutions — government, politics, journalism, education — to protect present and future generations from further fascism.

Moving on

I have news: I am leaving CUNY’s Newmark Graduate School of Journalism at the end of this term. Technically I’m retiring, though if you know me you know I will never retire. I’m looking at some things to do next and I’m open to others. More on that later. Now, I want to recollect — brag — about my time there.

Eighteen years ago, in 2005, I was the first professor hired at our new school. The New York Times was dubious:

For some old-school journalists, blogging is the worst thing to hit the print medium since, well, journalism school. They may want to avert their eyes today, when Stephen B. Shepard, dean of the new Graduate School of Journalism at the City University of New York, is to name Jeff Jarvis director of the new-media program and associate professor.

On my first day on the job, after attending my first faculty meeting, I quit. I had suggested that faculty needed to learn the new tools of online and digital journalism and some of them jumped down my throat: How dare I tell them what to learn? This festered in me, as things do, and I emailed Steve Shepard and Associate Dean Judith Watson saying that we had made a mistake. I’d already quit my job as president of Advance.net. But, oh well. 

Steve emailed me asking WTF I was doing. That curriculum committee was a temporary body. They weren’t on the faculty of the school. I was. Over lunch, Steve and Judy salved my neuroses and said I could teach that entrepreneurial journalism thing the committee had killed. I stayed. 

Steve took a flier on me. It wasn’t just that I was a blogger and a neurotic but I had only a bachelor’s degree. I’ve always said that I am a poseur in the academy, a fake academic. Nonetheless, I’ve had the privilege of starting three master’s degrees at the school. (Recently, visiting with actual academics at the University of St Andrews, I said I had started three degrees and they looked at me cock-eyed and asked why I hadn’t finished any of them.) 

With Steve, I took our entrepreneurial class and turned it into the nation’s first Advanced Certificate and M.A. in Entrepreneurial Journalism, to prepare journalists to be responsible stewards of our field. The program has been run brilliantly ever since by my colleague Jeremy Caplan, a most generous educator. It has evolved into an online program for independent journalists. 

I’m grateful that our next dean, Sarah Bartlett, also took a flier on involving me in her strategy for growth and we built much together. This week, I’m teaching the fourth cohort in our News Innovation and Leadership executive program. I’d long seen the need for such a degree, so news people would not be corrupted getting MBAs, and so our school, dedicated to diversity, would have an impact not just at the entry level in newsrooms but also on their management. I had to wait to recruit the one person who could build this program, Anita Zielina, and she has done a phenomenal job; she is the leaders’ leader. The program is in great hands with her successor, Niketa Patel. (And I plan to stick around to teach with them in this program after I leave.) 

My proudest accomplishment at the school and indeed in my career has been creating the Engagement Journalism degree in 2014, inspired when Sarah read what I’d written about building relationships with communities as the proper basis of journalism. She asked whether we taught that at the school. Not really, I said. How about a new degree? Cool, I said. We scribbled curricula on napkins. By the end of that week in California we had seed funding from Reid Hoffman, and by that fall we had students in class. I had the great good fortune of hiring, once again, the one person who could build the program, Dr. Carrie Brown, with whom I’ve had the privilege of teaching and learning ever since. She is a visionary in journalism. 

The program is, I’m sad to say, on pause right now. But after having just attended preconferences at AEJMC and ONA on Engagement Journalism, I am gratified to report that the movement is spreading widely. Each gathering was filled with journalists, educators, and community leaders dedicated to centering our work on communities, to building trust through listening and collaboration, to valuing the experience-as-expertise of the public over the tired doctrine of journalistic objectivity, and to repairing the damage journalism has done. I have told our Engagement students that they would be Trojan horses in newsrooms and they have been just that, getting important jobs and reimagining and rebuilding journalism from within.

I am proud of those graduates as I am of those from the executive and Entrepreneurial programs. Since arriving at the school, I have said to each class that I am too old to change journalism. Instead, I would watch and try to help students take on that responsibility. It is wonderful to witness their success. Of course, there is much yet to do. 

Lately, I have turned my attention to internet studies and the wider canvas on which journalism should work in our connected world. What interests me most is bringing the humanities into the discussion of this most human enterprise, which has for too long been dominated (as print was in its first half-century) by the technologists. This is work I hope to continue. 

I love starting things. In my career, I have had the honor of founding Entertainment Weekly at Time Inc., and lots of web sites at Advance. Here I had the great opportunity to help start a school. At the Tow-Knight Center, which I direct, we started communities of practice for new roles in newsrooms; two of these organizations have flown the nest to become independent and sustainable: the News Product Alliance and the Lenfest Institute’s Audience Community of Practice. I’m also proud to have had a small role in helping at the start of Montclair State’s Center for Cooperative Media, which is doing amazing work in Engagement under Stefanie Murray and our alum, Joe Amditis. Those are activities I expected from our Center.

What I had not imagined was that the Center would become an incubator for new degrees. That was made possible by funders. I also never thought that I’d be in the business of fundraising. But without funders’ support, none of these programs would have been born. 

Sarah Bartlett taught me much about raising money, because she’s so good at it. I haven’t heard her say it just this way, but from her I learned that fundraising is about friendship. I am grateful for the friendship of so many supporters of the school and of my work there. 

My friend Leonard Tow challenged Steve and me — with a $3 million challenge grant — when we said we wanted to start a center dedicated to exploring sustainability for news. Emily Tow, who heads the family’s foundation, took us under her wise wing and patiently taught us how to tell our story. It worked. Our friend Alberto Ibargüen, CEO of the Knight Foundation, asked Steve what would make his new school stand apart. Steve said entrepreneurial journalism. Alberto matched the Tows’ grant and the Tow-Knight Center was born. Knight’s Eric Newton was the one who insisted we should make our Entrepreneurial Journalism program a degree and later Jennifer Preston supported our work there. 

As time went on, Len Tow also endowed the Tow Chair in Journalism Innovation, which I am honored to hold. 

When my long-time friend Craig Newmark decided to make it his life’s mission to support journalism (and veterans and cybersecurity and women in tech and pigeons), he generously told me to bring him an idea and thus was born Tow-Knight’s News Integrity Initiative, also supported by my Facebook friends (literally), Áine Kerr, Meredith Carden, and Campbell Brown. Next, Craig most generously endowed the school that now proudly carries his name. His endowment has been a life-saver in the crisis years of the pandemic. His friendship, support, and guidance are invaluable to me. And we love nerding about gadgets. 

I have more friends to thank for their support: John Bracken, way back when he was at the MacArthur Foundation, gave me my first grant to support Entrepreneurial Journalism students’ enterprises. Ford, Carnegie, McCormick, and others contributed to what has added up to — I’m amazed to say — about $53 million in support in which I had a hand. 

And I am grateful for the latest support of the Center, thanks to my friend Richard Gingras of Google. (By way of disclosure, I’ll add that I have not been paid by any technology company.)

I must give my thanks to Hal Straus and Peter Hauck, who worked alongside me — that is to say, tolerated my every inefficiency and eccentricity — managing Tow-Knight, as well as other colleagues (especially Jesenia De Moya Correa), who made possible the convenings the Center brought to the school. The latest were a Black Twitter Summit convened by Meredith Clark, André Brock, Charlton McIlwain, and Johnathan Flowers, and a gathering of internet researchers led by Siva Vaidhyanthan. I have learned so much from such scholars, journalists, technologists, and business and community leaders who have lent their time to the school and the Center.

Finally, I’d like to thank my friend Jay Rosen of NYU, who from the start has taught me much about teaching and scholarship. 

Having subjected you to my Oscar speech, I won’t burden you now with valedictory thoughts on the fate of journalism. That, too, awaits another day. But there’s one more thing I’m grateful for: the opportunity teaching has given me to research and write. I didn’t just blog, to the consternation of our neighbors at The Times, but also got to write books: What Would Google Do? (Harper 2009), Public Parts (Simon & Schuster 2011), and Geeks Bearing Gifts: Imagining New Futures for News (published by the CUNY Journalism Press in 2014). 

I spent the last decade digging into and geeking out about Gutenberg and the vast sweep of media history, leading to The Gutenberg Parenthesis: The Age of Print and Its Lessons for the Age of the Internet, recently published by Bloomsbury Academic. Here is its dedication:

I have another brief work of media history, Magazine, in Bloomsbury’s Object Lessons series, coming out this fall (in which I finally tell my story of the founding of Entertainment Weekly). I have a future book about the internet and media’s moral panic over it — and AI ; I just submitted the manuscript to Basic Books. And I have another few books I want to work on after that. So, yes, I’ll be busy. 

I do hope to continue teaching — perhaps internet studies or even book and media history — and to get back out speaking and consulting and helping start more things. I’d like a fellowship and would welcome the chance to return to serving on boards. Feel free to ping me if you have thoughts. 

I am grateful for my time at CUNY and the privilege to teach there and wish nothing but the best future for the Newmark School.

Copyright and AI and journalism

The US Copyright Office just put out a call for comment on copyright and artificial intelligence. It is a thoughtful document based on listening sessions already held, with thirty-four questions on rights regarding inclusion in learning sets, transparency, the copyrightability of generative AI’s output, and use of likeness. Some of the questions — for example, on whether legislation should require assent or licensing — frighten me, for reasons I set forth in my comments, which I offer to the Office in the context of journalism and its history:

I am a journalist and journalism professor at the City University of New York. I write — speaking for myself — in reply to the Copyright Office’s queries regarding AI, to bring one perspective from my field, as well as the context of history. I will warn that precedents set in regulating this technology could impinge on freedom of expression and quality of information for all. I also will share a proposal for an updated framework for copyright that I call creditright, which I developed in a project with the World Economic Forum at Davos.

First, some context from present practice and history in journalism. It is ironic that newspaper publishers would decry AI reading and learning from their text when journalists themselves read, learn from, rewrite, and repurpose each others’ work in their publications every day. They do the same with sources and experts, without remuneration and often without credit. This is the time-honored tradition in the field.

The 1792 US Post Office Act provided for newspapers to send copies to each other for free for the express purpose of allowing them to copy each other, creating a de facto network of news in the new nation. In fact, many newspapers employed “scissors editors” — their actual job title — to cut out stories to reprint. As I recount in my book, The Gutenberg Parenthesis: The Age of Print and Its Lessons for the Age of the Internet (Bloomsbury Academic, 2023, 217), the only thing that would irritate publishers was if they were not credited.

As the Office well knows, the Copyright Act of 1790 covered only books, charts, and maps, and not newspapers or magazines. Not until 1909 did copyright law include newspapers, but even then, according to Will Slauter in Who Owns the News?: A History of Copyright (Stanford University Press, 2019), there was debate as to whether news articles, as opposed to literary features, were to be protected, for they were often anonymous, the product of business interest more than authorship. Thus the definition of authorship — whether by person, publication, or now machine — remains unsettled.

As to Question 1, regarding the benefits and risks of this technology (in the context of news), I have warned editors away from using generative AI to produce news stories. I covered the show-cause hearing for the attorney who infamously asked ChatGPT for citations for a federal court filing. I use that tale as an object lesson for news organizations (and search platforms) to keep large language models far away from any use involving the expectation of facts and credibility. However, I do see many uses for AI in journalism and I worry that the larger technological field of artificial intelligence and machine learning could be swept up in regulation because of the misuse, misrepresentation, factual fallibility, and falling reputation of generative AI specifically.

AI is invaluable in translation, allowing both journalists and users to read news around the world. I have tested Google’s upcoming product, NotebookLM; augmentative tools such as this, used to summarize and organize a writer’s research, could be quite useful in improving journalists’ work. In discussing the tool with the project’s editorial director, author Steven Johnson, we saw another powerful use and possible business model for news: allowing readers to query and enter into dialogue with a publisher’s content. Finally, I have speculated that generative AI could extend literacy, helping those who are intimidated by the act of writing to help tell — and illustrate — their own stories.

In reviewing media coverage of AI, I ask you to keep in mind that journalists and publishers see the internet and now artificial intelligence as competition. In an upcoming book, I assert that media are embroiled in a full-fledged moral panic over these technologies. The arrival of a machine that can produce no end of fluent prose commodifies the content media produce and robs writers of our special status. This is why I teach that journalists must understand that their value is not resident in the commodity they produce, content, but instead in qualities of authority, credibility, independence, service, and empathy.

As for Question 8 on fair use, I am no lawyer, but it is hard to see how reading and learning from text and images to produce transformative works would not be fair use. I worry that if these activities — indeed, these rights — are restricted for the machine as an agent for users, precedent is set that could restrict use for us all. As a journalist, I fear that by restricting learning sets to viewing only free content, we will end up with a problem parallel to that created by the widespread use of paywalls in news: authoritative, fact-based reporting will be restricted to the privileged few who can and choose to pay for it, leaving too much of public discourse vulnerable to the misinformation, disinformation, and conspiracies available for free, without restriction.

I see another potential use for large language models: to provide researchers and scholars with a window on the presumptions, biases, myths, and misapprehensions reflected in the relationships of all the words analyzed by them — the words of those who had the power and privilege of publishing them. To restrict access skews that vision and potentially harms scholarly uses that have not yet been imagined.

The speculation in Question 9, about requiring affirmative permission for any copyrighted material to be used in training AI models, and in Question 10, regarding collective management organizations or legislatively establishing a compulsory licensing scheme, frightens me. AI companies already offer a voluntary opt-out mechanism, in the model of robots.txt. As media report, many news organizations are availing themselves of that option. To legally require opt-in or licensing sets up unimaginable complications.

Such complication raises the barrier to entry for new and open-source competitors and the spectre of regulatory capture — as does discussion in the EU of restricting open-source AI models (Question 25.1). The best response to the rising power of the already-huge incumbent companies involved in AI is to open the door — not close it — to new competition and open development.

As for Questions 18–21 on copyrightability, I would suggest a different framework for considering both the input and output of generative AI: as an intellectual, cultural, and informational commons, whose use and benefits we cannot not predict. Shouldn’t policy encourage at least a period of development, research, and experimentation?

Finally, permit me to propose another framework for consideration of copyright in this new age in which connected technologies enable collaborative creation and communal distribution. In 2012, I led a series of discussions with multiple stakeholders — media executives, creative artists, policymakers — for a project with the World Economic Forum in Davos on rethinking intellectual property and the support of creativity in the digital age. In the safe space of the mountains, even entertainment executives would concede that copyright law could be considered outmoded and is due for reconsideration. The WEF report is available here.

Out of that work, I conceived of a framework I call “creditright,” which I write about in Geeks Bearing Gifts (CUNY Journalism Press, 2014) and in The Gutenberg Parenthesis (221–2): “This is not the right to copy text but the right to receive credit for contributions to a chain of collaborative inspiration, creation, and recommendation of creative work. Creditright would permit the behaviors we want to encourage to be recognized and rewarded. Those behaviors might include inspiring a work, creating that work, remixing it, collaborating in it, performing it, promoting it. The rewards might be payment or merely credit as its own reward. I didn’t mention blockchain; but the technology and its automated contracts could be useful to record credit and trigger rewards.” I do not pretend that this is a fully thought-through solution, only one idea to spark discussion on alternatives for copyright.

The idea of creditright has some bearing on your Questions 15–17 on transparency and recordkeeping — what might ledgers of credit in creation look like? — though I am trying to make a larger argument about the underpinnings of copyright. As I have come to learn, 1710’s Statute of Anne was not formulated at the urging of — or to protect the rights of — authors, so much as it was in response to the demands of publishers and booksellers, to create a marketplace for creativity as a tradable asset. Said historian Peter Baldwin in The Copyright Wars: Three Centuries of Trans-Atlantic Battle (Princeton University Press, 2016, 53–6): “The booksellers claimed to be supporting authors’ just and natural right to property. But in fact their aim was to take for themselves what nature had supposedly granted their clients.”

I write in my book that the metaphor of creativity as property — of art as artifact rather than an act — “might be appropriate for land, buildings, ships, and tangible possessions, but is it for such intangibles as creativity, inspiration, information, education, and art? Especially once electronics — from broadcast to digital — eliminated the scarcity of the printed page or the theater seat, one need ask whether property is still a valid metaphor for such a nonrivalrous good as culture.”

Around the world, copyright law and doctrine are being mangled to suit the protectionist ends of those lobbying on behalf of incumbent publishers and producers, who remain flummoxed by the challenges and opportunities of technology, of both the internet and now artificial intelligence. In the context of journalism and news, Germany’s Leistungsschutzrecht or ancillary copyright law, Spain’s recently superseded link tax, Australia’s News Media Bargaining Code, the proposed Journalism Competition and Preservation Act in the US, and lately Canada’s C-18 Online News Act do nothing to protect the public’s interest in informed discourse and, in Canada’s case, will end up harming news consumers, journalists, and platforms alike as Facebook and Google are forced to take down links to news.

I urge the Copyright Office to continue its process of study as exemplified by this request for comments and not to rush into the frenzied discussion in media over artificial intelligence, large language models, and generative AI. It is too soon. Too little is known. Too much is at stake.

A few unpopular opinions about AI

In a conversation with Jason Howell for his upcoming AI podcast on the TWiT network, I came to wonder whether ChatGPT and large language models might give all of artificial intelligence cultural cooties, for the technology is being misused by companies and miscast by media such that the public may come to wonder whether they can ever trust the output of a machine. That is the disaster scenario the AI boys do not account for.

While AI’s boys are busy thumping their chests about their power to annihilate humanity, if they are not careful — and they are not — generative AI could come to be distrusted for misleading users (the companies’ fault more than the machine’s); filling our already messy information ecosystem with the data equivalent of Styrofoam peanuts and junk mail; making news worse; making customer service even worse; making education worse; threatening jobs; and hurting the environment. What’s not to dislike?

Below I will share my likely unpopular opinions about large language models — how they should not be used in search or news, how building effective guardrails is improbable, how we already have enough fucking content in the world. But first, a few caveats:

I do see limited potential uses for synthetic text and generative AI. Watch this excellent talk by Emily Bender, one of the authors of the seminal Stochastic Parrots paper and a leading critic of AI hype, suggesting criteria for acceptable applications: cases where language form and fluency matter but facts do not (e.g., foreign language instruction), where bias can be filtered, and where originality is not required.

Here I explored the idea that large language models could help extend literacy for those who are intimidated by writing and thus excluded from discourse. I am impressed with Google’s NotebookLM (which I’ve seen thanks to Steven Johnson, its editorial director), as an augmentative tool designed not to create content but to help writers organize research and enter into dialog with text (a possible new model for interaction with news, by the way). Gutenberg can be blamed for giving birth to the drudgery of bureaucracy and perhaps LLMs can save us some of the grind of responding to it.

I value much of what machine learning makes possible today — in, for example, Google’s Search, Translate, Maps, Assistant, and autocomplete. I am a defender of the internet (subject of my next book) and, yes, social media. Yet I am cautious about this latest AI flavor of the month, not because generative AI itself is dangerous but because the uses to which it is being put are stupid and its current proprietors are worrisome.

So here are a few of my unpopular opinions about large language models like ChatGPT:

It is irresponsible to use generative AI models as presently constituted in search or anywhere users are conditioned to expect facts and truthful responses. Presented with the empty box on Bing’s or Google’s search engines, one expects at least a credible list of sites relevant to one’s query, or a direct response based on a trusted source: Wikipedia or services providing the weather, stock prices, or sports scores. To have an LLM generate a response — knowing full well that the program has no understanding of fact — is simply wrong.

No news organization should use generative AI to write news stories, except in very circumscribed circumstances. For years now, wire services have used artificial intelligence software to generate simple news stories from limited, verified, and highly structured data — finance, sports, weather — and that works because of the strictly bounded arena in which such programs work. Using LLMs trained on the entire web to generate news stories from the ether is irresponsible, for it only predicts words, it cannot discern facts, and it reflects biases. I endorse experimenting with AI to augment journalists’ work, organizing information or analyzing data. Otherwise, stay away.

The last thing the world needs is more content. This, too, we can blame on Gutenberg (and I do, in The Gutenberg Parenthesis), for printing brought about the commodification of conversation and creativity as a product we call content. Journalists and other writers came to believe that their value resides entirely in content, rather than in the higher, human concepts of service and relationships. So my industry, at its most industrial, thinks its mission is to extrude ever more content. The business model encourages that: more stuff to fill more pages to get more clicks and more attention and a few more ad pennies. And now comes AI, able to manufacture no end of stuff. No. Tell the machine to STFU.

There will be no way to build foolproof guardrails against people making AI do bad things. We regularly see news articles reporting that an LLM lied about — even libeled — someone. First note well that LLMs do not lie or hallucinate because they have no conception of truth or meaning. Thus they can be made to say anything about anyone. The only limit on such behavior is the developers’ ability to predict and forbid everything bad that anyone could do with the software. (See, for example, how ChatGPT at first refused to go where The New York Times’ Kevin Roose wanted it to go and even scolded him for trying to draw out its dark side. But Roose persevered and led it astray anyway.) No policy, no statute, no regulation, no code can prevent this. So what do we do? We try to hold accountable the user who gets the machine to say bad shit and then spread it, just as we would if you printed out nasty shit on your HP printer and posted it around the neighborhood. Not much else we can do.

AI will not ruin democracy. We see regular alarms that AI will produce so much disinformation that democracy is in peril — see a recent warning from Jon Naughton of The Guardian that “a tsunami of AI misinformation will shape next year’s knife-edge elections.” But hold on. First, we already have more than enough misinformation; who’s to say that any more will make a difference? Second, research finds again and again that online disinformation played a small role in the 2016 election. We have bigger problems to address about the willful credulity of those who want to signal their hatreds with misinformation and we should not let tropes of techno moral panic distract us from that greater peril.

Perhaps LLMs should have been introduced as fiction machines. ChatGPT is a nice parlor trick, no doubt. It can make shit up. It can sound like us. Cool. If that entertaining power were used to write short stories or songs or poems and if it were clearly understood that the machine could do little else, I’m not sure we’d be in our current dither about AI. Problem is, as any novelist or songwriter or poet can tell you, there’s little money in creativity anymore. That wouldn’t attract billions in venture capital and the stratospheric valuations that go with it whenever AI is associated with internet search, media, and McKinsey finding a new way to kill jobs. As with so much else today, the problem isn’t with the tool or the user but with capitalism. (To those who would correct me and say it’s late-stage capitalism, I respond: How can you be so sure it is in its last stages?)

Training artificial intelligence models on existing content could be considered fair use. Their output is generally transformative. If that is true, then training machines on content would not be a violation of copyright or theft. It will take years for courts to adjudicate the implications of generative AI on outmoded copyright doctrine and law. As Harvard Law Professor Lawrence Lessig famously said, fair use is the right to hire an attorney. Media moguls are rushing to do just that, hiring lawyers to force AI companies to pay for the right to use news content to train their machines — just as the publishers paid lobbyists to get legislators to pass laws to get search engines and social media platforms to pay to link to news content. (See how well that’s working out in Canada.) I am no lawyer but I believe training machines on any content that is lawfully acquired so it can be inspired to produce new content is not a violation of copyright. Note my italics.

Machines should have the same right to learn as humans; to say otherwise is to set a dangerous precedent for humans. If we say that a machine is not allowed to learn, to read, to extract knowledge from existing content and adapt it to other uses, then I fear it would not be a long leap to declare what we as humans are not allowed to read, see, or know some things. This puts us in the odd position of having to defend the machine’s rights so as to protect our own.

Stopping large language models from having access to quality content will make them even worse. Same problem we have in our democracy: Pay walls restrict quality information to the already rich and powerful, leaving the field — whether that is news or democracy or machine learning — free to bad actors and their disinformation.

Does the product of the machine deserve copyright protection? I’m not sure. A federal court just upheld the US Copyright Office’s refusal to grant copyright protection to the product of AI. I’m just as happy as the next copyright revolutionary to see the old doctrine fenced in for the sake of a larger commons. But the agency’s ruling was limited to content generated solely by the machine and in most cases (in fact, all cases) people are involved. So I’m not sure where we will end up. The bottom line is that we need a wholesale reconsideration of copyright (which I also address in The Gutenberg Parenthesis). Odds of that happening? About as high as the odds that AI will destroy mankind.

The most dangerous prospect arising from the current generation of AI is not the technology, but the philosophy espoused by some of its technologists. I won’t venture deep down this rat hole now, but the faux philosophies espoused by many of the AI boys — in the acronym of Émile Torres and Timnit Gebru, TESCREAL, or longtermism for short — is noxious and frightening, serving as self-justification for their wealth and power. Their philosophizing might add up to a glib freshman’s essay on utilitarianism if it did not also border on eugenics and if these boys did not have the wealth and power they wield. See Torres’ excellent reporting on TESCREAL here. Media should be paying attention to this angle instead of acting as the boys’ fawning stenographers. They must bring the voices of responsible scholars — from many fields, including the humanities — into the discussion. And government should encourage truly open-source development and investment to bring on competitors that can keep these boys, more than their machines, in check.