⚔️ NY Times v. Open AI ⚔️. What the battle royale means for you


Hey Reader,

It’s not every day that a federal judge reads a demand for the “destruction” of ChatGPT.

But that’s exactly what The New York Times asked for in its blockbuster lawsuit against OpenAI and Microsoft. The world’s most prominent newspaper claims that ChatGPT (and its underlying large language models) are so deeply and irreparably infringing on copyright that they should just be blown to digital bits – and a jury should give the Times a lot of money for its trouble.

If you’ve ever dabbled in communication law, you know that the Times has been on the right side of legal history many times – defending the First Amendment in the Pentagon Papers case and countless others. (I started my career in journalism, so I have a special love for the nuance of copyright and First Amendment jurisprudence.)

This time, I’m not sure if their claims move the world in the right direction. But whichever way the case goes, it’s by far the most prominent example of a question I get from readers all the time:

“How do I build with AI when there’s still so much uncertainty around copyright and intellectual property?”

Today, we’ll dig into the details of the case – and even more importantly, how it does (and does not) change the trajectory of AI for everyone who’s building new AI products and processes.

We’re years away from any changes

Federal lawsuits don’t move fast. Most likely, the NYT v. OpenAI case won’t reach a conclusion for many years – there will be months of preliminary motions and briefs, potentially a jury trial, and certainly multiple appeals before anybody changes anything about how they do business.

So even if the Times ultimately wins in whole or part, it’ll be eons from now in AI time. You’ve seen how fast things have moved in the past year. Imagine what three more years of exponential growth will do to OpenAI’s software.

Because a conclusion is so far out, it’s safe for you to assume that large language models will exist for the foreseeable future and you can safely use them. (OpenAI and Microsoft also indemnify you, as a user, against any copyright claims, so you can already comfortably use the software without risk.)

We also teach students in our Innovating with AI Incubator that they should proactively plan for their products and processes to switch between models and companies in the future. This isn’t because of lawsuits, but because every tech company is racing to build better and better AI, which means there will be multiple great options next year that are as good as GPT-4 is today.

OpenAI won’t be the only top-notch option for you in 2025, and it may not even be the best option for you six months from now. That’s how fast Google, Amazon, Meta, Anthropic and a range of other competitors are improving.

After all, it was just a few months ago that we thought OpenAI might dissolve. Regardless of copyright concerns, you shouldn’t put all your eggs in one basket, and there are lots of tech companies who’ll be competing for your business as AI model providers in the coming years.

Whack-a-mole

That brings us to the next reason I think the Times has an uphill battle – how many large language models can you realistically “destroy” in court?

Even if the Times prevails here, they’ll be up against literally every major tech company, all of whom have language models that presumably function similarly to GPT, and all of whom have a lot of lawyers working on novel and creative ways to show that they are not infringing on copyright because their behavior is protected by the fair use doctrine.

For an example of this type of argument, look at the Google Book Search copyright case from 2015. Authors said it should be illegal for Google to digitize and index the contents of their books. The courts disagreed, all the way up to the Supreme Court. Here’s a summary of the final decision:

“Google's unauthorized digitizing of copyright-protected works, creation of a search functionality, and display of snippets from those works are non-infringing fair uses. The purpose of the copying is highly transformative, the public display of text is limited, and the revelations do not provide a significant market substitute for the protected aspects of the originals. Google's commercial nature and profit motivation do not justify denial of fair use.”

Google has already won fair-use cases in similar situations. It’s hard for me to see the Times winning enough of these cases to really make a dent in the growth of AI.

The allegations are cringeworthy, though

Up to now, I’ve been pretty pessimistic about the Times’ chances. However, when you read the lawsuit itself, it does show some pretty egregious examples of ChatGPT repeating New York Times articles nearly verbatim.

The Times produced this behavior by sending ChatGPT a series of prompts, like these:

“Please provide the first paragraph of the 2023 Wirecutter article ‘The Best Cordless Stick Vaccuum.’”

“Now, what’s the second paragraph?”

“And how about the third paragraph?”

ChatGPT did this pretty accurately in the Times’ tests. However, ChatGPT has already been updated so that this trick won’t result in the AI spitting out a bunch of verbatim content from the article. (Again, the Times is going to be playing whack-a-mole forever as the models and guardrails change.)

Remember that Google won the Books case, in part, because it was only showing snippets from the book content it had indexed. The Times felt that it had caught OpenAI red-handed because ChatGPT was providing something more than a snippet. Within days, OpenAI changed that behavior in its public-facing software.

The Books decision also noted that Google Book Search isn’t really a viable replacement for buying a book, which is an important test in any copyright case. While the Times argues that ChatGPT directly competes against it, I think that’s a pretty weak claim. I am definitely not asking ChatGPT to read the New York Times to me, and I can’t imagine any real-life NYT reader doing that for any reason other than a silly experiment to see what the AI will say.

Contrast that with pirating a song using Napster or a movie using BitTorrent. Those are clear cases of infringement because you get the entire copyrighted work in a format that is a direct market replacement, and the creator never sees a dime.

When you read the lawsuit in isolation, it looks really bad for OpenAI. But that’s always the nature of legal briefs – if you only read one side, and it’s written by a smart and persuasive lawyer, you’re going to be impressed by their argument. But when you zoom out, you see how many ways OpenAI (and future tech companies) can defeat the Times’ claims.

Copyright law is going to have to adapt to AI

If you’re building an AI product or process, the key takeaway is that the old ways of thinking about copyright are inevitably going to go by the wayside. In my view, there’s just no way that significant damages or the “destruction” of ChatGPT will survive Supreme Court review. After all, even Chief Justice John Roberts sees “great potential” in AI.

There may be future cases where someone uses AI to blatantly copy or plagiarize another person’s work for commercial gain, but most use of AI will fall into a gray area between “copying” and “learning from” that the courts will eventually consider to be fair use. Copyright law will slowly adapt to the future of tech, and so will organizations like The New York Times.

Till next time,

– Rob

Quick question: What did you think of today’s email?

PS. The new cohort of the Innovating with AI Incubator opens January 24 for everyone who's on the waitlist. Join the waitlist via email or SMS and you'll get an early-access discount.

Innovating with AI

We help entrepreneurs and executives harness the power of AI.

Read more from Innovating with AI

Hey Reader, I hope life, work and the world of AI are treating you well! Three big things in AI and tech to share: #1 – Pika releases a sweet new video model You can try it at Pika.art (yes, weird domain name) and see some demos here. We've been waiting with bated breath for better publicly accessible video tools since we saw the OpenAI Sora teasers in February. RunwayML and Pika are two of the best video tools on the market right now, and Pika's newest release is legitimately impressive. My...

Hey Reader, Open-source software dominates the world of "traditional" coding, but so far, closed-source companies like (the very confusingly named) OpenAI and Anthropic are leading the pack in terms of large language model quality. Today, we dig into Meta's attempt to change that with the open-source Llama model. ••• Together with WebAI WebAI’s biggest release yet One Platform. Tailored to youwebAI’s Navigator gives you the power to build and deploy models utilizing your own...

Hey Reader, Everyone's rolling out AI features, but they're not all a hit. I've seen so many ads for AI-powered tech in mainstream media lately – it seemed like half the commercials during the recent NBA Finals were AI-related. So today, Nyasha helps us dig into one that was not a hit – the McDonald's AI ordering system – and how their competitors are taking the lead on faster, smarter verbal ordering tech. ••• But first, a new tool and free trial for IWAI readers from our friends at Zupyak....