As if there wasn’t a more convenient time in the year to do this, the New York Times chose the last week of 2023 to sue OpenAI for copyright infringement. Which is why instead of putting my feet up and relaxing, I had to fire up my computer to figure out whether this was the end of generative AI that so many were saying it was going to be.
One of the arguments in the New York Times’ complaint is that OpenAI’s large language models (LLMs) demonstrate a kind of behaviour called “memorisation," as a result of which, when supplied with a particular prompt, they repeat verbatim the material they were trained on. This, the complaint argues, allows OpenAI users to bypass the New York Times firewall and access its copyright content without a subscription.
Never one to take allegations at face value, I decided to see for myself whether any of the prompts listed in the complaint could be used to generate even a part of a New York Times article. So I copied each prompt into my own paid instance of ChatGPT to see what it spat out.
Each time, I received variations of the same response: “I can’t provide verbatim excerpts from copyrighted material like the New York Times article. However, I can offer a summary or discuss the themes and content of the article...
For the full article, you would need to visit the New York Times website or access it through a library or other service that offers full-text articles from the newspaper." Nothing I did could get ChatGPT to breach the paywall as the New York Times argued was possible. Either the newspaper has access to a version of ChatGPT that we ordinary users do not have, or it had gone to extraordinary lengths to constrain the model while responding to its prompt so that the outputs came
. Read more on livemint.com