ChatGPT developer OpenAI issued a blog post responding to allegations levelled by the New York Times in a lawsuit filed last month. The news organisation had accused OpenAI of «wide-scale copying» and using millions of articles published by it for training chatbots.
OpenAI has refuted some of these allegations and claimed that the fair use doctrine of copyright laws will apply in its case. ET explains the case and its ramifications.
What are the allegations?
The news publisher has claimed that OpenAI’s chatbots were competing with the media platform as a source of information, while having taken much of the content from NYT. It said in the lawsuit that after Google and Wikipedia, NYT was the biggest dataset scraped from the internet by Common Crawl, a non-profit web crawler. Data from Common Crawl have been used to partly train the GPT3 engines, it claimed. Further, NYT claimed that OpenAI’s generative AI tools could “generate output that recites Times content verbatim, closely summarises it, and mimics its expressive style”.
Also read | ETtech Explainer | NYT vs OpenAI: Why news publishers are fighting Big Tech over LLMs
How has OpenAI responded?
OpenAI’s response is primarily across four points that it has made.
It collaborates with news organisations.
Training of generative AI engines is fair use and it provides an opt-out.
“Regurgitation”, or verbatim use of content, is a bug, and it’s trying to solve it.
NYT “is not telling the full