Subscribe to enjoy similar stories. Last month, news agency ANI filed a lawsuit against OpenAI, the maker of ChatGPT, alleging unauthorized use of its content to train the generative artificial intelligence (AI) chatbot. This marks the first such case in India, but the lack of clear policies on AI's use of public data has sparked similar conflicts globally.
At the core of the issue is AI tools’ relentless demand for high-quality data. However, their reliance on readily available online information is nearing its limit. Without expanding their data sources—such as printed novels, personal documents, videos, and copyrighted news content—the growth of AI chatbots could plateau.
This pursuit of data, however, is colliding with copyright concerns and the carefully constructed business models of publishers and media outlets. Read this | Mint Explainer: The OpenAI case and what’s at stake for AI and copyright in India In the US, publishers, musicians, and authors have taken legal action against AI companies for using copyrighted content to train their models. Last year, Getty Images sued Stability AI, accusing it of using 12 million of its photos to develop its platform.
Similarly, The New York Times filed a lawsuit against OpenAI, alleging the misuse of its content and positioning the AI company as a direct competitor in providing reliable information. Several writers have also initiated lawsuits with similar claims. AI companies, however, largely argue that their language models are built on publicly available data, which they contend is protected under fair use policy.
Read more on livemint.com