

AI training methods do not violate our copyright law—but the output that AI models generate might
In Ex Machina last week (‘AI use of original work: A reverse Robin Hood proposal’), I argued that the working paper issued by the Department for Promotion of Industry and Internal Trade (DPIIT) on copyright and artificial intelligence (AI) falls short of its objective because the mandatory blanket licensing regime it proposes transfers wealth away from the very creators it was supposed to protect. But as bad as this suggestion is, it is not the most egregious conceptual shortcoming of the report.
Far more disconcerting are the assumptions it makes about how AI systems are trained and its suggestion that this process infringes the Copyright Act of 1957.The verb ‘copy’ lies at the heart of many operational activities around which the Copyright law has been designed. A ‘copy’ has always referred to a reproduction of a work clearly identifiable as having been substantially derived from the original.
However, it has never treated the act of learning from a work as equivalent to reproducing it. In the early days, copies referred to physical reproductions made by a printing press or other mechanical devices designed for this purpose.
Since then, it has been extended to the many digital duplicates we encounter today—most of which will never physically exist. It is this concept that is now being extended to AI training.To qualify as a copyright infringement, it must be established that the process of training an AI model results in the creation and storage of reproductions of copyrighted works in a form that is intelligible, expressive and capable of substitution.
Read on livemint.com