Penguin Random House has taken a firm stance against the unauthorized use of its works for training artificial intelligence (AI) systems. The publisher, reports The Bookseller, has decided to change the language on the copyright pages of its books, expressly forbidding their use for AI training purposes. This decision contrasts with that of other academic publishers, such as Taylor & Francis, Wiley and Oxford University Press, which have agreed to license their works to AI companies.
Matthew Sage, an expert in AI and copyright at Emory University, believes that Penguin's message seems to be aimed at the European market, where legislation allows rights holders to decide whether their works can be used for data mining. Although this option is not protected by law in the US, leading AI developers tend to avoid using explicitly excluded copyrighted material.
Meanwhile, in the US, several authors and media companies have filed lawsuits against technology companies such as Google, Meta, Microsoft, OpenAI and others, accusing them of breaking the law by using copyrighted works to train language models without permission. The companies defend themselves by arguing that they are acting in accordance with the “fair use” doctrine, which allows the unlicensed use of protected content in certain circumstances, such as criticism, reporting or education.
It has not yet been decided whether the use of books to train AIs constitutes “fair use ” in the US, which leaves the debate open. At the same time, some social media initiatives, in which users post messages to prevent their content from being used in AI training, have not been successful. This is partly due to the fact that the platforms' terms and conditions generally allow this content to be used for training purposes.
The position of the world's largest publisher is different, as Penguin has the financial and legal backing to enforce its restrictions and protect the rights of its authors. In addition, the ability to stand up to the big tech companies allows them to enforce their copyright policies effectively, which contrasts significantly with the actions of individual users on social media.
Penguin Random House's action is a clear attempt to draw a line in the sand against AI developers and highlight the importance of protecting the content of its authors. Although there is still much to be debated and defined in terms of regulation, the publisher's stance sets a precedent in the sector and may motivate other publishers and creators to follow in its footsteps in defending their intellectual property rights in the face of advancing artificial intelligence.