Nvidia sued for allegedly using copyrighted books to train AI

11 Mar 2024

Image: © Tada Images/Stock.adobe.com

Nvidia has joined the list of companies facing lawsuits over claims that copyrighted material is being used to train their AI models.

Nvidia is facing legal trouble from a trio of authors, who claim the company used copyrighted books to train one of its AI models.

The US dispute – first reported by Reuters – involves authors Brian Keene, Abdi Nazemian and Stewart O’Nan, who claim their works were included in a dataset used to train NeMo, a Nvidia framework that is designed to build and customise generative AI models.

This dataset was taken down last October for reported copyright infringement. The three authors are seeking damages for copyrighted works that helped train NeMo’s large language models in the last three years, Reuters reports. The copyright dispute was filed with the US District Court for the Northern District of California.

AI v copyright

The case is the latest in the growing issue of copyright infringement for AI companies. Last year saw a number of authors file suits against both OpenAI and Meta, with claims that their AI models used their books as training material.

Those lawsuits claimed the large language models developed by Meta and OpenAI were trained on illegal “shadow libraries” – websites that contain pirated versions of the authors’ books.

Last year also saw thousands sign a letter written by the US Authors Guild, calling on the likes of OpenAI, Alphabet and Meta to stop using their work to train AI models without “consent, credit or compensation”.

Towards the end of 2023, The New York Times stepped into the ring with a high-profile lawsuit against both OpenAI and Microsoft. The media outlet claimed AI models such as ChatGPT have copied and use millions of copyrighted news articles, in-depth investigations and other journalistic work.

In January, OpenAI said it was “surprised and disappointed” by the lawsuit and added the newspaper was “not telling the full story”. It followed up in February with a claim that The New York Times “paid someone to hack OpenAI’s products” to generate “highly anomalous results” used as evidence in its AI copyright case.

In a statement sent to SiliconRepublic.com, Ian Crosby, a partner at Susman Godfrey and lead counsel for The New York Times, noted that OpenAI did not dispute that it copied millions of articles from the media outlet to build its products.

OpenAI is also facing a class-action lawsuit filed last year, which claims the company scraped the internet to train its generative AI chatbot and potentially violated the rights of millions as a result.

Find out how emerging tech trends are transforming tomorrow with our new podcast, Future Human: The Series. Listen now on Spotify, on Apple or wherever you get your podcasts.

Leigh Mc Gowran is a journalist with Silicon Republic

editorial@siliconrepublic.com