Connect with us

Hi, what are you looking for?

Investing

Sarah Silverman’s ChatGPT Lawsuit Raises Big Copyright Questions for AI Models

One of the tricky parts of the generative artificial-intelligence phenomenon is whether you can effectively build comprehensive large-language models without violating the rights of content creators.

That issue has resurfaced Monday, after the comedian Sarah Silverman and two other authors filed a pair of purported class-action lawsuits against OpenAI, parent of ChatGPT, and Meta Platforms (ticker: META), which created LLaMA, a widely adopted AI model.

In a pair of lawsuits filed last week in federal court in San Francisco, the authors assert that the models created by the two companies illegally included their works in training their models. The three plaintiffs—Silverman, and authors Christopher Golden and Richard Kadrey—assert that the OpenAI and Meta models were trained in part on vast digital-book collections known as “shadow libraries” that contain unlicensed copyrighted material.

Among other things, the OpenAI suit notes that a request to ChatGPT to summarize Silverman’s book “The Bedwetter” returns a detailed summary of the book, and asserts that it wouldn’t be possible to provide a summary of that variety without having the full text in the training model.

The lawsuit was filed by the lawyers Joseph Saveri and Matthew Butterick, who together have filed several other lawsuits focused on the use of unlicensed copyrighted material in training sets for large language models. The two have filed previous lawsuits against
Microsoft’s
(MSFT) GitHub and OpenAI over the alleged AI-enabled software piracy, and another suit against Stability AI, MidJourney and DeviantArt representing a group of artists asserting that their work was included in training data without permission. They also filed a suit against OpenAI on behalf of the authors Paul Temblay and Mona Awad, making claims similar to those in the Silverman suit.

Meta declined to comment on the lawsuits. OpenAI did not immediately respond to a request for comment.

Most large-language model creators provide little data on the underlying data powering their models. One exception to that is
Adobe
(ADBE), which has said its image-creation model uses only images already licensed to the company in its stock-photography business, or those no longer under copyright protection. Adobe is also developing a method for creators who include their works in Adobe Stock to open out of inclusion in training models—and to be compensated when the software created new work based on a particular artist’s creations.

Write to Eric J. Savitz at [email protected]

Read the full article here

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Videos

Watch full video on YouTube