AI's Use in Copyright Law: Is It Permissible as Fair Use or an Infringement?

In a series of high-profile lawsuits, major entertainment companies like Disney, Universal, and Warner Bros. have accused AI companies of using copyrighted scripts, stills, and character designs without permission to train their AI models. These legal battles, which are testing the boundaries of fair use and creative rights, are reshaping the landscape of AI development and raising important questions about the role of creators in the AI era.

The use of copyrighted material for training AI models is evaluated under the four-factor fair use test defined by the U.S. Copyright Act. The four factors considered are:

Purpose and character of the use: Courts look favorably on uses that transform the original work's purpose, such as teaching an AI to generate new text or art. AI companies argue that training qualifies because it involves extracting patterns to build new systems, while critics argue that generative models can closely mimic the style of specific authors, artists, or musicians.
Nature of the copyrighted work: Whether the work is creative or factual can influence the fair use analysis. In the AI context, the second factor typically weighs against fair use, as generative AI models are often trained on highly expressive content such as fiction, poetry, visual art, and film scripts.
Amount and substantiality of the portion used: Using large amounts may weigh against fair use if it replicates the core creative expression. The large-scale ingestion of entire texts or datasets necessary for AI training is generally considered unlicensed copying by many creators, publishers, and rights holders.
Effect on the market: Courts consider whether the use harms the market for the original works or causes market dilution. Creators argue that AI-generated content could substitute for their work, especially when it imitates a distinctive style or is used commercially.

Recent court decisions, particularly in 2025, have found that training AI models on lawfully obtained copyrighted works can qualify as fair use if the use is deemed highly "transformative"—meaning the AI uses the works to learn language patterns and generate new, unrelated content rather than reproducing the original material. However, these rulings are fact-specific and do not create broad absolute safe harbors.

Prominent 2025 cases, such as Bartz v. Anthropic and Kadrey v. Meta, found AI training fair use because the training was transformative and did not demonstrably harm the market for the original works. Yet, courts warned that if concrete evidence of market harm or dilution existed, fair use might not apply. Additionally, courts emphasize that training on pirated or unlawfully obtained materials likely violates copyright and is not protected by fair use. Licensing or purchasing data for training further protects AI developers from liability.

In the UK, officials have released new guidance on AI ethics and creative rights, and efforts to give creators more control, like opt-out systems, are starting to gain more traction. Notably, Tennessee's ELVIS Act was passed to protect people's voices and likeness from being used without permission in AI-generated content.

One recent decision found that Anthropic's use of copyrighted books to train its Claude AI model could qualify as fair use, but the company was found to have downloaded millions of books from pirate websites. This highlights the complexities and nuances involved in these cases, as well as the importance of adhering to copyright laws when sourcing data for AI training.

As the legal landscape evolves, AI companies like Adobe train their models only on licensed or owned content, while others like Getty Images and the Authors Guild have sued Stability AI and OpenAI, respectively, for using their copyrighted works without permission. Former President Donald Trump has publicly rejected proposals requiring AI companies to compensate creators for training on their work.

The legal basis for this defense remains uncertain, as the training of generative models is a novel use on a massive scale that doesn't fit neatly into any of the established fair use precedents. The unresolved legal questions surrounding each factor of the fair use test when applied to generative AI will continue to shape these ongoing legal battles and the future of AI development.

The use of copyrighted books by companies like Anthropic to train their AI models can potentially be deemed as fair use, but it is crucial to adhere to copyright laws when sourcing data, as the large-scale downloading of copyrighted works from pirate websites, as in Anthropic's case, may not be protected.
As the AI industry continues to expand, companies like Adobe are opting to train their models only on licensed or owned content to avoid potential legal disputes, while entities like Getty Images and the Authors Guild are taking legal action against firms like Stability AI and OpenAI for using their copyrighted works without permission.