Notice: Function _load_textdomain_just_in_time was called incorrectly. Translation loading for the rt_theme domain was triggered too early. This is usually an indicator for some code in the plugin or theme running too early. Translations should be loaded at the init action or later. Please see Debugging in WordPress for more information. (This message was added in version 6.7.0.) in /var/www/wp-includes/functions.php on line 6114
Ask Dr. Copyright © | Copyright Issues with AI
Ask Dr. Copyright

 Dear Doc:

It seems that all anyone wants to talk about now is artificial intelligence. Are there copyright issues involved in AI?

Signed, ChatGPT 

Dear Chat:

The Doc has been thinking about this question for a long time. Back in the “dark ages” of computers (ca. 1983) we used to call the job of getting information into a computer so that the computer could do artificially intelligent things “knowledge engineering”. It was difficult and time-consuming work. Often, the people whose knowledge was being engineered objected, because they were worried that they would soon be replaced by computers (how silly!)

Today, computers just “read” everything they can get their search spiders on. It’s less about engineering and more about just figuring out how language strings together words, phrases, sentences, paragraphs, images and other information to create a simulation of intelligence. Once again, people are worried that they will be replaced by computers – particularly people whose jobs just barely require any intelligence at all. 

Now, from a copyright standpoint, there are two big issues in all of this: first, when the artificially intelligent computer program “reads” things in order to train itself, does it make a copy of the work being read and is that copy an infringement of copyright, and second, when the AI program creates a response by stringing together words and phrases, is that a “derivative work” under the copyright law that requires permission from the original author? The Doc thinks the answer to both questions is “NO”, but as with everything in the law, there is room for argument.

The way the Internet was created, when you use a browser to read a page, the components of that page are fetched from a web server, copied into the memory of your device (computer, phone, tablet, etc.) and then put on your screen according to the directions given by the server. If the owner of the material does not want you to see it, she can either not place it on a web server in the first place, or include instructions along with it that prevent it from being used or shown. The way that search engines work is much the same, except that instead of showing you the information, the web “spider” program copies the information from the server and builds an index. Web spiders are built to respect limits on what they scan, and these are encoded into files called “ROBOTS.TXT”. Once the page is displayed, or the index is built, the copied information is thrown away. Just as nobody accuses Google of copyright infringement when it reads a website to index the information, so, too, the Doc thinks that training an AI is not actionable copying under the Copyright Statute.

New AI programs such as ChatGPT (and that is only one of many so-called “large language model AIs”), take that one step more. They fetch information from web servers and build specialized maps (models) of the information showing how it is structured. The source original is then thrown away, leaving only the model. While the model is, in some sense, “derived” from the original, the Doc thinks that this is very far from the kinds of derivative works that Congress intended the law to address: translations, musical arrangements, motion picture versions of literary material or plays, art reproductions, abridgments, and condensations of preexisting works. Rather, the AI is creating a deeper kind of index, one that may be used to summarize the millions of different expressions on the Internet in new ways.

What does ChatGPT “think” is the answer? The Doc will let you see for yourself…

The Doc asked: “Do responses by ChatGPT infringe copyrights on its training materials?”

And ChatGPT responded:

“No, responses by ChatGPT do not infringe copyrights on its training materials. The responses generated by the model are based on patterns in the text it was trained on and do not reproduce any specific copyrighted material. However, it’s possible that some responses may contain information that is covered by a third-party’s intellectual property rights, such as trademark or patent rights. OpenAI encourages users to respect the intellectual property rights of others.”

So there you have it! Straight from the artificial horse’s mouth.

Have a cutting-edge question about intellectual property law? Reach out to the attorneys at LW&H. They’re staying on top of this rapidly evolving area of the law so that they don’t get replaced by robots any time soon.

Until next month,

The “Doc” (or some reasonable facsimile thereof)

— Lawrence A. Husick, Esq.

*Originally posted on February 23, 2023