Yesterday, the News/Media Alliance published a White Paper and a technical analysis and submitted comments to the U.S. Copyright Office on the use of publisher content to power generative artificial intelligence technologies (GAI). Together, the three publications document the pervasive, unauthorized use of publisher content by GAI developers, the impact this may have on the sustainability and availability of high-quality original content, and the legal implications of such use. GAI systems have been developed by copying massive amounts of the expressive material published by the Alliance’s members, almost always without authorization or compensation, to create new products and services that frequently compete with Alliance member publishers.
The Alliance recognizes the exciting potential of GAI models and applications to improve aspects of our lives and supports the principled development of these systems. But this development must not come at the expense of publishers and journalists who invest considerable time and resources producing material that keeps our communities informed, safe, and entertained, and holds our government officials and other decision makers in check. The Alliance and its members would welcome working with GAI developers to help build and grow these technologies in a sustainable and responsible manner.
While the Copyright Office submission and White Paper discuss the wider publisher landscape in the face of the GAI revolution, including relevant principles of copyright law, the accompanying technical analysis documents the extent to which GAI developers rely on high-quality journalistic content to power their models. In particular, the results show:
Alliance President & CEO Danielle Coffey stated, “The research and analysis we've conducted shows that AI companies and developers are not only engaging in unauthorized copying of our members' content to train their products, but they are using it pervasively and to a greater extent than other sources. This shows they recognize our unique value, and yet most of these developers are not obtaining proper permissions through licensing agreements or compensating publishers for the use of this content. This diminishment of high-quality, human created content harms not only publishers but the sustainability of AI models themselves and the availability of reliable, trustworthy information.”
The Copyright Office comments and the White Paper offer multiple recommendations to policymakers, including recognizing that unauthorized use of publishers' expressive content for commercial GAI training and development is likely to compete with and harm publisher businesses in a manner that infringes copyright; creating transparency requirements to require disclosure of the use of copyright protected content in training; encouraging and facilitating effective licensing solutions; supporting international cooperation and harmonization on GAI regulations; and adopting legislation to remedy existing market imbalances that prevent publishers from engaging in fair negotiations for the use of their content against dominant platforms.
Coffey continued, "Generative AI systems should be held responsible and accountable, just like any other business. This White Paper demonstrates that these systems rely on journalistic and creative content, which have the benefit of investment in quality on the front end, as well as publishers who are required by law to take responsibility for the content they share with the public. Continued unauthorized use will harm existing markets that acknowledge the value of archived and real-time quality content, and over time the GAI models themselves will deteriorate. You get out what you put in. It is critical that our copyright protections are properly enforced and that high standards of quality and accountability are the foundation of these and other new technologies."
No comments on this item Please log in to comment by clicking here