Preventing the Harms of AI-enabled Voice Cloning
Preventing the Harms of AI-enabled Voice Cloning
By FTC’s Office of Technology and Division of Marketing Practices
November 16, 2023
Today, the FTC is announcing the Voice Cloning Challenge to address the present and emerging harms of artificial intelligence- or “AI”-enabled voice cloning technologies.
Speech synthesis has been around for several decades.[1] Perhaps one of the most famous examples is CallText 5010, the robotic-sounding speech synthesizer[2] Stephen Hawking used after he lost his voice in 1985. And now, going beyond digital voices like Hawking’s and Apple’s Siri, it is possible to clone people’s voices thanks to improvements in text-to-speech AI engines. Voice cloning systems are generally built on large training sets composed of people’s real voices. Since many of these systems are commercially available or even open sourced, they can be easy for anyone to access, equipping people with a powerful tool to replicate human voices in a way that is hard to detect by ear.
This progress in voice cloning technology offers promise for Americans in, for example, medical applications—it offers the chance for people who may have lost their voices due to accident or illness to speak as themselves again. It also poses significant risk—families and small businesses can be targeted with fraudulent extortion scams; and creative professionals, such as voice artists, could potentially have their voices appropriated in ways that could jeopardize an artist’s reputation and ability to earn income. The FTC has previously explored similar topics in our 2020 workshop, where we examined advances in artificial intelligence and text-to-speech (TTS) synthesis.
With these threats in mind, the FTC has made clear that we are prepared to use all of our tools to hold bad actors accountable, including law enforcement actions under the FTC Act, the Telemarketing Sales Rule, and other authorities. In addition, the Commission is considering the adoption of a recently-proposed Impersonation Rule that will give us additional tools to deter and halt deceptive voice cloning practices.
To further advance this work, the FTC is launching the Voice Cloning Challenge, a new exploratory challenge, which is another tool in our toolkit. The FTC seeks to encourage the development of multidisciplinary solutions—from products to policies to procedures—aimed at protecting consumers from AI-enabled voice cloning harms including fraud and the broader misuse of biometric data and creative content. We are asking the public to submit ideas to detect, evaluate, and monitor cloned voices.
This isn’t a techno-solutionist approach or a call for self-regulation. We hope to generate multidisciplinary tools to prevent harms, and we will continue to enforce the law.
The risks posed by voice cloning and other AI technology require a multidisciplinary response. That’s why this Challenge, and so much of the FTC’s work, is a joint effort across the agency. In this case, the Office of Technology and Bureau of Consumer Protection are proud to coordinate on this Challenge.
We recognize that this Challenge is just one way to approach this problem. The risks involved with voice cloning and other AI technology cannot be addressed by technology alone. It is also clear that policymakers cannot count on self-regulation alone to protect the public. That’s why, at the FTC, we will continue to use all of our tools—including enforcement, rulemaking, and public challenges like this one—to ensure that the promise of AI can be realized for the benefit, rather than to the detriment, of consumers and fair competition.
We are calling for ideas that are administrable, increase company responsibility and reduce consumer burden, and are resilient to rapid technological change.
This Challenge reflects the reality that, while the private sector is richly rewarding development of AI-related technology, technology to mitigate potential harms is not developing as organically or at the pace that is necessary to address the harms.
For the Challenge, there are three primary areas of focus for assessment:
Administrability and Feasibility to Execute: How well might the idea work in practice and be administrable and feasible to execute?
Increased Company Responsibility, Reduced Consumer Burden: If implemented by upstream actors, how does the idea place liability and responsibility on companies and minimize burden on consumers? How do we ensure that the assignment of liability and responsibility matches the resources, information, and power of the relevant actors? How does this mitigate risks at their source or otherwise strategically intervene upstream before harms occur? If required to be implemented by consumers, how easy is it for consumers to use?
Resilience: How is the idea resilient to rapid technological change and evolving business practices? How easily can the approach be sustained and adapted as voice cloning technology improves, including how the idea will avoid or mitigate any additional safety and security risks that it itself might introduce?
The present state-of-affairs highlights similarities with what we saw in the robocall context a decade ago, when the FTC spurred innovators to develop call-blocking technology. That technology has significantly advanced in the years since. In fact, while robocalls remains a scourge, we are proud that over the last few years, the number of robocall complaints to the FTC has steadily declined.
The goal of this Challenge is to foster breakthrough ideas on preventing, monitoring, and evaluating malicious voice cloning. This effort may help push forward ideas to mitigate risks upstream—shielding consumers and creative professionals against the harms of voice cloning. It also may help advance ideas to mitigate risks at the consumer level.
And if viable ideas do not emerge, this will send a critical and early warning to policymakers that they should consider stricter limits on the use of this technology, given the challenge in preventing harmful development of applications in the marketplace.
Last updated