An FCC Ruling on NPRM addressing AI-generated Robocalls/Robotexts could Hurt Legitimate Businesses
FCC 24-84 proposes to define AI-generated calls as well as new rules that would require callers disclose to consumers when they receive an AI-generated call. A ruling in this area could be damaging to legitimate businesses with respect to the problematic nature of labeling calls as “AI-generated” as well as deficiencies in terms of the phone user interface (UI) to be able to efficiently disclose the type of call to a consumer.
Regulate Impersonation Only or All Synthetic Voice?
One of the biggest issues with this NPRM is the definition of “AI-generated call’ relative to the aims of the FCC. This NPRM justifiably was prompted by the “Deep Fake” of Joe Biden that occurred in conjunction with unlawful political calls in the State of New Hampshire.
These calls were deemed a Deep Fake because they involved synthesized audio to impersonate President Joe Biden, a well known public figure who is recognizable by name and voice by millions of citizens. It could also be considered a Deep Fake based on the context of the communication including the call-to-action (telling consumers there is no need to vote) and telephone number identity (e.g. use of a spoofed telephone number).
The word impersonate is key because the likeness of his voice (e.g. a synthesized version using AI-generated audio) occurred without permission. If these had been legal/ethical calls, approved by the Biden administration, the word mimic may be more appropriate. However, the calls were not legal and permission was not granted. It is recommended that the words impersonate or impersonation should always inherently mean without permission and thus a form of fraud and therefore unlawful.
In contrast, a human being providing permission to mimic their voice, such as an actor licensing rights for movie production, would not be unlawful. AI may also be used to synthesize the voice of a person that is not a public figure, such as Gerry Christensen. AI may also be used to generate audio for a non-person – a completely synthetic identity (e.g. a unique voice and associated name that is unlike any other in the world). In the case of Gerry Christensen, absence of permission to mimic would represent impersonation. In the case of voice generated in association with a synthetic identity, neither impersonation or mimic seem to fit as there is nothing to copy from a real human being, alive or dead.
Deep Fakes (which again implies impersonation and thus unlawful) arguably have the potential for great harm to consumers and society as a whole. As with the Biden impersonation in New Hampshire, Deep Fakes are conducted by bad actors that rely on context that may involve two factors, “what you know” (calling registered Democrats in New Hampshire), and “what you have” (a spoofed telephone number). Either one of both factors may be employed by bad actors with intelligence gained from public sources, social engineering, and/or data breaches that occasionally occur from commercial and government sources.
The threat of Deep Fakes should not be taken lightly. However, the scale and scope should be a consideration. For example, impersonating Joe Biden’s voice could impact millions of people, but impersonating Gerry Christensen’s voice would likely only impact family, friends, and business associates. Therefore, the impact of a Deep Fake of a well known public figure would have bearing on broadcast vishing/smishing attempts whereas impersonating someone like Gerry Christensen would be limited to highly targeted “spear” vishing/smishing and therefore would not involve wide scale B2C communications.
It is therefore important to ask the question, does the FCC want to regulate all impersonation or only instances involving public figures? If the FCC wants to regulate for only public figures, what is the measure of determination? If the FCC wants to regulate all impersonation, how will the industry know that a real person’s voice is being synthesized, and if so, how would tracking of licensing agreements for legal mimicking occur? There are also issues to consider if the FCC wants to regulate the use of all synthesized voice as it would be burdensome to legitimate businesses.
Labeling All Pre-recorded Calls as “AI-generated” would cause Undue Burden
Some legitimate businesses use AI as part of their lawful and ethical business practices. For example, some companies use machine learning (ML) and large large language models (LLM) to produce pre-recorded voice for the purpose of providing information and/or a prompt to consumers when they receive a call from a legitimate business. For example, a pre-recorded message using AI to synthesize voice could be, “This is Gerry calling from [XYZ] pharmacy about your prescription that is ready for pickup. Press one or call us at 800-555-1212 if you have any questions. Otherwise, you may pick up your prescription any time.” Note that this example states simply Gerry (not Gerry Christensen) and Gerry Christensen is not a public figure and/or someone who most people would recognize by name and/or voice. Also note that a random name (e.g. synthetic identity) could be used with synthesized voice.
Pre-recorded audio is already governed by the Federal Trade Commission (FTC) Telemarketing Sales Rule (TSR) in terms of required disclosures and consent. Consent is also already governed by the FCC for compliance with TCPA rules including revocation. Best practices also recommend use of the Reassigned Number Database (RND) to help ensure right party contact. Companies that follow the aforementioned rules and recommendations would already be in compliance with FCC and FTC regulations. Burdening legitimate, law abiding companies with the need to first label, and then disclose based on such usage, would likely create unnecessary inefficiencies.
Disclosure of “AI-generated Calls” to Consumers is Currently Problematic
Conveying trust to consumers at the UI for inbound calls is problematic. While there are some efforts underway, such as proprietary and standards-based means of providing branded calls, efforts to facilitate a trusted user experience have largely fallen flat. For example, the green checkmark and/or “Verified Caller” indication on the UI may provide a false sense of security as it explicitly conveys that STIR/SHAKEN has been applied to the call, but does not ensure that the call is from a legitimate entity with lawful intent.
The aforementioned Biden impersonation calls are a perfect example of this issue as A-level attestation was improperly applied to these calls. In addition, even when A-level attestation is properly applied within the industry (e.g. network is known and telephone number is associated with the network), calls associated with A-level may be observed to have unlawful intent based on their behaviors as measured by content-based analytics.
As mentioned, branded calling is one means of instilling trust to consumers for organizations that have means and inclination to pay more to convey their logo, name, and even reason for some calls to consumers. Branded calling is effective at the UI level, but there will be a need for increased consumer awareness that these types of calls may be trusted as being genuinely from the company depicted. This trust stems from industry best practices for enhanced Know Your Customer (KYC) procedures in terms of the branded calling ecosystem thoroughly vetting organizations prior to allowing them to brand their calls. In addition, best practices suggest that KYC should include more than just vetting – it should also include ongoing monitoring to ensure compliance with regulations and of course identify the rare case of fraud that may occur despite vetting. However, not every company will have the means to offer branded calling, and in fact, for some companies such as third-party debt collection, branded calling is not a viable method for increasing effective business to consumer engagement.
Need for New Forms of Digital Identity for Consumer Contact
Taking the aforementioned issues into consideration, there is a need to evaluate enhancements at the phone UI with a goal towards transparency, accountability, and non-repudiation with respect to organizations taking responsibility for calls to consumers. It is recommended that the FCC consider advanced digital identity management as a means of verifying an entity involved with consumer contact. To date, the industry relies largely upon network and telephone number identification and association. A prime example of this for voice calls is STIR/SHAKEN mentioned earlier, which provides a framework developed to deal with unauthorized telephone number spoofing issues that uses an in-band signaling based method of authentication and validation.
One said means of digital identity recommended is “Organizational Identity”, which would provide traceability to an organization (business, government agency or NGO) responsible for conduct associated with phone number usage. For example, if a telephone call is found to exhibit unlawful behavior, attribution of an organization to that call and phone number (at the time the call is placed) provides non-repudiation. This is very important for non-real-time, post-call purposes as a means of providing accountability. More advanced implementations of an Organizational ID that involve network and analytics integration could provide a means of real-time call processing.
Another example of digital identity is a means of tracking AI-generated voice. A Proof-of-Life (PoL) ID could be leveraged to identify that a communications attempt is associated with a real human or a synthesized voice (e.g. a bot). Unlike the Organizational ID, which can represent virtually an infinite number of descriptives, the PoL ID is a Boolean attribute representing either “Human” or “Bot”.
One important aspect of both the Organizational ID and the PoL ID is reliable authentication and validation. Therefore, it is recommended that both leverage cryptographically verifiable credentials. The use of blockchain or some other similar means of verification (e.g. cannot be counterfeit, provides traceability and attribution) is strongly recommended.
It is also strongly recommended that the use of said Organizational ID and PoL ID be used in conjunction with Know Your Customer (KYC) onboarding and monitoring best practices. For example, a business may be assigned an Organizational ID upon becoming a customer of a communications service provider. This ID would be associated with all telephone numbers that are used at any given point in time, providing a means of KYC attribution and thus accountability to the business itself for compliance with applicable regulations and laws.
For the PoL ID, organizations would similarly be required to register telephone numbers as being associated with “Human”=1 or “Bot”=0. As business needs and/or telephone numbers change, administrative updates may be required to change 1-to-0, or vice versa, from time to time.
From a KYC monitoring perspective, observations will reveal whether PoL claims have been made, and if so, whether the claim is valid (e.g. did the organization claim “Human” but it was actually found to be “Bot”). Organizations found to be making false claims regarding PoL could be treated in a manner similar to those that make other false claims that pertain to KYC as well as related terms of service, and compliance with applicable regulations and laws.
New Digital Identities Require User Interface Enhancements
While there is arguably a need to be able to discern bot vs. human, the industry also needs UI/UX improvements to convey to consumers when an incoming communications attempt will involve interactions with a bot or a human. The specific means of conveyance could be as simple as illuminating either a (to be developed) “bot” icon or a “human” icon on a smartphone screen upon receiving a call or text. The latter would allow the called party to have confidence that a human being will be on the other end if they answer the call.
If the “Bot” icon is illuminated, an additional (to be developed) indication on the phone screen (such as a Blue Checkmark) could demonstrate that the calling/texting party is a known, “Verified Organization”. This would provide an improved level of consumer confidence, even if the initiating entity uses synthetic voice.
Consumers seeing the “bot” indicator along with the Verified Organization notification would gain a sense of trust that they are being called by a vetted organization that happens to be using AI for some portion of the call. Therefore, it would be very important for KYC associated with these suggested digital identities to be at least as robust as the KYC employed for branded calling.
Recommended: Regulating AI-generated Calls for Conversational Dialog
Conversational AI refers to the use of artificial intelligence to enable natural and fluid interactions between humans and machines through conversation. It involves the development of systems that can understand and respond to human language in a way that mimics natural-sounding human conversations. While replicating a specific human’s speech patterns is more difficult, conversational AI provides a mechanism for interactive dialog that provides plausibility that it may actually be the person that the voice claims to be.
Unlike the Biden Deep Fake that occurred in New Hampshire, which merely employed static audio to generate pre-recorded voice, conversational AI is interactive, dynamically changing based on consumer interaction. It therefore would be especially deceptive and a potential greater threat to consumers than static audio. Accordingly, it is recommended that the FCC consider prioritizing regulations to address these types of calls that leverage AI to replicate conversational speech based on dynamic synthetic voice generation.
Some legitimate businesses will want to be pioneers in leveraging AI-generated calls involving conversational dialog (e.g. use conversational AI to engage with consumers). This could have great benefit to businesses, consumers and society as a whole once efficient and effective KYC, traceability and disclosure methods are put into place and governed by regulatory bodies including the FCC and FTC. Conversely, bad actors will also seek to abuse conversational AI for purposes of defrauding the public as well as legitimate organizations.
Accordingly, it is highly recommended that the FCC prioritize regulations involving conversational AI, including methods for tracing responsible organizations and use of AI-technology as well as a means of conveying said instances to consumers at the UI, allowing an informed decision when answering a call. Consumers would learn over time that calls that include said information at the UI are more trustworthy than those that do not contain said disclosures. Over time this would diminish the effectiveness of bad actor campaigns involving conversational AI.
Recommended: Make Organizational ID an Urgent Priority
Taking all of the aforementioned into consideration, catalyzing the development and implementation of Organizational ID is of the utmost importance. Harm to consumers in New Hampshire could have been prevented if Organizational ID were available and the means to disclose via the phone UI. When the Biden Deep Fake calls occurred, consumers would have been much less inclined to heed the call-to-action to those calls as they would have little reason to believe that they were actually authorized by the Biden administration.
It is highly recommended that the FCC consider efforts to accelerate development and implementation of foundational trust technologies and solutions involving the aforementioned digital identities along with the means of conveying said information to consumers via the phone UI. It is arguably a much greater priority to put this in place as soon as possible, rather than requirements for AI usage that will arguably only be a burden to legitimate companies as bad actors, by definition, will not comply.
Once these foundational technologies and solutions are in place, additional consumer and business benefits may be realized. For example, a “Consent Claim” may also be tracked and conveyed to consumers. A person receiving a call may see an icon indicating that they are receiving a call from an organization that claims to have consent to call them. This type of information could be extremely helpful for consumers to make an informed decision as many calls may never be branded. Even for those calls that are branded, and/or carry the aforementioned suggested “Verified Organization” designation, consumers may not know/remember that they have provided consent.
Just as with organizational responsibility (e.g. traceability, accountability and non-repudiation) would be enabled via Organizational ID, the same would be true with a Consent Claim designation. While not every Consent Claim may be verifiable (e.g. some do not employ robust opt-in tracking*), claims that utilize robust verification would enable proof of lawful contact on the part of the organization engaging with consumers. Accordingly, many organizations that engage in consumer contact would strive to utilize verifiable Consent Claim solutions. This would have many benefits including improved consumer engagement and proveable compliance with applicable regulations such as consent revocation, and in the case of telemarketing/telesales, consent on an expressed written one-to-one basis.
*Note: Examples of solutions that maintain verifiable consent include ActiveProspect’s “Trusted Form” solution and Verisk’s “Jornaya” solution. These solutions are designed especially for telemarketing/telesales that employ “lead lists” to ensure compliance with applicable regulations such as the FCC’s expressed written consent on a one-to-one basis.
About the Author
In his current role, Gerry Christensen is responsible for regulatory compliance as an internal advisor to Caller ID Reputation® and its customers as well as externally in terms of policy-making, industry solutions and standards. In this capacity, Gerry relies on his knowledge of regulations regarding B2C communications engagement. This includes the Truth in Caller ID Act, the Telephone Consumer Protection Act of 1991, state “mini-TCPA” laws and statutes governing consumer contact, various Federal Communications Commission rules, and the Federal Trade Commission’s Telemarketing Sales Rule (FTC TSR).
Christensen coined the term, “Bad Actor’s Dilemma”, which conveys the notion that unlawful callers often (1) don’t self-identify and/or (2) commit brand impersonation (explicit or implied), when calling consumers. These rules are addressed explicitly in the FTC TSR (see 310.3 and 310.4) and implicitly in the Truth in Caller ID Act. Christensen has expertise in VoIP, messaging and other IP-based communications. Gerry is also an expert in solutions necessary to identify unwanted robocalls as well as enabling wanted business calls. This includes authentication, organizational identity, and use of various important data resources such as the DNO, DNC and RND. His expertise also includes digital forensics (audio/text) to identify unlawful communications.
Gerry is also an expert in technologies and solutions to facilitate accurate and consistent communications identity. This includes authentication and validation methods such as STIR/SHAKEN as well as various non-standard techniques. His expertise also includes non-network/telephone number methods such as cryptographically identifiable means of verifying organizational identity. In total, Christensen’s knowledge and skills make him uniquely qualified as an industry expert in establishing a trust framework for supporting wanted business communications.