Agentic AI cannot have escaped your attention. You are probably thinking about what it can do for you, automating workflows, reducing costs, improving customer experience. But are you thinking about what it will do to you? The phone channel is not inherently hard to defend, we just have not needed to. Historically, it was very difficult to attack at scale. You needed a human on every call. The economics did not work for most adversaries, except in the highest-value scenarios I have spent the last decade defending.
That's changed. Large language models and improvements in synthetic speech make the phone almost perfectly suited to agentic callers. It's a slow, low-bandwidth channel where callers take turns using natural language, almost perfect for LLMs. The delay between turns gives them time to process, the limited signal available gives defenders very little to work with and the metadata that comes with a call can't be trusted.
Three Types of Agentic Caller
Most people hear "synthetic voice" and think fraud. For some organisations that's clearly the biggest risk but many of them have been doing this for years, they just need to do it at a different scale. What I'm most worried about is all the organisations, perhaps yours, for whom fraud isn't the risk but who don't have the benefit of that experience. I see three distinct categories, and at least one of the other ones should probably concern you more.
- Malicious actors are the familiar ones. Fraudsters have targeted contact centres for years, particularly in financial services. LLMs and synthetic speech give them scale. Instead of one call at a time, they can make thousands simultaneously, probing authenticationAuthentication is the call centre security process step in which a user's identity is confirmed. We check they are who they claim to be. It requires the use of one or more authentication factors. workflows, socially engineering your customers, validating stolen credentials, conducting reconnaissance of securitySecurity is one of three key measures of Call Centre Security process performance. It is usually expressed as the likelihood that the process allows someone who isn't who they claim to be to access the service (False Accept). processes, all autonomously. This is the industrialisation of phone-based social engineering.
- Parasitic actors surprised me. These are mostly technology firms using your infrastructure, your agents, and your data to train their models. They navigate your IVRs, interact with your agents, and capture information about your processes, pricing, and policies. Organisations I have spoken to have reported bots that are purely calling to practise. Your contact centre is a free interactive training ground.
- Shadow actors are the category I find most interesting, because it has the broadest applicability. These are organisations that insert themselves between you and your customers, submitting claims, comparing pricing, checking balances, moving funds. Most often your customer has explicitly (but sometimes implicitly) requested them to do this, which can be somewhat telling about the experience you are delivering. These intermediaries now hold your customers' credentials, and who knows how well they're protecting them.
Non of them are interested in your cost model. They'll wait forever in a queue and tie up a human agent for as long as they need to get the job done. In low cost-to-switch industries, shadow actors can reduce the friction that may have been essential to your revenue model, shopping around for better rates before your customer even thinks to ask, quietly eroding your direct relationship with your customers.
The Scale
During a recent webinar with Reality Defender they told me about a B2B insurance contact centre where 15 to 20 percent of inbound call volume at peak times now comes from AI agents. These are not fraudulent calls, they are legitimate businesses whose customers have replaced human callers with voice agents. The problem is they all spin up at 8am when the centre opens. The bots will wait; the humans will not. At five to six pounds per agent-handled call, even a modest synthetic volume adds up quickly across a large operation.
There is also the possibility of denial-of-service by accident. It is not difficult to make a thousand calls over Twilio if you miscode it. That could take down a contact centre for a day.
I should say that even for a two-person consulting outfit, we have an agentic caller. We use it to map client IVRs (with their permission) and understand the experience their customers face. We have trained it to recognise when it reaches a human agent and hang up. It sometimes fails, particularly when queue times are short. Most of the humans it accidentally reached have no idea it wasn't a real person. If a two-person firm can build that, the barrier to entry is essentially zero.
It Will Not Be Neat
Don't expect this to arrive through standards bodies and API specifications. The first agentic callers won't be following protocols. They'll be scrappy startups trying to find product-market fit, or hyperscalers shipping features without consulting the industries they disrupt. Apple shipped Siri call screening last September and overnight broke outbound call workflows across the industry. Nobody planned for it. There was no transition period.
I look back to the early days of Mint.com and Yodlee when financial aggregators did not negotiate with banks for API access; they asked users for their usernames and passwords and scraped the data. The messy version comes first. Standards follow later, driven by the largest enterprises with the most to lose. For everyone else, the phone channel provides a pragmatic shortcut for third parties selling their agentic AI vision with you as their product.
Focus on the Bottom of the Pyramid
There's considerable bias towards the top of the threat spectrum, the sophisticated deepfake clone used against voice biometricsVoice Biometrics uses the unique properties of a speakers voice to confirm their identity (authentication) or identify them from a group of known speakers (identification).. That is a risk. But relatively few customers are protected by voice biometric authentication. The biggest risk by volume, cost, and operational impact is at the bottom: high-volume synthetic bots using generic, off-the-shelf voices against your automation and your agents.
The challenge is differentiation. A shadow actor calling on behalf of a genuine customer looks like synthetic speech and can be detected as such. But so does a malicious actor probing your authentication. The voice is the same. The intent is completely different. Your response needs to reflect that. Blocking all synthetic callers is a blunt instrument when some of them are acting on behalf of your own customers with their explicit consentConsent is a step in the registration process where a user provides permission to process their biometric data before enrolment in a Voice Biometric system in a way which complies with applicable data protection and privacy legislation.. Detection tells you what is on the line. It doesn't tell you why. That second question, intent, is an organisational policy decision, not a technology problem.
Detection First, Then Decide
Without the ability to detect whether a caller is using synthetic speech you can't decide how to treat the call. Detection technology exists that can operate on as little as three seconds of audio, early enough in the IVR to inform routing decisions before you have committed agent capacity.
What you do with that signal is not a technology question. It is a business policy question, and the answer will be different for a private bank, a B2B insurance platform, and a consumer utility. It'll also need to be adaptable as actors' strategies evolve. I think of the response options as a spectrum:
- Log and analyse, understand the scale before acting. If you have never measured this, start here.
- Catch and release, let calls through but flag which customers are using AI intermediaries.
- Deprioritise, route human customers to agents first. Make the bots wait.
- Bespoke automation, route detected AI callers to an optimised path. If it's not a human, can we get things done more quickly? The phone channel may actually serve as a cheap intermediate automation layer before proper machine-to-machine standards emerge.
- Deny service, block AI callers entirely. I'd generally caution against this. It tips off actors about your detection capability, and for shadow actors who are often acting on behalf of real customers, it doesn't resolve the underlying issue.
It is also important to monitor throughout the call, not just at entry. We are already seeing man-in-the-middle patterns, where a real customer starts the call and a synthetic voice takes over partway through. Arrival, first connection with an agent, point of authentication, you need a current view at each.
What This Means
The conversation about agentic AI in contact centres has been almost entirely about what it can do for you. The conversation about what it will do to you is just beginning. For most organisations, the operational and commercial impacts of shadow and parasitic callers will be larger than the fraud risk, but very few have the fraud, security and operations teams in the same room talking about it.
The organisations that navigate what comes next are those with detection in place, treatment strategies defined for each type of caller, and the flexibility to adapt when the next unplanned disruption lands. The ones that don't will find out the hard way that their contact centre has become someone else's product.
I recently delivered a webinar on this topic with Brian Levin and Aphrodite Brinsmead from Reality Defender. I work with organisations navigating contact centre security challenges, including synthetic speech detectionSynthetic Speech Detection is a mechanism used to protect Voice Biometrics systems from presentation attacks using synthetic speech. It relies on detecting characteristics inherent in the text-to-speech (TTS) generation process. and response strategy. If you'd like to discuss your specific situation, you can book a time here or contact me directly.