What are the implications of synthetic speech attacks? Do consumers prefer biometrics and more?

Matt Smallman
Matt Smallman
10 mins read
Last updated
20 Apr 2024

Modern Security Newsletter #002 – February 2023

Welcome to the second edition of the Modern Security Newsletter community newsletter (did you spot the rename🔍) – Yes, Modern Customer Security Community was a bit of a mouthful.

This newsletter aims to provide members with a monthly summary of news, ideas, insight, and analysis in customer security based on my hours of reading and consideration so that you don’t have to. In this edition:

  • Community News – Including new website, recent and upcoming events
  • In the News – My usual roundup of relevant and interesting news
  • Just for fun – Here are some of the more pertinent amusing stories I’ve come across
  • Emerging Synthetic Speech Threat to Voice Biometrics – This month’s featured analysis follows the reported breach of Lloyd’s Voice ID service. The synthetic speech threat is no longer hypothetical.

Community News

  • Events – It’s been a busy month with two well-attended live events available to watch, with the key highlights tagged to help you find them quickly. I’ve also been investing time and effort in improving the production quality of each event, so I hope you notice the improvement:
  • New website – I launched our new website at symnexconsulting.com/community with a (hopefully) easy-to-use registration and sign-in process. Here you can watch replays of the live events with helpful features like quick links to interesting sections and links and files provided by our guests. I’ll be adding the newsletter and event registration shortly. However, we’re still on lu.ma/modernsecurity. As always, any feedback is appreciated.
  • New layout and email address – I’m using a new email provider, so you should get this directly from community@symnex.co. Please add it to your safe sender’s list and let me know your thoughts about the format.

📰 In the News

  • Consumers Prefer Biometrics – An interesting survey by PYMNTS (https://www.pymnts.com/wp-content/uploads/2023/01/PYMNTS-Consumer-Authentication-Preferences-January-2023.pdf) primarily focused on online authentication habits. However, it’s great to see how consumers have moved from accepting biometrics to demanding it despite still using passwords the most. Almost 52% of consumers prefer using biometrics for authentication. I’m not convinced the survey asked all the right questions, as the subtext is that consumers want both security and ease of use, but they were only really asked about their security preferences.
  • Online banking is really secure, but is it actually usable? – Which rated the security of online banking security (https://www.which.co.uk/money/banking/banking-security-and-payment-methods/online-banking-security/how-safe-is-online-banking-aPdmC5M5Emnj). If you take a step back from Which’s evaluation of deprecated TLS standards and inclusion of details in notifications you will see that in practice they are all really secure and that in the weak link, today, is the user (see the prevalence of APP fraud) and the bank’s other channels. This also didn’t focus on whether banks had effectively balanced usability and security, as there are some pretty horrendous security schemes featured as good practice (who else has a draw full of card readers but never one on them when they need it?).
  • Fake Authenticator Apps – As the weaknesses of SMS for One Time Passcodes have become more apparent, more and more organisations are relying on Time Based One Time Passcodes generated on a device. These use a QR code or long string to prime the generator, but there have been increasing reports (https://9to5mac.com/2023/02/21/scam-authenticator-app/) of fake apps stealing this QR code, which would enable a fraudster to see exactly the same code as you at the same time. Make sure you only use genuine authenticator apps. Google and Microsoft have their own, built into the latest versions of Apple’s iOS. I recommend 1 password (https://1password.com/) and a physical security key (https://www.yubico.com/products/yubikey-5-overview/).
  • Phone number recycling – Network Authentication and Fraud Prevention ensures that a call originates from the number it claims to come from. Still, we must never forget that the network owns the phone number, not the customer. This gives rise to the problem of SIM Swap attacks, but in this case (https://go.theregister.com/feed/www.theregister.com/2023/02/21/accidental_whatsapp_account_takeover/) simply recycled the number another customer had stopped using and as a result, someone else started getting their WhatsApp spam. Something to bear in mind when looking at authentication schemes relying on mobile numbers.

Analysis – The synthetic speech threat to Voice Biometrics

  • Journalist launches proof of concept attack – A journalist from Motherboard claims (https://www.vice.com/en/article/dy7axa/how-i-broke-into-a-bank-account-with-an-ai-generated-voice) he successfully defeated Lloyds’s Voice ID (https://www.lloydsbank.com/contact-us/voice-id.html) and accessed his bank account using a synthetic voice. The story was reasonably balanced, with the author admitting that Voice Biometrics was still better than the alternatives for most consumers.
  • Feasibility – The journalist created a voice using eleven labs (https://beta.elevenlabs.io) and claimed to use less than 5 minutes of audio to create the voice. He acknowledged that getting the cadence and tonality right took work, but the voice does sound very realistic. Capturing that amount of high-quality audio will not be trivial for fraudsters, but as I reported last month, both Microsoft (https://arxiv.org/abs/2301.02111) and Google (https://arxiv.org/abs/2210.15868) have tools that require only a few seconds in their labs, so we have to expect that this will become increasingly feasible for fraudsters in the future.
  • Defence in depth – In the associated video (https://youtu.be/kqYSIU70N68), the journalist appears to use his mobile phone and only got as far as playing his balance, so it’s unclear how impactful this claimed breach was. Llloyd’s statement states that they have a layered approach to security, so it’s doubtful that VoiceID was the only fraud prevention method in use or that additional authentication may not have been required for higher-risk transactions. Strengthening with Depth is one of the five principles of Modern Security in “Unlock Your Call Centre” (https://www.unlockyourcallcentre.com), and this case highlights its value.
  • Human in the loop – The LBG implementation uses text-dependent or active Voice Biometrics in their IVR, so requires only short durations of predictable speech for authentication. As we saw, the journalist spent some time getting their speech sound just right. Similar risks will apply to using text-independent or passive forms of the technology in the IVR or Natural Language Understanding (NLU) solutions because the utterances are short, and fraudsters will have many opportunities to test and adjust. However, the complexity of using this approach with agent-based passive Voice Biometrics is significantly greater because not just the sound of the voice but its coherence in a conversation with an agent will need to sound realistic. That said, organisations should continue to employ defence in depth to ensure that the person they are speaking to at the end of the call or that the riskiest requests are made by the same person they authenticated at the start.
  • Text to Speech (TTS) and Synthetic Speech Detection (SSD) – Most synthetic speech platforms are optimised to “sound like” rather than “be like” human voices when evaluated by another human. This means that many features of synthetic voices are different from how real humans speak. Every credible Voice Biometrics vendor should have TTS detection or SSD capabilities, but it is difficult to prove their efficacy in the face of limited real-world data. Whether they should be enabled and at what sensitivity labels need to be balanced with the cost of the inevitable false alarms as part of an organisation’s threat assessment.
  • Synthetic Voice Watermarks – Watermarking is a process by which Synthetic Speech platforms can introduce inaudible code into their voices that can be detected and attributed to both the platform that produced it and the user. This is helpful as an additional layer of protection for Voice Biometric systems on top of the above detection methods. Unfortunately, there are yet to be public standards, and platform providers are cagey with the details despite their public commitments. If you have more luck than me at getting answers, then please let me know:
  • Public figures (and YouTubers) – Some of us already have enough high-quality audio in the public domain that could be used to create a synthetic voice, with several public figures being the victims of malicious videos featuring their voices (https://www.vice.com/en/article/dy7mww/ai-voice-firm-4chan-celebrity-voices-emma-watson-joe-rogan-elevenlabs). This case also highlights the importance of respecting user’s preferences for different authentication methods and a holistic assessment of customer vulnerability.
  • Conclusion – Voice Biometrics remains a significant security improvement over traditional authentication methods like pins and passwords with the added advantage of exceptional usability in the voice channel. This example merely underlines the importance of deploying the technology as part of a coherent modern security strategy.
  • Learn More – We will be digging into the threat landscape for Voice Biometrics, including the most significant vulnerabilities (hint – It’s not Synthetic Speech) with experts in the field at a community event on 4 May (https://lu.ma/ldfmuyca) and at synthetic speech specifically on 18 May. Both are available to book now at lu.ma/modernsecurity.

Just for fun

  • How Fingerprints Get their Swirls – The best way to help customers understand how biometrics work is to refer them to fingerprints. We are all, seemingly, born with an innate understanding and acceptance that they are different. But why are they different? This article (https://www.nature.com/articles/d41586-023-00357-x) from Nature discusses how they evolve during fetal development, similar to a cheetah’s spots or zebra’s stripes.
  • Dating recordings using electrical hum – We spend much time and effort removing the hum of mains electricity from audio recordings. Still, in this incredibly detailed analysis (https://robertheaton.com/enf/), the author shows how it can be used to date recordings and was used in a prosecution in the UK.
  • A Hacker’s Mind – One of the fathers of modern security thinking, Bruce Schnier, who coined my favourite phrase, “Security Theatre”, has released a new book: “A Hacker’s Mind” (https://www.schneier.com/books/a-hackers-mind/). It focuses on how the hacker mindset is applied to daily life and what “white hat” hackers can do about it.

That’s all for this time. Thanks for being a member of the Modern Security Community

Popular Posts