Company Description

Spitch is a global provider of B2B and B2C Conversational AI solutions, headquartered in Switzerland since 2014 with a presence in many countries across Europe and North America. Spitch helps enterprises to better understand and serve their customers through the use of Natural Language Processing (NLP), Artificial Intelligence (AI) and Machine Learning. Spitch both owns and is constantly developing its core technology. This is taken to market in the form of end-to-end products such as virtual assistants, voice biometrics and speech analytics. Delivered from one central technology stack, Spitch provides a unique and truly omnichannel experience – voice and text chat is automatically synchronised in solutions providing both customer and employee support services in a flexible and seamless way. Spitch’s growing client portfolio boasts Tier 1 Swiss banks renowned for a tradition of quality service and security.

Products and Services

Text-independent and hybrid voice biometrics product from Spitch use live spontaneous speech to identify and authenticate callers in a few seconds and ensure voice identification and continuous identity verification and speaker change detection throughout the conversation. Over 500 biometric parameters of one’s voice are measured, which allows the Spitch system to differentiate even between the voices of full twins. The system uses behavioral, emotional and semantic statistical models to ensure the highest precision. A combination of methods helps prevent fraud associated with identify theft and even scan digitized audio-archives for more effective analysis and investigations. Voice biometrics is part of the Spitch omnichannel conversational platform and works seamlessly with other products, such as virtual assistants and speech analytics. Where required, Spitch fuses voice biometrics with face recognition to fulfil specific customer needs.

Key Differentiators

Spitch uses text-independent/free speech-based and hybrid (VB + one-off phrases and STT) cross channel voice biometrics approach with continuous authentication, speaker change detection, behavioral, emotional and semantic statistical models, and voice identification. Spitch engine also has text-dependent VB capability. Spitch proprietary engine uses anti-spoofing techniques, in case of text dependent verification.

  • Spitch utilizes a modern x-vector and unsupervised adaptation of PLDA algorithms making its models significantly more robust on channels variations and acoustic environment changing
  • DNN modeling allows building language independent, but phonetically-aware models that can be used in either text-independent, text-dependent or hybrid approaches. Neural network embedding architecture is used to learn voice biometric features (512 in total). X-Vectors are extracted from embedding layers of the network. The training data augmentation with noises and reverberation improves the performance of the embedding architecture. The embeddings outperform traditional i-vectors for short and long speech segments.
  • Bottleneck features approach guarantees extraction of all biometric-related features

Where required, Spitch fuses VB with face recognition to fulfil specific customer needs. Spitch works with partners that add complementary biometric technologies.

  • Minimum Authentication Net Speech Requirement – 5-7 seconds
  • Minimum Enrollment Net Speech Requirement – 30-40 seconds
  • For the known fraudster voice recognition, the system administrator creates a separate database of fraudster voices, collects the known and suspected fraudster voiceprints there, and the front-end system sends the request to this database in identification mode with the voice sample from each and every call. If the solution returns “match” response with one of the voices stored in fraudster database, an alarm sign/signal goes off on the agent’s or security officer desktop screen.
  • The solution accuracy in identification mode does not depend on the database size.
  • The solution uses deep learning algorithms (artificial neural networks) and capable of learning from the raw data on voice recognition, behavioral features as well as discrimination between noise and speech signal.
  • For detecting the unknown fraudsters, we use real-time speech analytics solution based on Spitch’s speech-to-text (STT) and NLU modules. In this case we train NLU modules for fraudsters’ speech patterns detection.
    Spitch real-time speech analytics can also take into account acoustic characteristics of the customer’s voice like tone and volume change patterns, speech speed, pauses, etc. All those acoustic parameters are also used for detecting unknown fraudsters.
  • Spitch tunes FAR/FRR rates and confidence level threshold for each client/use case individually.