Just imagine that you have the ability to talk to your bank account and perform transactions without the need for a computer or phone. This is the reality of voice-based banking, an industry that is rapidly expanding. According to Allied Market Research, voice banking is projected to reach $3.7 billion by 203. However, there is a security issue that threatens the industry.
In 2020, I strongly advocated for voice banking and believed it would become the next big thing. I even considered integrating voice payment technology into SDK.finance’s digital banking solution. We went as far as developing a demo and utilizing Voiceflow to create a voice-based experience, but eventually, the project was halted.
Recently, I have grown deeply concerned about the future of voice-based interfaces after hearing about Microsoft’s latest release of VALL-E. Microsoft claims that it can clone any voice with just a 2-3 second original recording. While this may seem unbelievable, if true, it could potentially alter the landscape of voice-based remote communication.
Voice cloning technology enables the creation of counterfeit audio clips or voice commands that sound exactly like the original voice, raising many concerns about the potential misuse of this technology. This poses a significant threat to the security of voice-enabled remote banking systems, not only for financial institutions that support this feature but also for customers whose financial data could be compromised. Think about someone cloning your voice and communicating with your bank on your behalf, such as checking your balance or transferring money to an external account. This scenario does not bode well for voice-based interfaces worldwide.
API-driven neobank software
Build your neobank on top faster and cheaperMore info
Given these concerns, the decision not to pursue voice-based interfaces, which I initially was bitter about, now appears to be the correct strategy. “Doing nothing” has proven to be more prudent than investing heavily in development.
So, in this article, I aim to explore the dangers of voice cloning and its implications for the voice banking industry.
Table of contents
What is the market size of the voice banking industry?
The voice banking sector is a modern and fast-growing industry, driven by the usage of AI and increased demand for voice-based services and technologies. According to Markets and Markets, voice banking is a segment of the broader voice technology market that is projected to surpass $127 billion by 2024.
Voice banking market by technology
Source: Allied Market Research
Allied Market Research has reported that the worldwide voice banking market has witnessed remarkable expansion, with a value of $984.6 million in 2021 and is expected to maintain its upward trend. It is projected to reach an astounding $3.7 billion by 2031, growing at a CAGR of 14.5% from 2022 to 2031.
This exceptional growth can be ascribed to several factors, including the rising popularity of voice-based services for financial transactions and banking, growing demand for personalized voice experiences, and widespread acceptance of voice-activated chatbots and virtual assistants.
The main players in the voice banking arena
Although Fiserv reports that a significant portion of consumers remain sceptical or unaware of voice banking, numerous financial institutions have recognized its potential and become key players in the industry.
For instance, ING was an early adopter, introducing a voice control feature in its app in 2014 powered by Nuance’s voice recognition software and featuring Inge, the assistant. Other banks created their own digital assistants and chatbots, while some integrated their services with established voice assistants like Siri, Google, and Alexa.
For instance, Capital One offers customers the option to utilize Alexa for conducting hands-free payments, checking balances, and tracking expenses. Similarly, Barclays has integrated with Siri to enable swift mobile payments to contacts without the need to open or log into the banking app. Santander in the UK has updated its technology to allow customers to use voice commands to perform transactions, transfer money, and report lost or stolen cards. Visa, in collaboration with Abu Dhabi Islamic Bank, is introducing biometric voice and voice-based authentication for e-commerce, utilizing biometric sensors installed on a standard smartphone.
Digital wallet solution
Affordable software to base a fintech product onLearn more
Moreover, some banks have introduced their own digital assistants, such as Bank of America’s Erica, a financial management assistant that can aid customers in monitoring their loans. However, the latest voice cloning advancements may pose significant challenges for voice banking and require companies utilizing this technology to search for ways to prevent potential fraud.
What is voice cloning technology?
The recent news about Microsoft developing an AI called VALL-E that can clone a person’s voice from a 3-second audio clip has caused quite a stir. Not only can VALL-E clone a voice, but it can also synthesize other words from the same clip, process text in 23 languages, and capture the context and meaning of a sentence instead of just translating it word for word. The technology behind voice cloning is based on deep-learning algorithms that are trained on large datasets of recorded speech samples from a particular speaker.
These algorithms analyze various aspects of the speaker’s voice, such as speech patterns and intonation, and use this information to create synthetic speech that sounds like the original voice. Voice cloning can be applied in various ways, such as creating digital voice assistants and chatbots, adding voiceovers to multimedia content, and generating synthetic voices for individuals with speech disabilities. Read this article to find more info about voice banking technology.
A growing threat to the security of banking and payments
Although voice cloning technology has potential benefits, there are significant risks associated with its use in the banking and payments industry. Malicious actors could use voice cloning to create synthetic voices of public figures or individuals without their consent, potentially leading to fraudulent activities, such as payment fraud through voice banking. For instance, cybercriminals recently used voice-cloning capabilities to clone the voice of a CEO and transfer $243,000 to their account, highlighting the vulnerability of voice-based interfaces to hacking through voice cloning technology.
Attackers with cloned voices could bypass bank voice authentication systems, voice-activated locks, or other systems that rely on voice recognition. They could also use cloned voices to make fraudulent phone calls or send phishing emails that appear to be from a trusted source. In a $35 million heist investigated in Dubai, cybercriminals used voice-altering technology to pose as a company executive who needed a money transfer for a takeover, demonstrating the sophisticated schemes possible with this technology. Check this article to explore how KYC verifications are transforming into red tape.
Mobile wallet solution
Build your ewallet service on top faster and cheaperMore details
Many banks use voice authentication to enable customers to access their accounts via telephone, and some promote voice identification as a secure and user-friendly option. However, biometric security based on voice is not always reliable, particularly in a world where synthetic voices can be generated quickly. One experiment showed that voice cloning technology allowed a user to trick Lloyds Bank’s security system and log into an account. The Consumer Financial Protection Bureau, which regulates the financial industry, has expressed concern about data security and expects firms to comply with the law regardless of the technology used.
Therefore, financial institutions must implement robust security measures and additional authentication methods to prevent voice cloning attacks and protect sensitive financial data.
How to prevent voice banking fraud?
To mitigate the risks of voice hacking, it is crucial to combine voice-based interfaces with multi-factor authentication and other security measures. Fintech companies should incorporate various forms of identification, such as voice biometrics, passwords, and other authentication methods. The implementation of multi-factor authentication can enhance the security of voice-based transactions and reduce the likelihood of fraudulent activities, especially in preventing voice cloning in fintech.
Read our article on Payment Processing and Compliance to find more information about the data management regulations.
Robust voice biometric systems
Deploying strong voice biometric systems that utilize sophisticated algorithms to detect voice cloning attempts is crucial for organizations to safeguard against fraud activities. These systems can also detect other forms of voice manipulation, such as replay attacks or impersonation. Many fintech apps already use biometrics like fingerprints or facial recognition for verification.
Continuous authentication is a technique that involves the constant monitoring of user behavior and activity to identify any suspicious activities. This can involve monitoring the user’s voice pattern and behavior during a call or transaction to detect any deviations from their usual behavior.
Hybrid-cloud fintech platform
Accelerate time-to-market with a pre-developed FinTech Platform by SDK.financeMore details
Conversational AI for transaction verification
Conversational AI empowers automated systems to generate interactive and lifelike conversations by responding to human speech in real-time. This technology plays a critical role in fraud detection by serving two key purposes: first, it boosts customers’ trust in voice bots, and second, it allows voice bots to obtain the required information to verify authentic transactions and detect and reject fraudulent ones.
Read this article to explore the difference between a real-time accounting model and a traditional general ledger.
Education and awareness
It’s essential for financial institutions to educate their customers and employees about the dangers of voice cloning and the steps they can take to safeguard themselves against these attacks. This includes providing guidance on verifying the identity of the caller or the legitimacy of the transaction before disclosing sensitive information.
Regular testing and evaluation
To prevent voice cloning attacks, financial institutions should regularly assess and strengthen their voice recognition systems to ensure they can accurately detect fraudulent attempts. This includes testing the system with various cloned voice samples and regularly updating it with the latest security patches.
API-driven neobank software
Build your neobank on top faster and cheaperMore info
A comprehensive approach is necessary to prevent voice cloning fraud in fintech, which involves the integration of sophisticated technology, user training, and ongoing testing and evaluation. Staying informed about the latest advances in voice cloning technology and security threats is critical for both companies and individuals. Appropriate protective measures should be taken to mitigate potential risks.
Therefore, preventing voice cloning fraud in fintech requires a multi-layered approach that combines advanced technology, user training, and regular testing and evaluation. It’s also important to stay aware of the latest developments in voice cloning technology and potential security threats and to take appropriate protective measures to safeguard against these threats.
- Allied Market Research. (2022). Voice Banking Market by Technology: Global Opportunity Analysis and Industry Forecast, 2021–2031.
- Fiserv. (2021). Voice Banking: Understanding the Benefits and Limitations of This Emerging Technology.
- Markets and Markets. (2019). Voice Technology Market by Component (Hardware and Software), Application (Assistance and Access, and Authentication and Verification), Deployment Mode, Organization Size, Vertical, and Region – Global Forecast to 2024.
- Merchant Savvy. (2022). Voice-Activated Payments and Banking: 5 Benefits and 5 Risks.
- Sandhya, N. (2022, January 31). Voice Cloning: A New Frontier in Cybersecurity Threats. The Economic Times.
- Visa. (2021). Abu Dhabi Islamic Bank, Visa and Globee® Launch World’s First Biometric Voice and Voice-Based Authentication for E-commerce.
What is voice banking?
Voice banking is a technology that allows users to perform banking transactions using their voice. It enables users to conduct banking activities, such as checking account balances, transferring money, paying bills, and more, by simply using voice commands.
What is the difference between voice biometrics and voice cloning?
Despite voice biometrics and voice cloning both rely on a person's voice, they are two distinct technologies that serve different purposes.
Voice biometrics is a technology that uses a person's unique voice characteristics to verify their identity. Voice biometrics is used in many applications, such as secure authentication for accessing bank accounts, making payments, or unlocking a phone, while voice cloning is used to create a digital replica of someone's voice. This technology uses machine learning algorithms to analyze a person's voice and create a synthetic voice that sounds like the original person.
How voice cloning works?
Voice cloning is a technology that involves creating a digital replica of a human voice. There are different methods used to create a voice clone, but the most common one involves deep learning models known as neural networks.