Vishing attacks have been a growing threat in recent years. While the audio and video content generated by emerging AI tools have become more accurate and convincing, the role of AI technology in these attacks may have been overestimated.
According to cybersecurity company Trellix, the number of vishing attacks in Q4 2022 increased by 142% from Q3 2022. Other vendors, such as CrowdStrike, have charted a similar rise in social engineering schemes like vishing. As emails and spam filters have improved at detecting phishing links, threat actors have pivoted and seeded multi-staged vishing attacks to target potentially lucrative individuals and organizations.
“They’re always trying new tactics that will be more effective,” said Eric George, director of solutions engineering at Fortra. “We believe these spiked up because they are harder to detect for traditional defenses.”
George said widespread scrutiny and interest about new technology has sparked suspicion of AI as the culprit behind vishing attacks. The threat of vishing, he argued, has been mistakenly conflated with the release of tools capable of generating seemingly authentic audio and video deepfakes.
“AI, ML and deepfake — they’re buzzwords,” said George. “They’re very popular right now, so a lot of people use those and overuse those.”
The terms have become common even with law enforcement. In February 2022 the FBI warned that threat actors were abusing virtual meeting platforms to conduct business email compromise attacks. The advisory said cybercriminals are executing attacks in several ways, including using deepfake audio to trick victims into authorizing fraudulent transactions.
In June the FBI released another advisory that warned of “an increase in complaints” of deepfake audio and video attacks targeting professional virtual meetings to obtain personal information of victims. While it is possible that AI tools can aid threat actors in their operations, some threat researchers say the current scheme of vishing attacks generally does not involve such tools.
“I do think for right now, the AI/ML is more reserved for kind of advanced actors or nation-state actors who are conducting attacks with a very specific target or a very specific requirement,” George said. “They’re doing that in a very limited scope.”
Steve Povolny, principal engineer and director at Trellix, also noted that AI tools do not promote efficiency for vishing attacks.
“I think it’s extremely rare that vishing attacks are using deep fake audio,” Povolny said. “Audio is pretty easy to voice act or to fake in general without using any tools. You’re typically not having a pre-recording, and if you do, they’re much less successful.”
AI tools suspected
Experts say it’s reasonable to suspect AI technology is contributing to an increase in vishing attacks, given how available many tools and services are, as well as instructional information on how to use them. Open-source tools let users translate text into image, video, audio, music and code; they can do most of the work for threat actors.
Between the ease of use and capabilities of these products, cybercriminals can leverage them for social engineering schemes. Earlier this year AI startup ElevenLabs warned on Twitter that it had detected “voice cloning misuse” cases on its beta platform. The company implemented additional safeguards, including paid account tiers that require authorization, but also acknowledged that preventing abuse could become more challenging.
Editor’s note: TechTarget Editorial has used ElevenLabs to generate audio versions of news articles.
“Now with a plethora of AI tools that are out there, the barrier to entry is lower, and the sophistication of the tools are higher,” said Pete Nicoletti, field CISO at Check Point Software Technologies. “You have to do some iteration, but the barrier to make it do things against what its creators have created it for is low as well.”
Voice impersonation tools can be quickly trained with one’s voice from videos or audio recordings extracted from the internet and social media. Synthetic voices have become convincing and can be used to execute full conversations to fool unsuspecting victims.
“It sounds just like them, and they can answer back,” Nicoletti said. “The interesting thing about these voice models is that the threat actor will be able to leverage live voice.”
According to Povolny, video deepfakes have also gotten “really believable.” A TikTok account by the name of “DeepTomCruise” demonstrates just how advanced audio and video deepfake tools are today. Visual effect specialist Chris Ume, who operates the account, employs deepfake tools to merge Tom Cruise’s doppelganger’s face with the actual actor to create realistic videos.
“They’re nearly indistinguishable from the actor, and they’ve gotten really good,” Povolny said.
AI tools have become easy to blame for vishing attacks, but these claims are often uncorroborated.
In March 2022 a fraudulent wire transfer was declared by some as the first voice spoofing attack involving AI technology. The CEO of an energy firm believed he was on the phone with his boss, who demanded a wire transfer. It was later announced that threat actors applied AI-technology to impersonate the chief executive of the company. The CEO admitted he thought he recognized the accent and melody of his boss’ voice.
But without any evidence proving the presence of AI in the call’s records, Povolny is unsure that the report was accurate. In fact he’s highly skeptical that any such reports have used actual AI-generate audio, especially since more advanced tools that generate live deepfake audio come with a significant price.
“It’s just not worth it in general,” he said. “I think we will see these more and more as the tools become better and especially as live deepfake audio becomes more prolific. But today it’s just a very small amount of them need to use deepfake audio.”
Povolny further explained that it is difficult to prove a call has been assisted by these tools since security companies rarely get access to call records. Access of call data is required to determine whether the content was pre-recorded and to conduct forensics or deep fake analysis on the call. Such analysis can require taking various frames of a video or segments of audio and playing those against voice samples of the actual person.
The detection process is not simple. Until there is evidence of AI technology driving these attacks, it may not yet be practical fully analyze and attribute the content.
“It would really take quite a bit of these attacks happening before there’s a lot of incentive to go analyze whether it’s a deep fake or not,” Povolny said. “Is it doable? I would say yes, it absolutely is. Is it worth it? Probably not yet today.”
Some reported cases strongly indicate AI tools were used in the attacks. Last August, Patrick Hillmann, chief strategy officer at cryptocurrency exchange Binance, claimed in a blog post that he had been impersonated with deepfake technology in video meetings. The employees, fooled by the alleged “AI hologram,” later sent him thank you messages for meeting with them online.
But without analyzing the content of these business calls, there is no certainty that AI is to blame for the work of threat actors.
What we know about recent vishing attacks
During investigations into suspected vishing attacks, Fortra researchers contacted phone numbers used by cybercriminals and have confirmed that many cases involve interactive voicemails that are simply automatically generated, without use of AI models.
“We do interact with [threat actors] to confirm the attack, and so we do confirm that it is an actual human,” said George.
Attackers may be those who are proficient in English, hired by attack groups to perform the calls with a script. Cybercriminals who specialize in social engineering attacks and are capable of conducting a vishing operation alone have been recruited in online hacking forums for a payment.
These threat actors collect online information to construct their target lists. “A lot of this comes from either already stolen data or compromised data. Someone’s gotten access to a database or cache of information that’s been leaked or sold on dark marketplaces, or [the data is] exposed on social media sites,” George said.
With information on individuals, threat actors will make direct calls or voicemails, claiming to be an IT professional who can deliver help with, for example, a failed update or detection of malware. The link or software provided by the threat will ultimately install malware to their machine and give attackers access to exploitable sensitive personal information.
Threat actors may also target a vishing attack by bypassing the two-factor authentication mechanism employed on many mobile apps and websites. They can call the victim’s phone number, impersonating a support representative. If the user grants the code, perpetrators may access personal accounts tethered to financial details.
“It’s just like phishing, where it’s extremely lucrative when it’s .1% reply,” Nicoletti said.
“They’re using a voice of authority. They’re using something that’s super-duper timely, something that’s time sensitive, and they’re making it relatable.”
Though new voice and video generators have, so far, shown no direct ties with the rise of vishing attacks, researchers say they could eventually make threat actors’ jobs exponentially easier in staging them. In the meantime, George said the intense scrutiny on AI’s role in vishing and deepfake attacks will produce some benefits by spurring organizations to improve their defenses and raising public awareness of social engineering schemes.
“There’s security working groups, information sharing committees and different things of that nature,” George said. “It’s getting them talking. It’s getting us ready to defend against these things. So I think it’s good in that regard. It’s only a matter of time that they use other newer technologies.”