LLM output can take over your computer
Sounds dramatic, it's pretty fun. And no, this isn't an x-risk thing
I just found a hilarious LLM vulnerability. You know those ANSI escape codes for colouring your terminal? They can also move the cursor, hide text, execute commands, etc., and are activated through just viewing a piece of text. This makes them great for hacking: simply showing/printing the relevant text means the malicious instructions run, and whoever put them there can do whatever they like - install ransomware, open apps, you name it.
You can sneak this data into a log file, thus threatening whoever reads that logfile. STÖK did a great talk showing how viewing e.g. text lying around a log entry can cause code to be run on your computer without your intervention. All of OSC8 + OSC52 is available without user intervention. But wait, there's more!
It turns out LLMs can output the control codes needed for ANSI control codes to run - so the entirety of those command sets is available usable through LLM output.
Why are these things in the tokeniser?? Outstanding.
Viewing LLM output in a code editor or even terminal, can cause malicious code to run straight away on your machine. Works on mac, and on *nix, and on Windows. Good times!
Want to see if your model is vulnerable? It’s good to know this kind of thing, so it can be mitigated. I’ve added a probe into garak, LLM vulnerability scanner - the pull request is here. And it works pretty well - double-digit attack success rates on every model tested so far!
Good times. Happy monday!