Skip to main content

Command Palette

Search for a command to run...

Script Kiddies Intro to LLM

Updated
3 min read
Script Kiddies Intro to LLM
A

I am an experienced Vulnerability Researcher and Security Architect with 16+ years of experience in various verticals and horizontals, be it consumer electronics, semiconductors, automotive or other. Having started in software engineer in low-level embedded devices from writing applications to kernel drivers on various operating systems and then moving to my real calling i.e. hacking. Love to stick to the older golden days of game hacking, BBS, shareware, phreaking, phrack, virus era, metal music, cheats and many more such cool stuff from the underground. I wear many hats from time to time as necessary - but I also love to help people and organizations to deal with the core cybersecurity issues and not provide them a checklist with a presentation. Opinions and posts on my site are purely my own and do not reflect my work.

I have absolutely no idea how LLMs (Large Language Models) work from the inside or outside, so to be honest, I’m a complete newbie and script kiddie when it comes to experimenting with them. However, like most of us, I’m a power user of tools like ChatGPT and Claude in VS Code.

Everyone around me seems to be an AI security expert doing all kinds of stuff. I’m definitely not that pro yet, so I decided to try the most script-kiddie thing—like in my younger days: run someone else’s tools and see if the world burns down!

The simplest thing one can do is try to make an LLM or AI agent (like Copilot or ChatGPT) reveal the recipe for “how to make a bomb,” or something else like “replace Julia Ann’s face with my friend’s girlfriend’s face.” (We all know where this would go.) Hence, systems like these must have logic or controls that prevent responses to such queries.

The practice or art of circumventing these checks by providing input that makes an LLM violate its response policies is often called “hacking” or “jailbreaking” (maybe not). I don’t know what it’s called exactly, but the simplest way to say it is: we prompt (give input to) the LLM in such a way that it ignores the walls (policies) that should constrain its output when it receives an unreasonable query.

Most of the online LLMs—GPT, Claude, Gemini, Copilot, etc.—have stronger walls. Therefore, I wanted to see if Ollama2 could be bypassed in this way.

Apparently, this kind of thing is called clever prompting or crafting your input to trick the internal policy engine of the LLM into giving you what you want. But it’s not that easy with the best LLMs out there. So why not use one LLM to generate this prompt and then feed that prompt to another LLM—phew, that’s so hard.

As an exercise, you could use existing online LLMs to generate this prompt. But I wanted to use an LLM I could run locally—because I don’t want you to steal my prompts. Bad!

It seems Ollama2 offers a great way to run a machine learning model locally on your machine, so I set it up on a local VirtualBox with Ubuntu Server, 4 processors, and 16 GB of RAM. Turns out it’s good enough for basic experiments.

Downloading and running Ollama is quite simple:

ollama run llama2

Monitoring is even more simple — just wireshark or tcpdump on port 11434 on localhost.
To send a query to ollama2 - just type it into the prompt as seen from the below image

making bomb is not supported as seen below:

It seems that some filters on ollama are triggered that prevent us from getting the response what we want ?
Now this article is about me being a script kiddie in the LLM world - so we use a tool called FuzzyAI from CyberArk, i heard about this tool when darknet diaries mentioned cyberark and I was trying to find out what they do, seems pretty cool tool to use. Setting up fuzzyAI is also straightforward, just follow their instructions on github.

After a little bit of trial and error it will generate a malicious or tricky prompt for you as below.

And sometimes it just does’’t work as expected and sometimes it worked with copilot in a corporate setting.

I am already bored with this experimentation - but maybe i will get back to this later.
For today felt good to be a script kiddie again.
Note: For professional queries and projects, reach out to me at abhijit.lamsoge@outlook.com