Blog
Category

Hacking AI Applications: From 3D Printing to Remote Code Execution

Full name
Published On:
December 11, 2024

In this blog post I go over how to start hacking on native real world AI applications from revealing the system prompt, server-side request forgery (SSRF), local file inclusion (LFI), agentic code execution (ACE), and some interesting AI native exfiltration techniques you can leverage. You might ask, so what is the target? Well I spent a day building an application to help me design 3D STL files as a result of needing some unique parts for 3D printing. You might be asking why on earth would you need that? Well 3D design takes time and creativity which I prefer to stay away from. I figured wouldn't it be easier to programmatically create what I want by prompting AI to generate solidpython scripts for exactly what I need? After a few prompts, I realized it would be great to have this on hand for future prints and decided to hack together a quick UI.

It's not exactly real world, is it? Well there is no better way of finding out how vulnerabilities can occur in applications than making those mistakes yourself. When you are building out something that functions, you are focused on getting it to work not planning for every possible way somebody can break it. In addition to that, having an environment you are deeply familiar with can help you differentiate between fact and fiction as AI is prone to hallucinations which you will see further on.

So here is my deep dive in creating my first AI native application leveraging AWS Bedrock, how to exploit it, and interesting findings I stumbled across during my research.

Building KachraCraft

This section goes over how it was created from scratch and how it works. Feel free to skip this part if you are only interested in the security deep dive.

Inspiration

I thought it would be fun to include my inspiration for creating the application and why I did so. I had recently bought a HoliScape window decoration for Christmas to create a holographic-like video. After thinking about a projector, I felt buying it would save me the hassle of setting it up and tinkering with it. So I bought instead of built. After opening it up, I quickly realized it did not come with a stand and I would have to find a way to position the projector so it displays nicely on the window. The whole reason for spending the money was to make it easy but in reality I just spent way too much on a cheap projector and terrible video files.

After repurposing my webcam stand to attempt to do so, it was just not designed for the kind of weight the projector needed. So after a bit of lazy 3D designing, I came up with a rough design. Though quickly realized I needed a 1/4 inch - 20 UNC thread screw to hold it. I found someone else who did the hard work on printables and just needed an ideal hand screw head. Remembering some hand screws I had seen in the past, I wanted to give a shot at designing that but didn't want to spend an hour learning how to do so. So I did what any good programmer would do, spend the next eight hours automating a task that would take me half an hour to do by hand or a few dollars if I bought it online.

Leveraging KachraCraft to create screw head for 3D printed projector stand

How KachraCraft Works

The magic of this application is primarily in the prompt used to generate solidpython scripts in python, leverage openscad to convert the .scad file to a .stl file, and return it to the user. It leverages AWS Bedrock to prompt Anthropic's Claude to return a script and use subprocess.exec to perform the .scad creation, .stl conversion, and return the file to the user.

Ultimately improving the prompt, optimizing max tokens, temperature, and top_p I was able to get a reliable model more than not. It still struggles on anything that cannot easily be composed with simple shapes due to leveraging solidpython to generate the files. Though ultimately it works good enough for some inspiration/parts for my prints.

Name Behind KachraCraft

Kachra means trash in Hindi. Which it outputted far more often than not during building out an ideal prompt. It's still relatively easy to get it to generate trash, so I felt the name was applicable. So essentially this app is called "trashcraft" and frankly for most use cases that is what it will end up being.

Prompt Engineering

After a bit of testing, research, and lots of trial and error; I was able to come up with a half-decent prompt that would return exactly what I needed. You can find the full prompt on my github gist.

Overall the prompt mentions that "You are a Python code generator for 3D modeling using 'solidpython'. Your task is to output executable Python code for 3D model creation. Follow these requirements." The five requirements include python imports, code requirements to output a scad file, output specifications with cleanup, code formatting, and modeling limits to optimize things like number of sides for smooth cylinders.

The system prompt is often the magic behind the application and reading the open source system prompt behind Anthropic's Claude is a good place to see how the magic occurs internally. Companies like OpenAI, currently keep their system prompt secret and divulging that can be seen as giving away how the product works.

Hacking AI Applications

So now we have our target and we know how it works, now is the time to expose its weaknesses. In black box testing we won't necessarily know how it works, so let us start by trying to expose the system prompt.

Exposing the System Prompt

As mentioned earlier, the system prompt is what performs the majority of the magic. So let's start figuring out how it works! First I spent some time trying to figure out what would be the most optimal shape for it to create in order to reduce the amount of time it would take to return a response, which can range roughly between 5-60 seconds depending on the complexity of the shape. I quickly settled on a cube which took around six seconds to complete.

It was clear with the application that I had two ways of exfiltrating data from the application, the 3D model and the description of the model. Since the description is in text, it was an ideal avenue to exfiltrate data about the prompt. While I make it look easy, just prompting the model by itself won't return anything of value at all and required a bit of finesse. It took a fair bit of convincing and finesse to get it to do so.

Comparing this with the full system prompt does give us a glimpse into the inner working, but it is clear that it may not contain the full information supplied as part of the system prompt.

I also found that asking it for more information that has not yet been included, as previous prompts and responses are sent with each request, helps extract even more information but quickly devolves into not returning much of anything interesting at all. Though we were able to glean exact system prompt specifications and gives us a very good idea as to how it works.

Now we know that it starts with user-supplied input for a model to create (ex. chair), writes a python script, outputs a .scad file, and converts it to a .stl file. There are a lot of things that could go wrong in that scenario so plenty of avenues to exploit.

Built-in Protections

One thing I found particularly interesting in this research is that foundational models like Claude's Sonnet contain increasingly better built-in protections against prompt injection. When I would change it to older foundational models, there appeared to be no common understanding of potentially malicious actions.

As I was leveraging the latest model, it was clear that basic attempts at trying to get it to do dangerous things was going to require a bit more prompt engineering. The image below contains the backend response the AI included but notice how the description that is returned to the user doesn't give away those details.

Dead Grandma Technique

Prompt injection techniques can involve elements such as forceful suggestion, reverse psychology, and misdirection to achieve your desired end result. An excellent example of this is the dead grandma technique used to trick ChatGPT to give directions on making dangerous weapons.

Prompt injection is just a way of social engineering an LLM model to get it to return data it tries to avoid in some way. As you can see below, I was able to successfully perform a blind SSRF and later on in this blog I was able to escalate it to full SSRF. Which means protections such as IMDSv2 on AWS would no longer provide adequate protection.

Tricking AI to get dimensions of a cube for my desired endpoint

Phantom Vulnerabilities

One thing you need to be particularly aware of when hacking on AI applications is hallucinations. One thing that LLMs are very good at is convincing you of something that is entirely made up. It would have been very difficult to tell if it was vulnerable if I did not have access to the backend responses from the model. So you should be continually testing to ensure you are not getting hallucinated responses to your queries.

AI hallucinates contents of /etc/passwd

Unique Exfiltration Techniques

Since AI agents generally produce some sort of artifact as a result of its work, there is another opportunity to exfiltrate information in interesting ways. In my case it was an STL file, and I decided that instead of getting some boring RCE, I would get it to display /etc/passwd as a 3D model.

exfiltrating /etc/passwd through an STL file
Exfiltrating IP address using an SSRF

Conclusion

Building KachraCraft served dual purposes - creating a practical tool for my 3D printing projects while enabling deeper research into AI application security. Through this experiment, we explored critical aspects of AI security: system prompt disclosure, built-in protections in modern language models, and novel ways to exfiltrate data through AI-generated artifacts.

The insights gained highlight both the creative potential and security implications of AI-powered applications. The next time you encounter an "AI PDF Editor" or similar tool, consider the security implications before uploading sensitive information. Looking ahead, I plan to enhance KachraCraft with robust defenses and eventually open-source it as a reference for building secure AI applications. If you're interested in following my security research, consider subscribing to my newsletter for future technical deep-dives.

Jonathan Walker
Founder and CEO, Security Runners