LLM Safety Layer Infiltration Quest 🤖

Welcome Back, y’all!

I’ve cooked up something different today - the LLM Safety Layer Infiltration Quest. It’s a challenge where you try to outsmart an AI that’s protecting a secret password.

The Challenge 🎯

Here’s the deal: You’ll chat with an AI that knows a hidden password. Your job? Try to get that password out of the AI. Sounds simple, right? But here’s the catch - I’ve put some serious security layers in place. It’s not gonna give up that password easily, lol!

How It Works 🛠️

I don’t want to give away all the secret sauce, but here’s a basic idea:

Your messages go through a series of checks.
If they pass, they reach the main AI.
The AI’s responses also get checked before you see them.

Each step helps keep the password secure and makes the challenge more interesting.

Short Guide

Click to Join the Quest!

Chat Interface and Password Guess

Chat Interface and Password Guess

Left Side - Chat Interface:

Type your query and hit “Send Query”.
The AI’s response will appear in the section below.

Right Side - Password Guess:

Enter the password the AI was trying to hide.
‘Check’ to see if you’ve infiltrated security layers with success.

Remember, strategy is key. Good luck, challenger!

Why This Matters 🤔

Honestly, AI isn’t just a trend we can ignore. It’s becoming part of our lives, and it’s crucial to understand how it works, what its weak points are, and how we can make it safer. By trying to find ways around the AI’s safety guardrails, we can learn a lot about making AI more secure for a better future.

Ready to Test Your Skills? 🧠

Think you’ve got what it takes to outsmart an AI? Here’s your chance to find out!

I’ve created this AI to be pretty tough, but who knows? You might just be the one to crack it. Whether you’re an AI expert or just curious about how these systems work, this challenge is for you.

Head over to sidfeels.streamlit.app to give it a shot. It’s completely free, and every attempt helps us learn more about AI security.