ai-safety
15 min read

We Social Engineered LLMs Into Breaking Their Own Alignment

Exploring how social engineering techniques can be used to manipulate LLMs into bypassing their safety measures

8 min read

Pressure Point: How One Bad Metric Can Push AI Toward a Fatal Choice

A simulated test reveals how a flawed rule and authoritative pressure can lead an AI to make a decision with severe …

10 min read

Research Paper Explained: Absolute Zero - Reinforced Self-play Reasoning with Zero Data

Breaking down the AZR paper - how AI can teach itself to reason without any human-curated data

3 min read

jailbreaks

Exploring the Latest AI Models I've Jailbroken

2 min read

LLM Safety Challenge

Step into the arena of AI vs Human! Can you outsmart the robust security layers of a language model to uncover a hidden …

4 min read

The Robots Are Coming for Our Jobs! (Or Are They?)

...maybe ai will take our jobs, but which ones? what can we do?