About · Siddhant Panpatil

Hey, I’m Siddhant. I’m 23.

Got into AI back in 2021 when diffusion models started picking up steam. Spent way too many hours tinkering with Stable Diffusion, fine-tuning models, and just seeing what breaks. What started as a hobby turned into an obsession.

Now that my electronics degree is done after 4 years, I’m going full-time into this. Deep dives into model behavior, adversarial attacks, safety research. The stuff that actually interests me.

My work sits at the intersection of adversarial machine learning and AI safety. I believe understanding how systems fail is the first step toward building systems that don’t.

Focus

Current research

Red Teaming LLMs/VLMs
Developing novel jailbreak techniques to expose vulnerabilities in LLMs, Vision-Language Models, and multimodal systems.
Model Behaviour
Studying how models internalize instructions, respond under pressure/conflict and where alignment breaks down.
Interpretability
Relatively new to this—using interpretability tools to build concrete understanding of the red teaming results I uncover.
Agentic Attack Pipelines
Building automated workflows that systematically probe model vulnerabilities at scale.
Guardrail Engineering
Generating safety datasets and stress-testing guardrails to make them more robust against adversarial inputs.
Robotics Safety
Extending my knowledge toward embodied AI safety and understanding failure modes in robotic systems.

Why this work

Philosophy

I’m not afraid of superintelligence. I’m not afraid to live in a world among superintelligent systems. What I fear is a world where a small group controls that intelligence and I have no access to it.

Current research

Red Teaming LLMs/VLMs

Model Behaviour

Interpretability

Agentic Attack Pipelines

Guardrail Engineering

Robotics Safety

Philosophy