Hey, I’m Siddhant. I’m 23.

Got into AI back in 2021 when diffusion models started picking up steam. Spent way too many hours tinkering with Stable Diffusion, fine-tuning models, and just seeing what breaks. What started as a hobby turned into an obsession.

Now that my electronics degree is done after 4 years, I’m going full-time into this. Deep dives into model behavior, adversarial attacks, safety research. The stuff that actually interests me.

My work sits at the intersection of adversarial machine learning and AI safety. I believe understanding how systems fail is the first step toward building systems that don’t.

Focus

Current research

  1. Red Teaming LLMs/VLMs

    Developing novel jailbreak techniques to expose vulnerabilities in LLMs, Vision-Language Models, and multimodal systems.

  2. Model Behaviour

    Studying how models internalize instructions, respond under pressure/conflict and where alignment breaks down.

  3. Interpretability

    Relatively new to this—using interpretability tools to build concrete understanding of the red teaming results I uncover.

  4. Agentic Attack Pipelines

    Building automated workflows that systematically probe model vulnerabilities at scale.

  5. Guardrail Engineering

    Generating safety datasets and stress-testing guardrails to make them more robust against adversarial inputs.

  6. Robotics Safety

    Extending my knowledge toward embodied AI safety and understanding failure modes in robotic systems.

Why this work

Philosophy

I’m not afraid of superintelligence. I’m not afraid to live in a world among superintelligent systems. What I fear is a world where a small group controls that intelligence and I have no access to it.