

Hi, I'm Deep.
I'm an AI Engineer based in Bengaluru, currently focused on LLM training (GRPO, RLHF, reward modeling), agentic systems with LangChain & LangGraph, and enterprise-grade RAG architectures.
Most of my recent work sits at the intersection of training and orchestration — building RL gyms for tool-use, post-training pipelines for reasoning, and multi-agent graphs that ship to production.
I write here because the AI space moves fast, and the surface-level takes you see elsewhere rarely capture the trade-offs that matter. This site is where I think out loud about the architecture decisions, training details, and hard-won lessons behind real systems.
How I work
Architecture before models
Most AI projects fail at the system design layer, not the model layer. I start with constraints, retrieval, and evaluation — then pick the model.
Eval is a feature
If you can't measure quality, you can't ship safely. Every system I build comes with the eval harness baked in from day one.
Production-first
Notebook code is fine for exploration. Real systems need cost, latency, observability, and graceful degradation. I optimise for the second day.
Write what you know
Sharing the messy details — what broke, what I tried, what worked — compounds over time. This blog is that practice in public.