Building intelligent systems
that act, adapt, and evolve.
Long-form notes and architecture case studies on LLM training, agentic systems, and enterprise-grade RAG — the engineering decisions behind real production AI.
From gradients to autonomous agents.
The full stack of building modern AI systems — training, orchestration, retrieval, and the evaluation work that makes them trustworthy in production.
LLM Training & Post-Training
From base model to production-ready system — across the full post-training pipeline. SFT, preference data, policy optimisation, and the eval-driven loop that ties them together.
Agentic Systems
Stateful multi-agent graphs with planner / executor / critic loops, tool routing, memory, and human-in-the-loop checkpoints — built in LangGraph.
Enterprise RAG
Hybrid retrieval, smart chunking, and eval-tested pipelines.
Evaluation & Reward Modeling
Pass@k, scenario-based eval, and reward-model training.
Production Deployment
FastAPI, vLLM, Azure, Databricks — getting models from notebook to traffic with cost, latency, and observability handled end-to-end.
Writing
Long-form essays and technical deep-dives on agentic AI, training, and the engineering trade-offs behind production systems.
Notes from the field.
In-depth writing on training, agentic systems, retrieval, and the messy engineering details that decide whether AI ships.
Building something ambitious in AI?
I take on a small number of consulting and advisory engagements each quarter — architecture reviews, training-pipeline design, and agentic systems. If something here resonates, let's talk.

