Available for select AI consulting & advisory engagements

Building intelligent systems
that act, adapt, and evolve.

Long-form notes and architecture case studies on LLM training, agentic systems, and enterprise-grade RAG — the engineering decisions behind real production AI.

Read the blog Explore case studies

LLM TrainingGRPOPPODPORLHFReward ModelingAgentic AILangGraphLangChainRAGHugging FacePyTorchvLLMFastAPILLM TrainingGRPOPPODPORLHFReward ModelingAgentic AILangGraphLangChainRAGHugging FacePyTorchvLLMFastAPI

Scroll

What I work on

From gradients to autonomous agents.

The full stack of building modern AI systems — training, orchestration, retrieval, and the evaluation work that makes them trustworthy in production.

LLM Training & Post-Training

From base model to production-ready system — across the full post-training pipeline. SFT, preference data, policy optimisation, and the eval-driven loop that ties them together.

GRPOPPODPORLHFRLAIFReward ModelingSFT

Agentic Systems

Stateful multi-agent graphs with planner / executor / critic loops, tool routing, memory, and human-in-the-loop checkpoints — built in LangGraph.

Enterprise RAG

Hybrid retrieval, smart chunking, and eval-tested pipelines.

Evaluation & Reward Modeling

Pass@k, scenario-based eval, and reward-model training.

Production Deployment

FastAPI, vLLM, Azure, Databricks — getting models from notebook to traffic with cost, latency, and observability handled end-to-end.

Writing

Long-form essays and technical deep-dives on agentic AI, training, and the engineering trade-offs behind production systems.

More about how I work →

Latest Writing

Notes from the field.

In-depth writing on training, agentic systems, retrieval, and the messy engineering details that decide whether AI ships.

Read all posts

May 3, 202613 min read

60% of your dataset is doing the work — a GRPO reward-variance analysis

Why most curated RL datasets carry no learning signal on a third or more of their prompts — and how a 30-minute pre-flight check cuts GRPO compute without losing uplift.

GRPORLPost-Training

Building something ambitious in AI?

I take on a small number of consulting and advisory engagements each quarter — architecture reviews, training-pipeline design, and agentic systems. If something here resonates, let's talk.

Start a conversation deep.chokshi@outlook.com

Building intelligent systemsthat act, adapt, and evolve.