Deploying Contextual Bandits: Production Guide and Offline Evaluation
Systems design, offline evaluation, and monitoring strategies for running contextual bandits safely in production.
Writing & Research
This is the hub for my long-form writing on data science, machine learning, and the engineering patterns that support them in production.
Each article aims to balance clarity with rigor: you will find walkthroughs of algorithms, postmortems from experiments, and the practical guardrails that emerge from shipping systems.
Use the filters to surface the topics, stacks, and case studies that match your current problem.
Systems design, offline evaluation, and monitoring strategies for running contextual bandits safely in production.
When linear models fail, neural networks step in. Learn when to use neural bandits, how to quantify uncertainty with bootstrap ensembles, and handle high-dimensional action spaces with embeddings and two-stage selection.
Complete Python implementations of ε-greedy, UCB, LinUCB, and Thompson Sampling. Learn which algorithm to use for your problem with default hyperparameters and practical tuning guidance.
Understand the theory behind contextual bandits: regret bounds, the exploration-exploitation tradeoff, reward models, and why certain algorithms work. Math that directly informs practice.
Stop running month-long A/B tests that leave value on the table. Learn when contextual bandits are the right choice for adaptive, personalized optimization—and when to stick with simpler alternatives.
Stop relying on gut feelings to evaluate LLM outputs. Learn systematic approaches to build trustworthy evaluation pipelines with measurable metrics, proven methods, and production-ready practices. A practical guide covering faithfulness vs helpfulness, LLM-as-judge techniques, bias mitigation, and continuous monitoring.
I pulled together notes on the Differential Transformer and its take on attention.
I wrote about OpenELM and how Apple approaches efficient language models.
No articles matched your filters. Try a different keyword or tag.
The blog is a living lab notebook. Some essays are polished deep dives, others capture lessons while they are still fresh — both have a place in the learning loop.
Perfection slows the feedback cycle, so I share drafts, return with new data, and document the missteps alongside the breakthroughs.
If something sparks a question or disagreement, please reach out. Dialogue keeps the writing honest and ensures the next revision is better informed.