Experience
Independent AI Researcher (2024 – Present)
I lead self-directed foundational AI research across the full model lifecycle — from pretraining mechanisms to post-training alignment, optimization, and distributed systems.
My work combines first-principles understanding with implementation-level depth. Rather than fine-tuning APIs, I build and analyze core components from scratch — focusing on how modern intelligence systems actually work under the hood.
Focus Areas:
- Pretraining: Implemented architectures like DeepSeek’s Multi-Head Latent Attention (MLA) and Mixture-of-Experts (MoE) from first principles.
- Post-training: Recreated and visualized alignment and reinforcement methods such as DPO, GRPO, and ReTool, including hands-on implementations for strategic tool use in LLMs.
- Optimization: Authored “Understanding the Muon Optimizer” — a deep dive into the theory, math, and implementation of a frontier optimizer used in record-breaking training runs.
- Distributed Systems: Currently investigating FSDP and ZeRO architectures — decoding how synchronization, communication, and memory define the economics of large-scale model training.
Impact:
My annotated codebases, tutorials, and visual guides have been widely used by practitioners learning foundational AI from scratch.
This line of work culminated in my upcoming PyData Global 2025 talk:
“I Built a Transformer from Scratch So You Don’t Have To.”
Industry Work — Building at the Intersection of Data and Systems (2020 – Present)
-
Principal Data Scientist – InsurTech Startup (2025 – Present)
Designing and deploying pricing models for next-generation insurance risk modeling. Bridging classical statistics and ML-driven automation to bring transparency and speed to underwriting. -
Senior Machine Learning Engineer – R&D Team (2021 – 2024)
Founding member of a 3D perception team, building a LiDAR-based object detection SDK from zero to alpha release. Designed training, packaging, and inference workflows for high-performance deployment. -
Data Science Technical Lead – Consulting Startup (2020 – 2020)
Led a small data science team delivering predictive modeling for energy and utilities clients. Introduced reproducible workflows and mentored junior analysts to accelerate project delivery.
Senior Quantitative Analyst – Energy & Utilities (2011 – 2019)
Developed quantitative models for electricity trading, pricing, and renewable investment strategy. Translated between policy, regulation, and math to help clients make multi-million-dollar infrastructure decisions.
Education & Foundations
-
PhD in Applied Mathematics (Dual Degree with MS in Statistics), Michigan State University (2012)
-
Master Certificate in Strategic Organizational Leadership & Management, Michigan State University (2015)