Bird of Paradise

Experience

Independent AI Researcher (2024 – Present)

I lead self-directed foundational AI research across the full model lifecycle — from pretraining mechanisms to post-training alignment, optimization, and distributed systems.

My work combines first-principles understanding with implementation-level depth. Rather than fine-tuning APIs, I build and analyze core components from scratch — focusing on how modern intelligence systems actually work under the hood.

Focus Areas:

  • Pretraining: Implemented architectures like DeepSeek’s Multi-Head Latent Attention (MLA) and Mixture-of-Experts (MoE) from first principles.
  • Post-training: Recreated and visualized alignment and reinforcement methods such as DPO, GRPO, and ReTool, including hands-on implementations for strategic tool use in LLMs.
  • Optimization: Authored “Understanding the Muon Optimizer” — a deep dive into the theory, math, and implementation of a frontier optimizer used in record-breaking training runs.
  • Distributed Systems: Currently investigating FSDP and ZeRO architectures — decoding how synchronization, communication, and memory define the economics of large-scale model training.

Impact:
My annotated codebases, tutorials, and visual guides have been widely used by practitioners learning foundational AI from scratch.
This line of work culminated in my upcoming PyData Global 2025 talk:
“I Built a Transformer from Scratch So You Don’t Have To.”


Industry Work — Building at the Intersection of Data and Systems (2020 – Present)

  • Principal Data Scientist – InsurTech Startup (2025 – Present)
    Designing and deploying pricing models for next-generation insurance risk modeling. Bridging classical statistics and ML-driven automation to bring transparency and speed to underwriting.

  • Senior Machine Learning Engineer – R&D Team (2021 – 2024)
    Founding member of a 3D perception team, building a LiDAR-based object detection SDK from zero to alpha release. Designed training, packaging, and inference workflows for high-performance deployment.

  • Data Science Technical Lead – Consulting Startup (2020 – 2020)
    Led a small data science team delivering predictive modeling for energy and utilities clients. Introduced reproducible workflows and mentored junior analysts to accelerate project delivery.


Senior Quantitative Analyst – Energy & Utilities (2011 – 2019)

Developed quantitative models for electricity trading, pricing, and renewable investment strategy. Translated between policy, regulation, and math to help clients make multi-million-dollar infrastructure decisions.


Education & Foundations

  • PhD in Applied Mathematics (Dual Degree with MS in Statistics), Michigan State University (2012)

  • Master Certificate in Strategic Organizational Leadership & Management, Michigan State University (2015)