About

Hi there, my name is Jen Wei.

I’m a T-shaped AI researcher — a generalist by curiosity, a specialist by discipline.
My work sits at the intersection of deep technical research and applied problem-solving: from model architectures and optimization to distributed training and real-world deployment.

I earned a PhD in Applied Mathematics and an MS in Statistics through a dual-degree program — training that taught me to think from first principles, set high standards, and always ask why.
Years of building real systems from scratch taught me the other side of the equation: how to turn those ideas into working code.

In many ways, I’m a hybrid between scientist and engineer — I ask “why” more deeply than most engineers, and I think “how” more concretely than most scientists. That balance drives how I approach every project: analytical in reasoning, pragmatic in execution.

Over the past decade, I’ve moved between three worlds:

Enterprise data science, where I led quantitative modeling for energy and utilities — translating between regulators, traders, and algorithms.
Startup R&D, where I was the first hire on a 3D vision SDK team and built the system from zero lines of code to alpha release.
Independent research, where I focus on the hard engineering behind modern AI — implementing optimizers like Muon, experimenting with distributed frameworks (FSDP, ZeRO), and teaching others how these systems actually work.

I’m currently a speaker at PyData Global 2025, presenting my talk “I Built a Transformer from Scratch So You Don’t Have To.”
Beyond code and research, I’m passionate about helping others connect the dots — between theory, architecture, and real impact.

Whether you’re scaling AI infrastructure, exploring reinforcement learning for reasoning, or looking for someone who can go from first principles to production — I’m always up for tackling the next hard problem.

You can find me on:

Menu

About