Louis Wang

ML engineer at Netflix, previously at Snap. I build reasoning recommender systems and AI agents — from generative retrieval and semantic IDs to autonomous agents, multi-agent systems, and LLM-powered applications.

Recent posts

March 29, 2026 · 17 min read

Two Bets on Generative Recommendation: Semantic IDs vs. Fine-Tuned LLMs

A head-to-head comparison of the two paradigms remaking recommendation — semantic ID autoregressive models and fine-tuned LLMs — with trade-off analysis and a look at how they're converging.
March 28, 2026 · 16 min read

The Attention Bottleneck: How Modern LLMs Solved a Problem That Nearly Broke the Transformer

From vanilla multi-head attention to Flash Attention 3 — the engineering bottlenecks that drove every major attention variant and the math behind each fix.
March 25, 2026 · 22 min read

The Harness Is the Moat: Why Autonomous AI Agents Live or Die by Their Architecture

Model quality is commoditising. The durable competitive advantage in 2026 is harness architecture — the deterministic enclosures that make probabilistic agents reliable. A deep analysis of the four architectural primitives every production harness must implement, and how Autoresearch, Ralph Loop, Superpowers, and GSD each solve them differently.
March 23, 2026 · 14 min read

Generative Recommendation in Production: HSTU, OneRec, and What Every Major Platform Is Building

From semantic IDs to OneRec Think — how Meta, Kuaishou, Google, Alibaba, ByteDance, and LinkedIn are replacing two-stage retrieval pipelines with generative models. What's in production and where the field is heading.
March 23, 2026 · 10 min read

From Vibe Coding to Harness Engineering: How to Actually Ship AI-Assisted Software

Vibe coding gets you a working prototype in 10 minutes. Harness engineering is how you ship it to production. Here's the difference, why it matters, and how to make the transition.

All posts →

Louis Wang

Recent posts

Two Bets on Generative Recommendation: Semantic IDs vs. Fine-Tuned LLMs

The Attention Bottleneck: How Modern LLMs Solved a Problem That Nearly Broke the Transformer

The Harness Is the Moat: Why Autonomous AI Agents Live or Die by Their Architecture

Generative Recommendation in Production: HSTU, OneRec, and What Every Major Platform Is Building

From Vibe Coding to Harness Engineering: How to Actually Ship AI-Assisted Software