Louis Wang
ML engineer at Netflix, previously at Snap. I build reasoning recommender systems and AI agents — from generative retrieval and semantic IDs to autonomous agents, multi-agent systems, and LLM-powered applications.
Recent posts
-
Picking the Wrong Agent Topology Is Your Most Expensive Mistake
Five multi-agent design patterns — with a decision guide for when each one earns its complexity and when it will burn you.
-
From Quadratic to Linear: A Survey of Subquadratic Sparse Attention
Why standard attention breaks at 128K tokens, how four families of efficient attention tried and partially failed to fix it, and how content-dependent sparse routing achieves linear scaling without sacrificing retrieval accuracy.
-
The Agent Harness Pattern: What Poker Taught Me About Multi-Agent Systems
How a Texas Hold'em simulator became a blueprint for any domain where autonomous agents compete, negotiate, and adapt — turn by turn.
-
Building a Self-Improving Personal Knowledge Base Powered by LLM
Inspired by Andrej Karpathy's post on LLM knowledge bases, I built a system where Claude Code skills manage a personal wiki end-to-end — ingesting raw content, compiling concept articles, synthesizing connections, and answering questions. You never touch the wiki. The LLM owns it.
-
Gemma 4 Explained: How One Model Family Spans Phones and Frontier-Class Reasoning
A technical deep-dive into Gemma 4's four core ideas — MatFormer elastic inference, hybrid attention with p-RoPE, parallel dense+MoE FFN, and native agentic tooling — with the Gemma 1–3 lineage as context.