Economic Alignment
Large populations of language agents, mechanism design, and policy evaluation in synthetic economic systems, with a focus on alignment and safety.
Research Areas
This page is the index of my papers. My research develops methods and evaluations for multi-agent systems (zero-sum and general-sum), generalized harnesses for embodied agents, and economic alignment and safety.
Large populations of language agents, mechanism design, and policy evaluation in synthetic economic systems, with a focus on alignment and safety.
Strategic language agents and reasoning frameworks for adversarial decision making, with expert-level performance in competitive games.
Benchmarks and evaluation harnesses for agentic capabilities, plus high-performance RL environment infrastructure for fast and faithful policy learning.
Communication protocols in cooperative and social multi-agent learning, with an emphasis on interpretability, sparsity, and human-agent teaming.
Earlier work on improving kinodynamic planning for autonomous vehicles with learned controllers.
Economic alignment
This paper studies large populations of language agents and uses them to analyze policy and mechanism design questions in multi-agent generative simulacra, with a focus on economic alignment and safety.
Agent harnesses
The first Pokemon battling paper at ICML, ICLR, or NeurIPS. An ICML Spotlight paper that establishes competitive Pokemon battling as a top-tier machine learning setting for reasoning agents and strategic language agents.
Evaluations & RL environments
A competition benchmark and evaluation harness that turns Pokemon into a durable machine learning testbed for long-context learning, reasoning agents, embodied agents, and strategic decision making.
Evaluations & RL environments
A benchmark of 132 real game development tasks evaluating agentic coding, multimodal reasoning, and graphics-aware capabilities, with image and video feedback substantially improving performance.
Evaluations & RL environments
An agent-assisted method that translates RL environments into high-performance implementations with semantic equivalence, achieving speedups up to 22,320x and validating cross-backend policy transfer.
Evaluations & RL environments
A benchmark and empirical test harness for evaluating competitive multi-agent reinforcement learning in structured adversarial environments.
Emergent communication
This paper examines how emergent communication shapes social learning dynamics in multi-agent reinforcement learning.
Emergent communication
This paper studies how to learn sparse communication protocols without discarding information needed for coordination.
Emergent communication
This paper focuses on making learned emergent communication more interpretable for human-agent teams.
Robotics
This paper improves kinodynamic planning for vehicles by combining classical planning with learned goal-reaching controllers.