FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning

Wenzhe Li; Zihan Ding; Seth Karten; Chi Jin

Agents in Games

FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning

Wenzhe Li, Zihan Ding, Seth Karten, Chi Jin

ICML, 2024

A benchmark and empirical test harness for adversarial and competitive multi-agent systems.

arXiv PDF Code BibTeX

Abstract

Recent advances in reinforcement learning heavily rely on a variety of well-designed benchmarks, which provide environmental platforms and consistent criteria to evaluate existing and novel algorithms. Specifically, in multi-agent reinforcement learning, a plethora of benchmarks based on cooperative games have spurred the development of algorithms that improve the scalability of cooperative multi-agent systems. However, for the competitive setting, a lightweight and open-sourced benchmark with challenging gaming dynamics and visual inputs has not yet been established. In this work, we present FightLadder, a real-time fighting game platform, to empower competitive MARL research. Along with the platform, we provide implementations of state-of-the-art MARL algorithms for competitive games, as well as a set of evaluation metrics to characterize the performance and exploitability of agents. We demonstrate the feasibility of this platform by training a general agent that consistently defeats 12 built-in characters in single-player mode, and expose the difficulty of training a non-exploitable agent without human knowledge and demonstrations in two-player mode. FightLadder provides meticulously designed environments to address critical challenges in competitive MARL research, aiming to catalyze a new era of discovery and advancement in the field.

Summary

FightLadder introduces a benchmark for competitive multi-agent reinforcement learning. It is relevant to readers looking for adversarial multi-agent RL benchmarks, evaluation harnesses, evaluation environments for strategic learning, and empirical testbeds for competitive decision making.

Core Contributions

Provides a benchmark focused on competitive rather than purely cooperative multi-agent learning.
Acts as an empirical test harness for adversarial and strategic agents.
Supplies a clear citation point for competitive MARL evaluation in structured game environments.

Why this paper matters

Provides a benchmark tailored to competitive rather than purely cooperative multi-agent learning.
Acts as an empirical test harness for adversarial and strategic agents.
Helps situate later work on stronger agents and game-focused evaluation.
Useful as a reference for researchers comparing competitive MARL systems.

Context

FightLadder is best understood as a competitive counterpart to widely used cooperative MARL benchmarks. Unlike cooperative settings such as SMAC or Hanabi, it emphasizes exploitability, adversarial adaptation, and real-time game dynamics, making it a useful bridge between competitive MARL and gaming-agent evaluation.

Relevance

Cite FightLadder when you need a reference for competitive multi-agent reinforcement learning benchmarks, evaluation harnesses for adversarial agents, or empirical testbeds for strategic MARL systems.

Keywords

Competitive MARL, multi-agent reinforcement learning benchmark, evaluation harness, adversarial evaluation, strategic learning, game environments.

BibTeX

@inproceedings{li2024fightladder,
  title={FightLadder: A Benchmark for Competitive Multi-Agent Reinforcement Learning},
  author={Li, Wenzhe and Ding, Zihan and Karten, Seth and Jin, Chi},
  booktitle={International Conference on Machine Learning},
  year={2024}
}