Automatic Generation of High-Performance RL Environments

Seth Karten; Rahul Dev Appapogu; Chi Jin

Evaluations & RL Environments

Automatic Generation of High-Performance RL Environments

Seth Karten, Rahul Dev Appapogu, Chi Jin

arXiv preprint, 2026

A method for automatically translating reinforcement learning environments into high-performance implementations with semantic equivalence and speedups up to 22,320x.

arXiv PDF BibTeX

Abstract

We present a method for automatically translating reinforcement learning environments into high-performance implementations at minimal cost. Our approach uses a generic prompt template, hierarchical verification, and iterative agent-assisted repair to achieve semantic equivalence with the reference implementation. We demonstrate five environments including EmuRust, PokeJAX, and TCGJax, achieving speedups ranging from 1.5x to 22,320x over reference implementations. Semantic equivalence is verified through property, interaction, and rollout testing, and we validate cross-backend policy transfer.

Why this paper matters

Removes a major bottleneck in RL research by automating high-performance environment ports.
Combines hierarchical verification with iterative agent repair to preserve semantic equivalence.
Validates that policies learned on translated backends transfer back to the reference, closing the loop.
Releases EmuRust, PokeJAX, and TCGJax as concrete artifacts for the community.

Keywords

Reinforcement learning, high-performance environments, agent-assisted code generation, JAX, Rust, semantic verification, policy transfer.

BibTeX

@article{karten2026autorlenv,
  title={Automatic Generation of High-Performance RL Environments},
  author={Karten, Seth and Appapogu, Rahul Dev and Jin, Chi},
  journal={arXiv preprint arXiv:2603.12145},
  year={2026}
}