Balatro-rs

About

Balatro-rs implements a simplified version of Balatro in Rust, providing both a game engine and exhaustive move generator. The library constrains the full game into a manageable state space, making it feasible to apply reinforcement learning techniques. Python bindings via PyO3 enable integration with machine learning frameworks like Gymnasium for training agents.

Features

The implementation includes a subset of Balatro’s mechanics with focus on core gameplay loops:

Implemented:

Poker hand identification and scoring
Card playing, discarding, and reordering
Blind pass/fail conditions and game progression
Money and interest generation
Ante progression through ante 8
Blind types (small, big, boss)
Stage transitions (pre-blind, blind, post-blind, shop)
Basic joker buying, selling, and usage

Not Implemented:

Tarot, planet, and spectral cards
Boss blind modifiers
Skip blinds and tags
Card enhancements, foils, and seals
Alternative decks and stakes

The goal is not full feature parity with Balatro but rather a simplified implementation suitable for reinforcement learning experiments.

Design

The engine is built in Rust for performance and provides two distinct APIs for action generation.

The dynamic action API returns a variable-length list of all valid actions for the current game state. This approach is straightforward but less suited for fixed-size neural network inputs.

let actions: Vec<Action> = game.gen_actions().collect();
let action = actions[random_index].clone();
game.handle_action(action);

The action space API returns a fixed-size masked array where valid actions are marked with 1 and invalid actions with 0. This bounded representation is designed for reinforcement learning frameworks that expect consistent input dimensions.

let space = game.gen_action_space();
let space_vec = space.to_vec();
let action = space.to_action(valid_index, &game)?;
game.handle_action(action);

Python bindings expose both APIs through PyO3, allowing the Rust game engine to integrate with Python’s machine learning ecosystem. A Gymnasium environment wrapper translates game states into observations and handles the step/reset interface expected by RL algorithms.

Reinforcement Learning

The constrained state space enables training agents with tabular Q-learning. Game observations are reduced to a tuple of key metrics: score, target, stage, round, plays remaining, discards remaining, money, and counts of cards in various positions.

A basic Q-learning agent uses epsilon-greedy exploration to balance trying new actions with exploiting learned values. The agent maintains a dictionary mapping state-action pairs to Q-values and updates them based on temporal difference errors.

agent = BalatroAgent(
    env=env,
    learning_rate=0.01,
    initial_epsilon=1.0,
    epsilon_decay=epsilon_decay,
    final_epsilon=0.1,
)

for episode in range(n_episodes):
    obs, info = env.reset()
    done = False

    while not done:
        action = agent.get_action(obs)
        next_obs, reward, terminated, truncated, info = env.step(action)
        agent.update(obs, action, reward, terminated, next_obs)
        done = terminated or truncated
        obs = next_obs

    agent.decay_epsilon()

Training tracks episode rewards, lengths, and temporal difference errors. The implementation is experimental and results are limited, but the infrastructure provides a foundation for exploring different RL approaches on Balatro.

Examples

Random game simulation in Rust:

use balatro_rs::{action::Action, game::Game};
use rand::Rng;

fn main() {
    let mut game = Game::default();
    game.start();

    while !game.is_over() {
        let actions: Vec<Action> = game.gen_actions().collect();
        if actions.is_empty() {
            break;
        }

        let i = rand::thread_rng().gen_range(0..actions.len());
        let action = actions[i].clone();
        game.handle_action(action);
    }

    let result = game.result();
}

Random game simulation in Python:

import pylatro
import random

config = pylatro.Config()
config.ante_end = 1
game = pylatro.GameEngine(config)

while not game.is_over:
    actions = game.gen_actions()
    if len(actions) > 0:
        action = random.choice(actions)
        game.handle_action(action)

print(f"Win: {game.is_win}, Score: {game.state.score}")

The repository includes a CLI for manual testing where players select actions interactively to step through games.

Building

Clone and build the Rust library:

git clone https://github.com/evanofslack/balatro-rs
cd balatro-rs
cargo build --release

Build Python bindings with maturin:

cd pylatro
maturin develop

See the pylatro directory for reinforcement learning experiments and Gymnasium environment setup.

About#

Features#

Design#

Reinforcement Learning#

Examples#

Building#