Chess as a living proof of the OpenVerb protocol

This is not a chess product. It is a demonstration that any deterministic environment can be turned into a verb-driven control surface where AI agents operate through structured, validated, and replayable actions.

Why chess is the perfect sandbox

Chess has everything OpenVerb needs to shine: strict turn-based flow, deterministic state, finite legal actions, and a complete audit trail. The verb surface is deliberately tiny, which makes it dead simple to demonstrate how the verb-to-executor-to-state-change loop works in practice.

4Verbs

40-80Avg Moves/Game

~60API Calls (AI vs AI)

<$0.05Cost per Game

The architecture

The system follows the OpenVerb mental model: a truth source (the chess engine), a set of typed verbs (the action language), pluggable agents (human or AI), and an immutable action log that records every interaction.

The OpenVerb loop

1. Engine: it's White's turn
2. Agent calls chess.get_state → reads FEN, status
3. Agent calls chess.get_legal_moves → gets valid actions
4. Agent calls chess.get_history → reviews past moves
5. Agent chooses → calls chess.make_move with UCI notation
6. Engine validates → applies move → updates state
7. Action is logged with reasoning + timestamp
8. Repeat for next agent

The verb surface

Every interaction between an agent and the game goes through exactly one of these four verbs. Read verbs are safe and idempotent. Write verbs mutate state and are validated by the engine before execution.

READchess.get_state

Returns the current board position (FEN), whose turn it is, and the game status. Safe to call any time.

READchess.get_legal_moves

Returns all legal moves for the current position in UCI format. The agent must choose from this list.

READchess.get_history

Returns the complete move history with SAN notation, timestamps, and ply numbers.

WRITEchess.make_move

Submits a move in UCI format. The engine validates turn order, legality, and game status before applying.

Example action payload

{
  "verb": "chess.make_move",
  "agent": "ai-white",
  "args": {
    "gameId": "game_1706000000",
    "moveUCI": "e2e4"
  },
  "reasoning": "Controls the center and opens lines for the bishop and queen.",
  "timestamp": 1706000001234
}

The truth source

The chess engine is the single source of truth. AI agents never edit the board directly -- they can only call verbs. The engine validates every action and maintains the canonical game state. This is the core safety guarantee.

GameState structure

interface GameState {
  board: Square[][]     // 8x8 grid of pieces or null
  turn: "w" | "b"      // Whose move
  castling: {           // Castling availability
    K: boolean          // White kingside
    Q: boolean          // White queenside  
    k: boolean          // Black kingside
    q: boolean          // Black queenside
  }
  enPassant: string | null  // Target square for en passant
  halfmoveClock: number     // For 50-move draw rule
  fullmoveNumber: number    // Increments after Black moves
}

How the AI actually plays

LLMs are not chess engines. They are pattern matchers, not brute-force evaluators. Neither GPT nor Claude will play anywhere near Stockfish level. In practice, both play human-like, imperfect chess at roughly 1200-1800 ELO depending on the model and prompt design.

What to expect

Games are competitive. Both models play imperfect but interesting chess. No model consistently dominates.
Different styles emerge. Some models play more aggressively, others more positionally. This makes for great viewing.
Humans can win. In Human vs AI mode, games feel fair. Much better UX than getting crushed by Stockfish.
Reasoning is visible. Every move includes the AI's reasoning in the action log, so you can see what it was thinking.

API cost analysis

The key insight: do not ask the model to analyze the whole board. You already computed the legal moves and the FEN. Just ask the model to choose from the list. This keeps token counts tiny and costs negligible.

Mode	API Calls	Tokens/Call	Est. Cost
Human vs AI	20-40	~300-500	< $0.02
AI vs AI	40-80	~300-500	< $0.05

The real point

The chess game is not the product. It is the proof. People ask "What does OpenVerb actually do?" and you point at a live chess match where every move is a logged, validated, replayable verb action. That is infinitely more convincing than a diagram.

Three demo angles that land

01Deterministic sandbox + audited actions. Every action is logged, replayable, and attributable to an agent with a policy version.
02Swappable agents. Plug in an LLM agent, a Stockfish agent, a random agent, or a human. They all talk through the same verb surface.
03Policy + paywall hooks. Rate-limit per game, per user. Analysis depth tiers. This is where monetization fits naturally into the protocol.

Swappable agents

Because every agent talks through the same verb surface, swapping one agent for another is trivial. The engine does not care who is calling the verbs, only that the verbs are valid. This is the core extensibility of OpenVerb.

Agent interface

interface Agent {
  id: string        // Unique identifier
  name: string      // Display name
  type: "human" | "ai"
  color: "w" | "b"  // Which side they play
  model?: string    // AI model (e.g. "openai/gpt-4o")
}

// The engine doesn't care about the agent type.
// It only validates:
//   1. Is it this agent's turn?
//   2. Is the move legal?
//   3. Is the game still playing?

See it in action

Go back to the game, select AI vs AI, pick two different models, and watch them play through the OpenVerb protocol. Every move is logged.

Play a game