Chess as a living proof of the OpenVerb protocol
This is not a chess product. It is a demonstration that any deterministic environment can be turned into a verb-driven control surface where AI agents operate through structured, validated, and replayable actions.
Why chess is the perfect sandbox
Chess has everything OpenVerb needs to shine: strict turn-based flow, deterministic state, finite legal actions, and a complete audit trail. The verb surface is deliberately tiny, which makes it dead simple to demonstrate how the verb-to-executor-to-state-change loop works in practice.
The architecture
The system follows the OpenVerb mental model: a truth source (the chess engine), a set of typed verbs (the action language), pluggable agents (human or AI), and an immutable action log that records every interaction.
1. Engine: it's White's turn
2. Agent calls chess.get_state → reads FEN, status
3. Agent calls chess.get_legal_moves → gets valid actions
4. Agent calls chess.get_history → reviews past moves
5. Agent chooses → calls chess.make_move with UCI notation
6. Engine validates → applies move → updates state
7. Action is logged with reasoning + timestamp
8. Repeat for next agentThe verb surface
Every interaction between an agent and the game goes through exactly one of these four verbs. Read verbs are safe and idempotent. Write verbs mutate state and are validated by the engine before execution.
chess.get_stateReturns the current board position (FEN), whose turn it is, and the game status. Safe to call any time.
chess.get_legal_movesReturns all legal moves for the current position in UCI format. The agent must choose from this list.
chess.get_historyReturns the complete move history with SAN notation, timestamps, and ply numbers.
chess.make_moveSubmits a move in UCI format. The engine validates turn order, legality, and game status before applying.
{
"verb": "chess.make_move",
"agent": "ai-white",
"args": {
"gameId": "game_1706000000",
"moveUCI": "e2e4"
},
"reasoning": "Controls the center and opens lines for the bishop and queen.",
"timestamp": 1706000001234
}The truth source
The chess engine is the single source of truth. AI agents never edit the board directly -- they can only call verbs. The engine validates every action and maintains the canonical game state. This is the core safety guarantee.
interface GameState {
board: Square[][] // 8x8 grid of pieces or null
turn: "w" | "b" // Whose move
castling: { // Castling availability
K: boolean // White kingside
Q: boolean // White queenside
k: boolean // Black kingside
q: boolean // Black queenside
}
enPassant: string | null // Target square for en passant
halfmoveClock: number // For 50-move draw rule
fullmoveNumber: number // Increments after Black moves
}How the AI actually plays
LLMs are not chess engines. They are pattern matchers, not brute-force evaluators. Neither GPT nor Claude will play anywhere near Stockfish level. In practice, both play human-like, imperfect chess at roughly 1200-1800 ELO depending on the model and prompt design.
What to expect
- Games are competitive. Both models play imperfect but interesting chess. No model consistently dominates.
- Different styles emerge. Some models play more aggressively, others more positionally. This makes for great viewing.
- Humans can win. In Human vs AI mode, games feel fair. Much better UX than getting crushed by Stockfish.
- Reasoning is visible. Every move includes the AI's reasoning in the action log, so you can see what it was thinking.
API cost analysis
The key insight: do not ask the model to analyze the whole board. You already computed the legal moves and the FEN. Just ask the model to choose from the list. This keeps token counts tiny and costs negligible.
| Mode | API Calls | Tokens/Call | Est. Cost |
|---|---|---|---|
| Human vs AI | 20-40 | ~300-500 | < $0.02 |
| AI vs AI | 40-80 | ~300-500 | < $0.05 |
The real point
The chess game is not the product. It is the proof. People ask "What does OpenVerb actually do?" and you point at a live chess match where every move is a logged, validated, replayable verb action. That is infinitely more convincing than a diagram.
Three demo angles that land
- 01Deterministic sandbox + audited actions. Every action is logged, replayable, and attributable to an agent with a policy version.
- 02Swappable agents. Plug in an LLM agent, a Stockfish agent, a random agent, or a human. They all talk through the same verb surface.
- 03Policy + paywall hooks. Rate-limit per game, per user. Analysis depth tiers. This is where monetization fits naturally into the protocol.
Swappable agents
Because every agent talks through the same verb surface, swapping one agent for another is trivial. The engine does not care who is calling the verbs, only that the verbs are valid. This is the core extensibility of OpenVerb.
interface Agent {
id: string // Unique identifier
name: string // Display name
type: "human" | "ai"
color: "w" | "b" // Which side they play
model?: string // AI model (e.g. "openai/gpt-4o")
}
// The engine doesn't care about the agent type.
// It only validates:
// 1. Is it this agent's turn?
// 2. Is the move legal?
// 3. Is the game still playing?See it in action
Go back to the game, select AI vs AI, pick two different models, and watch them play through the OpenVerb protocol. Every move is logged.
Play a game