Arbiter: Deterministic Testing Framework
Arbiter: Deterministic Testing Framework
Version: 1.0.0 Status: Production Last Updated: 2026-01-10
Overview
Arbiter is Orix’s deterministic testing framework. Every test runs with an explicit seed, ensuring perfect reproducibility. Same seed = same result, every time.
The Problem
Traditional testing frameworks weren’t designed for deterministic systems:
| Problem | Traditional Test | Arbiter Solution |
|---|---|---|
| Non-deterministic failures | Random(), DateTime.Now | Explicit seeds, ScenarioContext |
| Unreproducible bugs | ”Works on my machine” | Seed in every test, hash verification |
| Time-dependent code | Flaky CI/CD | Tick-based simulation |
| Unverified determinism | Manual checking | Automatic replay verification |
| Scattered test organization | Folder hierarchies | Category-based filtering |
The core problem: You can’t prove determinism with non-deterministic tests.
Arbiter’s answer: Every scenario is a reproducible experiment. Same inputs = same outputs, always.
How Arbiter Works
1. Seeded Scenarios
Every test declares its seed explicitly:
[ArbiterScenario(Seed = 42, Category = "Math.DFixed64")]public void DFixed64_Addition_IsCommutative(){ var a = DFixed64.FromInt(10); var b = DFixed64.FromInt(20);
Assert.Equal(a + b, b + a);}Key insight: The seed isn’t hidden in a setup method. It’s visible in the attribute. Every developer knows exactly what inputs produced this result.
2. Automatic Replay
Arbiter can run determinism checks automatically:
orix arbiter determinismThis:
- Runs each scenario
- Records the final state hash
- Runs it again with the same seed
- Verifies identical hash
If determinism breaks, the test fails immediately.
3. Category Organization
Tests organize by feature, not folder structure:
[ArbiterScenario(Seed = 42, Category = "Flux.World")][ArbiterScenario(Seed = 42, Category = "Flux.EntityAllocator")][ArbiterScenario(Seed = 42, Category = "Chronicle.TimeTravel")][ArbiterScenario(Seed = 42, Category = "Nexus.Sync")]Run specific categories:
orix arbiter run --category Flux.Worldorix arbiter run --category ChronicleWhat Arbiter Provides
The [ArbiterScenario] Attribute
public sealed class ArbiterScenarioAttribute : Attribute{ public ulong Seed { get; set; } = 1; public uint MaxTicks { get; set; } = 10000; public int TimeoutMs { get; set; } = 30000; public string? ExpectedHash { get; set; } public string? Category { get; set; } public string? Description { get; set; } public bool AllowAmbient { get; set; } = false; public bool Skip { get; set; } = false; public string? SkipReason { get; set; }}Parameters:
Seed- Deterministic seed (default: 1)MaxTicks- Timeout for tick-based tests (default: 10,000)TimeoutMs- Wall-clock timeout for CI (default: 30,000ms)ExpectedHash- Verify exact final state (optional)Category- Group tests (e.g., “Flux.World”)AllowAmbient- Allow non-deterministic operations (testing external APIs)Skip- Skip this testSkipReason- Explanation for skip
The Assert Class
Arbiter assertions are Orix-aware:
// Basic assertionsAssert.True(condition);Assert.False(condition);Assert.Equal(expected, actual);Assert.NotEqual(expected, actual);Assert.Null(value);Assert.NotNull(value);
// Orix-aware assertionsAssert.ApproxEqual(expected, actual, tolerance); // DFixed64Assert.ApproxEqual(expectedVec, actualVec, tolerance); // DVector2/3Assert.InRange(value, min, max); // DFixed64 rangeAssert.Positive(value); // DFixed64 > 0Assert.Negative(value); // DFixed64 < 0Assert.Zero(value); // DFixed64 == 0
// Tick assertionsAssert.AtOrAfter(tick, minimum);Assert.Before(tick, maximum);
// Collection assertionsAssert.Empty(collection);Assert.NotEmpty(collection);Assert.Count(expected, collection);Assert.Contains(item, collection);
// Exception assertionsvar ex = Assert.Throws<InvalidOperationException>(() => DoSomething());Assert.DoesNotThrow(() => DoSomething());
// Determinism verificationAssert.Deterministic(() => ComputeSomething(), runs: 10);The ScenarioContext
For tick-based simulations, use ScenarioContext:
[ArbiterScenario(Seed = 42, MaxTicks = 1000, Category = "Simulation")]public void PhysicsSimulation_TickBased(ScenarioContext ctx){ var position = DVector3.Zero; var velocity = new DVector3(DFixed64.One, DFixed64.Zero, DFixed64.Zero); var gravity = new DVector3(DFixed64.Zero, -DFixed64.FromDouble(9.8), DFixed64.Zero);
// Simulate 100 ticks for (int i = 0; i < 100; i++) { velocity += gravity * DFixed64.FromDouble(0.016); position += velocity * DFixed64.FromDouble(0.016);
ctx.HashState(position); // Track state for determinism ctx.Advance(); // Next tick }
Assert.True(position.Y < DFixed64.Zero, "Object should have fallen");}ScenarioContext API:
// Propertiesctx.Seed // ulong - The seed for this runctx.CurrentTick // Tick - Current simulation tickctx.MaxTicks // uint - Maximum allowed ticksctx.IsTimeout // bool - Exceeded max ticks?ctx.StateHash // ulong - Accumulated state hashctx.Random // OrixRandom - Deterministic RNG
// Methodsctx.Advance() // Advance one tickctx.Advance(count) // Advance multiple ticksctx.HashState(value) // Track state (ulong, DFixed64, DVector2/3)ctx.RandomValue() // DFixed64 in [0, 1)ctx.RandomRange(min, max) // DFixed64 in [min, max)ctx.RandomInt(min, max) // int in [min, max)ctx.RandomBool() // Random boolctx.Reset() // Reset to initial stateBasic Scenario Patterns
1. Simple Assertion Test
[ArbiterScenario(Seed = 42, Category = "Math.DFixed64")]public void DFixed64_Addition_IsCommutative(){ var a = DFixed64.FromInt(10); var b = DFixed64.FromInt(20);
Assert.Equal(a + b, b + a);}When to use: Pure functions, mathematical properties, invariants.
2. Determinism Verification Test
[ArbiterScenario(Seed = 42, Category = "Flux.Determinism")]public void Simulation_SameSeed_SameResult(){ // Run 1 uint finalRandom1; long finalTick1; using (var world1 = new World(12345)) { world1.Start(); for (int i = 0; i < 10; i++) { world1.CreateEntity(); } world1.ProcessTicks(100); finalRandom1 = world1.Random.NextUInt(); finalTick1 = world1.CurrentTick.Value; }
// Run 2 with same seed uint finalRandom2; long finalTick2; using (var world2 = new World(12345)) { world2.Start(); for (int i = 0; i < 10; i++) { world2.CreateEntity(); } world2.ProcessTicks(100); finalRandom2 = world2.Random.NextUInt(); finalTick2 = world2.CurrentTick.Value; }
Assert.Equal(finalRandom1, finalRandom2, "Same seed = same random state"); Assert.Equal(finalTick1, finalTick2, "Same operations = same tick count");}When to use: Verify entire subsystems are deterministic.
3. Tick-Based Simulation Test
[ArbiterScenario(Seed = 42, MaxTicks = 1000, Category = "Simulation")]public void PhysicsSimulation_TickBased(ScenarioContext ctx){ var position = DVector3.Zero; var velocity = new DVector3(DFixed64.One, DFixed64.Zero, DFixed64.Zero); var gravity = new DVector3(DFixed64.Zero, -DFixed64.FromDouble(9.8), DFixed64.Zero);
for (int i = 0; i < 100; i++) { velocity += gravity * DFixed64.FromDouble(0.016); position += velocity * DFixed64.FromDouble(0.016); ctx.HashState(position); ctx.Advance(); }
Assert.True(position.Y < DFixed64.Zero, "Object should have fallen");}When to use: Time-stepped simulations, physics, gameplay logic.
4. Property-Based Test
[ArbiterScenario(Seed = 42, Category = "Math.Random")]public void OrixRandom_Range_WithinBounds(ScenarioContext ctx){ var min = DFixed64.FromInt(-10); var max = DFixed64.FromInt(10);
for (int i = 0; i < 1000; i++) { var value = ctx.RandomRange(min, max); Assert.InRange(value, min, max); }}When to use: Verify properties hold across many random inputs.
5. Hash Verification Test
[ArbiterScenario(Seed = 42, ExpectedHash = "A1B2C3D4E5F6", Category = "Regression")]public void KnownBehavior_MatchesSnapshot(ScenarioContext ctx){ // Run some complex simulation var result = RunComplexSimulation(ctx);
// Hash final state ctx.HashState(result);
// Arbiter automatically verifies ctx.StateHash == ExpectedHash}When to use: Regression tests - verify exact behavior doesn’t change.
Real-World Examples from Orix
Example 1: Flux World Lifecycle
// From tests/Flux.Tests/TickScenarios.cs
[ArbiterScenario(Seed = 42, Category = "Flux.World")]public void World_StartStop_TransitionsCorrectly(){ using var world = new World(42);
Assert.True(world.State == WorldState.Stopped, "Initial: Stopped");
world.Start(); Assert.True(world.State == WorldState.Running, "After Start: Running");
world.Pause(); Assert.True(world.State == WorldState.Paused, "After Pause: Paused");
world.Resume(); Assert.True(world.State == WorldState.Running, "After Resume: Running");
world.Stop(); Assert.True(world.State == WorldState.Stopped, "After Stop: Stopped");}Pattern: State machine testing - verify all transitions work correctly.
Example 2: Entity Allocation Determinism
// From tests/Flux.Tests/TickScenarios.cs
[ArbiterScenario(Seed = 42, Category = "Flux.EntityAllocator")]public void EntityAllocator_Allocate_ProducesUniqueEntities(){ var allocator = new EntityAllocator(1000); var entities = new HashSet<uint>();
for (int i = 0; i < 100; i++) { var entity = allocator.Allocate(); Assert.True(!entities.Contains(entity.PackedValue), $"Entity {i} should be unique"); entities.Add(entity.PackedValue); }
Assert.Equal(100, allocator.AllocatedCount);}Pattern: Uniqueness verification across many allocations.
Example 3: Math Properties
// From tests/Atom.Tests/MathScenarios.cs
[ArbiterScenario(Seed = 42, Category = "Math.DFixed64")]public void DFixed64_Trigonometry_PythagoreanIdentity(){ // sin²(x) + cos²(x) = 1 for all x var angle = DFixed64.Pi / DFixed64.FromInt(4); // 45 degrees var (sin, cos) = DFixed64.SinCos(angle);
var result = sin * sin + cos * cos; Assert.ApproxEqual(DFixed64.One, result, DFixed64.FromDouble(0.0002));}Pattern: Mathematical identity verification with tolerance for fixed-point approximation.
Example 4: Deterministic RNG
// From tests/Atom.Tests/MathScenarios.cs
[ArbiterScenario(Seed = 42, Category = "Math.Random")]public void OrixRandom_Deterministic_SameSeedSameSequence(ScenarioContext ctx){ // Generate sequence var values = new DFixed64[100]; for (int i = 0; i < 100; i++) { values[i] = ctx.RandomValue(); ctx.HashState(values[i]); }
// Reset and verify same sequence ctx.Reset(); for (int i = 0; i < 100; i++) { Assert.Equal(values[i], ctx.RandomValue(), $"Mismatch at index {i}"); }}Pattern: Record-and-replay to verify determinism.
Example 5: Chronicle Time Travel
// From tests/Echo.Tests/ReplayScenarios.cs
[ArbiterScenario(Seed = 42, Category = "Echo.TimeTravel")]public void TimeTravel_JumpToTick_RestoresState(){ var recording = CreateTestRecording(); var player = new ReplayPlayer(recording);
// Play forward to tick 50 player.PlayTo(50); var state50 = player.GetCurrentStateHash();
// Continue to tick 100 player.PlayTo(100); var state100 = player.GetCurrentStateHash();
// Jump back to tick 50 player.JumpTo(50); var state50Again = player.GetCurrentStateHash();
Assert.Equal(state50, state50Again, "Jumping back should restore exact state"); Assert.NotEqual(state50, state100, "Different ticks should have different states");}Pattern: Time-travel verification - ensure exact state restoration.
Test Categories in Orix
Real categories from the codebase:
Atom (Foundation)
Math.DFixed64- Fixed-point arithmeticMath.Vector- Vector operationsMath.Random- Deterministic RNGDeterminism- Core determinism verificationSimulation- Tick-based simulation tests
Flux (ECS Runtime)
Flux.Tick- Tick advancementFlux.Lifecycle- Entity create/destroyFlux.World- World state managementFlux.EntityAllocator- Entity ID allocationFlux.Determinism- Full simulation determinism
Lattice (Storage)
Lattice.CRUD- Create/Read/Update/DeleteLattice.Query- Query executionChronicle.StateHasher- State hashingChronicle.MerkleTree- Merkle proof verificationChronicle.Snapshot- Snapshot creation/restorationChronicle.TimeTravel- Time-travel APICRDT.Register- LWW-Register testsCRDT.Counter- G-Counter, PN-CounterCRDT.Set- OR-Set testsCRDT.Map- OR-Map tests
Nexus (Networking)
Nexus.Sync- State synchronizationNexus.Delta- Delta compressionNexus.Authority- Authority resolutionNexus.Determinism- Network determinism
Echo (Replay)
Echo.Recording- Recording creation/stateEcho.Playback- Replay playbackEcho.TimeTravel- Jump to tickEcho.Determinism- Replay determinism
Lumen (Observability)
Lumen.Logging- Structured loggingLumen.Metrics- Metric collectionLumen.Tracing- Distributed tracingLumen.Determinism- Logging determinism
Crypto
Crypto.Envelope- Envelope encryptionCrypto.TimeLock- Time-locked encryptionCrypto.StructuredEncryption- Searchable encryptionPQC.MlKem- Post-quantum KEMPQC.MlDsa- Post-quantum signatures
Running Tests
Run All Tests
orix arbiter runRun by Category
orix arbiter run --category Fluxorix arbiter run --category Chronicle.TimeTravelorix arbiter run --category MathRun Specific Test File
cd tests/Flux.Testsdotnet testVerify Determinism
orix arbiter determinismThis runs each test twice with the same seed and verifies identical results.
Verbose Output
orix arbiter run --verboseWith Custom Seed Override
orix arbiter run --seed 99999Runs all tests with seed 99999 instead of their declared seeds.
Advantages
1. Perfect Reproducibility
[ArbiterScenario(Seed = 42, Category = "Bugs.DesyncIssue123")]public void Reproduce_DesyncBug_FromIssue123(){ // This test will ALWAYS reproduce the exact bug // because the seed is fixed}Ship this test with a bug report. Anyone can reproduce it.
2. Determinism Verification Built-In
orix arbiter determinismAutomatic verification that all tests are actually deterministic.
3. Clear Failure Reproduction
FAIL: Simulation_SameSeed_SameResult Seed: 42 Tick: 150 Hash: Expected A1B2C3D4, got A1B2C3D5
To reproduce: orix arbiter run --seed 42 --test Simulation_SameSeed_SameResult4. Category-Based Organization
# Test just Chronicleorix arbiter run --category Chronicle
# Test all CRDT typesorix arbiter run --category CRDT
# Test determinism across all productsorix arbiter run --category Determinism5. State Hash Tracking
ctx.HashState(position);ctx.HashState(velocity);ctx.HashState(entityCount);
// At end: ctx.StateHash contains all accumulated stateArbiter tracks state hashes automatically. Use ExpectedHash for regression tests.
Disadvantages
1. Requires Seed Management
Every test needs an explicit seed. Can’t just write new Random().
Mitigation: ScenarioContext provides seeded RNG automatically.
2. Not Suitable for Non-Deterministic Tests
Testing external APIs, network calls, file I/O requires AllowAmbient:
[ArbiterScenario(Seed = 42, AllowAmbient = true, Category = "Integration")]public void ExternalAPI_ResponseParsing(){ // This test is allowed to be non-deterministic}3. Learning Curve
Property-based testing and seed management are unfamiliar to many developers.
Mitigation: Clear examples, documentation, and patterns.
4. Slower Than Unit Tests
Determinism verification requires running tests multiple times.
Mitigation: Run determinism checks in CI, not locally every time.
Arbiter vs. Other Frameworks
vs. xUnit/NUnit
| Feature | xUnit/NUnit | Arbiter |
|---|---|---|
| Seed management | Manual | Built-in attribute |
| Determinism verification | Manual | Automatic |
| Tick-based simulation | Manual | ScenarioContext |
| State hashing | Manual | Built-in |
| Category filtering | Traits/Categories | Category attribute |
| Fixed-point aware | No | Yes (DFixed64, DVector) |
| Time abstraction | No | Tick-based |
Use xUnit/NUnit when: Testing infrastructure, I/O, external integrations.
Use Arbiter when: Testing deterministic simulation, game logic, math.
vs. QuickCheck/Hypothesis
| Feature | QuickCheck | Arbiter |
|---|---|---|
| Property testing | Yes | Yes (via ScenarioContext) |
| Shrinking | Yes | No (explicit seeds) |
| Type generators | Automatic | Manual (via OrixRandom) |
| Determinism focus | No | Yes |
| Orix primitives | No | Yes |
Use QuickCheck when: Need automatic input generation and shrinking.
Use Arbiter when: Need deterministic, reproducible tests with explicit seeds.
Best Practices
1. Always Use Explicit Seeds
// Good[ArbiterScenario(Seed = 42, Category = "Math")]public void Test_Something() { }
// Bad - uses default seed (1)[ArbiterScenario(Category = "Math")]public void Test_Something() { }Why: Explicit seeds make tests reproducible and debuggable.
2. Use ScenarioContext for Simulations
// Good[ArbiterScenario(Seed = 42, MaxTicks = 1000, Category = "Simulation")]public void Simulate_Physics(ScenarioContext ctx){ for (int i = 0; i < 100; i++) { // ... simulation logic ... ctx.HashState(state); ctx.Advance(); }}
// Bad - manual tick trackingpublic void Simulate_Physics(){ var tick = 0; for (int i = 0; i < 100; i++) { tick++; // ... simulation logic ... }}Why: ScenarioContext provides state tracking, timeout detection, and reset.
3. Hash State at Key Points
ctx.HashState(position);ctx.HashState(velocity);ctx.HashState(entityCount);Why: State hashing enables determinism verification and regression detection.
4. Use Categories Consistently
// Good - hierarchical categories"Flux.World""Flux.EntityAllocator""Chronicle.TimeTravel""CRDT.Register"
// Bad - flat categories"Test1""Test2"Why: Hierarchical categories enable filtering by subsystem.
5. Write Determinism Tests for New Features
[ArbiterScenario(Seed = 42, Category = "MyFeature.Determinism")]public void MyFeature_SameSeed_SameResult(){ var result1 = RunMyFeature(42); var result2 = RunMyFeature(42); Assert.Equal(result1, result2);}Why: Catch non-determinism early, before it ships.
Common Patterns
Pattern: Two-Run Determinism Check
[ArbiterScenario(Seed = 42, Category = "Determinism")]public void Feature_IsDeterministic(){ var result1 = RunComplexOperation(seed: 12345); var result2 = RunComplexOperation(seed: 12345); Assert.Equal(result1, result2);}Pattern: Record-Replay
[ArbiterScenario(Seed = 42, Category = "Replay")]public void Record_AndReplay_ProducesSameResult(ScenarioContext ctx){ var values = new List<DFixed64>();
// Record for (int i = 0; i < 100; i++) { values.Add(ctx.RandomValue()); }
// Replay ctx.Reset(); for (int i = 0; i < 100; i++) { Assert.Equal(values[i], ctx.RandomValue()); }}Pattern: Property Verification
[ArbiterScenario(Seed = 42, Category = "Properties")]public void Property_HoldsForRandomInputs(ScenarioContext ctx){ for (int i = 0; i < 1000; i++) { var input = ctx.RandomRange(min, max); var output = ProcessInput(input);
// Verify property Assert.True(PropertyHolds(input, output)); }}Pattern: State Machine Testing
[ArbiterScenario(Seed = 42, Category = "StateMachine")]public void StateMachine_AllTransitions_AreValid(){ var sm = new StateMachine(); Assert.Equal(State.Initial, sm.Current);
sm.Transition(Event.Start); Assert.Equal(State.Running, sm.Current);
sm.Transition(Event.Pause); Assert.Equal(State.Paused, sm.Current);
// ... test all transitions}Pattern: Regression with Hash
[ArbiterScenario( Seed = 42, ExpectedHash = "A1B2C3D4E5F6", Category = "Regression")]public void KnownBehavior_MatchesSnapshot(ScenarioContext ctx){ var result = RunComplexSimulation(ctx); ctx.HashState(result);
// Arbiter automatically verifies final hash}Summary
Arbiter is Orix’s answer to deterministic testing:
- Explicit seeds - Every test declares its seed
- Automatic replay - Determinism verification built-in
- Tick-aware - ScenarioContext for simulations
- State tracking - Hash accumulation for regression detection
- Category-based - Organize tests by feature, not folder
The golden rule: Same seed = same result, always.
Write tests that prove it.
Related Documents
- Atom Foundation - DFixed64 and deterministic primitives tested by Arbiter
- Flux Simulation - ECS runtime with determinism tests
- Echo Replay - Replay verification tests
- Chronicle Time-Travel - Time-travel tests
Next: Technical Deep-Dive - Architecture details for engineers