Architecture

Archetype is a data-centric Entity-Component-System (ECS) simulation engine. World state is columnar DataFrames. Every tick is an append-only write to storage. This gives you time-travel, forking, and replay for free.

Core Abstractions¶

classDiagram class Component { +to_row_dict() +get_prefix() } class AsyncWorld { +world_id +tick +resources +create_entity() +add_components() +remove_entity() +step() +run() } class AsyncProcessor { +components +priority +process() } class AsyncSystem { +add_processor() +remove_processor() +execute() } class Resources { +insert() +require() +get() } class AsyncStore { +get_archetype_df() +append() +shutdown() } class QueryManager { +get_archetype() +query_archetype() } class UpdateManager { +update() } class CommandBroker { +enqueue() +dequeue_due() +get_history() } class ServiceContainer { +world_service +command_service +simulation_service +query_service +broker } AsyncWorld --> AsyncSystem AsyncWorld --> Resources AsyncWorld --> QueryManager : reads AsyncWorld --> UpdateManager : writes AsyncSystem --> AsyncProcessor QueryManager --> AsyncStore UpdateManager --> AsyncStore ServiceContainer --> CommandBroker ServiceContainer --> AsyncWorld AsyncProcessor --> Component : requires

Layers¶

archetype.api / cli          External interface (REST + HTTP client)
       │
archetype.app                Services, RBAC, CommandBroker, WorldRegistry
       │
archetype.core               AsyncWorld, AsyncProcessor, Resources, Storage

The system runs as a single archetype serve process. The CLI is a thin HTTP client.

Core ECS Concepts¶

Components¶

Data-only value objects. A Component is a Pydantic model that defines the schema for one aspect of an entity.

from archetype.core.component import Component

class Position(Component):
    x: float = 0.0
    y: float = 0.0

class Health(Component):
    current: int = 100
    max_hp: int = 100

Components are stored as prefixed columns in Arrow tables: position__x, position__y, health__current, etc.

Entities¶

An entity is just an integer ID (entity_id). It has no behavior — it's a bag of components. Entities with the same set of component types are grouped into archetypes.

Archetypes¶

An archetype is a group of entities sharing the same component types. Each archetype is a single DataFrame where: - Rows are entities - Columns are prefixed component fields + metadata (entity_id, tick, world_id, run_id, is_active)

This columnar layout means bulk operations across thousands of entities are a single DataFrame transform.

Processors¶

Processors are pure DataFrame transforms that run each tick. They define which components they need, and the system routes the right archetypes to them.

from daft import DataFrame, col
from archetype.core.aio.async_processor import AsyncProcessor

class MovementProcessor(AsyncProcessor):
    components = (Position, Velocity)
    priority = 10

    async def process(self, df: DataFrame, **kwargs) -> DataFrame:
        return df.with_columns({
            "position__x": col("position__x") + col("velocity__vx"),
            "position__y": col("position__y") + col("velocity__vy"),
        })

Processors run in priority order (lower = earlier) and can access shared state via Resources.

Resources¶

A type-safe dependency injection container scoped to each world. Processors use it to access shared configuration, brokers, or any object.

world.resources.insert(SimConfig(gravity=9.8))

# In a processor:
config = resources.require(SimConfig)

Tick Lifecycle¶

Each tick executes these phases:

1. pre_tick hooks fire
2. For each archetype (in parallel):
   a. Query previous state (DataFrame)
   b. Materialize deferred mutations (spawns/despawns)
   c. Execute matching processors in priority order
   d. Persist updated DataFrame to storage
3. Update in-memory live snapshots
4. Increment tick counter
5. post_tick hooks fire

Mutations (spawn, despawn, add/remove components) are deferred — they queue during a tick and apply at the start of the next tick. This ensures consistency within a single tick.

Service Layer¶

The service layer mediates all access to worlds.

ServiceContainer¶

Wires everything together:

from archetype.app.container import ServiceContainer

container = ServiceContainer()
# container.world_service     — world lifecycle
# container.command_service   — command submission
# container.simulation_service — tick stepping
# container.query_service     — read path
# container.broker            — command queue
# container.storage_service   — storage backends

Command Flow¶

All mutations from external actors flow through the command pipeline:

CommandService.submit() — accepts a Command with type, payload, tick, priority
CommandBroker.enqueue() — validates RBAC via ActorCtx, enforces quotas, queues by priority
SimulationService.step() — drains due commands, applies them to the world, steps processors
QueryService — reads world state (current or historical)

RBAC¶

Every command submission requires an ActorCtx specifying the actor's roles:

Roles are flat (not hierarchical) — an actor can have multiple roles:

Role	Permissions
`viewer`	Read-only (query, get state, get world)
`player`	spawn, despawn, update, message, custom
`coder`	add/remove components, update
`operator`	trajectory ingestion and labeling
`maintainer`	spawn, despawn, components, processors, update
`admin`	All commands (wildcard)

Quotas: 500 commands per tick, 200k token budget per day.

Storage¶

World state is persisted as Arrow tables to LanceDB (default) or Iceberg. Each tick is an append — nothing is overwritten. This gives you:

Time-travel: Query any tick's state
Replay: Re-run from any checkpoint
Forking: Branch a world to explore alternatives
Audit: Full command history

Storage is configured via StorageConfig:

from archetype.core.config import StorageConfig, StorageBackend

config = StorageConfig(
    uri="./my_data",
    namespace="experiment_1",
    backend=StorageBackend.LANCEDB,  # default
)

World Forking¶

Create a new world from a snapshot of an existing one:

from archetype.core.config import StorageConfig

new_world = await container.world_service.fork_world(
    source_world_id=original.world_id,
    name="branch-A",
    storage_config=StorageConfig(),
)

The fork gets a system-generated world_id and an identical entity/component snapshot at the source's current tick. Source and fork then diverge independently.

What's cloned: tick, entity-to-signature mapping, entity counter, live archetype snapshots (re-stamped with the new world_id), processors, and non-broker resources.

What's not cloned: pending spawn/despawn caches (step first to materialize), lifecycle hooks, and the CommandBroker (re-injected by the service).

Use this for MCTS, counterfactual reasoning, or A/B testing simulation strategies.