An archetype is the fundamental grouping mechanism in the ECS.
- Entities that share the same set of components share an archetype
- Archetypes map directly to a table schema definitions.
class Archetype:
BASE_SCHEMA = pa.schema([
pa.field("world_id", pa.string(), nullable=False),
pa.field("run_id", pa.string(), nullable=False),
pa.field("entity_id", pa.int32(), nullable=False),
pa.field("tick", pa.int32(), nullable=False),
pa.field("is_active", pa.bool_(), nullable=False),
])
PARTITION_KEYS = ["world_id", "run_id", "tick"]
def __init__(self, components: list["Component"]):
self.components = components
self.sig: ArchetypeSignature = self.sig_from_components(components)
self.name = self.get_name(self.sig)
self.schema = self.get_archetype_schema(self.sig)
@staticmethod
def sig_from_components(components: list["Component"]) -> ArchetypeSignature:
component_types = [type(c) for c in components]
return tuple(sorted(component_types, key=lambda t: t.__name__))
@staticmethod
def get_name(sig: ArchetypeSignature) -> str:
combined_schema = Archetype.get_archetype_schema(sig)
schema_hash = hashlib.sha256(str(combined_schema).encode()).hexdigest()[:16]
return f"a_{len(sig)}c_s{schema_hash}"
@staticmethod
def get_archetype_schema(sig: ArchetypeSignature) -> pa.Schema:
archetype_schema = Archetype.BASE_SCHEMA
for component_type in sig:
component_schema = component_type.get_prefixed_schema()
archetype_schema = pa.unify_schemas([archetype_schema, component_schema])
return archetype_schema
@staticmethod
def to_row_dict(
entity_id: int, tick: int, components: list[Component], world_id: str, run_id: str
) -> dict[str, Any]:
row_dict = {
"world_id": str(world_id), "run_id": str(run_id),
"entity_id": entity_id, "tick": tick, "is_active": True,
}
for c in components:
row_dict.update({c.get_prefix() + k: v for k, v in c.model_dump().items()})
return row_dict
Signatures¶
An ArchetypeSignature is a tuple of component types, sorted alphabetically by class name:
from archetype.core.archetype import Archetype
from archetype.core.interfaces import ArchetypeSignature
# ArchetypeSignature = tuple[type[Component], ...]
sig = Archetype.sig_from_components([Position(x=0, y=0), Velocity(vx=1, vy=0)])
# sig == (Position, Velocity) -- sorted by __name__
Sorting ensures signatures are deterministic regardless of the order components are passed in.
Naming¶
Each archetype gets a compact, filesystem-safe table name:
a_2c_s9f3a1b2c4d5e6f7
| | |
| | +-- SHA-256 hash of the PyArrow schema (first 16 chars)
| +------ number of component types
+--------- "a" prefix (archetype)
Names are stable -- the same set of component types always produces the same name, regardless of component order. This allows multiple simulations and runs to share the same catalog.
name = Archetype.get_name(sig) # "a_2c_s9f3a1b2c4d5e6f7"
Schema¶
An archetype schema combines a base set of housekeeping columns with prefixed component fields:
schema = Archetype.get_archetype_schema(sig)
Base columns (present in every archetype):
| Column | Type | Description |
|---|---|---|
world_id |
string |
Which world this entity belongs to |
run_id |
string |
Which run produced this row |
entity_id |
int32 |
Unique entity identifier |
tick |
int32 |
Simulation tick when this row was written |
is_active |
bool |
Whether the entity is alive |
Component columns are prefixed with the lowercase class name. A Position(x=5, y=10) component adds columns position__x and position__y.
The full schema for an archetype with (Health, Position) would be:
world_id | run_id | entity_id | tick | is_active | health__current | health__max_hp | position__x | position__y
Partition Keys¶
Archetypes are partitioned by ["world_id", "run_id", "tick"] for efficient storage filtering. This lets the querier skip irrelevant partitions when reading a specific world at a specific tick.
Composing Signatures¶
Add or remove component types from an existing signature:
# Add a component type
new_sig = Archetype.add_components(sig, [Health])
# (Health, Position, Velocity)
# Remove a component type
new_sig = Archetype.remove_components(sig, [Velocity])
# (Position,)
Both return a new sorted tuple -- signatures are immutable.
Row Serialization¶
Convert an entity's components to a flat dictionary for storage:
row = Archetype.to_row_dict(
entity_id=1,
tick=0,
components=[Position(x=5, y=10), Velocity(vx=1, vy=0)],
world_id="abc-123",
run_id="run-001",
)
# {
# "world_id": "abc-123",
# "run_id": "run-001",
# "entity_id": 1,
# "tick": 0,
# "is_active": True,
# "position__x": 5.0,
# "position__y": 10.0,
# "velocity__vx": 1.0,
# "velocity__vy": 0.0,
# }
Entities¶
An entity is an integer ID (entity_id). It carries no logic — its state is the union of its component fields. The world tracks each entity's current archetype signature via an internal _entity2sig mapping.
When you add or remove components from an entity, it migrates to a different archetype: the old row is marked inactive, and a new row is spawned in the target archetype's table with the updated component set. See Worlds -- Entity Migration for the full algorithm.
Further Reading¶
- Components -- field types, Arrow serialization, the column prefixing contract
- System Execution -- how signatures drive the subset rule for processor matching
- Worlds -- tick lifecycle, spawn/despawn caches, entity migration
- Stores -- how archetype tables are persisted
Source Reference¶
The archetype system is defined in src/archetype/core/archetype.py.