Skip to main content
euca-dataset is the data face of the engine: it extracts the world’s exact state as structured data, hashes it for reproducibility, and (via the render path) emits aligned ground-truth modalities. Because the truth is read from the authoritative CPU state, generated appearance never changes it — which is what lets Euca double as a world-model answer key.

Structured state

  • WorldStateGraph — the observable world as { tick, entity_count, entities }, serialized to canonical, deterministic JSON. This is the structured form behind observe; the HTTP /observe route returns the flattened RichEntityData projection.
  • state_digest (FNV-1a) — a 64-bit hash of the observable state over a canonical ordering (entities by id, components by name, fields in declaration order). Equal digests ⇔ equal observable state — the reproducibility oracle behind Determinism. Two AI-generated visual skins of the same world produce a byte-identical digest.

Ground-truth modalities

When rendered, the render path emits aligned ground-truth channels into euca-dataset (not the color image — these are exact labels):
  • Entity-id segmentation — a per-pixel entity index (index + 1; 0 = background), so a pixel maps deterministically to an entity_id.
  • Metric depth — a per-pixel depth in world units.
These ship today. The broader spatial suite — semantic segmentation, surface normals, camera pose, optical flow, multi-camera — is the Tier-1 build target and is not all present yet; don’t assume the full modality set.

Aligned bundles

The offline data face exports a deterministic rollout as aligned layers — video plus object/field/ graph/causal projections, actions, and counterfactuals — each example addressable by (episode_id | stream_id, tick, entity_id), with deterministic tick↔frame and pixel→entity mappings. The structured projection, causal projection, counterfactual harness, action logging, and Parquet/manifest export are in place; the full GT modality suite and a few adapters are still being built.
Dataset extraction is primarily an offline/in-process capability (see the world_model_capabilities and experiment examples in the repo), not a plain :3917 HTTP route. The online path — a live world an agent learns against step by step — is covered in Evaluation.

Status

  • WorldStateGraph, canonical JSON, state_digest (FNV-1a), entity-id segmentation + metric depth channels, aligned-bundle export (object/causal projection, counterfactuals, actions, Parquet/manifest) — shipped.
  • 🟡 The full spatial GT suite (normals, semantic-seg, camera pose, optical flow, multi-camera) and some scoring adapters are in progress.

Evaluation

How exact state + the answer key become a model score.