Data Model¶
Records are stored in the LelielStore in-memory columnar store backed by DuckDB for write-through durability. There is no graph database. The factor graph is built in memory from the ingested records and updated on every ingest event.
Record¶
The canonical unit of data in Leliel. Every record is one node in the factor graph.
| Field | Type | Required | Description |
|---|---|---|---|
record_id |
string | Yes | Unique identifier; primary key within the store |
source_id |
string | Yes | Producer identifier; used to scope list and query results |
signal_value |
float | Yes | Normalised quality signal in [0.0, declared_upper_bound] |
declared_upper_bound |
float | No | Upper bound for signal_value; defaults to 1.0 |
timestamp |
string | No | ISO 8601 timestamp; used for recency fallback ordering |
extra_data |
JSON object | No | Arbitrary metadata; field-value pairs drive the IDF semantic index |
Re-ingesting an existing record_id updates the columnar metadata in-place without resetting
the accumulated mass. Mass is preserved across upserts.
Factor graph¶
The factor graph is the in-memory graph over which the quantum walk evolves.
Nodes: one node per record. Node index i aligns exactly with the columnar metadata
vectors so all field reads are O(1) index accesses with no hash lookup.
Co-occurrence edges: within each source, a sliding window of width K=5 connects each record to its K nearest neighbours by ingest order. Edges represent temporal proximity in the ingested signal stream.
Cross-source co-failure edges: when a record is ingested with a negative signal (signal_value low relative to declared_upper_bound), up to 3 edges are drawn to the nearest-mass records from other sources. These edges capture cross-source failure correlation.
Wormhole edges (structural): after each ingest, the Fiedler eigenvector of the graph
Laplacian is cached. Any two high-mass nodes whose Fiedler components differ by less than
WORMHOLE_ALPHA * std(v2) receive a wormhole edge. Wormhole edges are injected into the
walk snapshot at query time. They represent structural proximity in the ER=EPR sense:
nodes that are topologically equivalent in the Laplacian embedding may exhibit correlated
behaviour even when not directly connected.
Mass field¶
Every node carries a scalar utility mass value. Mass accumulates from REINFORCE feedback and decays via Hawking radiation. Three sources contribute to the effective mass used by the walk Hamiltonian:
| Component | Source | Description |
|---|---|---|
| Utility mass | REINFORCE feedback | Accumulates on negative signal; decays on positive signal and over time |
| Structural mass | Fiedler proximity | Wormhole edges encode structural co-location (ER=EPR) |
| Semantic mass | IDF feature similarity | Cold-start bias toward semantically similar records via extra_data IDF vectors |
Utility mass is the primary component at query time. Structural and semantic components operate via edge injection and walk seed bias respectively rather than as additive mass terms.
Hawking decay: mass decays continuously with a half-life of 86400 seconds (one day).
The decay is time-proportional and applied lazily on each write; the decay constant is
derived from the one-day half-life via alpha_H = ln(2) / 86400.
Schwarzschild threshold: M_s = lambda_2(L) * log(1/epsilon) / (G * K). Derived from
the Fiedler eigenvalue of the live graph; not a configurable constant. Nodes above M_s are
black holes: the walk's Born-rule amplitude suppresses them without explicit exclusion logic.
DuckDB schema¶
DuckDB provides write-through durability. The hot query path reads exclusively from the in-memory columnar store; no SQL is issued during query execution.
records¶
Durable record store. One row per record.
| Column | Type | Description |
|---|---|---|
record_id |
text | Primary key; matches the in-memory node identifier |
source_id |
text | Producer identifier |
signal_value |
double | Quality signal |
declared_upper_bound |
double | Declared amplitude upper bound |
timestamp |
text | ISO 8601 timestamp string, nullable |
extra_data |
text | JSON-serialised extra metadata, nullable |
ingested_at |
timestamp | UTC time this record was written to DuckDB |
record_masses¶
Durable mass index. One row per record. Updated on REINFORCE writes.
| Column | Type | Description |
|---|---|---|
record_id |
text | Foreign key to records |
mass |
double | Current utility mass value |
last_updated |
timestamp | UTC time of most recent mass write |
analysis_summaries¶
Source-level analysis summaries computed by the background analysis worker.
| Column | Type | Description |
|---|---|---|
source_id |
text | Source identifier |
record_count |
integer | Number of records in this source |
mean_signal |
double | Mean signal_value across records |
mean_mass |
double | Mean mass across records |
black_hole_count |
integer | Records above M_s at time of summary |
computed_at |
timestamp | UTC time this summary was computed |
mesh_snapshots¶
Append-only SLEM trend log written by the analysis worker each cycle.
| Column | Type | Description |
|---|---|---|
slem |
double | Second Largest Eigenvalue Modulus of the walk weight matrix |
spectral_gap |
double | 1 - slem |
node_count |
integer | Graph size at snapshot time |
computed_at |
timestamp | UTC time of this snapshot |