Mork theory
How to Use This Source
Articulating Some Conditions Where ZAM/MORK Yield Benefit: A Selectivity Theorem and a Hierarchical Corollary
Ben Goertzel, October 6, 2025. ~14K chars. Internal working paper (GPT-5-Pro production, lightly edited).
This is the formal mathematical foundation for MORK's performance claims. It proves that under mild independence conditions and hierarchical data distributions, ZAM/MORK's path-indexed intersection strategy reduces multi-leg join candidates to \(O(1)\) — constant work fed to the unifier — while WAM-style backtracking remains \(\Omega(N)\) per leg. The paper then argues that real-world Atomspaces (proofs, programs, NL parses, knowledge graphs) satisfy these conditions.
Reading strategy: For the core result, read §2–3 (definitions + Theorem 1). For the generative model that justifies real-world applicability, read §4 (Hierarchical Corollary). §7 is the most important section for understanding why the theorem matters in practice — it argues that five domains all satisfy the hierarchical assumptions.
Section Map — Selectivity Theorem (Goertzel, 2025)
| § | Section | Key Content | Cards Grounded |
|---|---|---|---|
| 1 | Introduction (what ZAM/MORK are) | MORK = content-addressed store (Merkle DAG) + path-indexed trie (PathMap). ZAM = execution model exploiting path algebra for streaming candidate generation. Central question: under what data distributions and query shapes does this yield advantage? | MORK, MORK Full |
| 2 | Model and Definitions | Selectivity exponent: \(\gamma(p) = -\log_N(|P(p)|/N)\), where \(P(p)\) is the posting list for prefix \(p\). Higher \(\gamma\) = more selective prefix. \(\varepsilon\)-independence: \(\Pr[\bigcap_i p_i] \in [(1-\varepsilon)\prod_i \Pr[p_i],\, (1+\varepsilon)\prod_i \Pr[p_i]]\). Cost model: PathMap enumerates and intersects postings in near-linear time; unifier cost \(c_u\) per candidate. | MORK Full |
| 3 | Main Theorem (Theorem 1) | Under \(\varepsilon\)-independence: \(\mathbb{E}|P_\cap| = \Theta(N^{1-\sum_i \gamma(p_i)})\). Three regimes: • \(\sum_i \gamma(p_i) > 1\): \(\mathbb{E}|P_\cap| = O(1)\) — constant candidates to unifier • \(\sum_i \gamma(p_i) = 1\): \(\mathbb{E}|P_\cap| = \Theta(1)\) • \(\sum_i \gamma(p_i) < 1\): sublinear \(\Theta(N^{1-\sum_i \gamma(p_i)})\) Expected ZAM work: \(O(\sum_i N^{1-\gamma(p_i)} + c_u \cdot N^{1-\sum_i \gamma(p_i)})\) | MORK Full |
| 4 | Hierarchical Model Corollary (Corollary 1) | Generative model: rooted tree of symbols, Dirichlet-distributed children (subuniform concentration), exponential prefix probability decay \(\Pr[p] \leq C a^{-d}\). Corollary: if \(\sum_i d_i \geq (\log N / \log a)(1+\delta)\), then \(\sum_i \gamma(p_i) > 1\) and \(\mathbb{E}|P_\cap| = O(1)\). In words: even moderately deep bindings across a few legs suffice. | MORK Full |
| 5 | Comparison to WAM-style Backtracking | WAM without path algebra: \(\Omega(N)\) candidates per leg, \(\Omega(N^k)\) worst case across \(k\) legs. ZAM under selectivity conditions: constant candidates, near-linear work in short posting lists. | MeTTa Implementations (PeTTa/metta-wam comparison context) |
| 6 | Practical Guidance and Measurement | Estimate \(\hat{\gamma}(p)\) empirically from PathMap. Plot \(\sum_i \hat{\gamma}(p_i)\) across workload. Small \(\varepsilon\) suffices. Hash-consing complements trie (RAM + \(O(1)\) equality) but does not replace selectivity algebra. | MORK Full |
| 7 | Why Real-World Atomspaces Are Approximately Hierarchical | Five domains analyzed: (1) logic proofs and tactics, (2) program ASTs, (3) NL semantic parses, (4) scene/instance graphs, (5) heterogeneous knowledge graphs. All satisfy: compositional generativity (multiplicative path probabilities), heavy-tailed conditionals (Zipf/Heaps subuniform priors → exponential prefix decay), reuse/normalization (effective depth \(D = O(\log N)\)), weak dependence across legs (distinct positions → \(\varepsilon\)-independence). | MORK Full, PLN Full, Semantic Parsing Full |
Key Concepts Index
| Concept | Section | Definition / Role |
|---|---|---|
| Selectivity exponent \(\gamma(p)\) | §2 | Normalized information carried by prefix \(p\); governs posting list size |
| \(\varepsilon\)-independence | §2 | Mild, checkable condition for joint posting list estimation |
| Selectivity sum \(\sum_i \gamma(p_i)\) | §3 | Determines join regime: >1 → constant candidates |
| Hierarchical generative model | §4 | Tree-structured Dirichlet conditionals with exponential prefix decay |
| Compositional generativity | §7 | Structures built by repeated conditioned expansion → multiplicative path probabilities |
| PathMap / posting list | §1–2 | Trie-indexed set of atoms matching a prefix; enumerable in near-linear time |
| ZAM streaming | §1, §3 | Exploits selective prefixes to prune before unification |
Companion MORK Papers
The Selectivity Theorem provides the foundational efficiency argument. These companion papers build specific algorithms on the MORK substrate:
| Paper | Source Card | Relationship |
|---|---|---|
| MORK Tensor Networks (Goertzel, Oct 2025) | MORK Tensor Networks | Bridge from path algebra to GPU tensor logic. ShardZipper workflow. Hierarchical Resolution Transformer example. ~19K chars. |
| MORK Slots (Goertzel, Oct 2025) | MORK Slots | Slot-centric indexing vs. permutation explosion. Probabilistic space-time tradeoff for materialized views. ~6K chars. |
| AdaptiMORK v8 (Goertzel, Sep 2025) | AdaptiMORK v8 | Adaptive streaming compression with parameter elevation and template reuse. WILLIAM integration. ~43K chars. |
| MORK Miner | MORK Miner | Pattern mining on MORK substrate (posting-list seed+grow pipeline). |
| Mork William | Mork William | WILLIAM incremental compression on MORK data structures. |
| PLN Chaining On MORK | PLN Chaining On MORK | PLN forward/backward chaining exploiting MORK path algebra. |
| PLN FactorGraph MORK | PLN FactorGraph MORK | Factor-graph message passing for PLN on MORK. Multiple versions (v1, v2). |
| Moses Mork | Moses Mork | MOSES evolutionary learning on MORK substrate. |
| Ebt Pc Mork | Ebt Pc Mork | Evidence-based predictive coding on MORK. |
| Chaining Mork Cache | Chaining Mork Cache | Caching strategies for chaining queries on MORK. |
Enrichment Targets
| Gap | Source | Target Card | Priority |
|---|---|---|---|
| Tensor logic bridge (ShardZipper → GPU kernels, semiring generalization) | MORK Tensor Networks §1–5 | MORK Full (ShardZipper section) | High |
| Slot-centric indexing tradeoff formula | MORK Slots §2–4 | MORK Full | Medium |
| AdaptiMORK parameter elevation and template library | AdaptiMORK v8 §1–3 | WILLIAM Full or MORK Full | Medium |
| PLN factor-graph message passing on MORK | PLN FactorGraph MORK | PLN Full | Medium |
| Hierarchical Resolution Transformer (HRT) example | MORK Tensor Networks §6 | No dedicated card — possible new subcard | Low |
| MOSES on MORK integration specifics | Moses Mork | MOSES Full | Low |
Best-Source-For Shortcuts
- "Why is MORK faster than a WAM?" → Selectivity Theorem §3 (Theorem 1) + §5 (WAM comparison)
- "When does ZAM give constant-time joins?" → §3: when \(\sum_i \gamma(p_i) > 1\)
- "Does the selectivity theorem apply to my data?" → §7 (five domain arguments) + §6 (empirical measurement guide)
- "How does MORK use GPUs?" → MORK Tensor Networks §1–5 (ShardZipper, tensor logic, Viterbi)
- "How does pattern mining work on MORK?" → MORK Miner + MORK Slots (slot-centric seed+grow)
- "How does PLN run on MORK?" → PLN Chaining On MORK + PLN FactorGraph MORK
- "How does WILLIAM integrate with MORK?" → Mork William + AdaptiMORK v8
Source Card
- Selectivity Theorem (Goertzel, Oct 2025) — ID 3803, ~14K chars
Tags
Discussion