Mork theory

Draft — This content has not been approved for publication.

How to Use This Source

Articulating Some Conditions Where ZAM/MORK Yield Benefit: A Selectivity Theorem and a Hierarchical Corollary
Ben Goertzel, October 6, 2025. ~14K chars. Internal working paper (GPT-5-Pro production, lightly edited).

This is the formal mathematical foundation for MORK's performance claims. It proves that under mild independence conditions and hierarchical data distributions, ZAM/MORK's path-indexed intersection strategy reduces multi-leg join candidates to \(O(1)\) — constant work fed to the unifier — while WAM-style backtracking remains \(\Omega(N)\) per leg. The paper then argues that real-world Atomspaces (proofs, programs, NL parses, knowledge graphs) satisfy these conditions.

Reading strategy: For the core result, read §2–3 (definitions + Theorem 1). For the generative model that justifies real-world applicability, read §4 (Hierarchical Corollary). §7 is the most important section for understanding why the theorem matters in practice — it argues that five domains all satisfy the hierarchical assumptions.

Section Map — Selectivity Theorem (Goertzel, 2025)

§	Section	Key Content	Cards Grounded
1	Introduction (what ZAM/MORK are)	MORK = content-addressed store (Merkle DAG) + path-indexed trie (PathMap). ZAM = execution model exploiting path algebra for streaming candidate generation. Central question: under what data distributions and query shapes does this yield advantage?	MORK, MORK Full
2	Model and Definitions	Selectivity exponent: \(\gamma(p) = -\log_N(\|P(p)\|/N)\), where \(P(p)\) is the posting list for prefix \(p\). Higher \(\gamma\) = more selective prefix. \(\varepsilon\)-independence: \(\Pr[\bigcap_i p_i] \in [(1-\varepsilon)\prod_i \Pr[p_i],\, (1+\varepsilon)\prod_i \Pr[p_i]]\). Cost model: PathMap enumerates and intersects postings in near-linear time; unifier cost \(c_u\) per candidate.	MORK Full
3	Main Theorem (Theorem 1)	Under \(\varepsilon\)-independence: \(\mathbb{E}\|P_\cap\| = \Theta(N^{1-\sum_i \gamma(p_i)})\). Three regimes: • \(\sum_i \gamma(p_i) > 1\): \(\mathbb{E}\|P_\cap\| = O(1)\) — constant candidates to unifier • \(\sum_i \gamma(p_i) = 1\): \(\mathbb{E}\|P_\cap\| = \Theta(1)\) • \(\sum_i \gamma(p_i) < 1\): sublinear \(\Theta(N^{1-\sum_i \gamma(p_i)})\) Expected ZAM work: \(O(\sum_i N^{1-\gamma(p_i)} + c_u \cdot N^{1-\sum_i \gamma(p_i)})\)	MORK Full
4	Hierarchical Model Corollary (Corollary 1)	Generative model: rooted tree of symbols, Dirichlet-distributed children (subuniform concentration), exponential prefix probability decay \(\Pr[p] \leq C a^{-d}\). Corollary: if \(\sum_i d_i \geq (\log N / \log a)(1+\delta)\), then \(\sum_i \gamma(p_i) > 1\) and \(\mathbb{E}\|P_\cap\| = O(1)\). In words: even moderately deep bindings across a few legs suffice.	MORK Full
5	Comparison to WAM-style Backtracking	WAM without path algebra: \(\Omega(N)\) candidates per leg, \(\Omega(N^k)\) worst case across \(k\) legs. ZAM under selectivity conditions: constant candidates, near-linear work in short posting lists.	MeTTa Implementations (PeTTa/metta-wam comparison context)
6	Practical Guidance and Measurement	Estimate \(\hat{\gamma}(p)\) empirically from PathMap. Plot \(\sum_i \hat{\gamma}(p_i)\) across workload. Small \(\varepsilon\) suffices. Hash-consing complements trie (RAM + \(O(1)\) equality) but does not replace selectivity algebra.	MORK Full
7	Why Real-World Atomspaces Are Approximately Hierarchical	Five domains analyzed: (1) logic proofs and tactics, (2) program ASTs, (3) NL semantic parses, (4) scene/instance graphs, (5) heterogeneous knowledge graphs. All satisfy: compositional generativity (multiplicative path probabilities), heavy-tailed conditionals (Zipf/Heaps subuniform priors → exponential prefix decay), reuse/normalization (effective depth \(D = O(\log N)\)), weak dependence across legs (distinct positions → \(\varepsilon\)-independence).	MORK Full, PLN Full, Semantic Parsing Full

Key Concepts Index

Concept	Section	Definition / Role
Selectivity exponent \(\gamma(p)\)	§2	Normalized information carried by prefix \(p\); governs posting list size
\(\varepsilon\)-independence	§2	Mild, checkable condition for joint posting list estimation
Selectivity sum \(\sum_i \gamma(p_i)\)	§3	Determines join regime: >1 → constant candidates
Hierarchical generative model	§4	Tree-structured Dirichlet conditionals with exponential prefix decay
Compositional generativity	§7	Structures built by repeated conditioned expansion → multiplicative path probabilities
PathMap / posting list	§1–2	Trie-indexed set of atoms matching a prefix; enumerable in near-linear time
ZAM streaming	§1, §3	Exploits selective prefixes to prune before unification

Companion MORK Papers

The Selectivity Theorem provides the foundational efficiency argument. These companion papers build specific algorithms on the MORK substrate:

Paper	Source Card	Relationship
MORK Tensor Networks (Goertzel, Oct 2025)	MORK Tensor Networks	Bridge from path algebra to GPU tensor logic. ShardZipper workflow. Hierarchical Resolution Transformer example. ~19K chars.
MORK Slots (Goertzel, Oct 2025)	MORK Slots	Slot-centric indexing vs. permutation explosion. Probabilistic space-time tradeoff for materialized views. ~6K chars.
AdaptiMORK v8 (Goertzel, Sep 2025)	AdaptiMORK v8	Adaptive streaming compression with parameter elevation and template reuse. WILLIAM integration. ~43K chars.
MORK Miner	MORK Miner	Pattern mining on MORK substrate (posting-list seed+grow pipeline).
Mork William	Mork William	WILLIAM incremental compression on MORK data structures.
PLN Chaining On MORK	PLN Chaining On MORK	PLN forward/backward chaining exploiting MORK path algebra.
PLN FactorGraph MORK	PLN FactorGraph MORK	Factor-graph message passing for PLN on MORK. Multiple versions (v1, v2).
Moses Mork	Moses Mork	MOSES evolutionary learning on MORK substrate.
Ebt Pc Mork	Ebt Pc Mork	Evidence-based predictive coding on MORK.
Chaining Mork Cache	Chaining Mork Cache	Caching strategies for chaining queries on MORK.

Enrichment Targets

Gap	Source	Target Card	Priority
Tensor logic bridge (ShardZipper → GPU kernels, semiring generalization)	MORK Tensor Networks §1–5	MORK Full (ShardZipper section)	High
Slot-centric indexing tradeoff formula	MORK Slots §2–4	MORK Full	Medium
AdaptiMORK parameter elevation and template library	AdaptiMORK v8 §1–3	WILLIAM Full or MORK Full	Medium
PLN factor-graph message passing on MORK	PLN FactorGraph MORK	PLN Full	Medium
Hierarchical Resolution Transformer (HRT) example	MORK Tensor Networks §6	No dedicated card — possible new subcard	Low
MOSES on MORK integration specifics	Moses Mork	MOSES Full	Low

Best-Source-For Shortcuts

"Why is MORK faster than a WAM?" → Selectivity Theorem §3 (Theorem 1) + §5 (WAM comparison)
"When does ZAM give constant-time joins?" → §3: when \(\sum_i \gamma(p_i) > 1\)
"Does the selectivity theorem apply to my data?" → §7 (five domain arguments) + §6 (empirical measurement guide)
"How does MORK use GPUs?" → MORK Tensor Networks §1–5 (ShardZipper, tensor logic, Viterbi)
"How does pattern mining work on MORK?" → MORK Miner + MORK Slots (slot-centric seed+grow)
"How does PLN run on MORK?" → PLN Chaining On MORK + PLN FactorGraph MORK
"How does WILLIAM integrate with MORK?" → Mork William + AdaptiMORK v8

Source Card

Selectivity Theorem (Goertzel, Oct 2025) — ID 3803, ~14K chars