Approved by Ursula Addison on 2026-05-08

← Back to WILLIAM

Responsible: Ben Goertzel, Arthur Franz (original WILLIAM concept)

Papers: Goertzel (2025), Hyperon for AGI⇒ASI Whitepaper, §5.12, §7.6; Franz, A., WILLIAM: Adaptive Compression for AGI

Status: Experimental. Core compression-based pattern detection is under development. MORK trie instrumentation is being implemented. Full integration with PLN scheduling, ECAN attention, and neural acceleration remains a research goal.

This card provides technical depth beyond the concise WILLIAM index card. WILLIAM is an adaptive-compression-based approach to cognitive pattern discovery, now being integrated into MORK's trie infrastructure to serve as a real-time guide for both symbolic reasoning and neural processing.

Core Principle

WILLIAM embodies a fundamental insight: the patterns worth remembering are those that compress experience most effectively. It acts as a cognitive feature detector that continuously asks "what's the simplest explanation that captures what I'm seeing?" The weakness prior provides the theoretical foundation — simpler patterns generalize better, and compression naturally identifies what matters without hard-coded heuristics.

Compression Gain

Each pattern application is scored by the description-length reduction it achieves:

Formal definition:

\[\text{gain}(r, S) = L(S) - L(S') - C(r)\]
  • Variables: \(r\) = template/rewrite rule; \(S\) = current state; \(S'\) = state after applying \(r\); \(L(\cdot)\) = description length; \(C(r)\) = dictionary cost of template \(r\)
  • Domain: \(\text{gain} \in \mathbb{R}\); positive values indicate worthwhile compression
  • Assumptions: Under the Compositional Description of Data (CoDD) framework, each template application strictly strengthens the hypothesis — weakness decreases monotonically, so MDL gain is also monotone
  • Meaning: Keep branches where cumulative gain \(G_t = \sum_{i \leq t} \text{gain}(r_i, S_i) > 0\)
  • Source: Franz, A., WILLIAM on MORK clarifications

Theoretical Guarantees (Hierarchical Priors)

When data follows a hierarchical generative process with bounded-size reusable templates, reuse probability \(\rho > 0\), and heavy-hitter separability margin \(\gamma\):

\[K(x) = \sum_i \ell(f_i) + K(r_s) + O\bigl((\log \ell(x))^2\bigr)\]
  • Variables: \(K(x)\) = Kolmogorov complexity of input \(x\); \(\ell(f_i)\) = length of template \(f_i\); \(K(r_s)\) = complexity of residual; \(\ell(x)\) = input length
  • Meaning: WILLIAM-on-MORK with top-\(k\) beam search (where \(k \in \{2, 3\}\) suffices) achieves near-optimal compression in \(O(\log \ell(x))\) steps, each costing \(O(T_{\text{lookup}})\) via MORK trie traversal
  • Source: Franz, A., WILLIAM on MORK clarifications, Theorem 1

MORK Trie Instrumentation

The key implementation step is adding instrumentation directly to MORK's trie nodes. Each node carries:

FieldPurpose
Local occurrence countsHow often this exact node is accessed
Subtree totalsAggregate weight counters across all descendants
Compression-gain sumsCumulative compression benefit when this pattern is applied
Top-\(k\) children rankingsRanked lists of most important children by various metrics

This instrumentation enables weighted iterators that return heavy subpatterns directly from any point in the graph — no global scans required.

API Surface

The API remains minimal:

  • iter_prefix_topk(prefix, k) — returns the \(k\) most important patterns under a given prefix
  • iter_any_topk(k) — finds globally significant patterns across the entire graph
  • A validation API that records actual compression gains when patterns are applied, closing the feedback loop between prediction and outcome

Concurrency Model

The implementation handles concurrent access through per-core write buffers for counts, with wait-free readers using snapshot/RCU (Read-Copy-Update) techniques. This ensures that pattern discovery does not block the main reasoning pipeline.

Consumer Integration

WILLIAM's iterators serve multiple cognitive processes simultaneously:

ConsumerHow It Uses WILLIAM
PLNPrioritizes inference on high-value subgraphs; follows "heavy edges" during backward chaining
SchedulersAllocates resources based on compression-adjusted priorities
ECANReceives compression-driven importance signals for attention allocation
Pattern miningUses heavy subpatterns as seeds for deeper structural discovery
Symbolic HeadsTemplate library creation: mines frequent subgraphs from training text for key-value template stores

Neural Network Acceleration

WILLIAM's integration with neural networks — particularly transformers and predictive coding networks — uses compression metrics to guide computation allocation. Applied to neural internals (especially networks using local learning, which have greater propensity toward compositional representations), WILLIAM finds "heavy-hitter features" to:

  • Prioritize attention heads and tokens: Route computation to where it matters most, guided by compression metrics rather than ad-hoc sparsification rules
  • Trigger on-demand refinement: Expand computation only where uncertainty justifies the cost
  • Prune low-value frequency bands: Remove computation that contributes little to compression, while preserving model capability

The weakness prior provides the theoretical foundation for all three operations — the same quantale-based framework that guides symbolic pattern selection also guides neural sparsification, ensuring a uniform simplicity bias across both domains.

Role in the PRIMUS Cognitive Cycle

In the whitepaper's §5.13 integration picture, WILLIAM occupies a specific position:

  1. Weakness-based control selects regions needing attention
  2. WILLIAM identifies compression-worthy patterns in those regions
  3. Fluid-dynamic ECAN routes attention optimally
  4. PLN and ActPC-Chem refine understanding
  5. MetaMo/SubRep evaluate discovered options
  6. TransWeave determines what can be reused

Patterns discovered by WILLIAM feed directly into ActPC-Chem as chemical rules, and WILLIAM-discovered structures can be transferred to new domains via TransWeave.

Key References

  • Goertzel, B. (2025). Hyperon for AGI⇒ASI Whitepaper, §5.12: WILLIAM-on-MORK, §7.6: WILLIAM-Guided Neural Efficiency
  • Franz, A. WILLIAM: Adaptive Compression for AGI
  • Franz, A. WILLIAM on MORK clarifications (Theorem 1: hierarchical prior guarantees)

Related cards: PRIMUS Full · MORK Full · ECAN Full



Discussion