WILLIAM Full

Approved by Ursula Addison on 2026-05-08

← Back to WILLIAM

Responsible: Ben Goertzel, Arthur Franz (original WILLIAM concept)

Papers: Goertzel (2025), Hyperon for AGI⇒ASI Whitepaper, §5.12, §7.6; Franz, A., WILLIAM: Adaptive Compression for AGI

Status: Experimental. Core compression-based pattern detection is under development. MORK trie instrumentation is being implemented. Full integration with PLN scheduling, ECAN attention, and neural acceleration remains a research goal.

This card provides technical depth beyond the concise WILLIAM index card. WILLIAM is an adaptive-compression-based approach to cognitive pattern discovery, now being integrated into MORK's trie infrastructure to serve as a real-time guide for both symbolic reasoning and neural processing.

Core Principle

WILLIAM embodies a fundamental insight: the patterns worth remembering are those that compress experience most effectively. It acts as a cognitive feature detector that continuously asks "what's the simplest explanation that captures what I'm seeing?" The weakness prior provides the theoretical foundation — simpler patterns generalize better, and compression naturally identifies what matters without hard-coded heuristics.

Compression Gain

Each pattern application is scored by the description-length reduction it achieves:

Formal definition:

\[\text{gain}(r, S) = L(S) - L(S') - C(r)\]

Variables: \(r\) = template/rewrite rule; \(S\) = current state; \(S'\) = state after applying \(r\); \(L(\cdot)\) = description length; \(C(r)\) = dictionary cost of template \(r\)
Domain: \(\text{gain} \in \mathbb{R}\); positive values indicate worthwhile compression
Assumptions: Under the Compositional Description of Data (CoDD) framework, each template application strictly strengthens the hypothesis — weakness decreases monotonically, so MDL gain is also monotone
Meaning: Keep branches where cumulative gain \(G_t = \sum_{i \leq t} \text{gain}(r_i, S_i) > 0\)
Source: Franz, A., WILLIAM on MORK clarifications

Theoretical Guarantees (Hierarchical Priors)

When data follows a hierarchical generative process with bounded-size reusable templates, reuse probability \(\rho > 0\), and heavy-hitter separability margin \(\gamma\):

\[K(x) = \sum_i \ell(f_i) + K(r_s) + O\bigl((\log \ell(x))^2\bigr)\]

Variables: \(K(x)\) = Kolmogorov complexity of input \(x\); \(\ell(f_i)\) = length of template \(f_i\); \(K(r_s)\) = complexity of residual; \(\ell(x)\) = input length
Meaning: WILLIAM-on-MORK with top-\(k\) beam search (where \(k \in \{2, 3\}\) suffices) achieves near-optimal compression in \(O(\log \ell(x))\) steps, each costing \(O(T_{\text{lookup}})\) via MORK trie traversal
Source: Franz, A., WILLIAM on MORK clarifications, Theorem 1

MORK Trie Instrumentation

The key implementation step is adding instrumentation directly to MORK's trie nodes. Each node carries:

Field	Purpose
Local occurrence counts	How often this exact node is accessed
Subtree totals	Aggregate weight counters across all descendants
Compression-gain sums	Cumulative compression benefit when this pattern is applied
Top-\(k\) children rankings	Ranked lists of most important children by various metrics

This instrumentation enables weighted iterators that return heavy subpatterns directly from any point in the graph — no global scans required.

API Surface

The API remains minimal:

iter_prefix_topk(prefix, k) — returns the \(k\) most important patterns under a given prefix
iter_any_topk(k) — finds globally significant patterns across the entire graph
A validation API that records actual compression gains when patterns are applied, closing the feedback loop between prediction and outcome

Concurrency Model

The implementation handles concurrent access through per-core write buffers for counts, with wait-free readers using snapshot/RCU (Read-Copy-Update) techniques. This ensures that pattern discovery does not block the main reasoning pipeline.

Consumer Integration

WILLIAM's iterators serve multiple cognitive processes simultaneously:

Consumer	How It Uses WILLIAM
PLN	Prioritizes inference on high-value subgraphs; follows "heavy edges" during backward chaining
Schedulers	Allocates resources based on compression-adjusted priorities
ECAN	Receives compression-driven importance signals for attention allocation
Pattern mining	Uses heavy subpatterns as seeds for deeper structural discovery
Symbolic Heads	Template library creation: mines frequent subgraphs from training text for key-value template stores

Neural Network Acceleration

WILLIAM's integration with neural networks — particularly transformers and predictive coding networks — uses compression metrics to guide computation allocation. Applied to neural internals (especially networks using local learning, which have greater propensity toward compositional representations), WILLIAM finds "heavy-hitter features" to:

Prioritize attention heads and tokens: Route computation to where it matters most, guided by compression metrics rather than ad-hoc sparsification rules
Trigger on-demand refinement: Expand computation only where uncertainty justifies the cost
Prune low-value frequency bands: Remove computation that contributes little to compression, while preserving model capability

The weakness prior provides the theoretical foundation for all three operations — the same quantale-based framework that guides symbolic pattern selection also guides neural sparsification, ensuring a uniform simplicity bias across both domains.

Role in the PRIMUS Cognitive Cycle

In the whitepaper's §5.13 integration picture, WILLIAM occupies a specific position:

Weakness-based control selects regions needing attention
WILLIAM identifies compression-worthy patterns in those regions
Fluid-dynamic ECAN routes attention optimally
PLN and ActPC-Chem refine understanding
MetaMo/SubRep evaluate discovered options
TransWeave determines what can be reused

Patterns discovered by WILLIAM feed directly into ActPC-Chem as chemical rules, and WILLIAM-discovered structures can be transferred to new domains via TransWeave.

Key References

Goertzel, B. (2025). Hyperon for AGI⇒ASI Whitepaper, §5.12: WILLIAM-on-MORK, §7.6: WILLIAM-Guided Neural Efficiency
Franz, A. WILLIAM: Adaptive Compression for AGI
Franz, A. WILLIAM on MORK clarifications (Theorem 1: hierarchical prior guarantees)

Related cards: PRIMUS Full · MORK Full · ECAN Full