Weakness Theory
Responsible: Ben Goertzel
Papers: Goertzel (2025), Weakness Is All You Need: Quantale Weakness as a Unifying Generalized-Occam Principle (v9 draft); Hyperon for AGI⇒ASI Whitepaper (2025), §5.1; Goertzel (2025), PLN AND Confidence Framework (technical note)
Status: Proposed. Theoretical framework described in the 2025 whitepaper and developed at length in the Weakness Theory monograph (v9, ~350 pages, 35 chapters). The PLN-quantale integration is under active theoretical development. System-wide deployment across all cognitive modules remains a research goal.
Weakness theory proposes a single mathematical framework that makes "simplicity" precise and computable across all cognitive processes. Rather than a one-size-fits-all minimum description length metric, it defines a family of Occam's razors — one per algorithmic context — each native to the algebra that algorithm already uses.
Core Idea
Weakness measures how much two items "fail to be distinguished" in a precise quantale-theoretic sense. A quantale is an enriched algebraic structure \((V, \otimes, \bigoplus, \leq)\) that captures how a particular domain combines and compares things. The general weakness of a subgraph \(H\) in an enriched category with measure \(\mu\) is:
\[w(H) = \bigoplus_{(u,v) \in H} \bigl[\mu(u) \otimes \mu(v)\bigr]\]
Intuitively: high-status relationships are those joining two valuable entities; a collection is weak to the extent it contains enough high-status (obvious, non-discriminating) relationships. Weakness thus captures the idea of "restricting the least" — a weak hypothesis is one true in more possible worlds.
By defining weakness within each algorithm's native quantale, one obtains simplicity metrics natural to each context:
- In logic (PLN): prefer weaker rules — those true in more possible worlds — via set/Heyting operations
- In neural networks: prefer smooth, generalizable solutions via commutativity constraints
- In evolutionary search (MOSES): prefer shorter, more transferable programs
- In Schrödinger bridge dynamics: prefer lower-disturbance paths through information space
The key insight from the whitepaper: if one has multiple hypotheses explaining the same data, the "weaker" one — true in more possible worlds — will generalize better.
Weakness and Logical Entropy
A foundational result in the monograph (Theorem 2) connects weakness to logical entropy: for any finite partition \(\Pi\),
\[w(H_\Pi) + h(\Pi) = 1\]
where \(h(\Pi) = 1 - \sum_j p(B_j)^2\) is the logical entropy (probability two independent draws yield different outcomes). Maximally weak hypotheses collapse everything into one block (zero logical entropy); maximally strong hypotheses isolate each element (maximum logical entropy). This provides a smooth, interpretable simplicity bias estimable in statistical settings.
Weakness Composition
In the probabilistic quantale \(([0,1], \times, \tilde{+})\), where \(\otimes = \times\) (multiplication) and \(\bigoplus\) is probabilistic union, the composition of independent weaknesses follows the inclusion-exclusion formula:
\[w_1 \oplus w_2 = 1 - (1 - w_1)(1 - w_2)\]
This arises because the combined weakness of two independent hypotheses \(H_1, H_2\) is the probability that a random pair falls into either indistinguishability set: \(w(H_1 \cup H_2) = w_1 + w_2 - w_1 w_2\). The formula ensures that composing weak (non-discriminating) components yields a result that is itself weak — weakness composes monotonically under independence. More complex composition rules emerge when working in non-probabilistic quantales (logic quantales, quantum-operator quantales, SVM-margin quantales).
Composability Across Paradigms
Because weaknesses in different algorithms live in different but compatible quantales, the monograph proposes composing them via functorial maps. This would let a planner, learner, or reasoner carry a single simplicity preference across neural, symbolic, and evolutionary submodules without flattening them into one metric.
The PLN-Quantale Framework
The monograph (§3.3) constructs a dedicated quantale for PLN Simple Truth Values. Each PLN statement carries a pair \((n^+, n^-)\) of positive and negative evidence counts, with operations:
\[(n^+, n^-) \oplus (m^+, m^-) = (n^+ + m^+,\; n^- + m^-)\] \[(n^+, n^-) \otimes (m^+, m^-) = (n^+ m^+,\; n^- m^-)\]
where \(\oplus\) models union of independent micro-situations by adding evidence counts, and \(\otimes\) models independent conjunction by multiplying proportions. A strict monoidal homomorphism connects this to the Constructible Duality Logic (CDL) quantale, demonstrating that CDL and PLN are different truth-value attachments to the same underlying logic quantale of possible-world predicates (§3.4). Both preserve weakness ordering and mutual information under their respective embeddings.
Unified PLN-NARS Perspective and Jeffrey's Rule
The PLN AND Confidence Framework (technical note) demonstrates the unification concretely for the conjunction rule: PLN and NARS differ primarily in their handling of context — PLN maintains explicit term probabilities \(P(A|C)\), while NARS uses direct projection without them. The NARS AND formula can be derived from PLN by setting uniform priors and a specific structural weakness. The framework recommends different computational strategies depending on data availability:
- Few samples (\(n < 10\)): NARS-style simple product with overlap — fast, robust to sparse data
- Moderate data (\(10 \leq n \leq 100\)): Hybrid via Jeffrey's Rule combination — balances distributional information with tolerance for mixed evidence quality
- Sufficient data (\(n > 100\)): PLN with full information-theoretic estimation — precise confidence calibration via entropy and mutual information
This positions NARS and PLN not as competing systems but as endpoints of a data-regime continuum within the same quantale structure, with Jeffrey's Rule (Jeffrey conditionalization) providing the principled interpolation for moderate-data situations where some distributional information is available but sample sizes are limited.
Neural Network Applications
For neural networks using predictive coding (monograph Part IV, Ch 27–31), weakness manifests as commutativity and monotonicity constraints: different layer updates should commute (order shouldn't matter much) and compose monotonically (adding layers shouldn't cause wild behavioral changes). The monograph shows how predictive coding can be reformulated as weakness-minimizing learning, with the McBride derivative of weakness enabling differentiable proof search and gradient descent for neural-symbolic hierarchical models.
Relationship to Description Length
Weakness and MDL are inversely coupled via a Kraft-type bound (§4.1): for any prefix-free description scheme,
\[L(h) \geq \log_2 N - \log_2 w(h)\]
Shorter descriptions cannot rule out too many worlds. In the CoDD (Combinatory Decision DAGs) computational model, this correlation is especially tight, making CoDD a natural language for weakness-oriented program search.
Intended Role in PRIMUS
Weakness is designed as one of two architecture-wide formal controls (alongside geodesic effort), intended to govern PLN inference scheduling, MOSES program evolution, WILLIAM pattern selection, TransWeave transfer filters, neural network regularization, and self-modification admission bounds. The monograph (Ch 33) explicitly presents weakness as a unifying bias across all PRIMUS components, and Ch 22 formalizes cognitive synergy itself in weakness-theoretic terms via lax-monoidal weakness functors.
Key References
- Goertzel, B. (2025). Weakness Is All You Need: Quantale Weakness as a Unifying Generalized-Occam Principle for Cognitive Science and AGI (v9 draft, ~350pp).
- Goertzel, B. (2025). Hyperon for AGI⇒ASI Whitepaper, §5.1: Weakness Theory.
- Goertzel, B. (2025). PLN AND Confidence Framework (technical note). §6–7: unified PLN-NARS perspective, Jeffrey's Rule for moderate-data regimes.
- Bennett, M.T. (2025). Set-theoretic weakness as satisfaction-set size (foundational formulation).
Â
Tags
Discussion
Â