close fullscreen

About Hyperon+AtomSpace+AtomSpace Full+Design Evolution and Performance

help edit space_dashboard

Design Evolution: What Was Tried and Why (mailing-list-backed, opencog-ml 2014–2023)

The current AtomSpace design was shaped by a decade of experimentation with alternatives, each abandoned for specific technical reasons:

  • Atoms immutable by design (2014): "Easiest and best way to support multi-threading — making them mutable would require locks and crazy logic in all sorts of obscure places" (Linas Vepstas). Identical atoms deduplicate to single instance automatically. The formal argument (Vepstas 2023) is stronger than convenience: because a metatree may be shared as a subtree of many larger trees, editing any node requires deciding what happens to all containing trees — the only consistent solution is copy-on-write, making immutability necessary, not merely desirable. Immutable metatrees can be traversed lock-free even while other threads create or delete. The mutable form (the top-level master index over immutable subtrees) is the "database" — in OpenCog, this is the AtomSpace. (Provenance: publication, Vepstas — "Graphs, Metagraphs, RAM, CPU" v2.1.1, 2023)
  • IPFS backend abandoned (2019): Code-complete, 6/7 tests passing, but fundamentally unsuitable — centralized index, DHT queries taking minutes, only hundreds of atoms/sec vs. 100K+ in-RAM. IPFS is "surprisingly terrible" for this use case.
  • OpenDHT (Kademlia) abandoned (2020): Hashing atoms across the planet destroys locality of reference. Solution: use DHT for indexes only, serve actual atoms via "seeders" (BitTorrent-style).
  • UUID-based identity rejected (2021): Requires central authority (bottleneck), creates ~30% RAM overhead for lookup tables. Solution: use atom name directly (globally unique, easy to compute).
  • Serialization overhead is the primary bottleneck (2020): Postgres with ZeroMQ/protobuf: ~100 atoms/sec. Neo4j: 95% CPU spent serializing. ASCII file reader: ~100K atoms/sec. Raw in-RAM: 700K nodes/sec. "Converting 12-byte objects into other representations has just a huge overhead." Conclusion: "Placing atoms into a database is pointless and useless" for active reasoning.
  • Fractional indexing at O(1) (2020): AtomSpace maintains per-atom incoming/outgoing sets rather than global indexes. Adding one atom updates O(1) fractional indexes, vs. commercial DBs' O(N log K). Three index entries per binary link. Cost: ~632 bytes/atom in RAM (MOZI dataset: 7M atoms = 4.3 GB) vs. 55 bytes as s-expressions.
  • Natural chunking via recursive incoming sets (2020): "Given atom X, the natural chunk is the entire recursive incoming set of X." Hypergraphs have natural boundaries unlike regular graphs which snowball. This insight eventually informed MORK's ShardZipper partitioning.
  • Automatic alpha-conversion was contentious (2017): Silently renaming bound variables on scope-link insertion caused practical problems for URE and PLN developers. The eventual conclusion: the chainer should do alpha-conversion on the fly, not the AtomSpace on insertion. (Is-automatic-alpha-conversion-evil)
  • Contextual AtomSpaces proposed (2014): An AtomSpace could have an associated context atom, so all contents would implicitly be in that context. This foreshadowed Hyperon's multi-Space architecture but was not implemented in OpenCog Classic. (Contextual Atomspaces)

What survived: immutable atoms, name-based identity, fractional indexes, pattern matcher as core query engine, s-expression serialization, no eventual consistency requirement. AtomSpace Frames (2022) added snapshot changesets for inference context, implemented atop RocksDB.

Production validation: The classical AtomSpace has "been used in production systems, pumping through tens of billions of Atoms in dozens of threads, with run-times extending into weeks, without crashing." (Provenance: official-site, wiki.opencog.org— AtomSpace design notes)

Rejected serialization formats: RESTful APIs, ZeroMQ, Neo4J, Protocol Buffers, and JSON were all evaluated and rejected because "Atoms are tiny, and converting them from native Atomese to other formats is a giant waste of CPU time." RocksDB succeeded by storing bare s-expression strings directly — lossless compression achieves "a few dozen bytes" per atom, making 100M-atom databases only "a few GBytes." (Provenance: official-site, wiki.opencog.org— AtomSpace design notes)

Threading scaling (classical): Thread-safety via C++ std::shared_ptr<> with atomic reference counting, constrained by CPU cache-line availability for hardware atomic locks. Observed scaling: AMD Opteron 12-core achieved only 3× speedup (4 hardware locks); AMD Ryzen 5 3400G achieved 8×; AMD Ryzen 9 3900X achieved only 7× on 24 threads — illustrating the diminishing returns of cache-line contention. (Provenance: official-site, wiki.opencog.org— AtomSpace design notes)

Hyperon scalability targets: Current baseline is ~100 million atoms per live instance and ~1 billion atoms storable via StorageNode (~50GB file), with half-a-dozen networked AtomSpaces via ProxyNode. The Hyperon redesign targets "going beyond these current limits" with static pattern matching using free variables in both queries and knowledge base entries — "substantially different from the current query engine" and enabling "efficient distributed implementation." A key unresolved design question: whether to implement only distributed AtomSpace, only distributed episodic memory (via grounded atoms), or both as separate container types. (Provenance: official-site, wiki.opencog.org— Hyperon:Atomspace design notes)

Performance Architecture (mailing-list-backed)

Performance observations from the classical AtomSpace that informed Hyperon's design:

  • Atom add/remove throughput: ~500K atoms/second in-RAM. The callback-notification mechanism for atom changes was "in the critical performance path" capable of 10-100× slowdown if misused. (Atomspace-dynamics-visualisation, 2014)
  • Pattern Matcher O(N²) on numerical domains: PM works "very slowly" for NumberNode/GreaterThanLink queries because it enumerates all pairs for virtual links. Proposed solutions: SMT-style delegation to specialized solvers, SpaceServer integration for spatial queries, and cover trees for numerical indexing. (Pattern-Matching-performance, 2018)
  • No systematic benchmarks existed: An accidental 5-10× performance regression went unnoticed for months. Linas revealed "I accidentally slowed down the atomspace performance by maybe 5x or 10x and basically, no one noticed." (Performance-benchmarks, 2018)
  • Parallelization: most time in callbacks, not the matcher: "The time actually spent in the pattern matcher tends to be small, compared to the time spent in the callback." Running large independent queries in parallel is more effective than micro-parallelizing the matcher itself. This insight directly influenced MORK's parallel architecture. (Contributing-to-Parallelizing-Pattern-Matcher, 2018)
  • Pattern mining vs. pattern matching: Using the PM for mining (estimated 96,351 hours for 2-gram mining of a 2M-atom Wikipedia corpus) was a fundamental algorithmic mismatch — PM is designed for "find X such that P(X)" while mining requires "find P such that count(P) is large." (A-disappointing-evaluation, 2014)
  • Sparse graph memory bottleneck: A 5.3M node graph with Zipfian edge distribution requires one hash-table or btree access per float multiply during matrix operations. The 4KB paging MMU granularity is a fundamental bottleneck — the insight that motivates MORK's locality-optimized triemap layout. (Perfect-architecture-for-OpenCog, 2018)
  • TinkerPop/Gremlin comparison: A Gremlin traversal equals a single-clause pattern match; AtomSpace's multi-clause queries, typed links, and hypergraph support give it fundamental advantages. Marketing was identified as the core adoption barrier: Neo4j had 466K Google results vs. AtomSpace's 4K. (Graph-Traversal-Machine-Close-Encounters, 2018)