Knowledge Substrates
Scope
All repositories implementing knowledge storage, retrieval, and distribution β the AtomSpace concept and its backends. For theoretical foundations, see AtomSpace Full, MORK Full, and DAS Full.
Active Repositories
| Repo | Language | Role | Maturity | Purpose |
|---|---|---|---|---|
| atomspace | C++ | Classical implementation | Operational | Production-grade in-RAM hypergraph with pattern matching, ~150 atom types, Scheme/Python bindings. Requires cogutil. |
| MORK | Rust (nightly) | High-performance kernel | Operational | Triemap-based engine with ZAM, bidirectional matching, MM2. 7-crate workspace. Requires sibling PathMap checkout. |
| das | C++ + Python | Distributed layer | Operational | Distributed AtomSpace with Redis/MongoDB backends, Attention Broker, and cognitive query agents. Bazel/Docker build. |
| atomspace-storage | C++ | Base storage API | Operational | StorageNode interface (v4.3.0). File, JSON, Prolog, MeTTa, CSV I/O. ProxyNodes for mirroring/caching. Required by all storage backends. |
| atomspace-rocks | C++ | Local persistence | Operational | RocksDB backend (v1.6.1). Single-user, single-host. Two modes: full DAG (RocksStorageNode) and simple (MonoStorageNode). |
| atomspace-cog | C++ | Network distribution | Operational | Network client (v1.2.0) for CogServer. Multi-threaded (4 sockets). Frame support incomplete. |
| atomspace-bridge | C++ | SQL bridge | Operational | Bidirectional PostgreSQL-to-AtomSpace bridge (v0.2.1). Motivated by FlyBase genome database use case. |
| das-metta-parser | C (Flex/Bison) | DAS ingestion | Operational | Parses MeTTa files into MongoDB/Redis for DAS. Docker-based build. |
| das-toolbox | Python | DAS CLI tooling | Operational | das-cli for infrastructure management (containers, OpenFaaS, MeTTa operations). |
| mork_ffi | Rust + C | MORK-Prolog bridge | Operational | ~150 lines exposing MORK to SWI-Prolog: add-atoms, remove-atoms, match, mm2-exec. Used by PeTTa. |
| CZ2 | Scala 3 | Triemap research | Experimental | Prefix-compressed triemap toolkit (v0.2.17). Cross-compiles to JVM/JS/Native. Inspired by Peyton Jones's paper. |
| MM2_Structuring_Code | Rust / MM2 | MM2 tutorial | Experimental | 30+ progressive examples for MORK's MM2 dataflow language. Requires MORK binary. |
| mork-rust-sdk | Rust | MORK API client | Experimental | Rust client SDK for MORK API. iCog Labs. |
| mork-ts-sdk | TypeScript | MORK API client | Operational | TypeScript client SDK for MORK HTTP API. iCog Labs. |
| faiss_ffi | Rust + C | Vector similarity bridge | Operational | FAISS vector similarity FFI for Prolog/MeTTa. Creates atom-indexed vector spaces for similarity-based retrieval. |
| generate | C++ / Guile | Graph generation | Experimental | Constraint-guided network synthesis using sheaf theory and jigsaw-puzzle connector semantics. Generates parse trees, deduction chains, pathways. Requires cogutil + atomspace. Independently maintained. |
| opencog-cycl | Python | KB ingestion | Experimental | CycL-to-Atomese translator mapping OpenCyc knowledge base entries into AtomSpace. Script-based pipeline. Early-stage research. |
| atomese-simd | C++ / OpenCL / CUDA | GPU compute bridge | Experimental | Bridges Atomese symbolic descriptions to GPU/SIMD hardware via sensory-motor agency model. Generates Atomese IDL for GPU kernel introspection. Built on the sensory system. Independently maintained. |
| cogserver | C++ | Network service layer | Independently maintained | Network server providing telnet/WebSocket/HTTP access to AtomSpace. Server half of atomspace-cog. Independently maintained by original OpenCog contributors outside the Hyperon project. |
| atomspace-pgres | C++ | SQL persistence (deprecated) | Legacy / Deprecated | PostgreSQL persistent backend for AtomSpace. Superseded by atomspace-rocks. Independently maintained. |
How They Fit Together
This family has a clear layered architecture with two parallel lineages:
Classical lineage (OpenCog C++): These repos are independently maintained by original OpenCog contributors, separate from the Hyperon project's active development.
cogutil β atomspace β atomspace-storage β atomspace-rocks (local)
β atomspace-cog (network)
β atomspace-bridge (SQL)
Build order matters: cogutil must be installed first, then atomspace, then atomspace-storage, then any specific backend. All use CMake with the same mkdir build && cd build && cmake .. && make -j pattern.
Hyperon lineage (Rust/distributed):
MORK (+ sibling PathMap) β mork_ffi β PeTTa (Prolog compiler)
β CZ2 (Scala research prototype)
β MM2_Structuring_Code (tutorial)
das β das-metta-parser (ingestion)
β das-toolbox (CLI management)
β MORK (planned high-performance backend)
The two lineages are bridged by the Space API abstraction β MeTTa code can target either lineage via named Spaces.
Extensions and ingestion tools: generate extends AtomSpace with constraint-guided graph synthesis (sheaf theory). opencog-cycl converts external knowledge bases (CycL/OpenCyc) into Atomese. atomese-simd extends AtomSpace to GPU hardware via Atomese IDL, building on the sensory system. These are independently maintained research extensions rather than core infrastructure.
Quick Start
# Classical AtomSpace (requires cogutil installed first)
cd atomspace && mkdir build && cd build && cmake .. && make -j
sudo make install && sudo ldconfig
# MORK (requires nightly Rust + sibling PathMap checkout)
cd MORK && cargo +nightly build --workspace --release
cargo +nightly test --workspace
# DAS (Docker-based)
cd das && make build-all # Builds all components
make test-all # Requires running AtomDB services
# MORK FFI for PeTTa
cd mork_ffi && RUSTFLAGS="-C target-cpu=native" cargo build -p mork_ffi --release
./build.sh
Living Documentation
Active development decisions for MORK, DAS, and the broader substrate layer are discussed in weekly team calls. These transcripts capture the why behind implementation choices β design trade-offs, performance benchmarks, integration priorities β that the code and commit history alone do not preserve.
- MORKification Weekly β Primary development log for MORK. Covers triemap architecture decisions, ZAM evolution, MM2 dataflow design, PathMap integration, and performance trade-offs.
- Magi Weekly β Broader project coordination touching DAS integration, substrate boundary decisions, and infrastructure planning.
- MeTTa Study Group β Language-level discussions that inform substrate API requirements β Space semantics, type system interactions, and grounding patterns.
For agents and future contributors: these transcripts are the best source for understanding why the current architecture looks the way it does, especially for decisions that predate the current codebase state.
Current State vs. Whitepaper
- MORK as primary substrate (whitepaper Β§2.3): Operational for local in-RAM processing via PeTTa. The reported 500M+ atom scale is on powerful development hardware with PeTTa/MORK integration.
- DAS + MORK integration (whitepaper Β§2.5): Under development. DAS handles distributed persistence; MORK handles hot compute. The boundary definition is an active research question.
- Neural Spaces (whitepaper Β§2.2): Proposed β no implementation exists wrapping DNNs as queryable AtomSpaces.
- ShardZipper (whitepaper): Proposed Merkle-based distributed state management for MORK. Not yet implemented.
- ByteFlow GPU acceleration (whitepaper): Proposed adaptive block packing for dense tensors in MORK. atomese-simd represents an earlier, independent approach to GPU integration via Atomese descriptions, but differs from the ByteFlow vision.
Forks and Mirrors
- MORK forks: trueagi-io/MORK is canonical. A local mirror tracks ngeiswei/MORK (experimental fork).
- atomspace-pgres: Deprecated PostgreSQL backend, superseded by atomspace-rocks. Still exists in the reference collection.
- atomspace-gpu: Experimental OpenCL/CUDA AtomSpace β a related but distinct effort from atomese-simd. Neither is actively developed.
- atomspace mirrors: A local mirror tracks a contributor fork of opencog/atomspace.
Explicitly excluded: Visualization tools (atomspace-viz, atomspace-typescript, atomspace-explorer) are developer debugging aids, not storage substrates. They may warrant a future "Developer Tools" family card.
Recommended Entry Points
- Learning AtomSpace concepts: Start with the classical C++ atomspace README β it has the clearest explanation of the Atom/Value distinction.
- High-performance MeTTa: Use PeTTa with mork_ffi for MORK-backed execution.
- Distributed deployment: Use das with das-toolbox for infrastructure management.
- Learning MM2: Work through MM2_Structuring_Code's 28 progressive examples.
- Triemap research: CZ2 provides a clean Scala 3 implementation without MORK's Rust nightly requirements.
Gaps and Consolidation Opportunities
- No unified Space API test suite: The Space API is conceptual β no conformance tests verify that MORK, DAS, and hyperon-experimental implement the same interface.
- atomspace-cog frame support incomplete: Network-distributed frames don't fully work yet.
- MORK requires sibling PathMap checkout: This external dependency isn't documented in all places and can surprise new developers.
- DAS Bazel build is Docker-only: No native build path documented for DAS outside Docker containers.
- GPU integration fragmented: atomspace-gpu and atomese-simd represent two different approaches to GPU acceleration β neither is active, and neither aligns with the whitepaper's ByteFlow vision.
Tags
Discussion