Applications+Bioinformatics

Approved by Ursula Addison on 2026-06-10

ai generated

Bioinformatics

Ecosystem

human approved

Authors: Ben Goertzel

Contributors: iCog Labs

Papers: Hyperon for AGI⇒ASI Whitepaper (2025), §9.3

GitHub: agi-bio (genomics/proteomics), biochatter-metta (NL-to-MeTTa queries), pubchem2metta (PubChem conversion), bio-semantic-parser (biological data parsing), bio-data-semantic-parsing (bio data semantic parsing pipeline)

Status: Active pilot. Data ingestion pipelines operational (agi-bio, biochatter-metta, pubchem2metta). End-to-end hypothesis generation on longevity datasets is a near-term milestone.

Biology is fundamentally structured as graphs: genes connect to proteins, proteins form pathways, pathways influence phenotypes, drugs modulate these relationships. Hyperon's graph-native architecture is well-suited to combining noisy biological graphs, mining meaningful motifs, running uncertain chains of reasoning, and proposing ranked hypotheses with clear rationales.

Data Pipeline

Data flows into AtomSpace through BioSpace adapters that transform omics matrices into node attributes, protein-protein interactions and pathways into edges, literature triples into assertions with provenance, and clinical outcomes into noisy links with confidence scores. Everything receives CIDs, making merges auditable and reproducible. Existing tools include:

agi-bio — legacy genomic and proteomic data exploration (C++/Scheme/Python), extended by MOZI.AI as SingularityNET services
biochatter-metta — Converts natural language biomedical questions into MeTTa queries against the Human BioAtomspace knowledge graph using LLMs with BioCypher schema
pubchem2metta — Converts PubChem RDF chemical data into MeTTa format via BioCypher adapters
bio-semantic-parser — iCog Labs biological data parsing tool for extracting structured representations from biological datasets
bio-data-semantic-parsing — iCog Labs pipeline for semantic parsing of biological data into knowledge graph-compatible formats

Proposed Hypothesis Generation Pipeline

The whitepaper describes a pipeline where Pattern Miner identifies motifs (e.g., "gene A ↔ pathway P ↔ phenotype Y with drug D evidence") ranked by I-surprisingness, WILLIAM promotes frequent subgraphs to reusable templates, PLN factor-graphs propagate graded truth over ontologies and experimental results, and MOSES/GEO-EVO evolves predictive programs. TransWeave would move mechanism components across cohorts or omics platforms when matches hold strong.

The proposed output would be ranked hypothesis packs — auditable CID bundles containing mechanism graphs, predictors, expected biomarkers, and counter-evidence — with experiment selection guided by geodesic f·g control.

Key References

Goertzel, B. (2025). Hyperon for AGI⇒ASI Whitepaper, §9.3: Bioinformatics

Tags

ai generated

Bioinformatics

Ecosystem

human approved

Discussion