Back to Blog

Frame Codex: A Public Digital Garden for AI

November 15, 20247 min readBy Frame Team

Frame Codex started as a simple idea: what if we treated knowledge the same way developers treat code? Version-controlled, peer-reviewed, openly accessible, and structured for machines to consume.

The Problem with Traditional CMSs

Most knowledge bases are built on CMSs designed for human readers. They have rich UIs, WYSIWYG editors, and complex databases. But when you want to feed that knowledge to an LLM or build a RAG system, you hit walls:

  • Locked in databases – Content lives in MySQL/Postgres, not accessible via simple HTTP
  • No version control – Changes aren't tracked, no git history, no diffs
  • Opaque structure – Relationships between content are implicit, not machine-readable
  • Heavy infrastructure – Requires servers, auth, APIs just to read static content

Enter the Digital Garden

The "digital garden" movement treats content as living documents that grow and evolve over time. Instead of polished blog posts frozen in time, you cultivate interconnected notes that improve continuously.

Frame Codex takes this concept and adds structure for AI consumption:

1. Git-Native

Every piece of content is a markdown file in a GitHub repo. This gives us:

  • Full version history (who changed what, when, why)
  • Pull request workflow (peer review before merge)
  • Branch-based experimentation
  • Distributed collaboration (fork, modify, PR)

2. Recursive Hierarchy

We organized content into Weaves → Looms → Strands:

  • Strand: Atomic knowledge unit (one .md file)
  • Loom: Curated collection (folder + loom.yaml manifest)
  • Weave: Complete universe (top-level folder + weave.yaml)
  • Fabric: The entire graph when all weaves are materialized together

This isn't arbitrary. The three-tier structure maps perfectly to graph databases and enables powerful algorithms (more on this in our next post).

3. Machine-Readable Metadata

Every strand has YAML frontmatter with rich metadata:

---
id: uuid-here
slug: intro-to-recursion
title: "Introduction to Recursion"
difficulty: intermediate
taxonomy:
  subjects: [technology, knowledge]
  topics: [algorithms, computer-science]
tags: [recursion, algorithms, tutorial]
relationships:
  references: [other-strand-id]
  prerequisites: [basics-strand-id]
---

LLMs can parse this instantly. No API calls, no database queries, just raw text.

Why OpenStrand Loves This

OpenStrand is our personal knowledge management system. It ingests any file type (PDFs, videos, code, images) and serializes them to markdown strands.

Frame Codex is the public subset of that knowledge. When you create a strand in OpenStrand, you can choose to:

  1. Keep it private (local SQLite/PGlite)
  2. Share with team (self-hosted PostgreSQL)
  3. Publish to Frame Codex (open PR to this repo)

Same schema, same tooling, zero friction. Your personal notes can become humanity's knowledge with one click.

SQL Caching: The Secret Sauce

We index thousands of markdown files on every commit. Naively, this would take 30+ seconds. We brought it down to 2-5 seconds using SQL-cached incremental indexing:

  • Store SHA-256 hash of each file in .cache/codex.db
  • On next run, compute diff: only re-process changed files
  • Cache persists in GitHub Actions via actions/cache
  • 85-95% cache hit rate on typical PRs

Read the full technical breakdown in our SQL caching post.

Static NLP Pipeline

We auto-categorize content using TF-IDF, n-gram extraction, and vocabulary matching. No LLM calls, no API keys, runs in CI for free:

  • TF-IDF: Extract important keywords per document
  • N-grams: Find common phrases (2-3 word sequences)
  • Vocabulary matching: Map to controlled taxonomy
  • Readability scoring: Flesch-Kincaid grade level

Output: codex-index.json (searchable) and codex-report.json (analytics).

Try It Yourself

Frame Codex is fully open source. You can:

Next in Series

In our next post, we'll explore the mathematical elegance of recursive knowledge structures and how they enable powerful graph algorithms.

Read: Recursive Knowledge Schema →