
Problem
Most AI workflows are siloed. Notes, prompts, and useful history end up spread across tools and become hard to search, reuse, or carry forward.
Context / users
Phase 1 is architecture first. The goal is to make context portable instead of trapping it inside one chat product.
My role
I defined the Phase 1 framing, storage model, retrieval shape, security boundaries, and the portfolio case-study translation of that design.
Solution
The Phase 1 model is straightforward: capture raw text, generate embeddings and metadata, store both in Supabase, expose retrieval through MCP, and keep the memory layer model-agnostic.
- Durable thought records that combine raw text, embeddings, metadata, and timestamps in one storage model
- Vector-based semantic retrieval designed around a `vector(1536)` embedding column
- Database-side search function for thresholded similarity matching and metadata-aware filtering
- Security model based on row-level security, server-side secrets, and a separate access key for the MCP endpoint
- Remote MCP connection model so the same memory layer can be used from ChatGPT, Claude Desktop, Claude Code, and other compatible clients
- Explicit separation between capture, storage, retrieval, and client interface layers
Architecture
The core is a `thoughts` table with content, metadata, timestamps, and a `vector(1536)` embedding. Retrieval happens through a database-side similarity function. MCP sits in front of that as the tool interface.
Engineering Details
- • Supabase / Postgres is the Phase 1 storage layer because it can hold raw text, JSON metadata, timestamps, and vector embeddings in one system
- • pgvector enables semantic retrieval, with a `vector(1536)` embedding field sized to match the guide’s embedding setup
- • The schema design uses dedicated indexes for the main access paths: vector similarity, JSON metadata filtering, and recent-item browsing
- • A database-side `match_thoughts` function keeps similarity search close to the data instead of pushing ranking logic entirely into the application layer
- • The security posture is server-oriented: row-level security is enabled, privileged operations are intended to run through a secret-bearing server path, and the MCP layer adds a separate access key boundary
- • The MCP surface is a cleaner abstraction than wiring each AI client directly to the database, because it exposes task-shaped tools instead of raw tables
- • Metadata extraction is treated as a secondary layer; retrieval is driven primarily by embeddings, not by perfect classification
- • Cold-start behavior and remote connector ergonomics are real operational concerns at this layer, so latency and connection handling belong in the technical story
Stack
AI & Machine Learning
- • OpenRouter - embedding generation and lightweight metadata extraction
- • Model-agnostic memory design - retrieval layer is intended to outlive any single model provider
Outcome
- The case study now describes a real Phase 1 system shape instead of only a broad memory thesis
- λcerebro reads more like a storage/retrieval architecture project than a vague AI idea
- The portfolio entry better communicates database design, retrieval design, security boundaries, and MCP delivery together
- Work-in-progress status remains explicit, which keeps the case study credible while still making the technical direction legible
Tradeoffs / Limits
- • There is still no dedicated λcerebro repo or live MCP demo linked from this page
- • This portfolio repo does not yet expose inspectable λcerebro-specific runtime code for the database schema, Edge Function, or deployment flow
- • Claims should stay framed as Phase 1 architecture and implementation direction unless backed by a dedicated codebase or public demo
- • Metadata extraction is inherently best-effort, so retrieval quality should not be described as depending on perfect tagging
- • Remote Edge Functions can introduce cold-start latency, so speed and responsiveness should be described as operational tradeoffs rather than assumed strengths
Why It Matters
It treats AI memory as infrastructure, not a UI feature.
What I'd improve next
- Ship a dedicated λcerebro runtime repo or public demo so the case study can point to inspectable code
- Add bulk-import flows for existing notes, conversations, or external systems
- Add additional capture sources beyond direct AI-tool usage
- Document observability, latency, and error-handling decisions once the runtime is exposed
- Introduce a clearer auth / multi-user model if the system expands beyond single-user memory
Like what you see?
Feel free to reach out if you have questions about this project or want to chat about working together.