blog: add metadata search, schema workflows, and agent knowledge architecture posts

bm-clawd · bm-clawd · commit be041356227b · 2026-03-03T19:20:08.000-06:00
diff --git a/content/9.blog/5.metadata-search.md b/content/9.blog/5.metadata-search.md
@@ -0,0 +1,151 @@
+---
+title: "Stop Grepping Your Frontmatter"
+description: "Search notes by status, priority, tags, or any custom field — structured queries instead of text-string guessing."
+---
+
+You've been searching for notes like this:
+
+```
+search_notes("status: active")
+```
+
+And sometimes it works. And sometimes it returns notes that mention the word "active" in a completely different context. And sometimes it misses notes where the frontmatter says `status: in-progress` because you searched for "active" and those are different strings.
+
+This is text search doing a job that structured queries should be doing.
+
+---
+
+## Frontmatter Is Data. Search It Like Data.
+
+Every Basic Memory note has YAML frontmatter — the metadata block at the top of the file:
+
+```yaml
+---
+title: API Refactor Plan
+type: task
+status: active
+priority: high
+tags: [backend, q1-2026]
+assigned: paul
+---
+```
+
+With v0.19.0, you can search these fields directly:
+
+```
+search_notes(metadata_filters={"status": "active"})
+```
+
+That returns every note where `status` is literally `active` in the frontmatter. Not notes that happen to contain the word "active" somewhere in their body. Not fuzzy matches. Exact structured queries on structured data.
+
+## Building Up
+
+Start simple — find all active tasks:
+
+```
+search_notes(metadata_filters={"status": "active"})
+```
+
+Filter by multiple fields — active tasks that are high priority:
+
+```
+search_notes(metadata_filters={
+    "status": "active",
+    "priority": "high"
+})
+```
+
+Use operators — find anything high or critical:
+
+```
+search_notes(metadata_filters={
+    "priority": {"$in": ["high", "critical"]}
+})
+```
+
+Combine with text search — active tasks about authentication:
+
+```
+search_notes(
+    query="authentication",
+    metadata_filters={"status": "active"}
+)
+```
+
+Search by tags using the shorthand:
+
+```
+search_notes(tags=["backend", "q1-2026"])
+```
+
+Or the tag syntax in the query itself:
+
+```
+search_notes("tag:backend AND tag:security")
+```
+
+## The Operators
+
+Beyond simple equality, you get range and set operations:
+
+```
+# Notes created after a date
+search_notes(metadata_filters={
+    "created": {"$gte": "2026-02-01"}
+})
+
+# Priority in a specific set
+search_notes(metadata_filters={
+    "priority": {"$in": ["high", "critical"]}
+})
+
+# Sprint number between 10 and 15
+search_notes(metadata_filters={
+    "sprint": {"$between": [10, 15]}
+})
+
+# Notes with a specific tag (array contains)
+search_notes(metadata_filters={
+    "tags": "security"
+})
+```
+
+Available operators: `$in`, `$gt`, `$gte`, `$lt`, `$lte`, `$between`. They work on strings, numbers, and dates.
+
+## Why This Matters
+
+Metadata search turns your knowledge base into something closer to a database — without giving up the plain-text format that makes it readable and editable.
+
+Your notes are still markdown files. You can open them in any editor. But when your AI agent needs to find "all active high-priority tasks tagged backend," it doesn't have to guess at text patterns. It queries the frontmatter directly and gets precise results.
+
+This is especially powerful combined with the [schema system](/blog/schema-workflows). Define what fields a task note should have, validate that they're consistent, then query them with confidence. The schema ensures the data is there. Metadata search makes it findable.
+
+## The Practical Pattern
+
+Here's how this changes daily workflow:
+
+**Morning standup:** "What tasks are active and high priority?"
+```
+search_notes(metadata_filters={"status": "active", "priority": "high"}, tags=["sprint-current"])
+```
+
+**Weekly review:** "What did we complete this week?"
+```
+search_notes(metadata_filters={"status": "done", "completed": {"$gte": "2026-02-24"}})
+```
+
+**Project scoping:** "Show me all notes tagged for the API refactor."
+```
+search_notes(tags=["api-refactor"])
+```
+
+No custom tooling. No project management SaaS. Just frontmatter in markdown files, queried through MCP tools that any AI assistant can call.
+
+---
+
+[Metadata search guide →](/concepts/semantic-search)
+[Schema system →](/concepts/schema-system)
+
+---
+
+*Basic Memory is local-first AI knowledge infrastructure. Plain text files, structured when you need it, searchable always. [Get started →](https://basicmemory.com)*
diff --git a/content/9.blog/6.schema-workflows.md b/content/9.blog/6.schema-workflows.md
@@ -0,0 +1,182 @@
+---
+title: "Your Knowledge Base Grew. Here's How to Keep It Honest."
+description: "Using Basic Memory's schema system to bring consistency to organic knowledge bases — without migration scripts or upfront planning."
+---
+
+You started with a few notes. Then a few dozen. Now you have hundreds, and you've noticed something: your AI writes "person" notes three different ways.
+
+Some have a `[name]` observation. Some use `[full_name]`. A few skip the name entirely and put it in the title. Relations are worse — `works_at`, `employed_by`, `employer`, `company` — four ways to say the same thing.
+
+This is normal. Knowledge bases grow organically. You don't sit down on day one and design a schema for every note type you'll ever write. You write what you need, when you need it, and structure emerges from use.
+
+The problem is that inconsistency compounds. Your AI can't reliably query "all people and their employers" if the relation type varies across notes. Metadata search returns incomplete results when some notes use `status: active` and others use `state: in-progress`.
+
+The schema system fixes this — not by forcing structure upfront, but by observing what you've already written and codifying the patterns.
+
+---
+
+## Step 1: Discover What You Have
+
+Start by asking your AI to look at what exists:
+
+```
+"I've been writing a lot of person notes. Can you figure out
+what they have in common?"
+```
+
+Behind the scenes, the AI runs `schema_infer` on your person notes. It analyzes observation categories, relation types, and frontmatter fields across every note with `type: person`. Then it proposes a schema:
+
+```
+Done! I analyzed 45 person notes. Here's what I found:
+
+Required (in 90%+ of notes):
+  - name: string — full name
+  - role: string — job title or function
+
+Optional (in 25-89%):
+  - works_at: Organization — employer
+  - expertise: string[] — areas of knowledge
+  - email: string — contact email
+
+Common relations:
+  - works_at → Organization (37 notes)
+  - collaborates_with → Person (12 notes)
+
+Want me to create this as a schema?
+```
+
+## Step 2: Create the Schema
+
+Say yes. The AI creates a schema note — which is just a regular markdown file:
+
+```yaml
+---
+title: Person
+type: schema
+entity: person
+version: 1
+schema:
+  name: string, full name
+  role?: string, job title
+  works_at?: Organization, employer
+  expertise?(array): string, areas of knowledge
+  email?: string, contact email
+settings:
+  validation: warn
+---
+```
+
+This lives in your knowledge base alongside everything else. Open it in VS Code, Obsidian, any editor. It's a note about how notes should look.
+
+The `?` marks optional fields. `validation: warn` means the schema advises but doesn't block. Your AI reads this schema when creating new person notes and follows it as a guide.
+
+## Step 3: Validate What Exists
+
+Now check your existing notes against the schema:
+
+```
+"How do my existing person notes look against that schema?"
+```
+
+The AI runs `schema_validate`:
+
+```
+42 of 45 person notes pass validation.
+
+3 issues found:
+  - "John Smith" — missing [name] observation (name is in title only)
+  - "Dr. Sarah Chen" — uses [full_name] instead of [name]
+  - "Bob (contractor)" — missing [role] observation
+
+Want me to fix these?
+```
+
+No migration script. No database alteration. The AI reads each note, identifies what's missing or inconsistent, and offers to edit them. You approve or adjust.
+
+## Step 4: Detect Drift Over Time
+
+A month later, run `schema_diff`:
+
+```
+"Has my person note format drifted from the schema?"
+```
+
+```
+Drift detected in Person schema:
+
+New fields found in notes (not in schema):
+  - linkedin_url (appeared in 8 notes)
+  - department (appeared in 15 notes)
+
+Declared fields with low usage:
+  - email (only 12% of notes — consider making optional or removing)
+
+Relation type variations:
+  - "employed_by" used 3 times (schema says "works_at")
+```
+
+Now you decide: add `department` to the schema because it's useful? Normalize `employed_by` back to `works_at`? Drop `email` because nobody uses it? The schema evolves with your knowledge base, not against it.
+
+---
+
+## The Full Workflow Loop
+
+This is what schema-managed knowledge looks like in practice:
+
+1. **Write freely** — don't worry about structure upfront
+2. **Infer** — let the AI discover patterns in what you've written
+3. **Codify** — create a schema from those patterns
+4. **Validate** — check existing notes and fix outliers
+5. **Create** — new notes follow the schema automatically
+6. **Drift** — periodically check if reality has diverged from the definition
+7. **Evolve** — update the schema to match how your knowledge base actually works
+
+It's the same cycle that good database teams follow — observe, define, validate, evolve — but applied to a plain-text knowledge base managed through conversation.
+
+## Why Schemas Are Just Notes
+
+This was a deliberate design decision. Schemas could have been configuration files, database tables, or API-only constructs. We made them notes because:
+
+- **You can read them.** Open the file, see exactly what "Person" means in your knowledge base.
+- **You can edit them.** Change a field name in your editor, save, done.
+- **Your AI can read them.** When creating a new person note, the AI checks the schema and follows it.
+- **They're versioned.** Git tracks changes. You can see how your schemas evolved over time.
+- **They're searchable.** `search_notes(metadata_filters={"type": "schema"})` finds all your schemas.
+
+Schemas aren't a separate system bolted onto your knowledge base. They're part of it.
+
+## Real-World Example: Task Management
+
+Schemas shine for workflow note types. Here's a task schema:
+
+```yaml
+---
+title: Task
+type: schema
+entity: task
+version: 1
+schema:
+  description: string, what needs to be done
+  status: string, current status (active/blocked/done)
+  priority?: string, urgency level
+  assigned?: string, who owns this
+  current_step?: string, where we are in the process
+  context?: string, accumulated working state
+  blocked_by?: string, what's preventing progress
+settings:
+  validation: warn
+---
+```
+
+Every task note your AI creates follows this structure. When you search for active tasks, you know `status` exists and is consistent. When your AI resumes work after a context reset, it reads `current_step` and `context` to pick up where it left off.
+
+The schema didn't require upfront planning. You wrote a few task notes, noticed they had common fields, inferred a schema, and now every future task is consistent.
+
+---
+
+[Schema system guide →](/concepts/schema-system)
+[Metadata search →](/blog/metadata-search)
+
+---
+
+*Basic Memory is local-first AI knowledge infrastructure. Structure when you need it, plain text always. [Get started →](https://basicmemory.com)*
diff --git a/content/9.blog/7.agent-knowledge-architecture.md b/content/9.blog/7.agent-knowledge-architecture.md