A file-native approach to LLM knowledge bases that pre-compiles semantic relationships at ingest time — delivering faster, higher-quality responses without vector databases, embedding models, or infrastructure.
Karpathy's LLM Wiki pattern is a genuine improvement over RAG: instead of retrieving and forgetting, it compiles knowledge into structured files loaded directly into context. The heavy reasoning happens at write time, not at query time.
But even compiled wikis have a structural gap: when you load a knowledge base, you load all of it. The LLM receives 20 files to answer a question that needs 3. Context is noisy. Response quality drops. Speed suffers.
And the connections between concepts — the "see also" links — are plain text. The LLM must infer the strength and type of every relationship on every query, from scratch, every time.
A lightweight routing layer that tells the LLM which files to load, and pre-declares how concepts relate — computed once at ingest, reused on every query.
"Load less. Reason more. Compute once, reuse forever."
CKP adds five structured fields to the header of each knowledge file. No new tools. No new infrastructure. The same folder, the same Git workflow, the same "load into context" approach — just richer files.
TLDRANSWERS_WHENSIMILAR_HIGHSIMILAR_MIDVALIDATEDThere is no SIMILAR_LOW field. Weak relationships add noise, not signal — the LLM infers them from content.
LLMs are significantly more consistent when classifying into categories than when assigning decimal scores. A score of 0.73 from one session may be 0.61 in another. But HIGH vs MID — with a fixed rubric and anchor examples — produces stable, reproducible results across any LLM and any session.
jwt ↔ oauth2jwt ↔ http-headersjwt ↔ databaseThe CKP header is format-agnostic. The example below uses a plain text block that works in Markdown, HTML, or any other format. The body below the header is unchanged — write it however you normally would.
---
CONCEPT: JWT Authentication
TLDR: Stateless token-based auth — server signs a token, client stores and sends it.
ANSWERS_WHEN: authentication, token, login, bearer, stateless, jwt, auth
SIMILAR_HIGH: oauth2:2025-03, refresh-token:2025-03, bearer-token:2025-03
SIMILAR_MID: http-headers:2025-03, cors:2025-03, api-security:2025-03
CONFIDENCE: high
VALIDATED: 2025-03
---
# JWT Authentication
Your normal content here. Nothing changes below the header.
Write in Markdown, HTML, plain text — whatever fits your workflow.
The date on each SIMILAR entry (oauth2:2025-03) records when that specific relationship was last validated — not when the file was last touched. This makes staleness detectable at the relationship level, not just the file level.
If oauth2 was updated after 2025-03, that specific relationship is potentially stale and should be re-evaluated at next ingest. The rest of the file's relationships remain valid.
A single index file at the root of your knowledge base aggregates only the headers — no body content. It is the only file always loaded into context. Everything else is loaded on demand.
---
CONCEPT: jwt-authentication
TLDR: Stateless token-based auth — server signs a token, client stores and sends it.
ANSWERS_WHEN: authentication, token, login, bearer, stateless, jwt
---
---
CONCEPT: oauth2
TLDR: Delegated authorisation framework. Issues access tokens on behalf of users.
ANSWERS_WHEN: oauth, authorization, delegate, scope, grant, flow
---
... one header block per file. No body content.
The index stays small regardless of knowledge base size because it contains only the five header fields — never the concept body. With 50 files, it remains a few hundred lines.
Ingest is the only moment where semantic work occurs. At query time, the LLM reads pre-computed relationships — it never re-derives them.
AGENT.md.
SIMILAR_HIGH, SIMILAR_MID, timestamps, VALIDATED.
You are updating the header of a knowledge file.
Classify the relationship between this concept and each existing concept
using exactly these three tiers:
HIGH — one concept requires understanding the other.
Direct dependency, extension, or prerequisite.
Anchor: jwt ↔ oauth2 = HIGH
MID — same domain, frequently used together.
No direct dependency.
Anchor: jwt ↔ http-headers = MID
NONE — loosely related or rarely co-relevant. Do not store.
Anchor: jwt ↔ database = NONE
Rules:
- SIMILAR_HIGH: select at most 3. If fewer qualify, write fewer.
- SIMILAR_MID: select at most 5. If fewer qualify, write fewer.
- Do not store NONE relationships.
- Append today's date (YYYY-MM) to each entry: concept:YYYY-MM
- Only update the header. Do not modify the body content.
For coding agents like Claude Code, add this rule to your AGENT.md. The agent treats it as a mandatory step after every file write inside your knowledge folder.
## Knowledge Base — Mandatory Rule
Whenever you create or modify a file inside /knowledge/:
1. Re-read the file you just wrote.
2. Apply the CKP ingest rubric (see below) to classify
relationships against the current index.
3. Update the file header: SIMILAR_HIGH (max 3),
SIMILAR_MID (max 5), timestamps, VALIDATED.
4. Update /knowledge/index with the new header block.
This step is part of the task. It is not optional.
[paste the rubric block here]
At query time the LLM performs no semantic computation. It reads pre-compiled structure and routes accordingly.
A query that previously loaded 20 files now loads 2–4. Context is denser, response quality is higher, and no vector database or embedding model was invoked at any point.
Query time is read-only. The LLM never writes during a query. If it notices a relationship timestamp is older than the target file's VALIDATED date, it notes this internally but still loads the most recent version of the file. Re-evaluation happens at the next ingest — not now.
Index scan:
"refresh" + "token" → matches jwt-authentication
Load:
jwt-authentication ← query match
oauth2:HIGH ← always load with jwt
refresh-token:HIGH ← always load with jwt
http-headers:MID ← relevant to token transport
Skip:
cors, api-security, database, cryptography ← not relevant
Result: 4 files loaded instead of 18.
---
CONCEPT: sicurezza-territoriale
TLDR: Crime statistics by geographic area via ISTAT SDMX API, with trend and anomaly detection.
ANSWERS_WHEN: crime, security, reati, statistics, ISTAT, territory, area, criminalità
SIMILAR_HIGH: istat-sdmx-api:2025-04, geographic-filtering:2025-04, crime-statistics:2025-04
SIMILAR_MID: sentiment-analysis:2025-04, civis-reporting:2025-04, anomaly-detection:2025-04
CONFIDENCE: high
VALIDATED: 2025-04
---
Index scan:
"reati" + "Fiumicino" → matches sicurezza-territoriale
Load:
sicurezza-territoriale ← query match
istat-sdmx-api:HIGH ← data source, always needed
geographic-filtering:HIGH ← "Fiumicino" signals location
civis-reporting:MID ← civic context, conditionally relevant
Skip:
rag-chat, multilingual-support, social-analytics
The following benchmark was run on a real NestJS codebase (Mentat — a civic intelligence platform) using Claude Sonnet. Two environments were compared: Environment A loaded all knowledge files with no routing (no CKP); Environment B used full CKP routing. Each of 5 queries was run 3 times per environment.
On out-of-domain queries, No-CKP hallucinates connections between unrelated modules. CKP loads nothing and answers honestly. Reduced context is not just a cost saving — it is a hallucination risk reduction.
Based on 3,182 tokens saved per query. Model: Claude Sonnet (Thinking). Date: 2026-05-20.
If you already use a structured memory bank — with files like activeContext.md, projectbrief.md, progress.md, and per-project files — CKP does not replace it. It layers on top, adding one thing the memory bank pattern is missing: explicit semantic relationships between files.
Apply CKP headers to your projects/ files. Leave activeContext.md, dailyLog.md, and decisionLog.md unchanged — they are session-scoped, not knowledge-scoped.
---
# CKP fields
CONCEPT: prescient
TLDR: Predictive analytics engine with ISTAT SDMX data and social sentiment.
ANSWERS_WHEN: analytics, predictive, ISTAT, trend, forecast, anomaly, dati
SIMILAR_HIGH: mentat:2025-04, istat-sdmx:2025-04
SIMILAR_MID: civis:2025-04, social-analytics:2025-04, rag-chat:2025-04
CONFIDENCE: high
VALIDATED: 2025-04
# Memory bank metadata (unchanged)
updated_at: 2025-04-01T10:00:00
updated_by: agente
ttl: 30d
project_id: prescient
---
_index.mdprojects/[detected].md_index.md (TLDR + ANSWERS_WHEN included)projects/[name].md_index.mdprojectbrief.mdactiveContext.mddailyLog.mddecisionLog.md## CKP LLM — Compiled Knowledge Pattern (obbligatorio)
### REGOLA ASSOLUTA — Zero domande all'utente
Non fare MAI domande durante il flusso CKP.
Fai assunzioni ragionevoli. Dichiarale in una riga.
---
### BOOT — Si attiva SEMPRE al primo messaggio di ogni sessione
1. Leggi PROJECT_ROOT/memory-bank/_index.md.
2. Se esiste → ROUTING. Se non esiste → INIT.
---
### ROUTING — Caricamento selettivo
1. Confronta keyword della query con ANSWERS_WHEN.
2. Carica il file con più match + tutti i SIMILAR_HIGH.
3. Carica SIMILAR_MID solo se ANSWERS_WHEN matcha la query.
4. Se il TLDR risponde già, non caricare il file completo.
Dichiara: [CKP: caricati X/Y file — match: [nome] via [keyword]]
---
### UPDATE — Aggiornamento knowledge base
Ogni volta che crei/modifichi un file in memory-bank/projects/:
- HIGH: dipendenze dirette. Max 3. concept:YYYY-MM
- MID: stesso dominio. Max 5. concept:YYYY-MM
1. Aggiorna header CKP del file (solo header, non il corpo).
2. Aggiorna riga in memory-bank/_index.md.
Questo step è parte del task. Non chiudere senza averlo completato.
SIMILAR_HIGH contains at most 3 entries. Write fewer if fewer qualify.SIMILAR_MID contains at most 5 entries. Write fewer if fewer qualify.SIMILAR_LOW. Weak relationships are noise, not signal.concept:YYYY-MM.oauth2 is HIGH for both jwt and pkce, both files declare it independently. At query time, whichever file is matched first will trigger loading oauth2.