Open Knowledge Format OKF: Turning Docs into Agent Memory

•

Updated on Jul 8, 2026

• 6 Mins Read

Large Language Models (LLMs) keep getting better, gaining capabilities in every release cycle. Despite advancements, AI agents still produce wrong answers, not because of poor reasoning but because of missing context. The AI Agents assume many things in the context. It does not know your metric definitions, join paths in your database tables, and any deprecation notices on your documentation content. Given that an AI agent processes information as part of your agentic application, missed context errors compound and impact the performance of the AI agent.

AI agents lack access to all internal knowledge, which is what technical writers produce. All the knowledge required for AI agents for context is present across the knowledge base in multiple articles. Many AI agents are building their own harness independently to ensure that their suite of products works well. Karpathy’s LLM wiki and Google’s Open Knowledge Format (OKF) are two sources that independently position effective content delivery for AI Agents. Google’s OKF is a vendor-neutral, technology-agnostic way to expose your entire knowledge base, encapsulating all information required by AI agents.

What is the Open Knowledge Format (OKF)?

The Open Knowledge Format (OKF) is an open specification from Google Cloud for storing organizational knowledge as a plain directory of markdown files that any AI agent can read directly. Google published it on 12 June 2026 as version 0.1.

Each file in an OKF bundle represents one concept, such as a metric definition, a database table, or a runbook. The file path is the concept’s identity, and concepts link to one another with ordinary markdown links, turning the directory into a graph of related knowledge.

Every file opens with a short block of YAML front matter that tells an agent what the file is before it reads it. The specification requires only one field, type. OKF is vendor-neutral by design, with no SDK or proprietary tool, and it formalizes the “LLM wiki” pattern Andrej Karpathy described in April 2026.

How Scattered Knowledge Breaks AI Agents

In many organizations, the knowledge a model needs lives in four incompatible places: catalogs, wikis, code comments, and tacit knowledge. AI Agents need to pull in information from these places. For example, suppose an agent must compute weekly active users from your event stream, the

The definition of active users lives inside the metric documentation
The event table schema lives inside the catalog
Join paths for user tables live inside your code base
There might be a chance that some fields in the user table might be deprecated, and it was informed in Slack channels

Now, the AI Agent must find all this information to ensure it gets the right context. Every vendor ships their own SDK, APIs, and knowledge graph schema. Information about the data model spans different platforms, such as help center articles, code comments, Slack threads, and your own internal mental model. The same knowledge comes in different shapes, such as YAML files, markdown files, and Slack messages. This is a structural fragmentation of information. Given that the same fact is written in slightly different ways, it might be harder for AI agents to infer that all the different information refers to the same thing. This is a semantic problem. Even creating another information-consolidation layer does not solve this problem. A shared format can help all AI Agents access all information seamlessly without making multiple calls to different knowledge systems.

Turn your knowledge base into an AI-ready context with Document360.

Book a Demo

From the LLM Wiki Pattern to the Open Knowledge Format

Andrej Karpathy published LLM Wiki, in which a knowledge base is built, and an LLM maintains it instead of technical writers or information developers. This is a very different approach to RAG. Instead of uploading content and chunking it for retrieval for response generation, LLM Wiki builds and maintains a persistent wiki. LLM Wiki is a structured set of interlinked markdown files; LLM reads new sources once, extracts what matters, and folds that information into existing pages. LLM performs the bookkeeping exercise automatically by updating cross-references and keeping summaries current; any ambiguities are already flagged. This eliminates the maintenance cost of the Wiki.

LLM Wiki also produces knowledge islands. Your wiki, my wiki, and your vendor’s export all looked alike, but still could not read each other. This is the gap that the Open Knowledge Format addresses. The OKF takes the LLM Wiki and formalizes it into a portable and vendor-neutral format. The OKF version 0.1 represents knowledge as a directory of markdown files with YAML front matter and a list of agreed-upon conventions. There is no need for heavy tooling to read OKF. They can be hosted as a tarball, a git repository, or mounted in a filesystem.

The simplicity of OKF comes in three properties:

Portability, because all files are in Markdown
Discoverability given that queryable fields are a short, reserved type, such as type, description, resource, tags, and timestamp
Information standardization via concept file

An index.md enumerates the contents of its directory so that the AI agent can survey the OKF bundle before reading individual pages. A log.md records changes in dated entries like a changelog. Thus, OKF ensures that knowledge is portable and AI agent-friendly.

How an Open Knowledge Format Bundle Is Structured

To understand OKF anatomy, let’s look at a Sales domain sample. It is a category of markdown files grouped by their descriptions.

example okf bundle sales domain

Figure: Example OKF bundle of sales domain

A sample order concept file looks like

sales domain order concept

This file shows that the orders table is linked to the customers table and how to join them. This is the high-level overview of how the OKF structure looks. It is now the responsibility of technical writers to write descriptions, link to the right neighbors, and so on. This helps AI agents to understand how a knowledge bundle is organized and how it is maintained. This OKF principle is in the same discipline as the Claude.md and skills.md files. The best way to test this is to point an AI agent at your OKF bundle and start asking questions.

▶ Check out this video on how AI transforms metadata tagging and information architecture to make documentation easier to organize, maintain, and discover.

OKF v0.1: A Starting Point, Not the Finish Line

LLM Wiki provided a pattern for letting LLMs maintain a Wiki. If this pattern spreads, it may lead to a knowledge island. Google’s OKF is an attempt to standardize how AI agents discover and use your knowledge base without losing any context. Given that Google has open-sourced this standard, you can write a producer for your own knowledge system and write a consumer that reasons over that knowledge bundle without being tied to a proprietary tool. Given that it is a v0.1 release, there is a big scope of improvements in the upcoming releases.

Centralize all your documentation and make it easily searchable for everyone.

Start Now Talk to Sales

Selvaraaju Murugesan

Selvaraaju (Selva) Murugesan received the B.Eng. degree in Mechatronics Engineering (Gold medalist) from Anna University in 2004 and the M.Eng. degree from LaTrobe University, Australia, in 2008. He has received his Ph.D. degree in Computational mathematics, LaTrobe University. He is currently working as a Senior Director of Data Science at SaaS startup Kovai.co. His interests are in the areas of business strategy, data analytics, Artificial Intelligence and technical documentation.