How to Use the Graphify Skill to Build a Queryable Knowledge Graph for AI Agents
Graphify converts codebases and notes into a queryable knowledge graph that reduces token costs by letting agents query structure instead of raw files.
Why Querying a Knowledge Graph Beats Feeding Raw Files to Your Agent
AI agents that work with large codebases, documentation sets, or research notes run into the same wall: context limits. If you’re feeding an agent an entire repository or a folder full of markdown files every time it needs to answer a question, you’re burning tokens unnecessarily — and often getting worse results because the model is drowning in irrelevant content.
A queryable knowledge graph changes this. Instead of dumping raw files into the prompt, your agent queries a structured graph of entities and relationships, pulling only what’s relevant. The Graphify skill makes this accessible to any developer working with AI agents, converting unstructured inputs — source code, notes, documentation — into a graph your agent can interrogate precisely.
This guide covers how the Graphify skill works, how to set it up, and how to build workflows where your agent queries structure instead of noise.
What a Knowledge Graph Actually Is (and Why Agents Need One)
A knowledge graph is a way of representing information as nodes (entities) and edges (relationships). Instead of storing “the UserAuth class calls validateToken() which checks the Redis cache,” you store three nodes — UserAuth, validateToken, Redis cache — connected by edges labeled calls and checks.
This matters for agents because:
- Queries are precise. Ask “what functions does
UserAuthcall?” and you get exactly that — not a dump of the entire auth module. - Traversal is cheap. Follow the edges from one node to related nodes without re-reading files.
- Context is controlled. The agent gets relevant subgraphs, not entire documents.
For codebases specifically, knowledge graphs capture the semantic relationships between files, classes, functions, imports, and dependencies. For notes or research, they map concepts, references, and connections between ideas.
The alternative — vector search over chunked text — is useful for semantic similarity but can miss structural relationships. If you want to know what calls what, or which concept links to which, graph queries beat embedding search.
What the Graphify Skill Does
Graphify is a skill available through the MindStudio Agent Skills Plugin, the npm SDK that gives any AI agent — whether you’re building with Claude Code, LangChain, CrewAI, or a custom agent — access to over 120 typed capabilities as simple method calls.
The Graphify skill takes your input (a codebase, a collection of notes, a set of documents) and:
- Parses the content — extracts entities like functions, classes, files, concepts, and terms
- Infers relationships — maps dependencies, references, calls, and conceptual links
- Builds a graph structure — stores entities as nodes and relationships as labeled edges
- Exposes a query interface — lets your agent ask specific questions against the graph
The output isn’t a summary or a flat list. It’s a queryable structure your agent can traverse repeatedly without reprocessing the original source material.
This is particularly useful for:
- Code agents that need to understand a repository’s architecture without reading every file
- Research agents that work with large note collections and need to find conceptual connections
- Documentation agents that answer questions about system design or API structure
- Refactoring agents that need to trace what would break if a function changes
Prerequisites Before You Start
Before building with Graphify, you’ll need:
- Node.js 18 or later installed in your development environment
- The MindStudio Agent Skills SDK installed (
npm install @mindstudio-ai/agent) - A MindStudio API key — available from your MindStudio workspace settings
- Your source material — a codebase directory, a folder of markdown files, or a structured document set
- Basic familiarity with async JavaScript or TypeScript — the SDK is fully typed, so TypeScript is recommended
If you haven’t set up the SDK yet, the MindStudio Agent Skills documentation walks through authentication and initialization.
Step 1: Install and Initialize the SDK
Start by installing the package and setting up your agent instance.
npm install @mindstudio-ai/agent
Then initialize the agent in your project:
import { MindStudioAgent } from '@mindstudio-ai/agent';
const agent = new MindStudioAgent({
apiKey: process.env.MINDSTUDIO_API_KEY,
});
Store your API key in an environment variable — never hardcode it. If you’re using a .env file locally, a package like dotenv handles loading it.
Step 2: Prepare Your Source Material
Graphify works best when your input is organized. Before calling the skill, spend a few minutes on structure:
For codebases:
- Include the full directory path, not just individual files
- Make sure file extensions are consistent (
.ts,.py,.js) - Remove generated files,
node_modules, build artifacts, and lock files — these add noise without semantic value - Include a
READMEor architecture notes if they exist — they help Graphify infer higher-level concepts
For notes and documentation:
- Use consistent heading hierarchies in markdown files
- Name files descriptively — the filename becomes part of the entity metadata
- Group related notes in subdirectories — Graphify uses directory structure as a signal for conceptual proximity
For mixed inputs:
- Separate code and prose into distinct directories if you can
- Pass them as separate inputs or specify the content type explicitly in your call
The cleaner your input, the more accurate the resulting graph.
Step 3: Build the Knowledge Graph
With your material ready, call the Graphify skill to build the graph:
const graph = await agent.graphify({
source: './src', // path to your codebase or note directory
contentType: 'code', // 'code', 'notes', or 'mixed'
language: 'typescript', // optional: helps parser accuracy for codebases
includeMetadata: true, // attach file paths, line numbers, and timestamps to nodes
});
console.log(`Graph built: ${graph.nodeCount} nodes, ${graph.edgeCount} edges`);
The skill returns a graph object with a unique ID, summary statistics, and the query interface. You don’t need to manage the underlying storage — the graph is persisted server-side and accessible through the ID for subsequent queries.
For large codebases (thousands of files), expect this step to take 30–90 seconds. For note collections under a few hundred files, it’s typically under 10 seconds.
Step 4: Query the Graph
Once the graph is built, your agent can query it in two ways: natural language queries and structured traversal queries.
Natural Language Queries
This is the simplest approach. Pass a question in plain English:
const result = await agent.graphify.query({
graphId: graph.id,
query: 'What functions call the validateToken method?',
});
console.log(result.answer);
console.log(result.nodes); // the specific nodes returned
console.log(result.reasoning); // how the graph was traversed
Natural language queries work well for exploratory questions — “what does this module depend on?” or “which notes reference the concept of retrieval-augmented generation?”
Structured Traversal Queries
For precision, use structured queries that specify exactly what you’re looking for:
const result = await agent.graphify.traverse({
graphId: graph.id,
startNode: 'UserAuth',
nodeType: 'class',
relationship: 'calls',
depth: 2, // follow edges up to 2 hops
});
This is useful when your agent already knows what entity it’s working with and needs to map its relationships — for example, a refactoring agent checking the blast radius of a change before modifying a function.
Subgraph Extraction
You can also pull a subgraph — a connected cluster of nodes — and pass it as context to your LLM:
const subgraph = await agent.graphify.getSubgraph({
graphId: graph.id,
focus: 'authentication',
maxNodes: 20,
});
// Use subgraph.summary as context in your LLM prompt
const prompt = `Given this system context:\n${subgraph.summary}\n\nAnswer: ...`;
This keeps your prompt context tight. Instead of 50,000 tokens of raw source files, you’re passing a 2,000-token structured summary of exactly the relevant portion of the codebase.
Step 5: Integrate Graph Queries into Your Agent Loop
The real value of Graphify comes when it’s embedded into your agent’s reasoning loop — not called once and forgotten.
Here’s a pattern for a code-understanding agent that uses the graph iteratively:
async function codeAgent(question: string, graphId: string) {
let context = '';
let answer = '';
// First pass: identify what entities are relevant
const entities = await agent.graphify.query({
graphId,
query: `What entities are relevant to: ${question}`,
});
// Second pass: get relationships for those entities
for (const node of entities.nodes.slice(0, 5)) {
const related = await agent.graphify.traverse({
graphId,
startNode: node.name,
depth: 1,
});
context += related.summary + '\n';
}
// Final pass: answer with focused context
answer = await agent.complete({
prompt: `Context:\n${context}\n\nQuestion: ${question}`,
model: 'claude-opus-4',
});
return answer;
}
This three-pass pattern — identify, expand, answer — keeps each LLM call focused and avoids the context-bloat problem that makes raw-file approaches expensive.
Step 6: Update the Graph When Source Material Changes
Knowledge graphs go stale when the underlying content changes. Graphify handles this with incremental updates:
// Update specific files without rebuilding the entire graph
await agent.graphify.update({
graphId: graph.id,
changedFiles: ['./src/auth/UserAuth.ts', './src/utils/tokenUtils.ts'],
});
// Or rebuild the full graph if changes are widespread
await agent.graphify.rebuild({
graphId: graph.id,
source: './src',
});
For codebases under active development, hook the update call into your CI pipeline or a file-watcher process so the graph stays current without manual intervention.
Real-World Use Cases for Graphify
Codebase Q&A for Developer Agents
A developer agent equipped with a Graphify knowledge graph can answer architecture questions without reading source files directly. “Where is rate limiting applied?” returns the relevant nodes and their relationships — not a wall of code.
This is especially useful for onboarding workflows, where a new developer (or an agent acting on behalf of one) needs to understand an unfamiliar codebase quickly.
Research and Literature Mapping
Researchers working with large note collections can build a concept graph that maps how ideas relate. An agent querying this graph can answer “what papers does this concept connect to?” or “what’s the chain of reasoning between these two ideas?” — tasks that keyword search handles poorly.
Impact Analysis for Refactoring
Before changing a function, an agent can query the graph to find everything that depends on it — two hops, three hops, across the whole codebase. This kind of blast-radius analysis is tedious to do manually and error-prone with grep-based search.
API Documentation Agents
Technical writers and support agents can build a knowledge graph from API documentation and answer questions like “what endpoints require authentication?” or “which parameters are shared across these three methods?” without scanning every doc page.
How MindStudio Fits Into This Workflow
If you want to go beyond a single agent script and build a full workflow around Graphify — with a UI, scheduling, or integration with tools like Notion, GitHub, or Slack — MindStudio’s no-code builder is the fastest way to do it.
You can build a visual workflow that:
- Watches a GitHub repository for changes (via webhook)
- Triggers a Graphify rebuild when new commits land
- Exposes a chat interface where developers ask questions about the codebase
- Logs query results to Notion or posts summaries to Slack
This kind of multi-step, multi-tool workflow would take significant infrastructure work to build from scratch. In MindStudio’s visual builder, it’s a matter of connecting blocks — the average workflow takes 15 minutes to an hour to set up.
The Agent Skills SDK and the no-code builder work together. Developers can call agent.graphify() directly in code for maximum control, while non-technical teammates can build wrapper workflows in the visual interface without touching a line of JavaScript.
You can try MindStudio free at mindstudio.ai — no API keys or separate accounts needed for the 200+ models available on the platform.
Common Mistakes and How to Avoid Them
Including Generated or Vendored Code
- ✕a coding agent
- ✕no-code
- ✕vibe coding
- ✕a faster Cursor
The one that tells the coding agents what to build.
Feeding node_modules, compiled output, or third-party libraries into Graphify bloats the graph with entities your agent doesn’t need to reason about. Filter these out before calling the skill.
Querying Too Broadly on Large Graphs
A query like “give me everything related to authentication” on a large enterprise codebase might return hundreds of nodes. Use maxNodes limits and follow up with targeted traversal queries to stay focused.
Rebuilding the Full Graph Too Often
Full rebuilds are expensive. Use incremental updates (graphify.update()) when only a few files changed. Reserve full rebuilds for major refactors or after adding large new modules.
Skipping Metadata
Setting includeMetadata: false speeds up graph construction slightly but removes file paths and line numbers from nodes. This makes it much harder for your agent to cite sources or navigate to specific locations in the codebase. Keep metadata on.
Not Versioning Graph IDs
If you’re running multiple environments (dev, staging, production), maintain separate graphs with separate IDs. Mixing environments leads to confusing query results where staging code bleeds into production agent responses.
Frequently Asked Questions
What types of content can Graphify process?
Graphify handles source code in most major languages (JavaScript, TypeScript, Python, Go, Rust, Java, and others), markdown and plain text notes, and mixed inputs that combine prose and code. JSON and YAML configuration files are supported as supplementary inputs — they’re parsed as structured data rather than treated as prose.
How much does using Graphify reduce token costs?
The reduction depends on what you’re replacing. If your agent was previously loading entire source files into context for each query, switching to subgraph queries typically reduces input tokens by 70–90% per agent turn. For a codebase with thousands of files, the savings compound significantly over many agent interactions.
Does Graphify work with non-English content?
The natural language query interface works in any language supported by the underlying model. Graph construction from source code is language-agnostic since code syntax is standardized. For prose notes in languages other than English, accuracy depends on the model’s multilingual capabilities, which vary by language.
How long does a Graphify knowledge graph last?
Graphs are persisted server-side and remain accessible via their graph ID until you explicitly delete them or they expire based on your plan’s retention policy. For active projects, plan to refresh graphs regularly rather than assuming they stay current.
Can multiple agents share the same knowledge graph?
Yes. A graph ID can be used by multiple agents running in parallel. This is useful for teams where several agents — a code reviewer, a documentation writer, and a refactoring agent — all need to reason about the same codebase without each maintaining their own copy.
Is Graphify suitable for very large codebases (millions of lines of code)?
At very large scales, you’ll want to partition the graph by module or service rather than building a single monolithic graph. Graphify supports this pattern — build separate graphs for separate subsystems, and have your agent decide which graph to query based on the task at hand. This keeps query performance fast and avoids graphs that are too large to traverse efficiently.
Key Takeaways
- Knowledge graphs let AI agents query structure instead of raw files, reducing token costs and improving response precision.
- The Graphify skill converts codebases and note collections into queryable graphs with nodes (entities) and edges (relationships).
- Setup involves three steps: building the graph, querying it via natural language or structured traversal, and updating it incrementally as content changes.
- The biggest practical gains come from embedding graph queries into the agent’s reasoning loop — identify, expand, answer — rather than treating the graph as a one-shot lookup.
- Avoid common mistakes like including vendored code, querying too broadly, or skipping node metadata.
- MindStudio’s no-code builder lets you wrap Graphify-powered agents in full workflows with triggers, integrations, and UIs — without additional infrastructure work.
If you’re building agents that work with complex, interconnected information, a queryable knowledge graph is one of the most practical upgrades you can make. Start building on MindStudio to see how quickly you can move from raw files to structured, agent-ready knowledge.


