Custom Knowledge Graphs
Add your own knowledge graph to ARK CLI.
ARK CLI auto-discovers knowledge graphs from the data/ directory. You can add any knowledge graph that follows the expected format without any code changes.
Directory Structure
Each knowledge graph lives in its own subdirectory under data/ and must contain three files:
ARK CLI scans for directories containing all three files at startup. Any directory missing a file is skipped.
The graph.json Manifest
Each graph directory must contain a graph.json file that describes the graph's metadata:
{
"id": 4,
"name": "My Custom Graph",
"description": "A detailed description of your knowledge graph. This text is included in the AI agent's system prompt, so be descriptive about what entities and relationships the graph contains.",
"shortDescription": "Brief one-liner for the selection UI.",
"color": "#ff6b6b",
"order": 4,
"category": "Custom"
}The description field is included in the AI agent's system prompt to help it
understand the graph's contents and purpose. Write it as if you're explaining
the graph to a researcher.
For the full field reference, see Graph Metadata Schema.
Parquet File Schemas
The nodes.parquet and edges.parquet files must follow specific column schemas. See Node Schema and Edge Schema in the Data Model reference for the required columns and types.
Creating Parquet Files
You can generate Parquet files from common data formats using Python:
import pandas as pd
nodes = pd.DataFrame({
"id": ["node_1", "node_2", "node_3"],
"name": ["Aspirin", "Headache", "COX-2"],
"type": ["drug", "disease", "gene/protein"],
"properties": [
'{"synonyms": ["ASA", "Acetylsalicylic acid"]}',
'{"icd10": "R51"}',
'{"full_name": "Cyclooxygenase-2"}'
]
})
nodes.to_parquet("data/my-graph/nodes.parquet", index=False)
edges = pd.DataFrame({
"from": ["node_1", "node_1"],
"to": ["node_2", "node_3"],
"type": ["treats", "targets"],
"properties": ['{}', '{"mechanism": "inhibition"}']
})
edges.to_parquet("data/my-graph/edges.parquet", index=False)Verifying Your Graph
After adding your graph directory, restart ARK CLI:
pnpm cliYour graph should appear in the selection list with its own dedicated agent. Select it and try a simple query like:
What types of nodes are in this graph?Ensure your id field in graph.json does not conflict with existing graph
IDs. The bundled graphs use IDs 1, 2, and 3.
Tips
- Keep
propertiesas a JSON string, not a nested object. DuckDB parses it at query time. - Use descriptive
typevalues for nodes and edges. The AI agent uses these to filter and reason about the graph. - The
namefield on nodes is what the agent searches withfindNodesByName, so use clear, recognizable names. - Include common synonyms in the
propertiesJSON if your entities have multiple names. ThesearchInSurroundingstool searches properties as well. - Snappy compression is recommended for Parquet files (it's the default in both pandas and polars).