Python SDK (v0.2.0)
The GibRAM Python SDK provides a GraphRAG-style workflow for indexing documents and querying knowledge graphs with minimal code.
What is the Python SDK?
A high-level library that automates the GraphRAG pipeline:
Instead of manually handling each step, you write:
from gibram import GibRAMIndexer
indexer = GibRAMIndexer(session_id="my-project")
stats = indexer.index_documents(["Your document text here"])
result = indexer.query("Your question")
The SDK handles: ✅ Text chunking (with overlap) • ✅ Entity & relationship extraction (via LLM) • ✅ Embedding generation (via OpenAI) • ✅ Graph storage (via GibRAM protocol) • ✅ Community detection (hierarchical clustering) • ✅ Hybrid query (vector + graph traversal)
When to Use the SDK
Use the Python SDK when: - You want GraphRAG workflow out-of-the-box - You're building RAG applications - You need automatic entity extraction - You want customizable components (chunker, extractor, embedder)
Don't use the SDK when: - You need custom protocol operations (use low-level client) - You're building non-Python applications (use Go client or implement protocol) - You have pre-extracted entities (use protocol directly)
Architecture
High-Level vs Low-Level
High-Level (GibRAMIndexer):
# All-in-one workflow
indexer = GibRAMIndexer(session_id="my-project")
stats = indexer.index_documents(documents)
Low-Level (_Client):
# Manual protocol operations
from gibram._client import _Client
client = _Client(host="localhost", port=6161, session_id="test")
client.connect()
doc_id = client.add_document(external_id="doc-001", filename="file.txt")
entity_id = client.add_entity(
external_id="ent-001",
title="EINSTEIN",
entity_type="person",
description="...",
embedding=[0.1, 0.2, ...] # You provide embedding
)
When to use low-level: Custom workflows, pre-computed embeddings, fine-grained control.
Component Architecture
The SDK is modular. You can swap components:
from gibram import GibRAMIndexer
from gibram.chunkers import TokenChunker
from gibram.extractors import OpenAIExtractor
from gibram.embedders import OpenAIEmbedder
# Default (OpenAI for everything)
indexer = GibRAMIndexer(session_id="my-project")
# Custom components
indexer = GibRAMIndexer(
session_id="my-project",
chunker=TokenChunker(chunk_size=1024, chunk_overlap=100),
extractor=OpenAIExtractor(model="gpt-4o-mini"),
embedder=OpenAIEmbedder(model="text-embedding-3-large", dimensions=3072)
)
Interfaces:
- BaseChunker - Splits text into chunks
- BaseExtractor - Extracts entities and relationships
- BaseEmbedder - Generates embeddings
See Custom Components for implementation guide.
Installation
Requirements: - Python 3.8+ - GibRAM server running (see Getting Started)
Dependencies (auto-installed):
- protobuf - Protocol communication
- openai - Default LLM and embeddings
- tqdm - Progress bars
Quick Example
from gibram import GibRAMIndexer
# Initialize
indexer = GibRAMIndexer(
session_id="my-project",
llm_api_key="sk-...", # Or set OPENAI_API_KEY
)
# Index
stats = indexer.index_documents([
"Albert Einstein developed the theory of relativity.",
"He received the Nobel Prize in Physics in 1921.",
])
print(f"Entities extracted: {stats.entities_extracted}")
print(f"Relationships: {stats.relationships_extracted}")
# Query
result = indexer.query("Einstein's achievements", top_k=5)
for entity in result.entities:
print(f"{entity.title}: {entity.score:.3f}")
Key Concepts
Session Isolation
Each GibRAMIndexer instance operates in an isolated session:
# Project A data
indexer_a = GibRAMIndexer(session_id="project-a")
indexer_a.index_documents(docs_a)
# Project B data (completely separate)
indexer_b = GibRAMIndexer(session_id="project-b")
indexer_b.index_documents(docs_b)
# Queries only see data from their session
result_a = indexer_a.query("query") # Only searches project-a data
Best Practice: Use one session per project or experiment.
Indexing Pipeline
When you call index_documents():
- Chunking: Splits documents into ~512 token chunks (configurable)
- Extraction: Calls LLM to extract entities and relationships per chunk
- Deduplication: Merges entities with same title across chunks
- Embedding: Generates embeddings for chunks and entities
- Storage: Stores in GibRAM server via protocol
- Linking: Links entities to their source chunks
- Clustering: Detects communities (if enabled)
Cost: Each chunk = 1 LLM call + embeddings. See Quickstart for estimation.
Query Execution
When you call query():
- Embedding: Generates query embedding
- Vector Search: Finds similar chunks, entities, communities
- Graph Traversal: Expands via relationships (k-hops)
- Ranking: Combines similarity scores
- Return: Sorted results with scores
Results include:
- entities - Relevant entities (with descriptions)
- text_units - Source chunks (with content)
- communities - Cluster summaries (if available)
Configuration
Essential Parameters
indexer = GibRAMIndexer(
session_id="required-unique-id", # REQUIRED
host="localhost", # Server host
port=6161, # Server port
)
LLM Configuration
indexer = GibRAMIndexer(
llm_provider="openai", # Only OpenAI currently
llm_api_key="sk-...", # Or env OPENAI_API_KEY
llm_model="gpt-4o", # Default: gpt-4o
)
Supported models:
- gpt-4o - Best quality (default)
- gpt-4o-mini - Faster, cheaper
- gpt-4-turbo - Alternative
Embedding Configuration
indexer = GibRAMIndexer(
embedding_provider="openai",
embedding_model="text-embedding-3-small", # Default
embedding_dimensions=1536, # MUST match server
)
⚠️ CRITICAL: embedding_dimensions must match server vector_dim.
Supported models:
- text-embedding-3-small - 1536 dims (default)
- text-embedding-3-large - 3072 dims (higher quality)
- text-embedding-ada-002 - 1536 dims (legacy)
Chunking Configuration
indexer = GibRAMIndexer(
chunk_size=512, # Tokens per chunk
chunk_overlap=50, # Overlap between chunks
)
Trade-offs: - Larger chunks: Fewer LLM calls (cheaper), but less precise retrieval - Smaller chunks: More LLM calls (costlier), but more precise retrieval
Community Detection
indexer = GibRAMIndexer(
auto_detect_communities=True, # Default: True
community_resolution=1.0, # Higher = more granular clusters
)
Error Handling
from gibram.exceptions import (
GibRAMError,
ConfigurationError,
ConnectionError,
ExtractionError,
EmbeddingError,
)
try:
indexer = GibRAMIndexer(session_id="test")
stats = indexer.index_documents(documents)
except ConfigurationError as e:
print(f"Configuration issue: {e}")
except ConnectionError as e:
print(f"Can't reach server: {e}")
except ExtractionError as e:
print(f"LLM extraction failed: {e}")
except EmbeddingError as e:
print(f"Embedding generation failed: {e}")
Performance
Typical throughput: - Indexing: ~5-10 documents/minute (depends on LLM API latency) - Querying: ~100-200ms per query
Bottlenecks: 1. LLM API calls (dominant) 2. Embedding API calls 3. Network latency to GibRAM server
Optimization tips:
- Increase batch_size for faster embedding calls
- Use gpt-4o-mini instead of gpt-4o for faster extraction
- Disable auto_detect_communities if not needed
Next Steps
- Quickstart - Step-by-step examples
- Indexing Workflow - Deep dive into indexing
- Query Workflow - Deep dive into querying
- Custom Components - Build your own chunker/extractor/embedder