LangChain integration for CockroachDB with native vector support
Quick Start • Features • Documentation • Examples • Contributing
Build LLM applications with CockroachDB's distributed SQL database and native vector search capabilities. This integration provides:
- 🎯 Native Vector Support - CockroachDB's
VECTORtype - 🚀 C-SPANN Indexes - Distributed vector indexes optimized for scale
- 🔄 Automatic Retries - Handles serialization errors transparently
- ⚡ Async & Sync APIs - Choose based on your use case
- 🏗️ Distributed by Design - Built for CockroachDB's architecture
pip install langchain-cockroachdbimport asyncio
from langchain_cockroachdb import AsyncCockroachDBVectorStore, CockroachDBEngine
from langchain_openai import OpenAIEmbeddings
async def main():
# Initialize
engine = CockroachDBEngine.from_connection_string(
"cockroachdb://user:pass@host:26257/db"
)
await engine.ainit_vectorstore_table(
table_name="documents",
vector_dimension=1536,
)
vectorstore = AsyncCockroachDBVectorStore(
engine=engine,
embeddings=OpenAIEmbeddings(),
collection_name="documents",
)
# Add documents
await vectorstore.aadd_texts([
"CockroachDB is a distributed SQL database",
"LangChain makes building LLM apps easy",
])
# Search
results = await vectorstore.asimilarity_search(
"Tell me about databases",
k=2
)
for doc in results:
print(doc.page_content)
await engine.aclose()
asyncio.run(main())- Native
VECTORtype support with C-SPANN indexes - Advanced metadata filtering (
$and,$or,$gt,$in, etc.) - Hybrid search (full-text + vector similarity)
- Multi-tenancy with namespace-based isolation and C-SPANN prefix columns
- Persistent conversation storage in CockroachDB
- Session management by thread ID
- Drop-in replacement for other LangChain chat history implementations
- Short-term memory for multi-turn LangGraph agents
- Human-in-the-loop with interrupt/resume support
- Both
CockroachDBSaver(sync) andAsyncCockroachDBSaver - Compatible with LangGraph's
compile(checkpointer=...)interface
- Automatic retry logic with exponential backoff
- Connection pooling with health checks
- Configurable for different workloads
- Works with both SERIALIZABLE (default, recommended) and READ COMMITTED isolation
- Async-first design for high concurrency
- Sync wrapper for simple scripts
- Type-safe with full type hints
- Comprehensive test suite (177 tests)
LangChain Official Integration Docs:
Getting Started:
Guides:
- Vector Store
- Vector Indexes
- Hybrid Search
- Chat History
- Multi-Tenancy
- LangGraph Checkpointer
- Async vs Sync
quickstart.py- Get started in 5 minutessync_usage.py- Synchronous APIvector_indexes.py- Index optimizationhybrid_search.py- FTS + vector searchmetadata_filtering.py- Advanced querieschat_history.py- Persistent conversationscheckpointer.py- LangGraph checkpointermulti_tenancy.py- Namespace-based multi-tenancyretry_configuration.py- Configuration patterns
# Clone repository
git clone https://github.com/cockroachdb/langchain-cockroachdb.git
cd langchain-cockroachdb
# Install dependencies
pip install -e ".[dev]"
# Start CockroachDB
docker-compose up -d
# Run tests
make test# Install docs dependencies
pip install -e ".[docs]"
# Serve documentation locally
mkdocs serve
# Open http://127.0.0.1:8000Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
- Distributed SQL - Scale horizontally across regions
- Native Vector Support - First-class
VECTORtype and C-SPANN indexes - Strong Consistency - SERIALIZABLE isolation by default, READ COMMITTED also supported
- Cloud Native - Deploy anywhere (IBM, AWS, GCP, Azure, on-prem)
- PostgreSQL Compatible - Familiar SQL with distributed superpowers
Apache License 2.0 - see LICENSE for details.
Built for the CockroachDB and LangChain communities.
- CockroachDB - Distributed SQL database
- LangChain - LLM application framework