Thank you for your interest in contributing to langchain-cockroachdb! This document provides guidelines and instructions for contributing.
Please be respectful and constructive in all interactions. We aim to foster an open and welcoming environment.
- Python 3.10 or higher
- uv for package management
- Docker for running CockroachDB locally
- Git
-
Clone the repository
git clone https://github.com/cockroachdb/langchain-cockroachdb.git cd langchain-cockroachdb -
Create virtual environment and install dependencies
uv venv source .venv/bin/activate # On Windows: .venv\Scripts\activate uv pip install -e ".[dev]"
-
Start CockroachDB locally
docker-compose up -d
-
Verify setup
pytest tests/unit -v
# Unit tests (fast, no database required)
pytest tests/unit -v
# Integration tests (requires CockroachDB)
docker-compose up -d
pytest tests/integration -v
# All tests
pytest tests -v
# With coverage
pytest tests --cov=langchain_cockroachdb --cov-report=htmlBefore submitting a PR, ensure your code passes all checks:
# Linting
ruff check langchain_cockroachdb tests
# Auto-fix linting issues
ruff check langchain_cockroachdb tests --fix
# Type checking
mypy langchain_cockroachdb
# Format code (if using black)
black langchain_cockroachdb tests# Run single test file
pytest tests/unit/test_indexes.py -v
# Run single test
pytest tests/unit/test_indexes.py::TestCSPANNIndex::test_create_index_sql_basic -v
# Run tests matching pattern
pytest tests -k "vector" -vWhen reporting issues, please include:
- Clear description of the problem
- Steps to reproduce
- Expected vs actual behavior
- CockroachDB version
- Python version
- Relevant code snippets or error messages
-
Fork and create a branch
git checkout -b feature/your-feature-name
-
Make your changes
- Write clear, concise code
- Add tests for new functionality
- Update documentation as needed
- Follow existing code style
-
Test thoroughly
pytest tests -v ruff check langchain_cockroachdb tests mypy langchain_cockroachdb
-
Commit with clear messages
git add . git commit -m "feat: add support for X - Implement feature X - Add tests for X - Update documentation Closes #123"
-
Push and create PR
git push origin feature/your-feature-name
Follow conventional commits:
feat:New featurefix:Bug fixdocs:Documentation changestest:Test additions or changesrefactor:Code refactoringperf:Performance improvementschore:Maintenance tasks
Example:
feat: add support for metadata filtering in vector search
- Implement _build_filter_clause for complex filters
- Support $and, $or, $gt, $lt operators
- Add comprehensive tests
- Update documentation
Closes #42
- Follow PEP 8
- Use type hints for all function signatures
- Maximum line length: 100 characters
- Use descriptive variable names
- Add docstrings for all public functions/classes
async def asimilarity_search(
self,
query: str,
k: int = 4,
filter: Optional[dict] = None,
**kwargs: Any,
) -> list[Document]:
"""Search for similar documents.
Args:
query: Query text
k: Number of results to return
filter: Optional metadata filter
**kwargs: Additional arguments
Returns:
List of matching documents
Raises:
ValueError: If query is empty
Example:
```python
results = await store.asimilarity_search("database", k=5)
```
"""- Write tests for all new features
- Maintain >80% code coverage
- Use descriptive test names:
test_<function>_<scenario>_<expected_result> - Follow AAA pattern: Arrange, Act, Assert
- Use fixtures for common setup
Example:
async def test_asimilarity_search_with_filter_returns_filtered_results(
vectorstore: AsyncCockroachDBVectorStore,
sample_texts: list[str],
sample_metadatas: list[dict],
) -> None:
"""Test that similarity search respects metadata filters."""
# Arrange
await vectorstore.aadd_texts(sample_texts, metadatas=sample_metadatas)
filter_dict = {"category": {"$eq": "database"}}
# Act
results = await vectorstore.asimilarity_search("query", k=5, filter=filter_dict)
# Assert
assert len(results) > 0
for doc in results:
assert doc.metadata.get("category") == "database"When implementing features that use transactions:
- Use smaller transactions when possible
- Implement retry logic for serializable isolation
- Consider using
run_transaction()helper - Test with induced contention
- Default to smaller batch sizes (100-500) for vector inserts
- Use C-SPANN indexes with appropriate partition sizes
- Test with various distance strategies (cosine, L2, inner product)
- Consider prefix columns for multi-tenant scenarios
When adding performance-critical features:
- Benchmark with realistic data volumes
- Test with different index configurations
- Document performance characteristics
- Compare with and without indexes
- All public APIs
- Configuration options
- Usage examples
- Performance considerations
- Migration guides (if applicable)
- Docstrings in code
- README.md for quick start
- Examples in
examples/directory - Architecture decisions in code comments
- Open an issue for bugs or feature requests
- Join discussions in GitHub Discussions
- Tag maintainers for urgent issues
- Check existing issues and PRs first
By contributing, you agree that your contributions will be licensed under the Apache License 2.0.