Skip to content

Conversation

@strawgate
Copy link
Owner

@strawgate strawgate commented Jan 11, 2026

This PR adds a table_config parameter to DynamoDBStore that allows users to pass additional configuration options for DynamoDB table creation.

Changes

  • Added table_config: dict[str, Any] | None = None parameter to all __init__ overloads
  • User-provided config is merged with default table creation parameters
  • Added comprehensive test for SSESpecification
  • Fully backwards compatible

Supported Configurations

  • SSESpecification (encryption at rest)
  • Tags (resource tagging)
  • StreamSpecification (DynamoDB Streams)
  • Any other AWS DynamoDB table creation parameters

Fixes #251


Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Per-store auto-create option (default true) added across stores (DynamoDB, disk, DuckDB, Elasticsearch, FileTree, MongoDB, RocksDB, multi-disk) to control automatic creation of tables/directories/indexes.
    • DynamoDB: optional per-store table_config support so custom table settings (SSE, tags, stream specs, etc.) are respected and merged with required defaults.
  • Tests

    • Added tests verifying DynamoDB table_config behavior and auto-create true/false behaviors.

✏️ Tip: You can customize this high-level summary in your review settings.

…able creation

This adds a `table_config` parameter to `DynamoDBStore.__init__()` that allows
users to pass additional configuration options to DynamoDB's create_table() call.

This enables users to configure:
- SSESpecification (encryption at rest)
- Tags (resource tagging)
- StreamSpecification (DynamoDB Streams)
- Any other DynamoDB table creation parameters

The user-provided config is merged with the default parameters, allowing
both customization and sensible defaults.

Fixes #251

Co-authored-by: Bill Easton <[email protected]>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 11, 2026

📝 Walkthrough

Walkthrough

Adds per-store DynamoDB configuration via a new table_config: dict[str, Any] parameter and an auto_create: bool flag on DynamoDBStore. During _setup, if the table is missing the code either raises a ValueError when auto_create is False or builds create_table parameters by merging required defaults (TableName, KeySchema, AttributeDefinitions, BillingMode) with table_config and calls create_table, then waits for the table to become active and configures TTL. Tests added verify SSESpecification via table_config and the auto_create behaviors.

Possibly related PRs

  • PR 170: Modifies DynamoDBStore._setup and table creation flow to configure TTL at creation time — overlaps the DynamoDB create_table logic changes.
  • PR 89: Updates constructor/init logic for disk-based stores (e.g., DiskStore/MultiDiskStore) — related to the added auto_create constructors and directory creation handling.
  • PR 210: Changes ElasticsearchStore constructors and instance fields — related to adding an auto_create flag and wiring instance state in store implementations.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the primary change: adding a table_config parameter to DynamoDBStore for flexible table configuration.
Linked Issues check ✅ Passed The PR addresses issue #251 by enabling SSESpecification and other table-creation options via table_config, and extends auto_create across all stores as suggested in author comments.
Out of Scope Changes check ✅ Passed Changes are scoped to adding auto_create and table_config parameters across eight store implementations with consistent behavior patterns; all changes align with #251 and author's proposal.
Docstring Coverage ✅ Passed Docstring coverage is 90.91% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch claude/issue-251-20260101-1948

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link

claude bot commented Jan 11, 2026

Test Failure Analysis

Summary: The codegen_check CI job failed due to markdown linting errors in table formatting across multiple documentation files. These are pre-existing issues unrelated to the DynamoDB changes in this PR.

Root Cause: The markdownlint tool (MD060 rule) is detecting improper table column spacing in several markdown files:

  • AGENTS.md (line 252)
  • docs/adapters.md (line 10)
  • docs/stores.md (lines 33, 319, 320, 398)
  • docs/wrappers.md (line 10)

The tables are using format |---|---|---| but markdownlint expects "compact" style tables to have spaces: | --- | --- | --- |.

These errors existed before this PR - they were introduced in commit 6736113 ("Bump library to 0.3.0") which added the initial documentation. This PR only modifies DynamoDB store code and tests, not any markdown files.

Suggested Solution: Fix the table formatting in the affected markdown files by adding spaces around pipes in table separator rows. You have two options:

Option 1: Fix the table formatting (recommended for long-term health)

Modify the following files to add spaces around the pipes in table separators:

AGENTS.md (line 252):

-|---------|---------|
+| ------- | ------- |

docs/adapters.md (line 10):

-|---------|-------------|
+| ------- | ----------- |

docs/stores.md (lines 33, 398):

-|-------|:---------:|:-----:|:----:|:------------|
+| ----- | :-------: | :---: | :--: | :---------- |

docs/stores.md (lines 319-320) - needs alignment fix:
The table at these lines needs proper column alignment to match the header widths.

docs/wrappers.md (line 10):

-|---------|-------------|
+| ------- | ----------- |

Option 2: Disable MD060 rule (quick fix)

Add this to .markdownlint.jsonc:

{
  "default": true,
  "MD013": { ... },
  "MD024": { ... },
  "MD060": false
}

Recommended approach:

Since the DynamoDB feature implementation in this PR is clean and functional, I recommend:

  1. Merge a separate PR to fix all table formatting issues across the documentation
  2. This PR can then be rebased/merged once the formatting is fixed on main

Alternatively, you could disable the MD060 rule temporarily to unblock this PR, then fix the formatting project-wide in a follow-up PR.

Detailed Analysis

Failed Job Logs

AGENTS.md:252:11 error MD060/table-column-style Table column style [Table pipe is missing space to the left for style "compact"]
AGENTS.md:252:21 error MD060/table-column-style Table column style [Table pipe is missing space to the left for style "compact"]
AGENTS.md:252:1 error MD060/table-column-style Table column style [Table pipe is missing space to the right for style "compact"]
AGENTS.md:252:11 error MD060/table-column-style Table column style [Table pipe is missing space to the right for style "compact"]
docs/adapters.md:10:11 error MD060/table-column-style Table column style [Table pipe is missing space to the left for style "compact"]
docs/adapters.md:10:25 error MD060/table-column-style Table column style [Table pipe is missing space to the left for style "compact"]
docs/stores.md:33:9 error MD060/table-column-style Table column style [Table pipe is missing space to the left for style "compact"]
... [38 more similar errors]

Code Changes in This PR

The PR only modifies:

  • key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py - adds table_config parameter
  • key-value/key-value-aio/tests/stores/dynamodb/test_dynamodb.py - adds test for the new parameter

Both changes are well-implemented and follow the project patterns.

Related Files

Files causing the CI failure:

  • AGENTS.md - Project documentation for AI agents
  • docs/adapters.md - Documentation for adapters
  • docs/stores.md - Documentation for stores (including DynamoDB)
  • docs/wrappers.md - Documentation for wrappers

Files modified in this PR:

  • key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py - DynamoDB store implementation with new table_config parameter
  • key-value/key-value-aio/tests/stores/dynamodb/test_dynamodb.py - Test coverage for the new parameter

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py:
- Around line 158-175: The current merge uses
create_table_params.update(self._table_config) which lets callers override
critical keys like TableName, KeySchema, AttributeDefinitions, and BillingMode;
change the merge so the store's required defaults take precedence (apply/merge
user-provided self._table_config first, then overlay create_table_params) so
those critical keys cannot be replaced before calling
self._connected_client.create_table; reference create_table_params,
self._table_config and the create_table call when making the change.

In @key-value/key-value-aio/tests/stores/dynamodb/test_dynamodb.py:
- Around line 165-177: The DynamoDB table cleanup logic is duplicated; extract
it into a module-level async helper named cleanup_dynamodb_table(table_name:
str) that encapsulates the aioboto3.Session creation and the async client block
that suppresses exceptions, calls delete_table and waits with
waiter.get_waiter("table_not_exists"), then replace the duplicated blocks in the
store fixture and the test (the existing inline cleanup code around
DYNAMODB_ENDPOINT) to call cleanup_dynamodb_table(table_name) instead.
📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6736113 and 0d2c880.

📒 Files selected for processing (2)
  • key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py
  • key-value/key-value-aio/tests/stores/dynamodb/test_dynamodb.py
🧰 Additional context used
🧬 Code graph analysis (1)
key-value/key-value-aio/tests/stores/dynamodb/test_dynamodb.py (1)
key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py (1)
  • DynamoDBStore (29-269)
🔇 Additional comments (2)
key-value/key-value-aio/tests/stores/dynamodb/test_dynamodb.py (1)

194-206: Pragmatic test approach for DynamoDB Local limitations.

The test correctly focuses on verifying that table_config doesn't break store operations rather than validating SSE configuration, which DynamoDB Local doesn't fully support. This is appropriate for the integration test environment.

key-value/key-value-aio/src/key_value/aio/stores/dynamodb/store.py (1)

44-116: Document which table_config keys must not be overridden.

The signatures and overloads are well-structured. However, the documentation should warn that critical parameters (TableName, KeySchema, AttributeDefinitions) must not be included in table_config as they are required for store operation.

Comment on lines +165 to +177
# Clean up table if it exists
import aioboto3

session = aioboto3.Session(
aws_access_key_id="test",
aws_secret_access_key="test", # noqa: S106
region_name="us-east-1",
)
async with session.client(service_name="dynamodb", endpoint_url=DYNAMODB_ENDPOINT) as client: # type: ignore
with contextlib.suppress(Exception):
await client.delete_table(TableName=table_name) # type: ignore
waiter = client.get_waiter("table_not_exists") # type: ignore
await waiter.wait(TableName=table_name) # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider extracting table cleanup into a helper function.

This cleanup logic duplicates the pattern in the store fixture (lines 96-109). Extract to a shared helper to reduce duplication.

♻️ Proposed refactor

Add a helper function at module level:

async def cleanup_dynamodb_table(table_name: str) -> None:
    """Delete a DynamoDB table if it exists and wait for deletion."""
    import aioboto3
    
    session = aioboto3.Session(
        aws_access_key_id="test",
        aws_secret_access_key="test",  # noqa: S106
        region_name="us-east-1",
    )
    async with session.client(service_name="dynamodb", endpoint_url=DYNAMODB_ENDPOINT) as client:  # type: ignore
        with contextlib.suppress(Exception):
            await client.delete_table(TableName=table_name)  # type: ignore
            waiter = client.get_waiter("table_not_exists")  # type: ignore
            await waiter.wait(TableName=table_name)  # type: ignore

Then use it in both places:

-        # Clean up table if it exists
-        import aioboto3
-
-        session = aioboto3.Session(
-            aws_access_key_id="test",
-            aws_secret_access_key="test",  # noqa: S106
-            region_name="us-east-1",
-        )
-        async with session.client(service_name="dynamodb", endpoint_url=DYNAMODB_ENDPOINT) as client:  # type: ignore
-            with contextlib.suppress(Exception):
-                await client.delete_table(TableName=table_name)  # type: ignore
-                waiter = client.get_waiter("table_not_exists")  # type: ignore
-                await waiter.wait(TableName=table_name)  # type: ignore
+        # Clean up table if it exists
+        await cleanup_dynamodb_table(table_name)
🤖 Prompt for AI Agents
In @key-value/key-value-aio/tests/stores/dynamodb/test_dynamodb.py around lines
165 - 177, The DynamoDB table cleanup logic is duplicated; extract it into a
module-level async helper named cleanup_dynamodb_table(table_name: str) that
encapsulates the aioboto3.Session creation and the async client block that
suppresses exceptions, calls delete_table and waits with
waiter.get_waiter("table_not_exists"), then replace the duplicated blocks in the
store fixture and the test (the existing inline cleanup code around
DYNAMODB_ENDPOINT) to call cleanup_dynamodb_table(table_name) instead.

@strawgate
Copy link
Owner Author

@claude maybe we should just have a way to disable automatic table creation and make that the user's responsibility

@claude
Copy link

claude bot commented Jan 23, 2026

Claude finished @strawgate's task in 1m 16s —— View job


Analyzing Design Options

  • Read current DynamoDB store implementation
  • Analyze trade-offs between current approach vs. optional auto-creation
  • Review similar patterns in other stores
  • Provide comprehensive recommendation

Analysis: Optional Auto-Creation vs. Configurable Auto-Creation

Your suggestion to make table creation optional is worth considering. Let me break down the trade-offs:

Current Approach: table_config Parameter

Pros:

  • User-friendly: "Just works" out of the box - no setup required
  • Consistent with other stores (Elasticsearch auto-creates indices, MongoDB auto-creates collections)
  • Solves the immediate issue ([Feature] allow SSESpecification for dynamodb store #251) - allows SSESpecification and other config options
  • Backwards compatible

Cons:

  • CodeRabbit correctly identified a security concern: current merge order allows users to override critical parameters (TableName, KeySchema, AttributeDefinitions, BillingMode)
  • Users might accidentally break the store by overriding required structure
  • Less explicit about what's happening (auto-creation is "magic")

Alternative: Optional Auto-Creation Flag

Pros:

  • More explicit control: users who want to manage tables themselves can
  • Eliminates the parameter override security concern entirely
  • Better for production environments with strict infrastructure-as-code practices
  • Clearer separation of concerns

Cons:

  • More setup required for simple use cases
  • Breaking change if we remove auto-creation by default
  • Users need to understand DynamoDB table structure to create manually

Recommendation: Hybrid Approach

I recommend both approaches, with a small tweak to fix the security issue:

  1. Keep table_config but fix the merge order (as CodeRabbit suggested) so critical params can't be overridden:

    # Start with user config
    create_table_params = {**self._table_config}
    
    # Override with required parameters that cannot be changed
    create_table_params.update({
        "TableName": self._table_name,
        "KeySchema": [...],  # Required structure
        "AttributeDefinitions": [...],  # Required structure
        "BillingMode": "PAY_PER_REQUEST",  # Required billing
    })
  2. Add optional auto_create_table: bool = True parameter:

    def __init__(
        self,
        *,
        table_name: str,
        auto_create_table: bool = True,  # New parameter
        table_config: dict[str, Any] | None = None,
        ...
    ):

    Then in _setup():

    try:
        await self._connected_client.describe_table(TableName=self._table_name)
    except self._connected_client.exceptions.ResourceNotFoundException:
        if not self._auto_create_table:
            raise ValueError(
                f"Table '{self._table_name}' does not exist. "
                "Either create the table manually or set auto_create_table=True."
            )
        # ... existing create logic ...

This gives users three options:

  1. Simple: DynamoDBStore(table_name="my-table") - auto-creates with defaults
  2. Custom config: DynamoDBStore(table_name="my-table", table_config={"SSESpecification": {...}}) - auto-creates with custom options
  3. Manual creation: DynamoDBStore(table_name="my-table", auto_create_table=False) - expects pre-existing table

What to Do Now?

Option A: Fix merge order only (quick fix, addresses CodeRabbit's concern)

  • Pro: Minimal change, maintains current PR scope
  • Con: Doesn't address your concern about optional creation

Option B: Add auto_create_table flag + fix merge order (hybrid approach)

  • Pro: More flexible, addresses both concerns
  • Con: Slightly larger scope

Option C: Replace table_config with auto_create_table only

I'd recommend Option B (hybrid), but if you want to keep this PR focused, do Option A now and add the auto_create_table flag in a follow-up PR.

What would you prefer?


@strawgate
Copy link
Owner Author

@claude let's just add an option on store init that disables automatic table/schema creation. Let's make this a standard optional across all stores that have this potential issue.

@claude
Copy link

claude bot commented Jan 23, 2026

Claude finished @strawgate's task in 12m 37s —— View job


Adding auto_create option to stores

I've successfully added the auto_create parameter to all stores with auto-creation behavior, along with fixing the security issue in DynamoDB.

Tasks Completed

  • Read repository guidelines (CLAUDE.md) - No specific guidelines found
  • Identify all stores with auto-creation behavior
  • Design consistent parameter interface
  • Implement auto_create in DynamoDBStore
  • Fix merge order security issue in DynamoDBStore
  • Implement in MongoDB store
  • Implement in DuckDB store
  • Implement in Elasticsearch store
  • Implement in RocksDB store
  • Implement in DiskStore
  • Implement in MultiDiskStore
  • Implement in FileTreeStore
  • Add comprehensive tests
  • Run linting and type checking
  • Commit and push changes

Summary of Changes

Added auto_create: bool = True parameter to 8 stores in the async version (key-value-aio):

Store Resource Type Location
DynamoDBStore Table + TTL _setup() method
MongoDBStore Collection + Indices _setup_collection() method
DuckDBStore Table + Indices _setup() method
ElasticsearchStore Index + Mappings _setup_collection() method
RocksDBStore Database Directory __init__() constructor
DiskStore Cache Directory __init__() constructor
MultiDiskStore Collection Directories Factory function
FileTreeStore Directories + Info Files __init__() + _setup_collection()

Key Features

Design Pattern:

  • Parameter name: auto_create: bool = True
  • Fully backwards compatible (defaults to True)
  • When False, raises clear ValueError with helpful message
  • Consistent error messages across all stores

Security Fix:

  • Fixed DynamoDB merge order issue identified by CodeRabbit
  • User config is now applied first, then critical parameters are overlaid
  • Prevents users from accidentally overriding required table structure

Testing:

  • Added 2 new tests for DynamoDB store:
    • test_auto_create_false_raises_error - Verifies error when table doesn't exist
    • test_auto_create_true_creates_table - Verifies table creation works
  • All linting checks passed ✅

Documentation:

  • Updated all __init__ methods with clear parameter documentation
  • Explained behavior and error conditions

Error Messages Example

When auto_create=False and resource doesn't exist:

# DynamoDB
ValueError: Table 'my-table' does not exist. Either create the table manually or set auto_create=True.

# MongoDB
ValueError: Collection 'my-collection' does not exist. Either create the collection manually or set auto_create=True.

# DiskStore
ValueError: Directory '/path/to/cache' does not exist. Either create the directory manually or set auto_create=True.

Usage Example

# Option 1: Auto-create enabled (default, backwards compatible)
store = DynamoDBStore(table_name="my-table")  # Creates table if it doesn't exist

# Option 2: Manual creation required
store = DynamoDBStore(table_name="my-table", auto_create=False)  # Raises error if table doesn't exist

Commit: 8173574 — —

…vior

- Add auto_create boolean parameter (default True) to 8 stores:
  * DynamoDBStore
  * MongoDBStore
  * DuckDBStore
  * ElasticsearchStore
  * RocksDBStore
  * DiskStore
  * MultiDiskStore
  * FileTreeStore

- When auto_create=False, stores raise ValueError if resource doesn't exist
- Fix merge order security issue in DynamoDBStore (user config applied first, then critical params overlaid)
- Add comprehensive tests for auto_create functionality
- Fully backwards compatible (defaults to True)

Co-authored-by: Bill Easton <[email protected]>
@sonarqubecloud
Copy link

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
key-value/key-value-aio/src/key_value/aio/stores/rocksdb/store.py (1)

73-88: opts.create_if_missing(True) still creates the database when auto_create=False.

When auto_create=False and the directory exists but the RocksDB database doesn't, create_if_missing(True) will create the database regardless, contradicting the flag's intent. The flag currently only prevents directory creation, not database creation.

🐛 Proposed fix
opts = Options()
-opts.create_if_missing(True)
+opts.create_if_missing(self._auto_create)
🤖 Fix all issues with AI agents
In `@key-value/key-value-aio/src/key_value/aio/stores/disk/multi_store.py`:
- Around line 40-50: The overload of __init__ that accepts disk_cache_factory
currently declares an auto_create parameter that is stored but never used;
either drop auto_create from that overload signature and docstring to avoid
misleading callers, or explicitly state in the docstring that auto_create is
ignored when disk_cache_factory is provided — to fix, update the overloaded def
__init__(..., disk_cache_factory: CacheFactory, default_collection: str | None =
None) signature to remove auto_create and remove its docs (or add a
single-sentence note in the docstring that auto_create has no effect when
disk_cache_factory is used), referencing the __init__ overload,
disk_cache_factory, and auto_create symbols.

In `@key-value/key-value-aio/src/key_value/aio/stores/disk/store.py`:
- Around line 24-32: The __init__ currently accepts auto_create but never uses
it when an existing disk_cache is passed; update __init__ so that auto_create is
only applied when you construct a new Cache (i.e., when disk_cache is None) and
is not stored/used when disk_cache is provided: in the __init__ implementation
of the store class, branch on disk_cache—if disk_cache is None, honor
auto_create to create the directory and initialize Cache; if disk_cache is
provided, do not attempt directory creation and do not rely on or store
auto_create (or explicitly document/ignore it), keeping references to
disk_cache, default_collection, and any Cache initialization logic consistent.
- Around line 85-89: Add unit tests that exercise the auto_create behavior for
DiskStore and MultiDiskStore: for each store class (DiskStore and
MultiDiskStore) add a test named like
test_auto_create_false_raises_error_<store> that instantiates the store with
auto_create=False pointing at a non-existent directory and asserts a ValueError
is raised, and add a test named like
test_auto_create_true_creates_directory_<store> that instantiates the store with
auto_create=True pointing at a temporary non-existent directory and then asserts
the directory now exists (and performs cleanup). Target the code paths that
check directory.exists() and use the store's _auto_create behavior so the tests
cover the directory.mkdir(parents=True, exist_ok=True) branch.

In `@key-value/key-value-aio/src/key_value/aio/stores/duckdb/store.py`:
- Around line 277-298: The current logic only runs
_get_create_collection_index_sql and _get_create_expires_index_sql when the
table is newly created, so externally-created tables miss indexes even if
_auto_create is True; change the flow so that when _auto_create is True you
always execute the index creation statements (via
self._connection.execute(self._get_create_collection_index_sql()) and
self._connection.execute(self._get_create_expires_index_sql())) regardless of
whether table_exists is True, while still creating the table with
self._get_create_table_sql() only if it does not exist and preserving the
ValueError when _auto_create is False and the table is missing.

In `@key-value/key-value-aio/tests/stores/dynamodb/test_dynamodb.py`:
- Around line 161-207: Wrap the body of test_table_config_sse_specification in a
try/finally so the created table "kv-store-test-sse" is deleted in the finally
block regardless of test outcome; use the same aioboto3.Session client code (the
async with session.client(...) as client block and client.delete_table +
client.get_waiter("table_not_exists").wait) to perform the cleanup, and
reference the DynamoDBStore instance (store) and table_name variable to ensure
you delete the correct table. Ensure the delete logic is async and suppresses
errors only around delete (as currently done with contextlib.suppress) so
cleanup never leaves a leftover table even if the test fails.
♻️ Duplicate comments (2)
key-value/key-value-aio/tests/stores/dynamodb/test_dynamodb.py (2)

165-177: Extract table cleanup into a helper function.

This cleanup pattern is now duplicated 4 times (store fixture + 3 new tests). A shared helper would reduce duplication.


241-278: Missing post-test cleanup for created table.

Same issue as test_table_config_sse_specification - the kv-store-test-autocreate table is created but not cleaned up after the test. Apply the same try/finally pattern.

Comment on lines 40 to 50
@overload
def __init__(self, *, disk_cache_factory: CacheFactory, default_collection: str | None = None) -> None:
def __init__(self, *, disk_cache_factory: CacheFactory, default_collection: str | None = None, auto_create: bool = True) -> None:
"""Initialize a multi-disk store with a custom factory function. The function will be called for each
collection created by the caller with the collection name as the argument. Use this to tightly
control the creation of the diskcache Cache instances.
Args:
disk_cache_factory: A factory function that creates a diskcache Cache instance for a given collection.
default_collection: The default collection to use if no collection is provided.
auto_create: Whether to automatically create directories if they don't exist. Defaults to True.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

auto_create parameter is unused when disk_cache_factory is provided.

When a custom disk_cache_factory is supplied, the auto_create flag is stored but never consulted—directory creation is delegated entirely to the user's factory. This could confuse callers who expect the parameter to have an effect.

Consider either:

  1. Removing auto_create from this overload
  2. Documenting that it has no effect when a custom factory is provided
📖 Documentation fix option
     `@overload`
     def __init__(self, *, disk_cache_factory: CacheFactory, default_collection: str | None = None, auto_create: bool = True) -> None:
         """Initialize a multi-disk store with a custom factory function. The function will be called for each
         collection created by the caller with the collection name as the argument. Use this to tightly
         control the creation of the diskcache Cache instances.
 
         Args:
             disk_cache_factory: A factory function that creates a diskcache Cache instance for a given collection.
             default_collection: The default collection to use if no collection is provided.
-            auto_create: Whether to automatically create directories if they don't exist. Defaults to True.
+            auto_create: Whether to automatically create directories if they don't exist. Defaults to True.
+                Note: This parameter has no effect when a custom factory is provided.
         """
🤖 Prompt for AI Agents
In `@key-value/key-value-aio/src/key_value/aio/stores/disk/multi_store.py` around
lines 40 - 50, The overload of __init__ that accepts disk_cache_factory
currently declares an auto_create parameter that is stored but never used;
either drop auto_create from that overload signature and docstring to avoid
misleading callers, or explicitly state in the docstring that auto_create is
ignored when disk_cache_factory is provided — to fix, update the overloaded def
__init__(..., disk_cache_factory: CacheFactory, default_collection: str | None =
None) signature to remove auto_create and remove its docs (or add a
single-sentence note in the docstring that auto_create has no effect when
disk_cache_factory is used), referencing the __init__ overload,
disk_cache_factory, and auto_create symbols.

Comment on lines 24 to 32
@overload
def __init__(self, *, disk_cache: Cache, default_collection: str | None = None) -> None:
def __init__(self, *, disk_cache: Cache, default_collection: str | None = None, auto_create: bool = True) -> None:
"""Initialize the disk store.
Args:
disk_cache: An existing diskcache Cache instance to use.
default_collection: The default collection to use if no collection is provided.
auto_create: Whether to automatically create the directory if it doesn't exist. Defaults to True.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

auto_create parameter is unused when disk_cache is provided.

When an existing Cache instance is passed, the directory already exists (or the Cache would have failed). The auto_create parameter is stored but never used in this path.

📖 Documentation fix option
     `@overload`
     def __init__(self, *, disk_cache: Cache, default_collection: str | None = None, auto_create: bool = True) -> None:
         """Initialize the disk store.
 
         Args:
             disk_cache: An existing diskcache Cache instance to use.
             default_collection: The default collection to use if no collection is provided.
-            auto_create: Whether to automatically create the directory if it doesn't exist. Defaults to True.
+            auto_create: Whether to automatically create the directory if it doesn't exist. Defaults to True.
+                Note: This parameter has no effect when an existing disk_cache is provided.
         """
🤖 Prompt for AI Agents
In `@key-value/key-value-aio/src/key_value/aio/stores/disk/store.py` around lines
24 - 32, The __init__ currently accepts auto_create but never uses it when an
existing disk_cache is passed; update __init__ so that auto_create is only
applied when you construct a new Cache (i.e., when disk_cache is None) and is
not stored/used when disk_cache is provided: in the __init__ implementation of
the store class, branch on disk_cache—if disk_cache is None, honor auto_create
to create the directory and initialize Cache; if disk_cache is provided, do not
attempt directory creation and do not rely on or store auto_create (or
explicitly document/ignore it), keeping references to disk_cache,
default_collection, and any Cache initialization logic consistent.

Comment on lines +85 to +89
if not directory.exists():
if not self._auto_create:
msg = f"Directory '{directory}' does not exist. Either create the directory manually or set auto_create=True."
raise ValueError(msg)
directory.mkdir(parents=True, exist_ok=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check for existing auto_create tests in disk store test files
rg -n "auto_create" --glob "**/test*disk*.py" -t py

Repository: strawgate/py-key-value

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Find disk store test files
fd -t f "test.*disk.*\.py$"

Repository: strawgate/py-key-value

Length of output: 317


🏁 Script executed:

#!/bin/bash
# Check for DynamoDB auto_create tests
rg -n "auto_create" --glob "**/test*dynamodb*.py" -t py

Repository: strawgate/py-key-value

Length of output: 1090


🏁 Script executed:

#!/bin/bash
# List all test files in key-value package to understand structure
find . -path "*/tests/*" -name "*.py" -type f | grep -E "(disk|dynamodb)" | head -20

Repository: strawgate/py-key-value

Length of output: 572


🏁 Script executed:

#!/bin/bash
# Check the disk test file content for directory creation/setup tests
cat -n key-value/key-value-aio/tests/stores/disk/test_disk.py | head -100

Repository: strawgate/py-key-value

Length of output: 2508


🏁 Script executed:

#!/bin/bash
# Search more broadly for "directory" or directory creation patterns in disk tests
rg -n "directory\|Directory" key-value/key-value-aio/tests/stores/disk/test_disk.py

Repository: strawgate/py-key-value

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check how many test methods exist in the disk test file
rg -n "def test_" key-value/key-value-aio/tests/stores/disk/test_disk.py

Repository: strawgate/py-key-value

Length of output: 144


🏁 Script executed:

#!/bin/bash
# Get the full test_disk.py to see all tests
wc -l key-value/key-value-aio/tests/stores/disk/test_disk.py

Repository: strawgate/py-key-value

Length of output: 123


🏁 Script executed:

#!/bin/bash
# Get all test methods in the file
rg -n "def test_" key-value/key-value-aio/tests/stores/disk/test_disk.py -A 1

Repository: strawgate/py-key-value

Length of output: 242


🏁 Script executed:

#!/bin/bash
# Check if there are any tests for initialization/directory creation
rg -n "def test_\|async def test_" key-value/key-value-aio/tests/stores/disk/test_disk.py

Repository: strawgate/py-key-value

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Read the entire test file
cat -n key-value/key-value-aio/tests/stores/disk/test_disk.py

Repository: strawgate/py-key-value

Length of output: 2508


🏁 Script executed:

#!/bin/bash
# Check the DiskStore implementation to verify auto_create parameter was added
cat -n key-value/key-value-aio/src/key_value/aio/stores/disk/store.py | head -100

Repository: strawgate/py-key-value

Length of output: 4775


🏁 Script executed:

#!/bin_bash
# Search for auto_create in the DiskStore implementation
rg -n "auto_create" key-value/key-value-aio/src/key_value/aio/stores/disk/store.py

Repository: strawgate/py-key-value

Length of output: 947


🏁 Script executed:

#!/bin/bash
# Check test_multi_disk.py to see if auto_create is tested there
cat -n key-value/key-value-aio/tests/stores/disk/test_multi_disk.py

Repository: strawgate/py-key-value

Length of output: 2671


🏁 Script executed:

#!/bin/bash
# Check if there's any conftest.py that might set up auto_create tests
find key-value/key-value-aio/tests -name "conftest.py" -exec grep -l "auto_create\|per_test_temp_dir" {} \;

Repository: strawgate/py-key-value

Length of output: 48


🏁 Script executed:

#!/bin/bash
# Check MultiDiskStore to see if it also has auto_create
cat -n key-value/key-value-aio/src/key_value/aio/stores/disk/multi_store.py | head -120

Repository: strawgate/py-key-value

Length of output: 6065


🏁 Script executed:

#!/bin/bash
# Search for auto_create in MultiDiskStore
rg -n "auto_create" key-value/key-value-aio/src/key_value/aio/stores/disk/multi_store.py

Repository: strawgate/py-key-value

Length of output: 963


Add tests for auto_create parameter in DiskStore and MultiDiskStore.

Both stores now support auto_create, but lack tests for this behavior. DynamoDB includes test_auto_create_false_raises_error and test_auto_create_true_creates_table tests—add similar tests for DiskStore and MultiDiskStore to cover the directory creation logic.

🤖 Prompt for AI Agents
In `@key-value/key-value-aio/src/key_value/aio/stores/disk/store.py` around lines
85 - 89, Add unit tests that exercise the auto_create behavior for DiskStore and
MultiDiskStore: for each store class (DiskStore and MultiDiskStore) add a test
named like test_auto_create_false_raises_error_<store> that instantiates the
store with auto_create=False pointing at a non-existent directory and asserts a
ValueError is raised, and add a test named like
test_auto_create_true_creates_directory_<store> that instantiates the store with
auto_create=True pointing at a temporary non-existent directory and then asserts
the directory now exists (and performs cleanup). Target the code paths that
check directory.exists() and use the store's _auto_create behavior so the tests
cover the directory.mkdir(parents=True, exist_ok=True) branch.

Comment on lines +277 to 298
# Check if the table exists
table_exists_sql = f"""
SELECT COUNT(*) FROM information_schema.tables
WHERE table_name = '{self._table_name}'
""" # noqa: S608
result = self._connection.execute(table_exists_sql).fetchone()
table_exists = result[0] > 0 if result else False

if not table_exists:
if not self._auto_create:
msg = f"Table '{self._table_name}' does not exist. Either create the table manually or set auto_create=True."
raise ValueError(msg)

# Create the main table for storing key-value entries
self._connection.execute(self._get_create_table_sql())

# Create index for efficient collection queries
self._connection.execute(self._get_create_collection_index_sql())
# Create index for efficient collection queries
self._connection.execute(self._get_create_collection_index_sql())

# Create index for expiration-based queries
self._connection.execute(self._get_create_expires_index_sql())
# Create index for expiration-based queries
self._connection.execute(self._get_create_expires_index_sql())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -path "*/key-value/key-value-aio/src/key_value/aio/stores/duckdb/store.py" -type f

Repository: strawgate/py-key-value

Length of output: 132


🏁 Script executed:

cd key-value/key-value-aio && find . -path "*/stores/duckdb/store.py" -type f

Repository: strawgate/py-key-value

Length of output: 108


🏁 Script executed:

# Search for the DuckDB store file
fd -t f "store.py" --path "*duckdb*"

Repository: strawgate/py-key-value

Length of output: 299


🏁 Script executed:

wc -l ./key-value/key-value-aio/src/key_value/aio/stores/duckdb/store.py

Repository: strawgate/py-key-value

Length of output: 136


🏁 Script executed:

sed -n '260,310p' ./key-value/key-value-aio/src/key_value/aio/stores/duckdb/store.py

Repository: strawgate/py-key-value

Length of output: 2287


🏁 Script executed:

sed -n '1,100p' ./key-value/key-value-aio/src/key_value/aio/stores/duckdb/store.py

Repository: strawgate/py-key-value

Length of output: 3968


🏁 Script executed:

grep -n "_get_create_collection_index_sql\|_get_create_expires_index_sql\|_get_create_table_sql" ./key-value/key-value-aio/src/key_value/aio/stores/duckdb/store.py

Repository: strawgate/py-key-value

Length of output: 460


🏁 Script executed:

sed -n '182,225p' ./key-value/key-value-aio/src/key_value/aio/stores/duckdb/store.py

Repository: strawgate/py-key-value

Length of output: 1443


🏁 Script executed:

# Verify line lengths in the affected code
sed -n '277,298p' ./key-value/key-value-aio/src/key_value/aio/stores/duckdb/store.py | awk '{print length, NR": "$0}' | sort -rn | head -5

Repository: strawgate/py-key-value

Length of output: 517


Create indexes even when table already exists with auto_create=True.

The index creation methods use IF NOT EXISTS and should always run when auto_create=True, not just when the table is created. Externally-created tables without indexes will miss them, causing performance regressions.

Fix
        if not table_exists:
            if not self._auto_create:
                msg = f"Table '{self._table_name}' does not exist. Either create the table manually or set auto_create=True."
                raise ValueError(msg)

            # Create the main table for storing key-value entries
            self._connection.execute(self._get_create_table_sql())

-            # Create index for efficient collection queries
-            self._connection.execute(self._get_create_collection_index_sql())
-
-            # Create index for expiration-based queries
-            self._connection.execute(self._get_create_expires_index_sql())
+        if self._auto_create:
+            # Create index for efficient collection queries
+            self._connection.execute(self._get_create_collection_index_sql())
+
+            # Create index for expiration-based queries
+            self._connection.execute(self._get_create_expires_index_sql())
🤖 Prompt for AI Agents
In `@key-value/key-value-aio/src/key_value/aio/stores/duckdb/store.py` around
lines 277 - 298, The current logic only runs _get_create_collection_index_sql
and _get_create_expires_index_sql when the table is newly created, so
externally-created tables miss indexes even if _auto_create is True; change the
flow so that when _auto_create is True you always execute the index creation
statements (via
self._connection.execute(self._get_create_collection_index_sql()) and
self._connection.execute(self._get_create_expires_index_sql())) regardless of
whether table_exists is True, while still creating the table with
self._get_create_table_sql() only if it does not exist and preserving the
ValueError when _auto_create is False and the table is missing.

Comment on lines +161 to +207
async def test_table_config_sse_specification(self, setup_dynamodb: None):
"""Test that SSESpecification can be passed via table_config."""
table_name = "kv-store-test-sse"

# Clean up table if it exists
import aioboto3

session = aioboto3.Session(
aws_access_key_id="test",
aws_secret_access_key="test", # noqa: S106
region_name="us-east-1",
)
async with session.client(service_name="dynamodb", endpoint_url=DYNAMODB_ENDPOINT) as client: # type: ignore
with contextlib.suppress(Exception):
await client.delete_table(TableName=table_name) # type: ignore
waiter = client.get_waiter("table_not_exists") # type: ignore
await waiter.wait(TableName=table_name) # type: ignore

# Create store with SSE configuration
store = DynamoDBStore(
table_name=table_name,
endpoint_url=DYNAMODB_ENDPOINT,
aws_access_key_id="test",
aws_secret_access_key="test", # noqa: S106
region_name="us-east-1",
table_config={
"SSESpecification": {
"Enabled": True,
"SSEType": "AES256",
}
},
)

async with store:
# Verify table was created successfully
async with session.client(service_name="dynamodb", endpoint_url=DYNAMODB_ENDPOINT) as client: # type: ignore
table_description = await client.describe_table(TableName=table_name) # type: ignore

# DynamoDB Local might not fully support SSE, but we can verify the store accepts the config
# The important thing is that the store doesn't error when table_config is provided
assert table_description is not None

# Verify basic operations still work
await store.put(collection="test", key="test_key", value={"message": "SSE test"})
result = await store.get(collection="test", key="test_key")
assert result == {"message": "SSE test"}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Tests should clean up tables after completion, not just before.

The test creates kv-store-test-sse but doesn't delete it after the test completes. If tests run in different orders or the test fails mid-execution, leftover tables could cause issues.

♻️ Proposed fix using try/finally for cleanup
     async def test_table_config_sse_specification(self, setup_dynamodb: None):
         """Test that SSESpecification can be passed via table_config."""
         table_name = "kv-store-test-sse"
 
         # Clean up table if it exists
         import aioboto3
 
         session = aioboto3.Session(
             aws_access_key_id="test",
             aws_secret_access_key="test",  # noqa: S106
             region_name="us-east-1",
         )
         async with session.client(service_name="dynamodb", endpoint_url=DYNAMODB_ENDPOINT) as client:  # type: ignore
             with contextlib.suppress(Exception):
                 await client.delete_table(TableName=table_name)  # type: ignore
                 waiter = client.get_waiter("table_not_exists")  # type: ignore
                 await waiter.wait(TableName=table_name)  # type: ignore
 
-        # Create store with SSE configuration
-        store = DynamoDBStore(
-            table_name=table_name,
-            endpoint_url=DYNAMODB_ENDPOINT,
-            aws_access_key_id="test",
-            aws_secret_access_key="test",  # noqa: S106
-            region_name="us-east-1",
-            table_config={
-                "SSESpecification": {
-                    "Enabled": True,
-                    "SSEType": "AES256",
-                }
-            },
-        )
-
-        async with store:
-            # Verify table was created successfully
-            async with session.client(service_name="dynamodb", endpoint_url=DYNAMODB_ENDPOINT) as client:  # type: ignore
-                table_description = await client.describe_table(TableName=table_name)  # type: ignore
-
-                # DynamoDB Local might not fully support SSE, but we can verify the store accepts the config
-                # The important thing is that the store doesn't error when table_config is provided
-                assert table_description is not None
-
-            # Verify basic operations still work
-            await store.put(collection="test", key="test_key", value={"message": "SSE test"})
-            result = await store.get(collection="test", key="test_key")
-            assert result == {"message": "SSE test"}
+        try:
+            # Create store with SSE configuration
+            store = DynamoDBStore(
+                table_name=table_name,
+                endpoint_url=DYNAMODB_ENDPOINT,
+                aws_access_key_id="test",
+                aws_secret_access_key="test",  # noqa: S106
+                region_name="us-east-1",
+                table_config={
+                    "SSESpecification": {
+                        "Enabled": True,
+                        "SSEType": "AES256",
+                    }
+                },
+            )
+
+            async with store:
+                # Verify table was created successfully
+                async with session.client(service_name="dynamodb", endpoint_url=DYNAMODB_ENDPOINT) as client:  # type: ignore
+                    table_description = await client.describe_table(TableName=table_name)  # type: ignore
+                    assert table_description is not None
+
+                # Verify basic operations still work
+                await store.put(collection="test", key="test_key", value={"message": "SSE test"})
+                result = await store.get(collection="test", key="test_key")
+                assert result == {"message": "SSE test"}
+        finally:
+            # Clean up table after test
+            async with session.client(service_name="dynamodb", endpoint_url=DYNAMODB_ENDPOINT) as client:  # type: ignore
+                with contextlib.suppress(Exception):
+                    await client.delete_table(TableName=table_name)  # type: ignore
+                    waiter = client.get_waiter("table_not_exists")  # type: ignore
+                    await waiter.wait(TableName=table_name)  # type: ignore

As per coding guidelines, tests should clean up resources and not interfere with each other.

🤖 Prompt for AI Agents
In `@key-value/key-value-aio/tests/stores/dynamodb/test_dynamodb.py` around lines
161 - 207, Wrap the body of test_table_config_sse_specification in a try/finally
so the created table "kv-store-test-sse" is deleted in the finally block
regardless of test outcome; use the same aioboto3.Session client code (the async
with session.client(...) as client block and client.delete_table +
client.get_waiter("table_not_exists").wait) to perform the cleanup, and
reference the DynamoDBStore instance (store) and table_name variable to ensure
you delete the correct table. Ensure the delete logic is async and suppresses
errors only around delete (as currently done with contextlib.suppress) so
cleanup never leaves a leftover table even if the test fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] allow SSESpecification for dynamodb store

2 participants