Skip to content

Add get_fields() cache#593

Open
anze3db wants to merge 3 commits intomodel-bakers:mainfrom
anze3db:add_get_fields_cache
Open

Add get_fields() cache#593
anze3db wants to merge 3 commits intomodel-bakers:mainfrom
anze3db:add_get_fields_cache

Conversation

@anze3db
Copy link
Copy Markdown

@anze3db anze3db commented Mar 30, 2026

Describe the change
This change caches get_fields() call that can be expensive on larger models. This change almost halved our test execution time:

Before:

16082 passed, 1099 subtests passed in 534.87s (0:08:54)

After:

16082 passed, 1099 subtests passed in 280.93s (0:04:40)

PR Checklist

  • Change is covered with tests
  • CHANGELOG.md is updated if needed

Summary by CodeRabbit

  • Changed

    • Field lookup results are now cached per model, improving performance and ensuring consistent repeated lookups while still excluding reverse relations.
  • Tests

    • Added test coverage verifying per-model caching, correct exclusion of reverse relations, and stable results across repeated calls.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 57716a1e-904b-4b13-b948-285480789707

📥 Commits

Reviewing files that changed from the base of the PR and between d443b0d and b368d1d.

📒 Files selected for processing (1)
  • CHANGELOG.md
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md

📝 Walkthrough

Walkthrough

The changes implement field-level caching in Baker.get_fields() to avoid recomputing model fields on repeated calls. The implementation stores results in a class-level cache keyed by model class and filters related objects using id()-based comparison rather than set difference. Tests verify caching behavior and per-model cache isolation.

Changes

Cohort / File(s) Summary
Documentation
CHANGELOG.md
Updated changelog to document that Baker.get_fields() results are now cached per model.
Caching Implementation
model_bakery/baker.py
Added Baker._fields_cache: dict[type[Model], set[Any]] and modified get_fields() to compute and cache field sets per model, excluding related objects via id()-based filtering.
Cache Verification Tests
tests/test_baker.py
Added TestGetFieldsCaching tests verifying reverse-relation exclusion, returned object is cached for repeated calls, and cache isolation across different models.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hopped through fields with eager paws,
I cached their homes without a pause.
Now Baker bakes with speed and grace,
Each model keeps its tidy place.
Thump! A faster, hoppier pace. 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: caching is added to the get_fields() method to improve performance.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/test_baker.py (1)

1550-1575: Isolate _fields_cache state per test.

These tests mutate baker.Baker._fields_cache directly and only partially clean it. Use a fixture to snapshot/restore the full cache to avoid order-dependent behavior.

🧪 Suggested test isolation pattern
 class TestGetFieldsCaching:
     """Tests for Baker.get_fields() caching and id()-based filtering."""
+
+    `@pytest.fixture`(autouse=True)
+    def restore_fields_cache(self):
+        original = baker.Baker._fields_cache.copy()
+        try:
+            yield
+        finally:
+            baker.Baker._fields_cache = original
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_baker.py` around lines 1550 - 1575, The tests mutate
baker.Baker._fields_cache and only partially clean it, causing order-dependent
failures; fix by isolating _fields_cache for these tests: snapshot the original
baker.Baker._fields_cache at test start (or use a fixture) and restore it after
each test (or set baker.Baker._fields_cache = {} before calling Baker), updating
the tests like test_get_fields_result_is_cached,
test_get_fields_different_models_cached_separately, and
test_get_fields_includes_all_non_reverse_fields to perform the snapshot/restore
(or use a pytest fixture that yields and restores baker.Baker._fields_cache) so
each test runs with a known, clean cache state.
model_bakery/baker.py (1)

351-352: Use an immutable cache value to protect shared field cache integrity.

get_fields() currently returns a shared mutable set. A single accidental mutation can affect all future calls for that model. Prefer caching a frozenset.

🛡️ Proposed change
-    _fields_cache: dict[type[Model], set[Any]] = {}
+    _fields_cache: dict[type[Model], frozenset[Any]] = {}

-    def get_fields(self) -> set[Any]:
+    def get_fields(self) -> frozenset[Any]:
         model = self.model
         if model not in self._fields_cache:
             related_ids = {id(f) for f in model._meta.related_objects}
-            self._fields_cache[model] = {
+            self._fields_cache[model] = frozenset(
                 f for f in model._meta.get_fields() if id(f) not in related_ids
-            }
+            )
         return self._fields_cache[model]

Also applies to: 438-445

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_bakery/baker.py` around lines 351 - 352, The shared _fields_cache
currently stores and returns mutable set objects; change the cache to store
immutable frozenset values and update get_fields (and any other places that
currently insert or return sets, e.g., the logic referenced around get_fields at
lines ~438-445) so it converts computed field sets to frozenset before caching
and always returns a frozenset to callers; make sure any code that previously
mutated the returned set either copies to a new set before mutation or is
updated to work with the immutable frozenset, referencing _fields_cache and
get_fields to locate the changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CHANGELOG.md`:
- Line 13: The changelog contains a typo where the function call is written as
get_fields()() in the sentence about caching results from Baker.get_fields();
update the text to use a single call, Baker.get_fields(), so the line reads
"Cache Baker.get_fields() results per model..." (ensure the symbol
Baker.get_fields() is used exactly once).

---

Nitpick comments:
In `@model_bakery/baker.py`:
- Around line 351-352: The shared _fields_cache currently stores and returns
mutable set objects; change the cache to store immutable frozenset values and
update get_fields (and any other places that currently insert or return sets,
e.g., the logic referenced around get_fields at lines ~438-445) so it converts
computed field sets to frozenset before caching and always returns a frozenset
to callers; make sure any code that previously mutated the returned set either
copies to a new set before mutation or is updated to work with the immutable
frozenset, referencing _fields_cache and get_fields to locate the changes.

In `@tests/test_baker.py`:
- Around line 1550-1575: The tests mutate baker.Baker._fields_cache and only
partially clean it, causing order-dependent failures; fix by isolating
_fields_cache for these tests: snapshot the original baker.Baker._fields_cache
at test start (or use a fixture) and restore it after each test (or set
baker.Baker._fields_cache = {} before calling Baker), updating the tests like
test_get_fields_result_is_cached,
test_get_fields_different_models_cached_separately, and
test_get_fields_includes_all_non_reverse_fields to perform the snapshot/restore
(or use a pytest fixture that yields and restores baker.Baker._fields_cache) so
each test runs with a known, clean cache state.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e85fd69f-794e-4bea-a68a-fad357e16576

📥 Commits

Reviewing files that changed from the base of the PR and between b094155 and d443b0d.

📒 Files selected for processing (3)
  • CHANGELOG.md
  • model_bakery/baker.py
  • tests/test_baker.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant