Add regression test for Sequential sublayer in custom train_step#22273
Add regression test for Sequential sublayer in custom train_step#22273DeborahOlaboye wants to merge 3 commits intokeras-team:masterfrom
train_step#22273Conversation
Summary of ChangesHello @DeborahOlaboye, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a regression test to address a previously identified bug where calling a Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Code Review
The pull request introduces a regression test for a specific issue where a Sequential sublayer within a custom train_step would incorrectly receive KerasTensors, leading to a ValueError. The changes include a new model class SequentialSublayerInTrainStepModel and a corresponding test method test_sequential_sublayer_in_custom_train_step to ensure this bug does not reappear. The implementation looks correct and directly addresses the reported issue by verifying that the model.fit() call completes without errors. The JAX backend is appropriately skipped due to its different train_step signature, which is a good consideration for backend compatibility.
| def train_step(self, data): | ||
| x, y = data[0], data[1] | ||
| # Before the fix for #18459, `x` was a KerasTensor here, | ||
| # causing an error when passed to the Sequential sublayer. | ||
| x = self.sublayer(x, training=True) | ||
| return super().train_step((x, y)) |
There was a problem hiding this comment.
The comment Before the fix for #18459, x was a KerasTensor here, causing an error when passed to the Sequential sublayer. is helpful for understanding the context of the fix. However, since the fix is already applied, this comment could be rephrased to describe the current behavior or removed if the code itself is self-explanatory. The goal of comments should be to explain why something is done, not just what was fixed, especially for regression tests. This aligns with the principle of minimizing cognitive load (Repository Style Guide, line 48) by keeping documentation concise and relevant to the current state.
| def train_step(self, data): | |
| x, y = data[0], data[1] | |
| # Before the fix for #18459, `x` was a KerasTensor here, | |
| # causing an error when passed to the Sequential sublayer. | |
| x = self.sublayer(x, training=True) | |
| return super().train_step((x, y)) | |
| def train_step(self, data): | |
| x, y = data[0], data[1] | |
| x = self.sublayer(x, training=True) | |
| return super().train_step((x, y)) |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #22273 +/- ##
===========================================
+ Coverage 71.44% 83.04% +11.59%
===========================================
Files 594 596 +2
Lines 65029 66800 +1771
Branches 10174 10403 +229
===========================================
+ Hits 46461 55475 +9014
+ Misses 16105 8683 -7422
- Partials 2463 2642 +179
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Adds a regression test to verify that calling a Sequential sublayer from a custom train_step during model.fit() does not incorrectly pass KerasTensors to the sublayer, which previously caused: "A KerasTensor cannot be used as input to a TensorFlow function." The test uses a model that holds a Sequential sublayer (Rescaling) that is only called inside train_step (not in call()), mirroring the pattern reported in the issue.
381373b to
316d355
Compare
hertschuh
left a comment
There was a problem hiding this comment.
I'm confused about this, it doesn't include a bug fix.
To be clear, I don't think the bug is just the result of using Sequential in a sublayer with a custom train_step. The root cause of the bug still needs to be determined.
@hertschuh, thanks for the review. You're right, the previous commit only added a regression test without a bug fix. I've now identified the root cause and added a fix. |
Root cause: when a Sequential sublayer is only used inside a custom `train_step` (not in `call()`), it is not built during `_symbolic_build`, which only traces `call()`. For the TF backend without a distribute strategy, `_maybe_symbolic_build()` previously returned early without calling `_symbolic_build` at all. This caused the Sequential sublayer to be built lazily during the first `tf.function` trace of `train_step`. `Sequential.build()` creates KerasTensors and calls `compute_output_spec`, which for TF creates a nested `FuncGraph` inside the already-active `tf.function` context. When TF then tries to use a KerasTensor as a real tensor, it hits `KerasTensor.__tf_tensor__` which raises: "A KerasTensor cannot be used as input to a TensorFlow function." Fix (two parts): 1. `_symbolic_build` (trainer.py): after tracing `call()`, iterate over all sublayers and pre-build any that are still unbuilt by running `compute_output_spec` on the symbolic input `x`. 2. TF `_maybe_symbolic_build` (tensorflow/trainer.py): instead of always deferring when no distribute strategy is set, only defer when all layers are already built. If unbuilt sublayers exist, call `_symbolic_build` so they are built before `tf.function` traces `train_step`. Also updates the regression test docstrings to accurately describe the root cause (keras-teamgh-18459).
62c9532 to
59cf91a
Compare
Description
Calling a Sequential model from a custom
train_stepduringmodel.fit()previously caused:ValueError: A KerasTensor cannot be used as input to a TensorFlow function.
This happened because
_symbolic_buildwas tracingtrain_stepwithKerasTensorinputs.When
train_steppassed those tensors to a Sequential sublayer (e.g., a data augmentationpipeline), TF's internal conversion (
__tf_tensor__) raised the error.Changes
Adds a regression test to
trainer_test.pyto prevent this bug from silently returningin future refactors:
SequentialSublayerInTrainStepModel— a model that holds aSequentialsublayer(
Rescaling) called only insidetrain_step, not incall(), mirroring theexact pattern from the issue.
test_sequential_sublayer_in_custom_train_step— runsmodel.fit()and assertsno KerasTensor-related error is raised.
Testing
KERAS_BACKEND=torch pytest keras/src/trainers/trainer_test.py \ -k test_sequential_sublayer_in_custom_train_step -xvs Closes #18459