refactor(data): unify Perlin noise helpers and expose synthetic blend factor.#3447
refactor(data): unify Perlin noise helpers and expose synthetic blend factor.#3447geeky33 wants to merge 3 commits intoopen-edge-platform:mainfrom
Conversation
…ehavior Signed-off-by: geeky33 <aaryap1204@gmail.com>
|
Hello, @ashwinvaidya17 @rajeshgangireddy Thank you |
There was a problem hiding this comment.
Pull request overview
This PR refactors synthetic anomaly generation by centralizing Perlin noise utilities (including GLASS-compatible defaults), reusing shared threshold-rescaling logic across models, and exposing a configurable synthetic_blend_factor on image datamodules so synthetic anomaly intensity can be controlled via config/API.
Changes:
- Consolidates Perlin helpers (default/GLASS scale ranges, optional power-of-two padding, threshold rescaling) and reuses them in DSR and SuperSimpleNet anomaly generators.
- Threads
blend_factorthroughmake_synthetic_dataset/SyntheticAnomalyDatasetand forwards it from image datamodules viasynthetic_blend_factor. - Adds/updates unit tests and updates example/docs YAML snippets to show
synthetic_blend_factor.
Reviewed changes
Copilot reviewed 27 out of 27 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/anomalib/data/utils/generators/perlin.py |
Adds shared Perlin utilities, GLASS defaults, and PerlinAnomalyGenerator compatibility knobs. |
src/anomalib/data/utils/generators/__init__.py |
Re-exports new Perlin helpers/constants from generators package. |
src/anomalib/data/utils/__init__.py |
Re-exports new Perlin helpers/constants from data utils top-level API. |
src/anomalib/data/utils/synthetic.py |
Adds blend_factor plumbing into synthetic dataset creation and dataset class. |
src/anomalib/data/datamodules/base/image.py |
Adds synthetic_blend_factor to datamodule and forwards to synthetic split creation. |
src/anomalib/data/datamodules/image/{folder,bmad,btech,datumaro,kaputt,kolektor,mpdd,mvtecad,mvtecad2,mvtec_loco,realiad,tabular,vad,visa}.py |
Threads synthetic_blend_factor through concrete image datamodules into base class. |
src/anomalib/models/image/dsr/anomaly_generator.py |
Replaces duplicated Perlin threshold-rescale logic with shared helper. |
src/anomalib/models/image/supersimplenet/anomaly_generator.py |
Replaces duplicated Perlin threshold-rescale logic with shared helper. |
tests/unit/data/utils/test_perlin.py |
Adds unit tests for shared Perlin utilities and GLASS helper. |
tests/unit/data/utils/test_synthetic.py |
Adds/updates tests to ensure blend factor is exposed and configurable. |
tests/unit/data/datamodule/image/test_folder.py |
Tests that datamodule forwards synthetic_blend_factor into synthetic split creation. |
examples/configs/data/folder.yaml |
Documents new synthetic_blend_factor config field. |
docs/source/snippets/.../normal_and_synthetic.yaml |
Updates docs snippets to include synthetic_blend_factor. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -117,6 +122,7 @@ def __init__( | |||
| self.val_split_mode = ValSplitMode(val_split_mode) if val_split_mode else ValSplitMode.NONE | |||
| self.val_split_ratio = val_split_ratio or 0.5 | |||
| self.seed = seed | |||
| self.synthetic_blend_factor = synthetic_blend_factor | |||
There was a problem hiding this comment.
synthetic_blend_factor is documented/configured as a YAML list (e.g. [0.01, 0.2]), which will typically arrive here as a list/OmegaConf ListConfig, not a tuple. Downstream, PerlinAnomalyGenerator only treats tuple as a range, so passing a list can silently ignore the configured range. Consider normalizing synthetic_blend_factor in __init__ (e.g., coerce 2-item list-like values to tuple[float, float]) or broadening downstream checks to accept Sequence.
| # Ensure dimensions are powers of 2 for proper noise generation | ||
| def nextpow2(value: int) -> int: | ||
| return int(2 ** torch.ceil(torch.log2(torch.tensor(value))).int().item()) | ||
|
|
||
| pad_h = nextpow2(height) | ||
| pad_w = nextpow2(width) | ||
| pad_h_base = nextpow2(height) if pad_to_power_of_2 else height | ||
| pad_w_base = nextpow2(width) if pad_to_power_of_2 else width | ||
| pad_h = ((pad_h_base + scalex - 1) // scalex) * scalex | ||
| pad_w = ((pad_w_base + scaley - 1) // scaley) * scaley |
There was a problem hiding this comment.
The comment # Ensure dimensions are powers of 2 for proper noise generation is no longer accurate now that pad_to_power_of_2 can be False and the padding is also rounded to a multiple of the chosen scale. Updating the comment (and/or docstring) would help avoid confusion about what invariants are actually required for the algorithm.
| denominator = perlin_noise.max() - perlin_noise.min() | ||
| if denominator == 0: | ||
| return perlin_noise | ||
|
|
||
| perlin_noise = (perlin_noise - perlin_noise.min()) / denominator |
There was a problem hiding this comment.
apply_perlin_threshold_rescale recomputes perlin_noise.min() multiple times. Since this runs on tensors (often on GPU), it would be cleaner (and avoids extra reductions) to compute min_val/max_val once, reuse them for denominator and normalization, and then rescale.
| denominator = perlin_noise.max() - perlin_noise.min() | |
| if denominator == 0: | |
| return perlin_noise | |
| perlin_noise = (perlin_noise - perlin_noise.min()) / denominator | |
| min_val = perlin_noise.min() | |
| max_val = perlin_noise.max() | |
| denominator = max_val - min_val | |
| if denominator == 0: | |
| return perlin_noise | |
| perlin_noise = (perlin_noise - min_val) / denominator |
|
Hi @geeky33 |
|
Hi @rajeshgangireddy |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 27 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| anomaly_source_path: Path | str | None = None, | ||
| probability: float = 0.5, | ||
| blend_factor: float | tuple[float, float] = (0.2, 1.0), | ||
| rotation_range: tuple[float, float] = (-90, 90), | ||
| *, | ||
| perlin_pad_to_power_of_2: bool = True, | ||
| perlin_scale_exponent_range: tuple[int, int] = DEFAULT_PERLIN_SCALE_EXPONENT_RANGE, | ||
| perlin_rescale_below_threshold: bool = True, |
There was a problem hiding this comment.
PerlinAnomalyGenerator only samples a range when blend_factor is a tuple; if a config/API passes a 2-item list (common from YAML/omegaconf), the later beta selection falls through to the fallback constant (0.5), effectively ignoring the configured range. Consider normalizing blend_factor to a tuple in __init__ (e.g., accept Sequence[float] length-2) or updating the type guards so both list and tuple ranges are handled.
| def __init__( | ||
| self, | ||
| augmentations: Transform | None, | ||
| source_samples: DataFrame, | ||
| dataset_name: str, | ||
| blend_factor: float | tuple[float, float] = (0.01, 0.2), | ||
| ) -> None: |
There was a problem hiding this comment.
SyntheticAnomalyDataset.__init__ now accepts blend_factor, but the class docstring/Args section above still documents only augmentations, source_samples, and dataset_name. Please update the docstring to describe blend_factor (fixed float vs sampled range) so public API docs stay accurate.
Behavior
Unifies Perlin noise generation (including GLASS-friendly options), shares threshold rescaling across DSR and SuperSimpleNet, fixes internal padding when
pad_to_power_of_2=Falseso odd sizes don’t break, and exposessynthetic_blend_factoron image datamodules so synthetic anomaly intensity can be set from config/API (forwarded intoSyntheticAnomalyDataset/make_synthetic_dataset).Description
This PR consolidates Perlin-related logic in
generate_perlin_noise(configurable scale exponent range, optional power-of-two padding,generate_perlin_noise_glass,apply_perlin_threshold_rescale), reuses that helper in DSR and SuperSimpleNet instead of duplicated code, and threadsblend_factorthrough synthetic dataset creation. Image datamodules gainsynthetic_blend_factor(default(0.01, 0.2)), with example/docs YAML updates and unit tests.Fixes #(issue) #3413
Snapshots for supporting the PR :

Changes
synthetic_blend_factoron datamodules / synthetic pipelinetest_perlin.py; updates totest_synthetic.py,test_folder.pyexamples/configs/data/folder.yamlChecklist