Skip to content

refactor(data): unify Perlin noise helpers and expose synthetic blend factor.#3447

Open
geeky33 wants to merge 3 commits intoopen-edge-platform:mainfrom
geeky33:anomaly
Open

refactor(data): unify Perlin noise helpers and expose synthetic blend factor.#3447
geeky33 wants to merge 3 commits intoopen-edge-platform:mainfrom
geeky33:anomaly

Conversation

@geeky33
Copy link
Copy Markdown

@geeky33 geeky33 commented Mar 20, 2026

Behavior

Unifies Perlin noise generation (including GLASS-friendly options), shares threshold rescaling across DSR and SuperSimpleNet, fixes internal padding when pad_to_power_of_2=False so odd sizes don’t break, and exposes synthetic_blend_factor on image datamodules so synthetic anomaly intensity can be set from config/API (forwarded into SyntheticAnomalyDataset / make_synthetic_dataset).

Description

This PR consolidates Perlin-related logic in generate_perlin_noise (configurable scale exponent range, optional power-of-two padding, generate_perlin_noise_glass, apply_perlin_threshold_rescale), reuses that helper in DSR and SuperSimpleNet instead of duplicated code, and threads blend_factor through synthetic dataset creation. Image datamodules gain synthetic_blend_factor (default (0.01, 0.2)), with example/docs YAML updates and unit tests.

Fixes #(issue) #3413

Snapshots for supporting the PR :
image

Changes

  • New feature (non-breaking): synthetic_blend_factor on datamodules / synthetic pipeline
  • Refactor: shared Perlin utilities and model anomaly generators
  • Bug fix: padding / shape correctness when not using power-of-two padding
  • Tests: test_perlin.py; updates to test_synthetic.py, test_folder.py
  • Documentation: config snippets + examples/configs/data/folder.yaml

Checklist

  • I have made the necessary updates to the documentation (where applicable).
  • I have written tests that support these changes (where applicable).
  • My PR title follows conventional commit format.

Copilot AI review requested due to automatic review settings March 20, 2026 23:06
…ehavior

Signed-off-by: geeky33 <aaryap1204@gmail.com>
@geeky33
Copy link
Copy Markdown
Author

geeky33 commented Mar 20, 2026

Hello, @ashwinvaidya17 @rajeshgangireddy
Could you please review the PR ?

Thank you
Aarya.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors synthetic anomaly generation by centralizing Perlin noise utilities (including GLASS-compatible defaults), reusing shared threshold-rescaling logic across models, and exposing a configurable synthetic_blend_factor on image datamodules so synthetic anomaly intensity can be controlled via config/API.

Changes:

  • Consolidates Perlin helpers (default/GLASS scale ranges, optional power-of-two padding, threshold rescaling) and reuses them in DSR and SuperSimpleNet anomaly generators.
  • Threads blend_factor through make_synthetic_dataset / SyntheticAnomalyDataset and forwards it from image datamodules via synthetic_blend_factor.
  • Adds/updates unit tests and updates example/docs YAML snippets to show synthetic_blend_factor.

Reviewed changes

Copilot reviewed 27 out of 27 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/anomalib/data/utils/generators/perlin.py Adds shared Perlin utilities, GLASS defaults, and PerlinAnomalyGenerator compatibility knobs.
src/anomalib/data/utils/generators/__init__.py Re-exports new Perlin helpers/constants from generators package.
src/anomalib/data/utils/__init__.py Re-exports new Perlin helpers/constants from data utils top-level API.
src/anomalib/data/utils/synthetic.py Adds blend_factor plumbing into synthetic dataset creation and dataset class.
src/anomalib/data/datamodules/base/image.py Adds synthetic_blend_factor to datamodule and forwards to synthetic split creation.
src/anomalib/data/datamodules/image/{folder,bmad,btech,datumaro,kaputt,kolektor,mpdd,mvtecad,mvtecad2,mvtec_loco,realiad,tabular,vad,visa}.py Threads synthetic_blend_factor through concrete image datamodules into base class.
src/anomalib/models/image/dsr/anomaly_generator.py Replaces duplicated Perlin threshold-rescale logic with shared helper.
src/anomalib/models/image/supersimplenet/anomaly_generator.py Replaces duplicated Perlin threshold-rescale logic with shared helper.
tests/unit/data/utils/test_perlin.py Adds unit tests for shared Perlin utilities and GLASS helper.
tests/unit/data/utils/test_synthetic.py Adds/updates tests to ensure blend factor is exposed and configurable.
tests/unit/data/datamodule/image/test_folder.py Tests that datamodule forwards synthetic_blend_factor into synthetic split creation.
examples/configs/data/folder.yaml Documents new synthetic_blend_factor config field.
docs/source/snippets/.../normal_and_synthetic.yaml Updates docs snippets to include synthetic_blend_factor.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 114 to +125
@@ -117,6 +122,7 @@ def __init__(
self.val_split_mode = ValSplitMode(val_split_mode) if val_split_mode else ValSplitMode.NONE
self.val_split_ratio = val_split_ratio or 0.5
self.seed = seed
self.synthetic_blend_factor = synthetic_blend_factor
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

synthetic_blend_factor is documented/configured as a YAML list (e.g. [0.01, 0.2]), which will typically arrive here as a list/OmegaConf ListConfig, not a tuple. Downstream, PerlinAnomalyGenerator only treats tuple as a range, so passing a list can silently ignore the configured range. Consider normalizing synthetic_blend_factor in __init__ (e.g., coerce 2-item list-like values to tuple[float, float]) or broadening downstream checks to accept Sequence.

Copilot uses AI. Check for mistakes.
Comment on lines 127 to +134
# Ensure dimensions are powers of 2 for proper noise generation
def nextpow2(value: int) -> int:
return int(2 ** torch.ceil(torch.log2(torch.tensor(value))).int().item())

pad_h = nextpow2(height)
pad_w = nextpow2(width)
pad_h_base = nextpow2(height) if pad_to_power_of_2 else height
pad_w_base = nextpow2(width) if pad_to_power_of_2 else width
pad_h = ((pad_h_base + scalex - 1) // scalex) * scalex
pad_w = ((pad_w_base + scaley - 1) // scaley) * scaley
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment # Ensure dimensions are powers of 2 for proper noise generation is no longer accurate now that pad_to_power_of_2 can be False and the padding is also rounded to a multiple of the chosen scale. Updating the comment (and/or docstring) would help avoid confusion about what invariants are actually required for the algorithm.

Copilot uses AI. Check for mistakes.
Comment on lines +62 to +66
denominator = perlin_noise.max() - perlin_noise.min()
if denominator == 0:
return perlin_noise

perlin_noise = (perlin_noise - perlin_noise.min()) / denominator
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply_perlin_threshold_rescale recomputes perlin_noise.min() multiple times. Since this runs on tensors (often on GPU), it would be cleaner (and avoids extra reductions) to compute min_val/max_val once, reuse them for denominator and normalization, and then rescale.

Suggested change
denominator = perlin_noise.max() - perlin_noise.min()
if denominator == 0:
return perlin_noise
perlin_noise = (perlin_noise - perlin_noise.min()) / denominator
min_val = perlin_noise.min()
max_val = perlin_noise.max()
denominator = max_val - min_val
if denominator == 0:
return perlin_noise
perlin_noise = (perlin_noise - min_val) / denominator

Copilot uses AI. Check for mistakes.
@rajeshgangireddy
Copy link
Copy Markdown
Contributor

Hi @geeky33
Thanks for the changes.
Please do note that currently due to many PRs, it might take some time before we start reviewing this PR.

@geeky33
Copy link
Copy Markdown
Author

geeky33 commented Mar 24, 2026

Hi @rajeshgangireddy
No issues,
Thank you.

Copilot AI review requested due to automatic review settings March 24, 2026 12:22
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 259 to +266
anomaly_source_path: Path | str | None = None,
probability: float = 0.5,
blend_factor: float | tuple[float, float] = (0.2, 1.0),
rotation_range: tuple[float, float] = (-90, 90),
*,
perlin_pad_to_power_of_2: bool = True,
perlin_scale_exponent_range: tuple[int, int] = DEFAULT_PERLIN_SCALE_EXPONENT_RANGE,
perlin_rescale_below_threshold: bool = True,
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PerlinAnomalyGenerator only samples a range when blend_factor is a tuple; if a config/API passes a 2-item list (common from YAML/omegaconf), the later beta selection falls through to the fallback constant (0.5), effectively ignoring the configured range. Consider normalizing blend_factor to a tuple in __init__ (e.g., accept Sequence[float] length-2) or updating the type guards so both list and tuple ranges are handled.

Copilot uses AI. Check for mistakes.
Comment on lines +179 to +185
def __init__(
self,
augmentations: Transform | None,
source_samples: DataFrame,
dataset_name: str,
blend_factor: float | tuple[float, float] = (0.01, 0.2),
) -> None:
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SyntheticAnomalyDataset.__init__ now accepts blend_factor, but the class docstring/Args section above still documents only augmentations, source_samples, and dataset_name. Please update the docstring to describe blend_factor (fixed float vs sampled range) so public API docs stay accurate.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants