Skip to content

Potential methodological concern: per-ROI discretization and inconsistency in square image implementation #950

@Danica-Chen

Description

@Danica-Chen

Background
While reviewing the preprocessing and discretization pipeline in PyRadiomics (particularly imageoperations.py), I identified two potential issues that may have methodological implications for radiomic feature computation and interpretation.

  1. Inconsistency between documented intention and actual implementation of the square filter
    The documentation of getSquareImage states: “negative intensities are made negative in resultant filtered image”.
    This description suggests that the transformation is intended to preserve the sign of the original intensities, while applying a nonlinear (quadratic) scaling to their magnitude. In other words, the intended behavior appears to be a sign-preserving transformation, conceptually similar to:
    f(x)=sign(x)⋅(c|x|)** 2
    However, the current implementation is:
    coeff = 1 / np.sqrt(np.max(np.abs(im)))
    im = (coeff * im) ** 2
    This operation applies a standard square transformation, which yields: f(x)=(cx)** 2≥0 for all input values. And in real CT images, after square operation, all output values are non-negative, regardless of input sign (real-data proved this).
    Question:
    1.1 Is this a documentation issue, or should the implementation preserve the sign of the original intensities?
    1.2 In its current form, the transformation effectively behaves like a squared magnitude (i.e., similar to |𝑥|^2, where values symmetric around zero are mapped to the same output. This leads to a loss of sign information and makes the transformation non-invertible, as distinct input values (e.g.,
    +𝑥 and −𝑥 collapse into identical outputs. Consequently, this may alter the intended purpose of the square filter, which appears to be enhancing signal magnitude while preserving underlying intensity structure.

  2. Per-ROI discretization and gray-level scaling inconsistency
    From the implementation of getBinEdges() and binImage():
    binEdges = getBinEdges(parameterMatrix[parameterMatrixCoordinates], **kwargs)
    This implies that: The minimum and maximum intensity values are determined per ROI, and gray levels are effectively rescaled such that the lowest gray level within each ROI is mapped to 1. As a result, gray-level representation becomes relative within each ROI, rather than reflecting absolute physical intensities (e.g., HU). For example, three ROIs with identical spatial texture patterns but different intensity ranges: ROI A: [-900, -200]; ROI B: [-400, 300]; ROI C: [-400, 300] with a single extreme voxel at -900. Under per-ROI discretization, ROIs A and B may produce identical discretized gray-level structures and identical glcm, glrlm, glszm, gldm, and ngtdm values, despite having very different absolute intensity ranges. And ROI C, due to a slight change in range (one extreme voxel), may yield substantially different texture features compared to ROI B. In another words, this may reduce inter-ROI and inter-patient comparability. And some potential concerns including: Loss of absolute intensity information; Potential bias in studies relying on intensity-based features; Reduced comparability across patients, scanners, or institutions; Possible impact on longitudinal or multi-center studies.
    Question:
    2.1 I would like to clarify whether per-ROI discretization is an intentional design choice in PyRadiomics, or whether there are recommended strategies for performing discretization at a global (cohort-level) scale to preserve absolute gray-level information.
    2.2 In addition, I am interested in how gray-level–dependent features should be interpreted under this scheme. For example, features such as High Gray Level Run Emphasis (e.g., GLRLM_HighGrayLevelRunEmphasis) are designed to quantify the distribution of higher gray-level intensities. Under per-ROI discretization, where gray levels are normalized relative to each ROI, it is unclear whether these features are: intended primarily for within-ROI characterization, or expected to be comparable across ROIs and patients. More specifically, if gray levels are rescaled per ROI, then “high gray level” effectively becomes a relative concept within each ROI, rather than reflecting absolute intensity (e.g., HU). This may affect the biological or physical interpretability of such features when used in inter-patient analyses.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions