Skip to content

Add C-RADIOv4 to model zoo#6888

Open
CharlesCNorton wants to merge 1 commit intodevelopfrom
feature/c-radio-zoo
Open

Add C-RADIOv4 to model zoo#6888
CharlesCNorton wants to merge 1 commit intodevelopfrom
feature/c-radio-zoo

Conversation

@CharlesCNorton
Copy link
Member

@CharlesCNorton CharlesCNorton commented Feb 5, 2026

Summary

  • Add NVIDIA C-RADIOv4 vision foundation models to the model zoo
  • New fiftyone/utils/radio.py wrapper with dual output modes
  • Two model variants: C-RADIOv4-H (631M) and C-RADIOv4-SO400M (412M)

Models Added

Model Size Description
c-radio-v4-h-torch 2.52 GB Maximum accuracy, 83.09% zero-shot ImageNet
c-radio-v4-so400m-torch 1.65 GB Balanced performance, faster inference

Usage

import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart", max_samples=25)

# Embeddings (2560-dim)
model = foz.load_zoo_model("c-radio-v4-h-torch")
dataset.compute_embeddings(model, embeddings_field="radio_embeddings")

# Spatial attention heatmaps
model = foz.load_zoo_model("c-radio-v4-h-torch", output_type="spatial")
dataset.apply_model(model, label_field="radio_attention")

Testing

94 tests passing across 5 categories:

Config (16 tests)

  • Default values, custom values, type inheritance
  • All parameters: hf_repo, output_type, use_mixed_precision, apply_smoothing, smoothing_sigma
  • SO400M variant, combined options

Output Processors (25 tests)

  • Batch sizes 1, 2, 8, 16
  • Dtypes: float32, float16, bfloat16, GPU tensors, numpy passthrough
  • NCHW and NLC spatial formats
  • NaN/Inf handling, constant input, smoothing effects
  • Variable output sizes

Inference (26 tests)

  • Image content: RGB, black, white, red, gradient, random pixels
  • Resolutions: 32px, 128px, 256px, 512px, 1024px, 1080p, 4K
  • Aspect ratios: square, portrait, landscape, extreme (100x1000, 1000x100)
  • Batch sizes 1-8, mixed sizes within batch
  • Determinism: same image produces same embedding
  • Spatial heatmaps match original image dimensions

FiftyOne Integration (13 tests)

  • compute_embeddings() stores 2560-dim vectors
  • apply_model() stores Heatmap labels
  • Works with views (take, skip, filtered)
  • Re-computation produces identical results
  • sort_by_similarity() with brain index
  • compute_uniqueness() scores
  • Multiple label fields on same dataset
  • Shuffled datasets

Edge Cases (14 tests)

  • Color modes: grayscale, RGBA, CMYK, palette, 1-bit → RGB
  • Odd dimensions (511x513), prime dimensions (509x503), power-of-2
  • File I/O: PNG, JPEG
  • Model reuse across multiple batches
  • Tiny (64px) and large (2048px) spatial heatmaps

Summary by CodeRabbit

  • New Features

    • Added C‑RADIOv4 model support for end-to-end inference producing summary embeddings and spatial heatmaps, with mixed‑precision support, automatic feature‑extraction mode for summaries, configurable spatial smoothing, resizing to original frames, and robust per‑sample and batched processing with device and memory handling.
  • Tests

    • Comprehensive deterministic test suite covering configs, embeddings, heatmaps, varied image sizes/formats, dataset integration, reuse, and edge‑case robustness.

@CharlesCNorton CharlesCNorton requested a review from a team as a code owner February 5, 2026 12:33
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 5, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a new fiftyone.utils.radio module implementing a C-RADIOv4 wrapper that supports loading models from HuggingFace, preprocessing with a CLIP image processor, mixed-precision inference, and post-processing to produce either 1D summary embeddings or spatial heatmaps. New utilities: RadioGetItem, RadioOutputProcessor, SpatialHeatmapOutputProcessor, CRadioV4ModelConfig, and CRadioV4Model. Also adds comprehensive unit tests exercising configs, processors, inference paths, edge cases, and FiftyOne integration.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CRadioV4Model
    participant RadioGetItem
    participant ImageProcessor
    participant HFModel
    participant OutputProcessor
    participant FiftyOne

    User->>CRadioV4Model: predict(samples)
    CRadioV4Model->>RadioGetItem: build_get_item / load batch
    RadioGetItem->>RadioGetItem: read filepath & convert to RGB
    RadioGetItem-->>CRadioV4Model: PIL images
    CRadioV4Model->>ImageProcessor: preprocess → tensors
    ImageProcessor-->>CRadioV4Model: model inputs
    CRadioV4Model->>HFModel: forward pass (autocast if enabled)
    HFModel-->>CRadioV4Model: embeddings or spatial features
    CRadioV4Model->>OutputProcessor: process outputs (summary or spatial)
    OutputProcessor-->>CRadioV4Model: embeddings or Heatmaps
    CRadioV4Model-->>User: results
    User->>FiftyOne: store results in dataset/view
Loading
sequenceDiagram
    participant Config as CRadioV4ModelConfig
    participant Model as CRadioV4Model
    participant P_SUM as RadioOutputProcessor
    participant P_SPAT as SpatialHeatmapOutputProcessor

    Note over Config: output_type selects processing path

    alt Summary Mode
        Config->>Config: set as_feature_extractor=True
        Model->>P_SUM: select RadioOutputProcessor
        P_SUM->>P_SUM: tensor → numpy embeddings
        P_SUM-->>Model: List[np.ndarray]
    else Spatial Mode
        Config->>Config: as_feature_extractor=False
        Model->>P_SPAT: select SpatialHeatmapOutputProcessor
        P_SPAT->>P_SPAT: PCA → optional smoothing → resize → normalize → Heatmaps
        P_SPAT-->>Model: List[Heatmap]
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description comprehensively covers the proposed changes, testing details, and usage examples, but omits the release notes section from the template. Complete the Release Notes section by selecting whether this is user-facing and providing a brief description for release notes.
Docstring Coverage ⚠️ Warning Docstring coverage is 6.84% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding C-RADIOv4 models to the FiftyOne model zoo.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/c-radio-zoo

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🤖 Fix all issues with AI agents
In `@fiftyone/utils/radio.py`:
- Around line 374-407: The current loop in fiftyone.utils.radio (where
pixel_values are created per img, resized to nearest_res via
_model.get_nearest_supported_resolution, and the model invoked per image)
prevents GPU batching; instead gather pixel_values for images that share the
same target resolution (or accumulate until a max batch size), stack them into a
batched tensor on self._device, then run the model once under the existing
autocast/no_grad logic (keep use_amp check and torch.autocast("cuda",
dtype=torch.bfloat16) and torch.no_grad()), and finally split/extend summaries
and spatial_features into the original per-image order; update the loop around
the calls to self._image_processor, the resizing block that queries
_model.get_nearest_supported_resolution, and the model invocation so they
operate on batched pixel_values rather than per-image tensors.
- Around line 122-130: The code that reshapes `spatial` into H×W when computing
H and W (the block handling `if spatial.ndim == 2:` in fiftyone/utils/radio.py)
can produce a 1×N layout when N is prime; add a clear signal when that happens:
detect the case where after the divisor-search loop H==1 and N>1 and emit a
warning (e.g. warnings.warn or use the module logger) explaining that a prime
token count led to a 1×N layout which may affect PCA quality, and mention the
expected behavior and possible remedies (padding or truncation) in the message
or docstring for the function that performs this reshape so callers are aware.
Ensure the check references the same `spatial`, `N`, `H`, `W` variables used in
the existing block.
- Around line 349-356: The broad except Exception in the mixed-precision
detection block should be narrowed: catch specific CUDA/runtime errors (e.g.,
RuntimeError and torch.cuda.CudaError or the appropriate torch.cuda.* exception
types) around the calls to torch.cuda.get_device_capability(self._device) and
torch.cuda.is_available(), log the error with logger.warning including the
exception, and re-raise unexpected exceptions so they’re not silently swallowed;
update the try/except surrounding these calls (the mixed-precision check using
torch.cuda.get_device_capability and logger.warning) to only handle those
specific error classes and let other exceptions propagate.
- Around line 47-50: The code currently returns
Image.open(filepath).convert("RGB") which leaves a lazy file handle open; change
it to open the file in a context manager using
Image.open(sample_dict["filepath"]) as img, call img.convert("RGB") and then
call .load() (or copy the image) inside the with-block so the file is fully read
and the handle closed before returning the PIL Image; update the return site
that currently uses Image.open(...).convert("RGB") to this pattern (referencing
Image.open, .convert, .load and sample_dict["filepath"] in
fiftyone/utils/radio.py).
- Around line 329-337: The _load_model method currently calls
AutoModel.from_pretrained(..., trust_remote_code=True) without documenting the
security implications or pinning a revision; update the _load_model function to
include a docstring that explains that trust_remote_code=True is required for
RADIO models but allows arbitrary remote code execution (security trade-off),
and update the call to AutoModel.from_pretrained in _load_model to accept an
optional pinned revision parameter (e.g., revision="<commit_sha>") or read
revision from config (config.hf_revision) so callers can pin to a specific
commit to avoid unexpected code changes; reference the symbols _load_model,
AutoModel.from_pretrained, trust_remote_code, and
config.hf_repo/config.hf_revision in your change.

In `@tests/unittests/utils/test_radio.py`:
- Around line 224-232: The test function test_radio_proc_gpu currently returns
early when no GPU is present, silently skipping the case; update it to use
pytest.mark.skipif on the test function (or call pytest.skip) using the
condition not torch.cuda.is_available() so the test is reported as skipped,
while keeping the logic that constructs RadioOutputProcessor, creates a CUDA
tensor, calls proc(tensor, (100, 100)) and asserts the numpy result; reference
the test function name test_radio_proc_gpu and the RadioOutputProcessor class
and torch.cuda.is_available() when making the change.
- Around line 16-34: The file uses a custom test decorator named "test" (with
inner "decorator" and "wrapper") and global counters PASSED/FAILED; replace
these with pytest-style tests: remove the "test" decorator and its wrapper logic
(including global PASSED/FAILED and traceback printing), convert each decorated
test into a plain function whose name begins with "test_" and use plain assert
statements for checks, add pytest fixtures or pytest.mark.parametrize where
needed instead of the custom pattern, and ensure you import pytest at the top so
the tests can be run with "pytest tests/unittests/utils/test_radio.py".
- Around line 1-11: The test module calls sys.exit(1) when a test fails but
never imports the sys module; add an import for sys to the module's imports
(place it alongside existing imports like logging, numpy, PIL, torch) so that
the reference to sys.exit in the failure handling code resolves correctly and no
NameError is raised.
- Around line 1051-1065: The test uses insecure tempfile.mktemp() to create
temp_path for the JPEG; replace that with a NamedTemporaryFile(delete=False)
context to securely create the file, write/save the image to temp.name, use
temp.name when opening with Image.open and ensure you close the
NamedTemporaryFile (or let context close it) and still unlink the file in the
finally block; update the block around creating/saving/opening the image and the
call to model._predict_all([loaded]) accordingly so it uses temp.name instead of
temp_path.
- Around line 1029-1044: Replace the insecure tempfile.mktemp usage with a
secure temporary file: create a tempfile.NamedTemporaryFile(delete=False) as
tmp, set temp_path = tmp.name (and close tmp before saving/reading the PIL
Image), save the image to temp_path, open it for processing with
model._predict_all([loaded]) as before, and ensure the finally block still
unlinks temp_path; update the variables in the test_radio function accordingly
to remove mktemp() and use NamedTemporaryFile safely.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@tests/unittests/utils/test_radio.py`:
- Around line 684-698: The test test_fo_embeddings_field creates a dataset and
calls dataset.compute_embeddings but deletes the dataset only after the
assertion, so failures leak datasets; update test_fo_embeddings_field to wrap
the dataset usage in a try/finally (create dataset, then try: run
compute_embeddings and assert the field, finally: call
fo.delete_dataset(dataset.name)) to guarantee cleanup even on assertion errors,
and apply the same try/finally pattern to the other dataset-creating tests in
this module (referencing dataset, compute_embeddings, and fo.delete_dataset).
- Around line 49-55: The tests use boolean equality comparisons that trigger
Ruff E712; update the assertions in test_config_output_summary and other tests
referencing CRadioV4ModelConfig attributes to use truthiness or negation instead
of `== True`/`== False` — e.g., replace `assert config.as_feature_extractor ==
True` with `assert config.as_feature_extractor`, `assert
config.use_mixed_precision == True` with `assert config.use_mixed_precision`,
and `assert config.use_mixed_precision == False` / `assert
config.apply_smoothing == False` with `assert not config.use_mixed_precision`
and `assert not config.apply_smoothing` respectively so all checks for
as_feature_extractor, use_mixed_precision, and apply_smoothing use idiomatic
truthiness.
- Around line 17-35: Wrap heavy-inference tests with an opt-in guard and reuse a
cached model: update the test decorator (the test() decorator used in tests) to
accept a requires_assets=True flag and skip the test early unless an env var
like RUN_ASSETS_TESTS (or pytest.mark) is set; add a module-level fixture (e.g.,
cached_cradio_model) that instantiates CRadioV4Model once and returns it for
reuse across tests; change tests that call model._predict_all(),
dataset.compute_embeddings(..., model, ...), or dataset.apply_model(..., model,
...) to use requires_assets=True on the decorator and to use the
cached_cradio_model fixture instead of constructing new CRadioV4Model instances
so downloads and inference are gated and the model is reused.
- Around line 660-669: The test test_infer_same_image fails under
mixed-precision GPU runs because np.allclose uses too-strict default tolerances;
update the assertion in the test that compares embeddings produced by
CRadioV4Model._predict_all to use relaxed tolerances (e.g., set rtol and atol
such as rtol=1e-3, atol=1e-3) so small numeric drift from
CRadioV4ModelConfig/CRadioV4Model inference on Ampere+ GPUs is accepted while
still ensuring embeddings are effectively identical.

Comment on lines +17 to +35
def test(name):
"""Decorator for test functions."""
def decorator(func):
def wrapper():
global PASSED, FAILED
print(f"\n{'='*70}")
print(f"TEST: {name}")
print('='*70)
try:
func()
print(f"[PASSED] {name}")
PASSED += 1
except Exception as e:
print(f"[FAILED] {name}")
print(f" Error: {e}")
traceback.print_exc()
FAILED += 1
return wrapper
return decorator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate the test file
find . -type f -name "test_radio.py" | head -5

Repository: voxel51/fiftyone

Length of output: 97


🏁 Script executed:

# Check the content of the test file to see actual test structure
fd test_radio.py -x cat -n {} | head -150

Repository: voxel51/fiftyone

Length of output: 6455


🏁 Script executed:

# Search for model instantiation patterns in test files
rg "CRadioV4Model|HuggingFace|inference|load.*model" tests/unittests/utils/test_radio.py -A 3

Repository: voxel51/fiftyone

Length of output: 18745


🏁 Script executed:

# Check if there are any existing skip/gate mechanisms
rg "pytest.mark.skip|os.getenv|FIFTYONE_RUN|requires" tests/unittests/utils/test_radio.py

Repository: voxel51/fiftyone

Length of output: 42


🏁 Script executed:

# Check for model reuse or caching patterns
rg "cache|@.*cache|global.*[Mm]odel" tests/unittests/utils/test_radio.py

Repository: voxel51/fiftyone

Length of output: 42


Gate heavy model downloads/inference in unit tests.

These tests repeatedly instantiate CRadioV4Model and run inference on HuggingFace models (nvidia/C-RADIOv4-H, etc.), which will download multi-GB weights on every test run. This makes default CI runs slow, network-dependent, and fragile. Gate inference-heavy tests behind an environment flag and implement module-level model caching to avoid re-downloads.

🔧 Example guard using the existing decorator
+RUN_RADIO_TESTS = os.getenv("FIFTYONE_RUN_RADIO_TESTS") == "1"
-def test(name):
+def test(name, requires_assets=False):
     """Decorator for test functions."""
     def decorator(func):
         def wrapper():
             global PASSED, FAILED
+            if requires_assets and not RUN_RADIO_TESTS:
+                print(f"[SKIPPED] {name} (set FIFTYONE_RUN_RADIO_TESTS=1)")
+                return
             print(f"\n{'='*70}")
             print(f"TEST: {name}")

Apply requires_assets=True to all tests calling model._predict_all(), dataset.compute_embeddings(model, ...), or dataset.apply_model(model, ...). Add a module-level fixture to cache the loaded model instance and reuse across tests.

🧰 Tools
🪛 Ruff (0.14.14)

[warning] 19-19: Missing return type annotation for private function decorator

(ANN202)


[warning] 20-20: Missing return type annotation for private function wrapper

Add return type annotation: None

(ANN202)


[warning] 29-29: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In `@tests/unittests/utils/test_radio.py` around lines 17 - 35, Wrap
heavy-inference tests with an opt-in guard and reuse a cached model: update the
test decorator (the test() decorator used in tests) to accept a
requires_assets=True flag and skip the test early unless an env var like
RUN_ASSETS_TESTS (or pytest.mark) is set; add a module-level fixture (e.g.,
cached_cradio_model) that instantiates CRadioV4Model once and returns it for
reuse across tests; change tests that call model._predict_all(),
dataset.compute_embeddings(..., model, ...), or dataset.apply_model(..., model,
...) to use requires_assets=True on the decorator and to use the
cached_cradio_model fixture instead of constructing new CRadioV4Model instances
so downloads and inference are gated and the model is reused.

Comment on lines +660 to +669
@test("Inference: same image produces same embedding")
def test_infer_same_image():
from fiftyone.utils.radio import CRadioV4ModelConfig, CRadioV4Model
config = CRadioV4ModelConfig({"output_type": "summary"})
model = CRadioV4Model(config)
img = Image.new("RGB", (512, 512), color=(128, 128, 128))
with model:
result1 = model._predict_all([img])
result2 = model._predict_all([img])
assert np.allclose(result1[0], result2[0])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the test file
fd "test_radio.py" --type f

Repository: voxel51/fiftyone

Length of output: 95


🏁 Script executed:

# Search for CRadioV4Model to understand the implementation
rg "class CRadioV4Model" -A 10

Repository: voxel51/fiftyone

Length of output: 1487


🏁 Script executed:

# Check how _predict_all is implemented
rg "_predict_all" -B 2 -A 15

Repository: voxel51/fiftyone

Length of output: 50372


🏁 Script executed:

# Look for existing determinism or tolerance handling in similar tests
rg "np\.allclose" tests/ -B 2 -A 2

Repository: voxel51/fiftyone

Length of output: 5761


🏁 Script executed:

# Check the exact line numbers of the test
cat -n tests/unittests/utils/test_radio.py | sed -n '660,670p'

Repository: voxel51/fiftyone

Length of output: 601


🏁 Script executed:

# Look at the CRadioV4Model _predict_all implementation
rg "class CRadioV4Model" -A 50 fiftyone/utils/radio.py | head -80

Repository: voxel51/fiftyone

Length of output: 2699


🏁 Script executed:

# Check if there's any torch.manual_seed or determinism settings
rg "manual_seed|deterministic|set_deterministic" fiftyone/utils/radio.py

Repository: voxel51/fiftyone

Length of output: 42


🏁 Script executed:

# Check other embedding/inference tests to see how they handle tolerance
rg "def test_infer" tests/unittests/utils/test_radio.py -A 10 | head -60

Repository: voxel51/fiftyone

Length of output: 2361


🏁 Script executed:

# Check TorchImageModel to understand _predict_all
rg "class TorchImageModel" -A 100 fiftyone/utils/torch.py | grep -A 50 "_predict_all"

Repository: voxel51/fiftyone

Length of output: 42


🏁 Script executed:

# Look at how the model runs on GPU and if there's determinism control
rg "device|cuda|GPU|torch.set" fiftyone/utils/radio.py

Repository: voxel51/fiftyone

Length of output: 491


🏁 Script executed:

# Check the model_tests.py to see how embeddings are tested
cat tests/intensive/model_tests.py | grep -A 20 "_assert_embeddings_equal"

Repository: voxel51/fiftyone

Length of output: 2011


🏁 Script executed:

# Look for any prior tolerance-related comments or patterns in radio.py
cat fiftyone/utils/radio.py | grep -i "tolerance\|deterministic\|precision\|allclose" -B 2 -A 2

Repository: voxel51/fiftyone

Length of output: 1410


🏁 Script executed:

# Check if there are any comments or context around the test about determinism
cat -n tests/unittests/utils/test_radio.py | sed -n '650,675p'

Repository: voxel51/fiftyone

Length of output: 1368


🏁 Script executed:

# Look at the full radio.py file to understand _predict_all implementation
wc -l fiftyone/utils/radio.py

Repository: voxel51/fiftyone

Length of output: 87


🏁 Script executed:

# Get the _predict_all method in CRadioV4Model
rg "def _predict_all" fiftyone/utils/radio.py -A 30

Repository: voxel51/fiftyone

Length of output: 1098


🏁 Script executed:

# Check if torch is seeded anywhere in the test file
rg "seed|manual_seed" tests/unittests/utils/test_radio.py

Repository: voxel51/fiftyone

Length of output: 147


🏁 Script executed:

# Get the full _predict_all method
rg "def _predict_all" fiftyone/utils/radio.py -A 60

Repository: voxel51/fiftyone

Length of output: 2065


🏁 Script executed:

# Check if the model is set to eval mode and no_grad context
rg "eval()\|torch.no_grad\|requires_grad" fiftyone/utils/radio.py

Repository: voxel51/fiftyone

Length of output: 42


🏁 Script executed:

# Check if there are other tests in the codebase that compare embeddings from same model runs
rg "result1.*result2\|_predict_all.*_predict_all" tests/unittests/utils/ -B 5 -A 5

Repository: voxel51/fiftyone

Length of output: 42


🏁 Script executed:

# Search for any GitHub issues or comments about non-determinism in C-RADIOv4
rg "deterministic\|non-deterministic\|flaky" fiftyone/utils/radio.py

Repository: voxel51/fiftyone

Length of output: 42


🏁 Script executed:

# Check how the model is initialized - look for determinism flags
cat fiftyone/utils/radio.py | grep -A 20 "def __init__" | head -40

Repository: voxel51/fiftyone

Length of output: 1558


🏁 Script executed:

# Look at the model loading to see if there are any determinism settings
rg "from_pretrained\|load_pretrained" fiftyone/utils/radio.py -B 2 -A 5

Repository: voxel51/fiftyone

Length of output: 42


🏁 Script executed:

# Check if bfloat16 is used and if that affects determinism
rg "bfloat16\|autocast" fiftyone/utils/radio.py -B 3 -A 3

Repository: voxel51/fiftyone

Length of output: 42


🏁 Script executed:

# Check what the default np.allclose tolerances are
python3 << 'EOF'
import numpy as np
import inspect
sig = inspect.signature(np.allclose)
print("np.allclose defaults:")
for param_name, param in sig.parameters.items():
    if param.default != inspect.Parameter.empty:
        print(f"  {param_name}: {param.default}")
EOF

Repository: voxel51/fiftyone

Length of output: 177


Add tolerance to accommodate GPU numeric drift from mixed precision.

The model uses bfloat16 mixed precision by default on Ampere+ GPUs (use_mixed_precision=True). This can introduce tiny numeric variations across runs, even with identical inputs. The default np.allclose tolerances (rtol=1e-05, atol=1e-08) are too strict for GPU inference with reduced precision.

🔧 Suggested tolerance
-    assert np.allclose(result1[0], result2[0])
+    assert np.allclose(result1[0], result2[0], rtol=1e-4, atol=1e-6)
🤖 Prompt for AI Agents
In `@tests/unittests/utils/test_radio.py` around lines 660 - 669, The test
test_infer_same_image fails under mixed-precision GPU runs because np.allclose
uses too-strict default tolerances; update the assertion in the test that
compares embeddings produced by CRadioV4Model._predict_all to use relaxed
tolerances (e.g., set rtol and atol such as rtol=1e-3, atol=1e-3) so small
numeric drift from CRadioV4ModelConfig/CRadioV4Model inference on Ampere+ GPUs
is accepted while still ensuring embeddings are effectively identical.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tests/unittests/utils/test_radio.py`:
- Around line 17-34: The _test decorator's wrapper currently swallows all
exceptions (incrementing FAILED) which hides failures from pytest; modify
wrapper in function _test to re-raise the caught exception when the process is
not running as "__main__" (i.e., after incrementing FAILED and printing trace
info, if __name__ != "__main__" then raise), preserving the existing
print/traceback/PASSED/FAILED behavior for standalone runs. Ensure you reference
and update the wrapper function within _test and keep PASSED and FAILED
increments unchanged.

Comment on lines +17 to +34
def _test(name):
"""Decorator for test functions."""
def decorator(func):
def wrapper():
global PASSED, FAILED
print(f"\n{'='*70}")
print(f"TEST: {name}")
print('='*70)
try:
func()
print(f"[PASSED] {name}")
PASSED += 1
except Exception as e:
print(f"[FAILED] {name}")
print(f" Error: {e}")
traceback.print_exc()
FAILED += 1
return wrapper
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n tests/unittests/utils/test_radio.py | head -50

Repository: voxel51/fiftyone

Length of output: 1813


🏁 Script executed:

tail -50 tests/unittests/utils/test_radio.py

Repository: voxel51/fiftyone

Length of output: 1365


🏁 Script executed:

grep -n "if __name__" tests/unittests/utils/test_radio.py

Repository: voxel51/fiftyone

Length of output: 91


🏁 Script executed:

sed -n '1100,1110p' tests/unittests/utils/test_radio.py

Repository: voxel51/fiftyone

Length of output: 372


🏁 Script executed:

sed -n '17,36p' tests/unittests/utils/test_radio.py

Repository: voxel51/fiftyone

Length of output: 629


Re-raise exceptions from _test decorator when running under pytest

The decorator catches exceptions but never re-raises them. When pytest imports and runs these tests, all failures are silently swallowed and pytest reports them as passes. Apply the proposed fix to re-raise exceptions only when not running as __main__:

Fix
             except Exception as e:
                 print(f"[FAILED] {name}")
                 print(f"  Error: {e}")
                 traceback.print_exc()
                 FAILED += 1
+                if __name__ != "__main__":
+                    raise
🧰 Tools
🪛 Ruff (0.14.14)

[warning] 17-17: Missing return type annotation for private function _test

(ANN202)


[warning] 19-19: Missing return type annotation for private function decorator

(ANN202)


[warning] 20-20: Missing return type annotation for private function wrapper

Add return type annotation: None

(ANN202)


[warning] 29-29: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In `@tests/unittests/utils/test_radio.py` around lines 17 - 34, The _test
decorator's wrapper currently swallows all exceptions (incrementing FAILED)
which hides failures from pytest; modify wrapper in function _test to re-raise
the caught exception when the process is not running as "__main__" (i.e., after
incrementing FAILED and printing trace info, if __name__ != "__main__" then
raise), preserving the existing print/traceback/PASSED/FAILED behavior for
standalone runs. Ensure you reference and update the wrapper function within
_test and keep PASSED and FAILED increments unchanged.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@tests/unittests/utils/test_radio.py`:
- Around line 926-934: The test test_edge_power2_dims currently opens and closes
the CRadioV4Model context inside the for loop causing repeated model
load/unload; move the context manager so the single with model: wraps the entire
loop (create CRadioV4ModelConfig and CRadioV4Model as before, then enter with
model: once, iterate sizes calling model._predict_all([img]) and assert the
result shape) to reuse the loaded model instead of re-entering on each
iteration.
- Around line 3-4: The module-level call to
logging.basicConfig(level=logging.WARNING) in
tests/unittests/utils/test_radio.py is reconfiguring pytest's logging capture;
remove that call or move it behind a top-level guard (e.g., run only when
__name__ == "__main__") so it does not execute during test collection, or
instead configure logging per-test using pytest's caplog fixture; update the
file by deleting or guarding the basicConfig call referencing
logging.basicConfig to avoid overriding pytest's root logger.
- Around line 380-520: Consolidate the repetitive inference tests by replacing
the many near-duplicate functions with parametrized pytest tests: create a
parametrize for resolution cases that iterates over tuples of (width, height)
and assert CRadioV4Model(config)._predict_all([Image.new("RGB",(w,h))])[0].shape
== (2560,) (use CRadioV4ModelConfig and CRadioV4Model to construct the model and
the context manager), and create another parametrize for batch-size cases that
generates lists of Image.new(...) of varying lengths and asserts
len(model._predict_all(imgs)) equals the batch size; apply the same pattern for
color and aspect-ratio variants so the unique symbols to update are
CRadioV4ModelConfig, CRadioV4Model, model._predict_all and the original test
names (e.g., test_infer_128..test_infer_4k,
test_infer_batch1..test_infer_batch8) are replaced by parameterized test
functions.
- Around line 19-24: Move repeated imports out of each test and place them at
the module level: import CRadioV4ModelConfig, CRadioV4Model,
DEFAULT_CRADIO_MODEL (and any other repeated names) from fiftyone.utils.radio at
the top of the test file, then update tests like test_config_default_repo to use
those module-level imports (e.g., CRadioV4ModelConfig and DEFAULT_CRADIO_MODEL)
instead of importing inside the function; keep test function bodies unchanged
aside from removing the inline import statements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Fix all issues with AI agents
In `@fiftyone/utils/radio.py`:
- Around line 386-401: The computation of use_amp (currently set using
self.config.use_mixed_precision, self._mixed_precision_supported, and
self._using_gpu) is loop-invariant and should be moved outside the loop that
calls self._model(pixel_values); hoist the use_amp assignment before the for
loop, then inside the loop simply branch on that precomputed boolean to choose
between torch.autocast("cuda", dtype=torch.bfloat16) + torch.no_grad() and just
torch.no_grad(), leaving summaries.append(summary) and
spatial_features.append(spatial) unchanged.
- Around line 403-407: The _predict_all method currently calls torch.cat on
summaries or spatial_features which crashes when imgs is empty; add an early
guard at the top of _predict_all to detect an empty imgs list (or zero-length
summaries/spatial_features) and return an appropriate empty tensor or
processed-empty result (consistent with config.output_type handling and
_output_processor expectations). Specifically, check imgs (or
len(summaries)/len(spatial_features)) and return either an empty batch_output
shaped correctly for downstream code or call _output_processor with an empty
tensor and original_sizes, so torch.cat is never invoked on an empty list and
behavior remains consistent for both summary and non-summary branches.
- Around line 365-407: The loop currently appends both summaries and
spatial_features every iteration which retains unused tensors and wastes memory;
change the loop to only store the tensor required by self.config.output_type
(check self.config.output_type before appending) — call
self._model(pixel_values) as before but immediately append only summary when
output_type == "summary" or only spatial when not, then explicitly delete (del)
the unused variable (and if on GPU call .detach().cpu() on the kept tensor if
the downstream _output_processor expects CPU tensors) to avoid dangling GPU
refs; update references to summaries, spatial_features, and the post-loop
torch.cat usage so each list is only created/concatenated when needed (keep
_output_processor(original_sizes) usage for the "summary" branch and the
existing spatial branch unchanged).

In `@tests/unittests/utils/test_radio.py`:
- Around line 962-980: In test_edge_jpeg replace insecure tempfile.mktemp usage
with tempfile.NamedTemporaryFile (delete=False) to create temp_path, write the
JPEG, then run the model prediction and assert result inside the try block
before the finally cleanup; specifically update the test_edge_jpeg function to
create temp_path via NamedTemporaryFile(delete=False), call
model._predict_all([loaded]) while the file is open/within the with model
context, perform assert result[0].shape == (2560,) inside the try, and keep the
existing finally block to os.unlink(temp_path) to ensure cleanup.
- Around line 3-16: Remove the duplicate "import pytest" statement so pytest is
only imported once (keep the initial import and the
pytest.importorskip("transformers") usage); specifically delete the second
occurrence of the import that appears later in the file (after the PIL/Image
import) to avoid redundant imports and linter warnings.
- Around line 940-960: In test_edge_from_file, replace the deprecated
tempfile.mktemp usage with a NamedTemporaryFile(delete=False) to create the temp
PNG and ensure you close it before reopening for Image.open; move the assertion
on result (result[0].shape == (2560,)) inside the try block immediately after
the model._predict_all call so it cannot be referenced after finally, and keep
the cleanup of the temp file and any opened images in the finally block; update
references to CRadioV4ModelConfig and CRadioV4Model as needed to locate the test
and ensure result is assigned before asserting.

@CharlesCNorton CharlesCNorton requested a review from a team February 10, 2026 14:59
@CharlesCNorton CharlesCNorton self-assigned this Feb 21, 2026
@CharlesCNorton CharlesCNorton force-pushed the feature/c-radio-zoo branch 2 times, most recently from bede410 to e69cf2f Compare March 20, 2026 23:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants