Skip to content

Add video tracking and Media tab visualization#3405

Open
claforte wants to merge 1 commit into
aimhubio:mainfrom
claforte:claforte-integrate_video
Open

Add video tracking and Media tab visualization#3405
claforte wants to merge 1 commit into
aimhubio:mainfrom
claforte:claforte-integrate_video

Conversation

@claforte
Copy link
Copy Markdown

@claforte claforte commented May 4, 2026

Summary

This adds first-class video tracking to Aim and exposes it in the run detail UI:

  • adds aim.Video and Videos sequence support for mp4, m4v, gif, mov, and webm
  • registers video sequences with query/run APIs, including /api/runs/search/videos/ and video blob batch loading
  • adds a Run Detail Media tab that groups Images, Videos, and Audio under one tab while preserving legacy media routes
  • renders tracked videos with default autoplay, loop, muted, playsInline, lazy blob loading, object URL cleanup, and batched visible-video blob requests
  • adds SDK, API, run-info, and opt-in system coverage for video upload efficiency
  • documents the new SDK Video object and Videos sequence in the SDK reference

Motivation

Aim already supports rich media objects such as images and audio, but training workflows that generate rollout videos, reconstruction clips, simulated environments, or model behavior videos currently need to store those artifacts outside the normal sequence/query/UI flow.

This PR keeps videos aligned with Aim's existing media model while paying attention to large-object behavior:

  • path-backed videos defer reading bytes until encoding in the tracking worker
  • async run.track() stays cheap during training loops
  • the UI requests blobs lazily and batches visible requests to avoid loading every video on the page up front

Implementation Notes

SDK and storage

  • aim.sdk.objects.video.Video follows Aim's CustomObject pattern.
  • Video(path=...) stores metadata and the source path, then reads bytes only during _aim_encode().
  • Video(data=...) supports byte-backed construction for tests and small in-memory videos.
  • Video.__deepcopy__() avoids materializing path-backed bytes when async tracking copies values before enqueueing.
  • Videos extends MediaSequenceBase and is registered through SEQUENCE_TYPE_MAP.

API

  • Adds video pydantic metadata (caption, format, fps, size, blob_uri, index).
  • Reuses CustomObjectApiConfig for video search and batch endpoints.
  • Allows videos in restricted query parameters and adds Repo.query_videos().

UI

  • Replaces separate Images/Audios run-detail navigation entries with a single Media tab.
  • Media has sub-tabs for Images, Videos, and Audio.
  • Legacy routes like /runs/:hash/images, /runs/:hash/videos, and /runs/:hash/audios still route to the relevant Media sub-tab.
  • Videos autoplay and loop by default while visible.
  • Video blobs are:
    • requested only when cards approach the viewport via IntersectionObserver
    • fetched in batches for visible cards
    • rendered from object URLs instead of base64 strings
    • revoked on cleanup

Tests

Ran locally:

uv run pytest tests/sdk/test_video_construction.py tests/api/test_run_videos_api.py tests/api/test_run_images_api.py::TestRunInfoApi -q

Result: 15 passed

timeout 180s env AIM_RUN_VIDEO_SYSTEM_TESTS=1 uv run pytest tests/system/test_video_upload_efficiency.py -q -s

Result: 1 passed

./node_modules/.bin/eslint src/pages/RunDetail/VideosVisualizer/VideosVisualizer.tsx src/pages/RunDetail/RunDetailMediaTab/RunDetailMediaTab.tsx src/pages/RunDetail/RunDetail.tsx

Result: passed

NODE_OPTIONS=--openssl-legacy-provider npm run build

Result: passed with existing repository warnings outside the touched files.

Reviewer Notes

  • The system test is opt-in behind AIM_RUN_VIDEO_SYSTEM_TESTS=1 because it can download or synthesize many video samples.
  • If AIM_VIDEO_SYSTEM_DATASET_URL is not set, the system test uses ffmpeg to generate a moving 1080p source and five downsampled variants, then logs 100 path-backed videos under async tracking.
  • I could not find an existing issue dedicated to video tracking support. Happy to link or open one if maintainers prefer to discuss the API/UX before merging.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 4, 2026

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@claforte
Copy link
Copy Markdown
Author

claforte commented May 4, 2026

The above was authored by Codex. I personally tested it locally, it seems to work well. Hope this helps,

Christian

@claforte
Copy link
Copy Markdown
Author

claforte commented May 4, 2026

Screenshot: image

@claforte claforte marked this pull request as ready for review May 4, 2026 03:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants