Skip to content

Data compression for screen data#81

Open
schewskone wants to merge 46 commits intosensorium-competition:mainfrom
schewskone:compressed_data
Open

Data compression for screen data#81
schewskone wants to merge 46 commits intosensorium-competition:mainfrom
schewskone:compressed_data

Conversation

@schewskone
Copy link
Copy Markdown
Collaborator

Implemented support for compressed screen formats

  • Added EncodedImageTrial and EncodedVideoTrial which feature get_data_() methods that support encoded formats.
  • Extended screen tests and screen data generation to feature encoded data in mp4 and jpeg format.
  • Thinned requirements.txt and added ffmpeg installation to CI workflow.

Right now the VideoDecoder decodes the entire video and returns that to the interpolation function. This can be optimized with the help of sliced decoding which would require changes to the interpolation method. @pollytur and I decided we should discuss first and implement it later on if deemed necessary.

@pollytur pollytur requested a review from Copilot July 3, 2025 10:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for compressed screen data by extending trials to handle encoded images and videos, updating tests and data generation to work with JPEG/MP4, and adjusting interpolation interfaces and CI.

  • Introduce EncodedImageTrial and EncodedVideoTrial with compressed data loaders
  • Enhance test utilities and screen interpolation tests for both encoded and raw formats
  • Update interpolation API to return only data, adjust downstream consumers, and add FFmpeg to CI

Reviewed Changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/test_sequence_interpolator.py Update tests for new single-array return value of interpolate
tests/test_screen_interpolator.py Add parameterized tests for encoded vs. raw screen data
tests/create_screen_data.py Extend data generation to save JPEG/MP4 and emit metadata
experanto/interpolators.py Remove valid returns, introduce image_names flag, add encoded trials
experanto/experiment.py Adjust interpolate to drop valid output
experanto/datasets.py Update dataset pipelines to match new return signature
configs/default.yaml Add image_names configuration
.github/workflows/test.yml Install FFmpeg for encoding dependencies
Comments suppressed due to low confidence (4)

experanto/interpolators.py:152

  • The return type annotation still indicates a tuple, but the method now returns only a single array. Update the type hint and docstring to reflect the new signature.
    def interpolate(self, times: np.ndarray) -> tuple[np.ndarray, np.ndarray]:

experanto/interpolators.py:425

  • [nitpick] Using format shadows the built-in Python function. Consider renaming this variable to file_format to avoid confusion.
            format = metadata.get("file_format")

tests/create_screen_data.py:14

  • The new encoded parameter isn't documented. Please update the function docstring to explain what encoded does and what formats it controls.
def create_screen_data(

tests/create_screen_data.py:118

  • The code calls shutil.rmtree but shutil is not imported. Add import shutil at the top of the module.
        shutil.rmtree(SCREEN_ROOT)

Copy link
Copy Markdown
Contributor

@pollytur pollytur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please look at the comments - at least test need to plug back valid checks for sure

@pollytur pollytur self-requested a review March 8, 2026 11:27
Copy link
Copy Markdown
Contributor

@pollytur pollytur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone I merged main into it again and made sure that the CI/CD passes (not sure why its not displayed in the PR but its here)

Please address the rest of the comments

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 10, 2026

Codecov Report

❌ Patch coverage is 82.90598% with 20 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
experanto/interpolators.py 83.62% 19 Missing ⚠️
experanto/datasets.py 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@pollytur
Copy link
Copy Markdown
Contributor

@schewskone could you please take a look why the docs CI/CD is failing?

Comment on lines +912 to +927
def __init__(
self, data_file_name, meta_data, shared_decoder=None, cache_data: bool = False
) -> None:
super().__init__(
data_file_name,
meta_data,
tuple(meta_data.get("image_size")),
meta_data.get("first_frame_idx"),
meta_data.get("num_frames"),
cache_data=cache_data,
)
self.video_decoder = shared_decoder
if self.video_decoder is None:
raise ValueError(
"EncodedVideoTrial requires a shared_decoder to be provided."
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone what happened here? still WIP?

Comment on lines +495 to +497
def _initialize_decoder(self, data_file_name):
decoder = VideoDecoder(
str(data_file_name),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, copilot is also wrong from time to time. I guess fine with me now anyways

Comment on lines +19 to +21
"torch==2.9.0",
"torchcodec==0.9",
"torchvision==0.24.0",
Copy link
Copy Markdown
Contributor

@pollytur pollytur Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schewskone sorry for going back on this but current FM code uses torch torch 2.8 dev
https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-25-05.html (nvcr.io/nvidia/pytorch:25.05-py3)

Could we maybe check if torch>=2.8 and "torchvision==0.23" also works to pair it with data release?
(I also think we should start do down limits on the dependencies instead of strict versions - or do you disagree?)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look and ping you later.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pollytur So I can make it work for CPU with torchcodec==0.7, torch==2.8 and torchvision==0.23, it needs a few code changes and is most likely slower as the beta backend is not supported there yet. I did not get a chance to test the GPU support yet.
Python still has to stay >=3.10 because of the recent Spikesinterpolator PR and the used numba library.

Should I test GPU as well and implement the changes in this PR?

I personally would prefer to upgrade torch and the other deps. I thought that projects like OmniMouse needed specific frozen experanto branches anyways?
I'm also not sure I understand what you mean with current FM code.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

python 3.10+ is not a problem, FM repo is python 3.12

when is beta backend gets added? (which version of codec / torch / torchvision)
Basically, if there a chance to match these - 2 would be really nice, if it degrades perfromance - then probably not

I am a bit pushy here since as I assume quite some ml people might want to extended omnimouse, so it would be a bit better that they could start of latest experanto rather some frozen branch at least in principle)

For reproducibility yes - there is a frozen omnimouse branch

PS FM is omnimouse, yes

Copy link
Copy Markdown
Collaborator Author

@schewskone schewskone Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0.7 is the latest torchcodec release supported with torch 2.8. Anything newer requires 2.9 torchcodec repo has a table of compatible versions.

I don't know how great the performance reduction is but tbh. I don't know if that reduction is significant considering that decoding is already very slow in general.

I will check GPU support for 0.7 later this week and get back to you.

Comment on lines -61 to -63
assert np.allclose(
interp, expected_frames, atol=1e-5
), "Nearest neighbor interpolation mismatch"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this got deleted?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we have to separate between encoded and unencoded. This is now in line 93 onwards inside the else statement.

Copy link
Copy Markdown
Contributor

@pollytur pollytur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check the comments :)

@pollytur pollytur mentioned this pull request Mar 18, 2026
@pollytur
Copy link
Copy Markdown
Contributor

please merge current main in your branch :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants