Data compression for screen data#81
Data compression for screen data#81schewskone wants to merge 46 commits intosensorium-competition:mainfrom
Conversation
…ges and mp4 videos
… for compatibility with allen-exporter and future datasets to avoid duplicate files
Adding small fix for ToTensor issue to decoder PR because seperate PR is unnessesary
There was a problem hiding this comment.
Pull Request Overview
This PR adds support for compressed screen data by extending trials to handle encoded images and videos, updating tests and data generation to work with JPEG/MP4, and adjusting interpolation interfaces and CI.
- Introduce EncodedImageTrial and EncodedVideoTrial with compressed data loaders
- Enhance test utilities and screen interpolation tests for both encoded and raw formats
- Update interpolation API to return only data, adjust downstream consumers, and add FFmpeg to CI
Reviewed Changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_sequence_interpolator.py | Update tests for new single-array return value of interpolate |
| tests/test_screen_interpolator.py | Add parameterized tests for encoded vs. raw screen data |
| tests/create_screen_data.py | Extend data generation to save JPEG/MP4 and emit metadata |
| experanto/interpolators.py | Remove valid returns, introduce image_names flag, add encoded trials |
| experanto/experiment.py | Adjust interpolate to drop valid output |
| experanto/datasets.py | Update dataset pipelines to match new return signature |
| configs/default.yaml | Add image_names configuration |
| .github/workflows/test.yml | Install FFmpeg for encoding dependencies |
Comments suppressed due to low confidence (4)
experanto/interpolators.py:152
- The return type annotation still indicates a tuple, but the method now returns only a single array. Update the type hint and docstring to reflect the new signature.
def interpolate(self, times: np.ndarray) -> tuple[np.ndarray, np.ndarray]:
experanto/interpolators.py:425
- [nitpick] Using
formatshadows the built-in Python function. Consider renaming this variable tofile_formatto avoid confusion.
format = metadata.get("file_format")
tests/create_screen_data.py:14
- The new
encodedparameter isn't documented. Please update the function docstring to explain whatencodeddoes and what formats it controls.
def create_screen_data(
tests/create_screen_data.py:118
- The code calls
shutil.rmtreebutshutilis not imported. Addimport shutilat the top of the module.
shutil.rmtree(SCREEN_ROOT)
pollytur
left a comment
There was a problem hiding this comment.
please look at the comments - at least test need to plug back valid checks for sure
pollytur
left a comment
There was a problem hiding this comment.
@schewskone I merged main into it again and made sure that the CI/CD passes (not sure why its not displayed in the PR but its here)
Please address the rest of the comments
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
@schewskone could you please take a look why the docs CI/CD is failing? |
experanto/interpolators.py
Outdated
| def __init__( | ||
| self, data_file_name, meta_data, shared_decoder=None, cache_data: bool = False | ||
| ) -> None: | ||
| super().__init__( | ||
| data_file_name, | ||
| meta_data, | ||
| tuple(meta_data.get("image_size")), | ||
| meta_data.get("first_frame_idx"), | ||
| meta_data.get("num_frames"), | ||
| cache_data=cache_data, | ||
| ) | ||
| self.video_decoder = shared_decoder | ||
| if self.video_decoder is None: | ||
| raise ValueError( | ||
| "EncodedVideoTrial requires a shared_decoder to be provided." | ||
| ) |
experanto/interpolators.py
Outdated
| def _initialize_decoder(self, data_file_name): | ||
| decoder = VideoDecoder( | ||
| str(data_file_name), |
There was a problem hiding this comment.
hm, copilot is also wrong from time to time. I guess fine with me now anyways
…is for strings and paths, attempt to fix docs ci
| "torch==2.9.0", | ||
| "torchcodec==0.9", | ||
| "torchvision==0.24.0", |
There was a problem hiding this comment.
@schewskone sorry for going back on this but current FM code uses torch torch 2.8 dev
https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-25-05.html (nvcr.io/nvidia/pytorch:25.05-py3)
Could we maybe check if torch>=2.8 and "torchvision==0.23" also works to pair it with data release?
(I also think we should start do down limits on the dependencies instead of strict versions - or do you disagree?)
There was a problem hiding this comment.
I'll take a look and ping you later.
There was a problem hiding this comment.
@pollytur So I can make it work for CPU with torchcodec==0.7, torch==2.8 and torchvision==0.23, it needs a few code changes and is most likely slower as the beta backend is not supported there yet. I did not get a chance to test the GPU support yet.
Python still has to stay >=3.10 because of the recent Spikesinterpolator PR and the used numba library.
Should I test GPU as well and implement the changes in this PR?
I personally would prefer to upgrade torch and the other deps. I thought that projects like OmniMouse needed specific frozen experanto branches anyways?
I'm also not sure I understand what you mean with current FM code.
There was a problem hiding this comment.
python 3.10+ is not a problem, FM repo is python 3.12
when is beta backend gets added? (which version of codec / torch / torchvision)
Basically, if there a chance to match these - 2 would be really nice, if it degrades perfromance - then probably not
I am a bit pushy here since as I assume quite some ml people might want to extended omnimouse, so it would be a bit better that they could start of latest experanto rather some frozen branch at least in principle)
For reproducibility yes - there is a frozen omnimouse branch
PS FM is omnimouse, yes
There was a problem hiding this comment.
0.7 is the latest torchcodec release supported with torch 2.8. Anything newer requires 2.9 torchcodec repo has a table of compatible versions.
I don't know how great the performance reduction is but tbh. I don't know if that reduction is significant considering that decoding is already very slow in general.
I will check GPU support for 0.7 later this week and get back to you.
| assert np.allclose( | ||
| interp, expected_frames, atol=1e-5 | ||
| ), "Nearest neighbor interpolation mismatch" |
There was a problem hiding this comment.
Because we have to separate between encoded and unencoded. This is now in line 93 onwards inside the else statement.
pollytur
left a comment
There was a problem hiding this comment.
please check the comments :)
|
please merge current main in your branch :) |
Implemented support for compressed screen formats
get_data_()methods that support encoded formats.Right now the VideoDecoder decodes the entire video and returns that to the interpolation function. This can be optimized with the help of sliced decoding which would require changes to the interpolation method. @pollytur and I decided we should discuss first and implement it later on if deemed necessary.