Skip to content

[ROCm] please install 'torchcodec' #7914

@AndreasKaratzas

Description

@AndreasKaratzas

Describe the bug

Datasets library is widely used by many Python packages. Naturally, it is a requirement on many platforms. This includes vLLM for ROCm. During audio dataset tests, there is an exception triggered:

    def decode_example(
        self, value: dict, token_per_repo_id: Optional[dict[str, Union[str, bool, None]]] = None
    ) -> "AudioDecoder":
        """Decode example audio file into audio data.

        Args:
            value (`dict`):
                A dictionary with keys:

                - `path`: String with relative audio file path.
                - `bytes`: Bytes of the audio file.
            token_per_repo_id (`dict`, *optional*):
                To access and decode
                audio files from private repositories on the Hub, you can pass
                a dictionary repo_id (`str`) -> token (`bool` or `str`)

        Returns:
            `torchcodec.decoders.AudioDecoder`
        """
        if config.TORCHCODEC_AVAILABLE:
            from ._torchcodec import AudioDecoder
        else:
>           raise ImportError("To support decoding audio data, please install 'torchcodec'.")
E           ImportError: To support decoding audio data, please install 'torchcodec'.

At the same time, torchcodec cannot be installed on ROCm, because Its GPU acceleration uses NVIDIA's NVDEC (hardware decoder), which is NVIDIA-specific. Therefore, code paths that call this block trigger errors on ROCm. Can you add an alternative package there as fallback instead of an ImportError?

Steps to reproduce the bug

On a machine with MI300/MI325/MI355:

pytest -s -v tests/entrypoints/openai/correctness/test_transcription_api_correctness.py::test_wer_correctness[12.74498-D4nt3/esb-datasets-earnings22-validation-tiny-filtered-openai/whisper-large-v3]

Expected behavior

_________________________________________________ test_wer_correctness[12.74498-D4nt3/esb-datasets-earnings22-validation-tiny-filtered-openai/whisper-large-v3] ________________________________________[383/535$

model_name = 'openai/whisper-large-v3', dataset_repo = 'D4nt3/esb-datasets-earnings22-validation-tiny-filtered', expected_wer = 12.74498, n_examples = -1, max_concurrent_request = None

    @pytest.mark.parametrize("model_name", ["openai/whisper-large-v3"])
    # Original dataset is 20GB+ in size, hence we use a pre-filtered slice.
    @pytest.mark.parametrize(
        "dataset_repo", ["D4nt3/esb-datasets-earnings22-validation-tiny-filtered"]
    )
    # NOTE: Expected WER measured with equivalent hf.transformers args:
    # whisper-large-v3 + esb-datasets-earnings22-validation-tiny-filtered.
    @pytest.mark.parametrize("expected_wer", [12.744980])
    def test_wer_correctness(
        model_name, dataset_repo, expected_wer, n_examples=-1, max_concurrent_request=None
    ):
        # TODO refactor to use `ASRDataset`
        with RemoteOpenAIServer(model_name, ["--enforce-eager"]) as remote_server:
>           dataset = load_hf_dataset(dataset_repo)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

tests/entrypoints/openai/correctness/test_transcription_api_correctness.py:160:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tests/entrypoints/openai/correctness/test_transcription_api_correctness.py:111: in load_hf_dataset
    if "duration_ms" not in dataset[0]:
                            ^^^^^^^^^^
/usr/local/lib/python3.12/dist-packages/datasets/arrow_dataset.py:2876: in __getitem__
    return self._getitem(key)
           ^^^^^^^^^^^^^^^^^^
/usr/local/lib/python3.12/dist-packages/datasets/arrow_dataset.py:2858: in _getitem
    formatted_output = format_table(
/usr/local/lib/python3.12/dist-packages/datasets/formatting/formatting.py:658: in format_table
    return formatter(pa_table, query_type=query_type)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/usr/local/lib/python3.12/dist-packages/datasets/formatting/formatting.py:411: in __call__
    return self.format_row(pa_table)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
/usr/local/lib/python3.12/dist-packages/datasets/formatting/formatting.py:460: in format_row
    row = self.python_features_decoder.decode_row(row)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/usr/local/lib/python3.12/dist-packages/datasets/formatting/formatting.py:224: in decode_row
    return self.features.decode_example(row, token_per_repo_id=self.token_per_repo_id) if self.features else row
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/usr/local/lib/python3.12/dist-packages/datasets/features/features.py:2111: in decode_example
    column_name: decode_nested_example(feature, value, token_per_repo_id=token_per_repo_id)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/usr/local/lib/python3.12/dist-packages/datasets/features/features.py:1419: in decode_nested_example
    return schema.decode_example(obj, token_per_repo_id=token_per_repo_id) if obj is not None else None
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Environment info

  • datasets version: 4.4.2
  • Platform: Linux-5.15.0-161-generic-x86_64-with-glibc2.35
  • Python version: 3.12.12
  • huggingface_hub version: 0.36.0
  • PyArrow version: 22.0.0
  • Pandas version: 2.3.3
  • fsspec version: 2025.10.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions