Optimizations for significantly faster downloads and cache hits by DePasqualeOrg · Pull Request #302 · huggingface/swift-transformers

DePasqualeOrg · 2025-12-26T12:53:02Z

This PR offers significant improvements in download and cache performance, and also brings the Swift Hub implementation closer to feature parity with the Python huggingface_hub library.

Changes

1. Skip HEAD requests for cached files

When downloading files that are already cached, we now skip the individual HEAD requests per file. The snapshot function fetches the current commit hash once via getRepoInfo, then passes it to each file download. If the local metadata shows the same commit hash, the file is returned immediately—no HEAD request needed to verify it's unchanged.

Python equivalent: file_download.py:1082-1095

2. Parallel file downloads

Files are now downloaded concurrently using a task group with a configurable number of concurrent downloads, matching the Python library's default of 8.

Python equivalent: _snapshot_download.py:449-455

3. Verify file integrity after download, skip re-hash on cache hit

LFS files (identified by SHA256 etags) are now verified after download. Previously, hash verification ran on every load in offline mode, adding ~200 ms+ for large files. Now we verify once at download time and trust the cache afterward.

Python equivalent: file_download.py:1394-1408

4. Size-weighted progress reporting

Progress is now weighted by file size instead of file count. This provides smoother, more accurate progress bars for downloads containing a mix of small config files and large model weights.

The getRepoInfo function fetches file sizes via the blobs=true API parameter (see hf_api.py:2617), and ProgressCoordinator uses these sizes as weights for each file's contribution to overall progress.

Benchmark Results

Tested with mlx-community/Qwen3-0.6B-Base-DQ5 (11 MB tokenizer.json).

Benchmark	Before	After	Improvement
Cached file retrieval	782 ms	267 ms	2.9x faster
Offline mode cache hit	4.87 ms	0.14 ms	35x faster
Parallel downloads	1704 ms	742 ms	2.3x faster

Testing

Added unit tests for getRepoInfo, snapshot caching, and offline mode
Added HubBenchmarks.swift with reproducible performance tests. You can check out commit 49f7e1b to run the benchmarks before the changes in this PR, and then run them again with the latest commit in this PR to see the difference. These benchmarks can be deleted before merging or kept for testing future improvements.

DePasqualeOrg · 2025-12-26T18:24:03Z

I've added the same optimizations to swift-huggingface in huggingface/swift-huggingface#21, which required porting some missing functionality from swift-transformers and huggingface_hub to swift-huggingface.

…wnload progress by file size

DePasqualeOrg · 2025-12-27T21:52:45Z

After #304 is merged, I'll move the benchmark test to the separate Benchmark target that was added in that PR so that it doesn't run in CI.

pcuenca

Looks directionally ok. A couple of initial comments before examining the code in detail:

It's important to verify that a HEAD or GET request is performed on cached repos in exactly the same way the Python library does them (except in offline mode)
Parallel downloading is very much dependent on the system and network. Have you considered the impact on iOS and mobile?
Size-weighted progress reporting is a great idea.
For the cached metadata, do we still verify if it changed server-side?

Also, note that the end goal is to proceed with swift-huggingface (#297)

DePasqualeOrg · 2026-02-07T19:07:47Z

It's important to verify that a HEAD or GET request is performed on cached repos in exactly the same way the Python library does them (except in offline mode)

I took care to align with the Python library.

Parallel downloading is very much dependent on the system and network. Have you considered the impact on iOS and mobile?

We could keep the default concurrency limit of 8 on macOS and set a lower default on iOS if you think that makes sense. The limit is configurable, so callers (like an iOS app) can pass a lower value.

For the cached metadata, do we still verify if it changed server-side?

Yes, through the commit hash.

Also, note that the end goal is to proceed with swift-huggingface (#297)

I have implemented similar optimizations in huggingface/swift-huggingface#21.

DePasqualeOrg added 2 commits December 26, 2025 11:47

Add benchmarks for optimizations

49f7e1b

Verify file integrity after download, skip re-hash on cache hit

1789228

DePasqualeOrg marked this pull request as draft December 26, 2025 13:59

DePasqualeOrg force-pushed the optimizations branch 2 times, most recently from 63ab365 to d877136 Compare December 26, 2025 14:33

DePasqualeOrg marked this pull request as ready for review December 26, 2025 17:46

DePasqualeOrg mentioned this pull request Dec 26, 2025

Improve download and cache performance; add resumable, parallel downloads and offline mode huggingface/swift-huggingface#21

Open

DePasqualeOrg force-pushed the optimizations branch 4 times, most recently from bc00570 to dd87e89 Compare December 26, 2025 22:17

DePasqualeOrg changed the title ~~Optimize Hub download and cache performance~~ Optimize download and cache performance Dec 27, 2025

DePasqualeOrg added 2 commits December 27, 2025 13:42

Parallelize downloads, skip HEAD requests for cached files, weight do…

d81fe0c

…wnload progress by file size

Use file locking for safe concurrent downloads

2cabd23

DePasqualeOrg force-pushed the optimizations branch from dd87e89 to 2cabd23 Compare December 27, 2025 12:42

DePasqualeOrg mentioned this pull request Dec 27, 2025

Optimizations for significantly faster tokenizer loading #303

Open

DePasqualeOrg changed the title ~~Optimize download and cache performance~~ Optimizations for significantly faster downloads and cache hits Dec 27, 2025

DePasqualeOrg mentioned this pull request Dec 28, 2025

Optimize model loading performance ml-explore/mlx-swift-lm#34

Merged

DePasqualeOrg added 4 commits January 5, 2026 13:05

Store lock files with metadata, mirroring Python behavior

fc007cd

Silence benign teardown warnings in tests

2d1778e

Improve handling of file lock contention

985e31b

To do: Move benchmark tests to separate target

d414dd6

DePasqualeOrg marked this pull request as draft January 5, 2026 12:45

DePasqualeOrg marked this pull request as ready for review January 5, 2026 13:54

Use default .lock extension for lock files

e0cecb5

pcuenca reviewed Feb 6, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimizations for significantly faster downloads and cache hits#302

Optimizations for significantly faster downloads and cache hits#302
DePasqualeOrg wants to merge 9 commits intohuggingface:mainfrom
DePasqualeOrg:optimizations

DePasqualeOrg commented Dec 26, 2025 •

edited

Loading

Uh oh!

DePasqualeOrg commented Dec 26, 2025

Uh oh!

DePasqualeOrg commented Dec 27, 2025 •

edited

Loading

Uh oh!

pcuenca left a comment

Uh oh!

DePasqualeOrg commented Feb 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DePasqualeOrg commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

1. Skip HEAD requests for cached files

2. Parallel file downloads

3. Verify file integrity after download, skip re-hash on cache hit

4. Size-weighted progress reporting

Benchmark Results

Testing

Uh oh!

DePasqualeOrg commented Dec 26, 2025

Uh oh!

DePasqualeOrg commented Dec 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

DePasqualeOrg commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DePasqualeOrg commented Dec 26, 2025 •

edited

Loading

DePasqualeOrg commented Dec 27, 2025 •

edited

Loading

DePasqualeOrg commented Feb 7, 2026 •

edited

Loading