Skip to content

Add patch to URL-escape key paths in HTTP remote URLs#251

Merged
yarikoptic merged 1 commit intomasterfrom
bf-URL-url-escape-in-keyUrls
Feb 13, 2026
Merged

Add patch to URL-escape key paths in HTTP remote URLs#251
yarikoptic merged 1 commit intomasterfrom
bf-URL-url-escape-in-keyUrls

Conversation

@yarikoptic
Copy link
Copy Markdown
Member

keyUrls in Remote/Git.hs constructs URLs for fetching content from HTTP git remotes by simple string concatenation of the repo URL and the annex object path. When the key contains characters that keyFile encodes using git-annex's internal escaping (& for colons, % for slashes), the resulting URL contains bare % and & characters that are invalid in a URI path -- % must be followed by two hex digits per RFC 3986, and the parser rejects the URL as "invalid url".

This affects URL-backend keys like
URL--yt:https://www.youtube.com/watch?v=... where keyFile produces paths containing &c (for :) and %% (for //), resulting in unparseable URLs. SHA256E and other hash-based keys are unaffected since their serialized forms contain only URI-safe characters.

The fix applies escapeURIString (from Network.URI, already imported) to percent-encode the path components while preserving / as a path separator. This is the same approach used by Remote/S3.hs and Remote/WebDAV/DavLocation.hs.

See https://git-annex.branchable.com/bugs/fails_to_get_from_apache2_server_URL_backend_file/

keyUrls in Remote/Git.hs constructs URLs for fetching content from
HTTP git remotes by simple string concatenation of the repo URL and
the annex object path.  When the key contains characters that keyFile
encodes using git-annex's internal escaping (& for colons, % for
slashes), the resulting URL contains bare % and & characters that are
invalid in a URI path -- % must be followed by two hex digits per
RFC 3986, and the parser rejects the URL as "invalid url".

This affects URL-backend keys like
URL--yt:https://www.youtube.com/watch?v=... where keyFile produces
paths containing &c (for :) and %% (for //), resulting in unparseable
URLs.  SHA256E and other hash-based keys are unaffected since their
serialized forms contain only URI-safe characters.

The fix applies escapeURIString (from Network.URI, already imported)
to percent-encode the path components while preserving / as a path
separator.  This is the same approach used by Remote/S3.hs and
Remote/WebDAV/DavLocation.hs.

See https://git-annex.branchable.com/bugs/fails_to_get_from_apache2_server_URL_backend_file/

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@yarikoptic
Copy link
Copy Markdown
Member Author

for mac - no money. Ubuntu datalad tests fails on FAILED ../datalad/core/distributed/tests/test_clone.py::test_ria_postclone_noannex - assert not True -- not new

@yarikoptic yarikoptic merged commit 0aec2e7 into master Feb 13, 2026
25 of 35 checks passed
@yarikoptic yarikoptic deleted the bf-URL-url-escape-in-keyUrls branch February 13, 2026 04:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant