Skip to content

Adds verification of the file name#18848

Draft
andsel wants to merge 7 commits intoelastic:mainfrom
andsel:fix/verify_name
Draft

Adds verification of the file name#18848
andsel wants to merge 7 commits intoelastic:mainfrom
andsel:fix/verify_name

Conversation

@andsel
Copy link
Contributor

@andsel andsel commented Mar 11, 2026

Release notes

[rn:skip]

What does this PR do?

Adds a file name verification

@andsel andsel self-assigned this Mar 11, 2026
@github-actions
Copy link
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)
  • run exhaustive tests : Run the exhaustive tests Buildkite pipeline.

@mergify
Copy link
Contributor

mergify bot commented Mar 11, 2026

This pull request does not have a backport label. Could you fix it @andsel? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit.
  • If no backport is necessary, please add the backport-skip label

@andsel andsel marked this pull request as ready for review March 11, 2026 13:54
@andsel
Copy link
Contributor Author

andsel commented Mar 11, 2026

run exhaustive tests

Copy link
Contributor

@andrewvc andrewvc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're improving the code here we should explicitly handle the symlink case for clarity.

Is there a reason tests weren't added? For a security fix I'd expect that.

raise CompressError.new("Directory #{target} exist") if ::File.exist?(target)
::Zip::File.open(source) do |zip_file|
zip_file.each do |file|
raise CompressError.new("Extracting file outside target directory: #{file.name}") unless LogStash::Util.name_safe?(file.name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the copy here could be improved.

Suggested change
raise CompressError.new("Extracting file outside target directory: #{file.name}") unless LogStash::Util.name_safe?(file.name)
raise CompressError.new("Refusing to extract file to unsafe path: #{file.name}. Files may not traverse with `..`") unless LogStash::Util.name_safe?(file.name)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather have the validation code tell us why the provided path is unsafe, instead of having the validator return boolean and hard-coding a reason here.

For example, the name_safe? method will also return false when the provided path is absolute.

If we do provide blanket guidance, I would prefer to include all possible failure reasons:

Suggested change
raise CompressError.new("Extracting file outside target directory: #{file.name}") unless LogStash::Util.name_safe?(file.name)
raise CompressError.new("Refusing to extract file outside target directory: `#{file.name}`. Paths must be relative and may not traverse with `..`") unless LogStash::Util.name_safe?(file.name)

tar_file.each do |entry|
raise CompressError.new("Extracting file outside target directory: #{entry.full_name}") unless LogStash::Util.name_safe?(entry.full_name)
target_path = ::File.join(target, entry.full_name)

Copy link
Contributor

@andrewvc andrewvc Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're already in here, it'd be nice to handle the symlink issue for safety. When I started reviewing this I saw the note about them and was forced to think through whether this was exploitable or not.

While we are not presently vulnerable--since we only write regular files or create directories--it would be nice to at least handle the case here explicitly.

If someone (or some agent) mistakenly added symlink support this would be defensive. Also, we can remove the note later in the file about symlinks not being covered. This is one less bit of security surface area future contributors would have to worry about.

Suggested change
# POSIX tar typeflag values: '1' = hard link, '2' = symbolic link
# https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_06
# The pathname of a link being created to another file, of any type, previously archived. This record shall override the linkname field in the following ustar header block(s). The following ustar header block shall determine the type of link created. If typeflag of the following header block is 1, it shall be a hard link. If typeflag is 2, it shall be a symbolic link and the linkpath value shall be the contents of the symbolic link. The pax utility shall translate the name of the link (contents of the symbolic link) from the encoding in the header to the character set appropriate for the local file system. When used in write or copy mode, pax shall include a linkpath extended header record for each link whose pathname cannot be represented entirely with the members of the portable character set other than NUL.
case entry.header.typeflag
when '2' then raise CompressError.new("Refusing to extract symlink entry: #{entry.full_name}")
when '1' then raise CompressError.new("Refusing to extract hardlink entry: #{entry.full_name}")
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could use the symlink? method present in Gem::Package::TarReader::Entry instead of using magic numbers (https://github.com/ruby/rubygems/blob/bcc44695d6bb0915912e10cbb09283c1e242f1ff/lib/rubygems/package/tar_reader/entry.rb#L127).

However, it doesn't cover the hardlink case.

It seems that also zip has symlink? method we can leverage.

Copy link
Contributor Author

@andsel andsel Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From better investigation I found that rubyzip (at the version we use and above) disabled to decompress symlinks for security reasons.

Version 3.2.2 apparently re-introduced symlink: rubyzip/rubyzip#531

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, I'm not nuts about magic numbers either, weird there's no hardlink option. What would you like to do here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see you added the symlink check below. Looks like the lib only covers symlinks anyway, so I think this is good!

end

# Is the name a relative path, free of `..` patterns that could lead to
# path traversal attacks? This does NOT handle symlinks; if the path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my comment above, we can remove the mention of symlinks if we add the check for them.

@andsel
Copy link
Contributor Author

andsel commented Mar 13, 2026

run exhaustive tests

@andsel
Copy link
Contributor Author

andsel commented Mar 16, 2026

run exhaustive tests

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds security validation for file names during archive extraction (both zip and tar.gz) to prevent path traversal attacks. It also adds symlink support for tar.gz extraction with safety checks to ensure symlink targets remain within the extraction directory. Additionally, the pack installer is updated to support .tar.gz format alongside .zip.

Changes:

  • Adds verify_name_safe! method to reject nil/empty, absolute, or ..-traversing paths during extraction
  • Adds symlink_target_safe? method and symlink extraction support for tar.gz archives
  • Extends PackInstaller::Local to accept and extract .tar.gz packs alongside .zip

Reviewed changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
lib/bootstrap/util/compress.rb Adds path validation (verify_name_safe!), symlink safety check (symlink_target_safe?), and symlink extraction in Tar.extract
lib/pluginmanager/pack_installer/local.rb Extends pack installer to support .tar.gz format with appropriate extraction routing and error handling
spec/unit/util/compress_spec.rb Tests for path validation, traversal rejection, and symlink safety in both zip and tar extraction
spec/unit/plugin_manager/pack_installer/local_spec.rb Test for tar.gz pack extraction with symlinks
spec/support/pack/pack_with_symlink.tar.gz Test fixture tar.gz containing a symlink
spec/support/pack/README.md Documents how to recreate the test fixture

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +143 to +178

# Returns true if a symlink target (linkname) would resolve to a path under extraction_root
# when the symlink is created at symlink_path. Works on both Unix and Windows.
# @param linkname [String] symlink target (relative or absolute)
# @param symlink_path [String] full path where the symlink will be created
# @param extraction_root [String] root directory all paths must stay under
# @return [Boolean] true if resolved path is under extraction_root
def self.symlink_target_safe?(linkname, symlink_path, extraction_root)
return false if linkname.nil? || linkname.to_s.strip.empty?
symlink_dir = ::File.dirname(symlink_path)
resolved = Pathname.new(::File.expand_path(linkname, symlink_dir)).cleanpath
root = Pathname.new(::File.expand_path(extraction_root)).cleanpath
!resolved.relative_path_from(root).to_s.start_with?("..")
rescue ArgumentError
# relative_path_from raises if resolved is not under root
false
end

# Verifies that a path string is safe for extraction (relative, no parents traversal).
# Raises CompressError with a specific message if the path is nil/empty, absolute, or
# contains `..`. Does NOT handle symlinks, symlinks should be handled on per archive type basis.
# Works on both Unix and Windows.
# @param name [String] path string to validate
# @raise [CompressError] if path is nil, empty, absolute, or traverses with `..`
def self.verify_name_safe!(name)
if name.nil? || name.to_s.strip.empty?
raise CompressError.new("Refusing to extract file. Path cannot be nil or empty.")
end
cleanpath = Pathname.new(name).cleanpath
if cleanpath.absolute?
raise CompressError.new("Refusing to extract file to unsafe path: #{name}. Absolute paths are not allowed.")
end
if cleanpath.each_filename.to_a.include?("..")
raise CompressError.new("Refusing to extract file to unsafe path: #{name}. Files may not traverse with `..`")
end
end
Comment on lines +72 to +76
if source.downcase.end_with?(".tar.gz")
LogStash::Util::Tar.extract(source, temporary_directory)
else
LogStash::Util::Zip.extract(source, temporary_directory, LOGSTASH_PATTERN_RE)
end
@@ -0,0 +1,18 @@
# Pack fixtures
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great README to explain exactly what this is!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I think the case I'm more interested in us testing is a malicious symlink. AFAICT we really don't need to support them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to be moved in x-pack/spec/geoip_database_management/fixtures/ because there is the place for tar.gz. We use tar.gz files for GeoIP archives while offline packs are zip files, which doesn't support symlinks.

FileUtils.mkdir_p(target_path)
elsif entry.symlink?
linkname = entry.header.linkname
unless LogStash::Util.symlink_target_safe?(linkname, target_path, target)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need symlink support? I was think we'd just not support it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, we shouldn't verify the symlink and manage it, just throw an error if we find one.

@andsel andsel marked this pull request as draft March 17, 2026 13:51
…ymlink just raising an error when a tgz contains a symlink
@elasticmachine
Copy link

💚 Build Succeeded

History

cc @andsel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants