Skip to content

Replace content_type spoofing_protection option by deep_validation (BREAKING) #404

@Mth0158

Description

@Mth0158

Current behaviour:

The spoofing_protection for the content_type validator is supposed to perform a deep validation on the uploaded file to ensure its content is the one declared through its metadata. This feature is supposed to counteract a file, which headers can say it's a png, pdf or whatever, but its content is something entirely else.
The current option name and documentation creates an expectation of deep validation that the implementation doesn’t actually deliver. Right now, we open the uploaded file with the UNIX file command, and retrieve the deducted content_type from such command. BUT, the file command is only performing a short validation, kind of close of what Marcel gem is doing inside ActiveStorage (file uses mostly libmagic to determine a content_type).
This check is not enough to declare the file as "spoofed free".

Expected behaviour:

The perfect, but entirely theoretical, approach would be to analyse the whole file content and say : "ok it's a not spoofed x y z", but, to my knowledge, this tool does not exist. A more pragmatical approach would be to open the uploaded file with the right tool (e.g. audio file with ffmpeg) and let the tool say if there is an issue with the file, like the processable_file validator is doing. Therefore we delegate the heavy work, we just gently catch the error.

The only pitfall here would be that the spoofing_protection option would do a bit more than announced: it would also tag malformed, but not spoofed, files as spoofed, which is not the goal. IMO we are sitting between two chairs here, I would prefer to also rename the option to deep_validation.

We could remove the processable_file validator altogether, and rather implement this validation through the deep_validation option for the content_type validator. This has the benefit of:

  • Solve the issue with spoofing_protection option not doing it's job
  • Making the gem's API simpler (remove 1 validator)

Proposal Summary

  • Deprecate spoofing_protection option for content_type validator
  • Introduce deep_validation option for content_type validator
  • deep_validation will:
    • Delegate to underlying processors appropriate for type (image/video/audio/pdf) (ie do exactly what processable_file validator is doing)
    • Fail if parse/processing fails for allowed types
    • Be opt-in due to expense
  • Remove/merge processable_file into deep_validation (drops the need for a separate validator)

To not forget

  • Update the readme extensively to reflect these changes
  • Find a way to update Rails custom metadata blobs that the gem sets to reflect these changes

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestsecuritySecurity related issues

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions