Refactor Matching Logic to Use compressed-tensors Utilities#2284
Refactor Matching Logic to Use compressed-tensors Utilities#2284Etelis wants to merge 9 commits intovllm-project:mainfrom
Conversation
|
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
Summary of ChangesHello @Etelis, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request undertakes a significant refactoring effort to standardize the module and parameter matching logic within the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request is a significant refactoring that replaces custom module and parameter matching logic with standardized utilities from the compressed-tensors library. This consolidation simplifies the codebase by removing around 15 redundant functions and updating several modifiers (SparseGPT, magnitude pruning, constant pruning) to use the new, more explicit APIs. The changes are well-executed and improve code clarity and maintainability. Additionally, a bug in the magnitude pruning modifier, where the targets parameter was missing, has been fixed. The new logic is supported by new unit tests.
My feedback includes a couple of suggestions to further improve maintainability by reducing code duplication and adhering to Python's import best practices.
brian-dellabetta
left a comment
There was a problem hiding this comment.
Thanks for looking into this. leaving a few comments for now
dsikka
left a comment
There was a problem hiding this comment.
The commit history looks strange. Can you rebase?
Consolidates 17 redundant matching functions into standardized compressed-tensors APIs to eliminate code duplication and improve maintainability. Changes: - Add expand_special_targets() helper for backward-compatible constant expansion - Add build_parameterized_layers() to replace get_layers_params() - Remove 15 old matching functions from module.py (~260 lines deleted) - Update SparseGPT, magnitude, and constant pruning modifiers - Replace get_linear_layers() with match_named_modules() - Update tests to use new APIs and PyTorch's native get_submodule() Special constants (__ALL_PRUNABLE__, __ALL_QUANTIZABLE__) still work with deprecation warnings for backward compatibility. Fixes vllm-project#1686 Signed-off-by: Itay Etelis <itay.etelis@ibm.com>
9e64749 to
59a7c61
Compare
- Move PRUNABLE_LAYER_TYPES to module-level constant in sgpt_base.py - Move match_named_parameters import to top of module.py - Remove unused QAT imports from module.py - Remove test_get_submodule test (tests PyTorch built-in) Signed-off-by: Itay Etlis <itay.etelis@gmail.com> Signed-off-by: Itay Etelis <itay.etelis@ibm.com>
|
Addressed all review comments:
|
|
Regarding the QAT imports question: The QAT layers are still supported because they have the same
So The explicit imports were only needed for the old |
Ah ok great, good catch. i missed that the names lined up |
brian-dellabetta
left a comment
There was a problem hiding this comment.
approving pending @kylesayrs review
|
The quality checks have failed. Please run |
Signed-off-by: Itay Etlis <itay.etelis@gmail.com> Signed-off-by: Itay Etelis <itay.etelis@ibm.com>
HDCharles
left a comment
There was a problem hiding this comment.
see comment and test failures
- Fix match_named_parameters not matching class names (e.g., "Linear") by switching to match_named_modules which supports class name matching - Inline match_modules function as it was only used in one place - Remove match_modules function from helpers.py Signed-off-by: Itay Etelis <itay.etelis@ibm.com>
ab6373f to
47391cf
Compare
|
This pull request has merge conflicts that must be resolved before it can be |
Resolve merge conflicts: - magnitude/base.py: Keep build_parameterized_layers usage - helpers.py: Merge imports (keep match_named_modules, add new offload API) Signed-off-by: Itay Etelis <itayetelis@gmail.com>
|
Resolved merge conflicts araised. Can you guys re approve and get it merged? |
- Fix build_parameterized_layers to set param_name as full path (e.g., "0.weight" instead of just "0") - Remove extra brackets around sequential_targets in sgpt_base.py that caused TypeError: unhashable type 'list' Signed-off-by: Itay Etelis <itay.etelis@ibm.com>
05eb539 to
c48f32a
Compare
|
Tested PR #2284 on a remote machine: All base tests pass. |
|
Hey @brian-team, I ran the full transformers compression test suite locally on a GPU machine (RTX 3090) and all tests pass (62 passed, 5 skipped). The CI failure was a transient 504 Gateway Timeout from HuggingFace Hub: This is an infrastructure issue on HF's side, not a code problem. Could we re-run the failed CI job? |
|
This pull request has merge conflicts that must be resolved before it can be |
brian-dellabetta
left a comment
There was a problem hiding this comment.
Looks like some tests have not been updated with the removed helper methods
tests/llmcompressor/transformers/sparsegpt/test_sparsegpt_owl.py:10: in <module>
from llmcompressor.utils.pytorch.module import get_layers
E ImportError: cannot import name 'get_layers' from 'llmcompressor.utils.pytorch.module' (/home/runner/_work/_tool/Python/3.12.12/x64/lib/python3.12/site-packages/llmcompressor/utils/pytorch/module.py)
Consolidates 17 redundant matching functions into standardized compressed-tensors APIs.
Fixes #1686
module.py:get_layers,get_params,get_prunable_layers,get_quantizable_layers,match_targets, etc.expand_special_targets()(backward compatibility) andbuild_parameterized_layers()self.targetsparameter in magnitude pruning