Skip to content

Restructure and simplify the AWQModifier to be similar to SmoothQuant #2327

@dsikka

Description

@dsikka

Summary

Similar to SmoothQuant, AWQ determines and applies smoothing scales. This process is facilitated by a set of model specific mappings which indicate the activation layers to smooth. Once smoothing is complete, the modifier applies redundant code that is already present within the QuantizationModifier to determine optimal quantization scales and zero_points

The scope of this task is to simplify this modifier by:

  1. Removing all the code responsible for generating the quantization scales and zero-points. AWQ, like SmoothQuant, should have the ability to be used in a stackable manner with other quantization modifiers such as the GPTQModifier and QuantizationModifier.

Examples 1:

recipe = [
    AWQModifier(...),
    QuantizationModifier(...)
]

Examples 2:

recipe = [
    AWQModifier(...),
    GPTQModifier(...)
]

Example 3:

recipe = [
    AWQModifier(...),
    QuIPModifier(
        rotations=["v", "u"], transform_block_size=128, transform_type="hadamard"
    ),
    QuantizationModifier(targets="Linear", scheme="W4A16", ignore=["lm_head"]),
]
  1. Simplify the modifier. This can be done by targeting key functionality that is quite complex, such as _compute_best_scale

Implementation Steps

  1. Remove quantization scale and zero-point generation code and ensure AWQ can be used in a stackable manner with other quantization modifiers. Validate this through additional test scripts.
  2. Make other simplifications within the modiifer to improve the complexity of the modifier
  3. Update all examples and tests using the AWQModifier to now use AWQModifier stacked with the QuantizationModifier. Validate performance and accuracy remains the same
  4. As the AWQModifier will still require quantization arguments to generate the means, potentially add a recipe validation step to validate that these arguments match the quantization arguments passed to the subsequent quantization modifier, such as GPTQModifier and QuantizationModifier

Metadata

Metadata

Assignees

Labels

awqFor any issue / PR related to AWQ supportenhancementNew feature or requestgood first issueA good first issue for users wanting to contributegood follow-up issueA good issue for users with some familiarity of the codebasekeep-open

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions