Skip to content

granite4_vision: support standard granite MLP backbone for 4.1-4b#1104

Merged
Blaizzy merged 5 commits into
Blaizzy:mainfrom
EliSchwartz:granite-vision-4.1-4b-support
May 5, 2026
Merged

granite4_vision: support standard granite MLP backbone for 4.1-4b#1104
Blaizzy merged 5 commits into
Blaizzy:mainfrom
EliSchwartz:granite-vision-4.1-4b-support

Conversation

@EliSchwartz
Copy link
Copy Markdown
Contributor

The granite-4.0-3b-vision uses a GraniteMoeHybrid backbone with a fused SharedMLP (input_linear/output_linear), while granite-vision-4.1-4b uses a standard Granite backbone with separate gate_proj/up_proj/down_proj weights. Add an MLP class for the standard variant and select between the two in TransformerBlock based on text_config.model_type.

EliSchwartz and others added 5 commits May 3, 2026 17:09
The granite-4.0-3b-vision uses a GraniteMoeHybrid backbone with a fused
SharedMLP (input_linear/output_linear), while granite-vision-4.1-4b uses
a standard Granite backbone with separate gate_proj/up_proj/down_proj weights.
Add an MLP class for the standard variant and select between the two in
TransformerBlock based on text_config.model_type.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown
Owner

@Blaizzy Blaizzy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @EliSchwartz! 🚀

@Blaizzy Blaizzy merged commit b8b8939 into Blaizzy:main May 5, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants