-
-
Notifications
You must be signed in to change notification settings - Fork 506
Pull requests: Blaizzy/mlx-vlm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
granite4_vision: support standard granite MLP backbone for 4.1-4b
#1104
opened May 3, 2026 by
EliSchwartz
Loading…
Fix: model discovery for single-file safetensors models & improve /v1/models reporting
#1101
opened May 2, 2026 by
frandariem
Loading…
minicpmo / fastvlm: fix pixel cast on quantized language models
#1098
opened May 1, 2026 by
contrapuntal
Loading…
3 tasks done
feat(server): support tools / function-calling on /v1/responses
#1097
opened May 1, 2026 by
jasonvassallo
Loading…
fix: Batch generation breaks top-p sampling
#1094
opened Apr 30, 2026 by
spicyneuron
Contributor
Loading…
fix: Prevent batched cache metadata lazy graph buildup
#1093
opened Apr 30, 2026 by
spicyneuron
Contributor
Loading…
server: added loaded model's context size and tool call parser to /health endpoint
#1092
opened Apr 30, 2026 by
goniz
Contributor
Loading…
Add Sapiens2 + RTMDet (top-down pose pipeline)
#1081
opened Apr 26, 2026 by
Blaizzy
Owner
Loading…
8 tasks done
fix: remap per-layer quantization keys for mixed-bit models
#1078
opened Apr 26, 2026 by
ivanfioravanti
Contributor
Loading…
Fix Unicode byte-fallback decoding in server streaming responses
#1076
opened Apr 26, 2026 by
spicyneuron
Contributor
Loading…
fix(qwen3_5_moe): route nested model.language_model.visual.* to vision_tower (#1057)
#1075
opened Apr 26, 2026 by
machiabeli
Loading…
3 tasks done
Fix stale test_utils.py regressions + extract get_class_predicate
#1071
opened Apr 25, 2026 by
mdstaff
Contributor
Loading…
2 tasks done
fix(gemma4): handle JSON-formatted keys in tool call parser
#1065
opened Apr 24, 2026 by
afanty2021
Loading…
Add vision feature caching to all models
#1028
opened Apr 16, 2026 by
Blaizzy
Owner
Loading…
6 tasks done
Expose presence_penalty, frequency_penalty, and per-penalty context_size on the server API
#1023
opened Apr 14, 2026 by
esaruoho
Loading…
refactor: improve model loading and resource handling in utils.py
#1019
opened Apr 13, 2026 by
SyedaAnshrahGillani
Loading…
server: indicate finish reason properly when model made a tool call.
#1014
opened Apr 12, 2026 by
viktike
Contributor
Loading…
perf: close 5.5% decode gap vs mlx_lm.server on streaming chat endpoint
#1012
opened Apr 11, 2026 by
chilang
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-04-04.