Blaizzy / mlx-vlm Public

Notifications You must be signed in to change notification settings
Fork 506
Star 4.6k

Code
Issues 107
Pull requests 59
Discussions
Actions
Projects
Security and quality
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security and quality
Insights

Pull requests: Blaizzy/mlx-vlm

Labels 11 Milestones 0

New pull request New

59 Open 532 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Improve text-only decode performance

#1105 opened May 4, 2026 by lucasnewman Collaborator

Loading…

granite4_vision: support standard granite MLP backbone for 4.1-4b

#1104 opened May 3, 2026 by EliSchwartz

Loading…

[WIP] Add APC prompt caching with disk persistence

#1103 opened May 3, 2026 by Blaizzy Owner • Draft

Fix: model discovery for single-file safetensors models & improve /v1/models reporting

#1101 opened May 2, 2026 by frandariem

Loading…

Fix Gemma4 chat templating for Python generate()

#1099 opened May 1, 2026 by Girish011

Loading…

minicpmo / fastvlm: fix pixel cast on quantized language models

#1098 opened May 1, 2026 by contrapuntal

Loading…

3 tasks done

feat(server): support tools / function-calling on /v1/responses

#1097 opened May 1, 2026 by jasonvassallo

Loading…

fix: Batch generation breaks top-p sampling

#1094 opened Apr 30, 2026 by spicyneuron Contributor

Loading…

fix: Prevent batched cache metadata lazy graph buildup

#1093 opened Apr 30, 2026 by spicyneuron Contributor

Loading…

server: added loaded model's context size and tool call parser to /health endpoint

#1092 opened Apr 30, 2026 by goniz Contributor

Loading…

Add Sapiens2 + RTMDet (top-down pose pipeline)

#1081 opened Apr 26, 2026 by Blaizzy Owner

Loading…

8 tasks done

fix: remap per-layer quantization keys for mixed-bit models

#1078 opened Apr 26, 2026 by ivanfioravanti Contributor

Loading…

Fix Unicode byte-fallback decoding in server streaming responses

#1076 opened Apr 26, 2026 by spicyneuron Contributor

Loading…

fix(qwen3_5_moe): route nested model.language_model.visual.* to vision_tower (#1057)

#1075 opened Apr 26, 2026 by machiabeli

Loading…

3 tasks done

Fix stale test_utils.py regressions + extract get_class_predicate

#1071 opened Apr 25, 2026 by mdstaff Contributor

Loading…

2 tasks done

Handle OpenAI tool_choice requests

#1070 opened Apr 25, 2026 by eloe • Draft

Support OpenAI stop sequences in server

#1069 opened Apr 25, 2026 by eloe

Loading…

fix(gemma4): handle JSON-formatted keys in tool call parser

#1065 opened Apr 24, 2026 by afanty2021

Loading…

Add MiniCPM-V 4.6 support

#1058 opened Apr 24, 2026 by pzc163

Loading…

Add vision feature caching to all models

#1028 opened Apr 16, 2026 by Blaizzy Owner

Loading…

6 tasks done

Expose presence_penalty, frequency_penalty, and per-penalty context_size on the server API

#1023 opened Apr 14, 2026 by esaruoho

Loading…

refactor: improve model loading and resource handling in utils.py

#1019 opened Apr 13, 2026 by SyedaAnshrahGillani

Loading…

fix: propagate the verbose to the Prefill tqdm

#1015 opened Apr 12, 2026 by PeterStaar-IBM

Loading…

server: indicate finish reason properly when model made a tool call.

#1014 opened Apr 12, 2026 by viktike Contributor

Loading…

perf: close 5.5% decode gap vs mlx_lm.server on streaming chat endpoint

#1012 opened Apr 11, 2026 by chilang

Loading…

Previous 1 2 3 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2026-04-04.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!