-
Notifications
You must be signed in to change notification settings - Fork 164
Pull requests: jd-opensource/xllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat: support video inference for Qwen3-VL on NPU device.
#1151
opened Mar 31, 2026 by
xanecdotex
Loading…
bugfix: rollback shared prefix blocks on allocate failure.
#1146
opened Mar 31, 2026 by
RobbieLeung
Loading…
feat: implement column parallel for lm head to improve performance.
#1145
opened Mar 31, 2026 by
wxh571001500
Loading…
feat: support embedding interface for all generate VLM models.
#1136
opened Mar 30, 2026 by
xanecdotex
Loading…
perf: optimize qwen3.5 hybrid linear cache flow[4/N].
#1130
opened Mar 30, 2026 by
yingxudeng
Loading…
feat: support configurable max/min pixels for vlm image processors.
#1123
opened Mar 27, 2026 by
wly-115
Loading…
refactor: simplify xllm_ops pre-build checks and marker-based rebuild logic.
#1111
opened Mar 25, 2026 by
LMX-xin
Loading…
feat: support pured lm head by candidate token ids.
#1071
opened Mar 17, 2026 by
RobbieLeung
Loading…
feat: add onerec in supported model docs and align rec utility style.
#1055
opened Mar 13, 2026 by
DragonFive
Loading…
feat: add onerec model implement[4/N].
#1051
opened Mar 13, 2026 by
DragonFive
Loading…
3 of 6 tasks
feat: add onerec model implement[3/N].
#1050
opened Mar 13, 2026 by
DragonFive
Loading…
5 of 9 tasks
bugfix: avoid decode instance leak if prefill instance prefill fail.
#1046
opened Mar 12, 2026 by
magicheng0816
Loading…
bugfix: fix decode instance infinite retry allocate.
#1045
opened Mar 12, 2026 by
magicheng0816
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.