-
Notifications
You must be signed in to change notification settings - Fork 732
Pull requests: pytorch/FBGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix sorted_ids None issue in SSD TBE optimizer state fetching
cla signed
fb-exported
meta-exported
#5525
opened Mar 24, 2026 by
EddyLXJ
Loading…
Update default CUDA version to 13.0.2
cla signed
fb-exported
meta-exported
#5524
opened Mar 24, 2026 by
gchalump
Loading…
Fix CPU TBE inline bounds check for unified embedding (#5523)
cla signed
fb-exported
meta-exported
#5523
opened Mar 24, 2026 by
shuyaobi-afk
Loading…
Precompute writeback dedup indices in forward to eliminate GPU-CPU sync in backward (#5522)
cla signed
fb-exported
meta-exported
#5522
opened Mar 24, 2026 by
Zhihan-Lu
Loading…
IKBO LCE kernel in fbgemm (#5521)
cla signed
fb-exported
meta-exported
#5521
opened Mar 24, 2026 by
jianj01
Loading…
Fix race conditions: make shared mutable state atomic
cla signed
#5520
opened Mar 24, 2026 by
cyyever
Loading…
Auto-size RocksDB block cache and expose L2 cache hit rate
cla signed
fb-exported
meta-exported
#5513
opened Mar 22, 2026 by
goldcoderZ
Loading…
Double-buffer eviction buffers to reduce prefetch stalls
cla signed
fb-exported
meta-exported
#5512
opened Mar 22, 2026 by
goldcoderZ
Loading…
Tune RocksDB bloom filter and background thread pool sizing (#5511)
cla signed
fb-exported
meta-exported
#5511
opened Mar 22, 2026 by
goldcoderZ
Loading…
Replace spin-wait polling with condition variable in EmbeddingKVDB fill queue (#5510)
cla signed
fb-exported
meta-exported
#5510
opened Mar 22, 2026 by
goldcoderZ
Loading…
Add tests for group_index_select_dim0 mixed-dtype validation
cla signed
fb-exported
meta-exported
#5507
opened Mar 21, 2026 by
q10
Loading…
Fix Half2 UVM performance regression with vectorized store
cla signed
fb-exported
meta-exported
module: rocm
#5499
opened Mar 19, 2026 by
q10
Loading…
Add cache locking and dedicated memcpy stream for SSD TBE inference (#5480)
cla signed
fb-exported
meta-exported
#5480
opened Mar 16, 2026 by
goldcoderZ
Loading…
Use atomicAdd for lxu_cache_locking_counter increments/decrements
cla signed
fb-exported
meta-exported
#5479
opened Mar 16, 2026 by
goldcoderZ
Loading…
Implement pre-sorting, caching and contigous warp processing in group_index_select backward
cla signed
module: rocm
#5476
opened Mar 12, 2026 by
avbokovoy
Loading…
Deduplicate check to reduce binary size
cla signed
fb-exported
meta-exported
#5474
opened Mar 12, 2026 by
spcyppt
Loading…
Folly header clean up (fbgemm)
cla signed
fb-exported
meta-exported
#5471
opened Mar 10, 2026 by
mzlee
Loading…
Extend
permute_2D_sparse_data with optional pre-allocated output buffers
cla signed
fb-exported
meta-exported
#5461
opened Mar 9, 2026 by
TroyGarden
Loading…
enable feature score auto collection in EBC (#5459)
cla signed
fb-exported
meta-exported
#5459
opened Mar 6, 2026 by
xywang9334
Loading…
Clean up kernel code by deleting unused options and code logic
cla signed
fb-exported
meta-exported
#5456
opened Mar 6, 2026 by
howei
Loading…
apply the similar logic in https://github.com/pytorch/FBGEMM/pull/4208 on forward V2 kernel for ROCm
cla signed
module: rocm
#5447
opened Mar 2, 2026 by
liligwu
Loading…
Use folly::atomic_ref only in non-OSS
cla signed
fb-exported
meta-exported
#5445
opened Mar 2, 2026 by
q10
Loading…
Enable MTIA preproc in aps_frontend_benchmark
cla signed
fb-exported
meta-exported
#5441
opened Feb 27, 2026 by
AGZain
Loading…
Fix conda corruption from GCC deactivation scripts
cla signed
fb-exported
meta-exported
#5436
opened Feb 26, 2026 by
gchalump
Loading…
Previous Next
ProTip!
Updated in the last three days: updated:>2026-03-21.