You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/releases/known-issues.md
+6-2Lines changed: 6 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,13 +2,17 @@
2
2
3
3
This page lists known issues and limitations in the current release.
4
4
5
+
## 26.02
6
+
7
+
- AWS EKS only: Due to AWS-OFI-NCCL v1.17.0 long-running jobs suffer a memory leak that causes performance regression over time. This can be mitigated by upgrading to [v1.17.3](https://github.com/aws/aws-ofi-nccl/releases/tag/v1.17.3).
8
+
- Context parallelism with sequence packing are not yet supported for Qwen 3 VL in the r0.3.0 release. For this functionality with Qwen 3 VL, please utilize the main branch.
9
+
- DeepEP is not supported in the current NeMo framework 26.02 container (nvcr.io/nvidia/nemo:26.02), which results in reduced DSv3 performance compared to the NeMo framework 25.09 container (nvcr.io/nvidia/nemo:25.09) on H100 machines. For optimal H100 performance, we recommend using the NeMo framework 25.09 container.
10
+
5
11
## 25.11
6
12
7
13
- Deepseek V3 on H100 has an issue when using DeepEP and fails with `RuntimeError: DeepEP error: timeout (dispatch CPU)`.
8
14
- MODEL_TFLOP/s/GPU is printed as 0 to stdout for all Hybrid models, such as Nemotron-H 56B.
9
15
10
-
11
16
## 25.09
12
17
13
18
-**Pretraining DeepSeek in subchannel FP8 precision is not working.** Pretraining DeepSeek with current scaling FP8 is a workaround, but MTP loss does not converge.
Copy file name to clipboardExpand all lines: docs/releases/software-versions.md
+24-2Lines changed: 24 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,29 @@
1
1
# Software Component Versions
2
2
3
+
## NeMo Framework 26.02
4
+
5
+
| Software Component | Version |
6
+
|-------------------|---------|
7
+
| PyTorch | 2.10.0a0 |
8
+
| Megatron Core | main:0.16.0 |
9
+
| Transformer Engine | 2.12 |
10
+
| Megatron-Bridge | 0.3.0 |
11
+
| Megatron-FSDP | 0.3.0 |
12
+
| Export-Deploy | 0.4.0 |
13
+
| Evaluator | 0.1.74 |
14
+
| NeMo | 2.7.0 |
15
+
| NeMo Run | 0.8.0 |
16
+
| Nvidia-ModelOpt | 0.41.0 |
17
+
| NVRX | 0.5.0 |
18
+
| CUDA | 13.0.2 |
19
+
| cuDNN | 9.18.0.50 |
20
+
| TRT-LLM | 1.1.0 |
21
+
| vLLM | 0.14.1 |
22
+
23
+
```{note}
24
+
NVIDIA NeMo™ Framework Training container is built on top of NVIDIA Optimized Frameworks PyTorch 25.06 container: https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html
25
+
```
26
+
3
27
## NeMo Framework 25.11
4
28
5
29
| Software Component | Version |
@@ -23,7 +47,6 @@
23
47
NVIDIA NeMo™ Framework Training container is built on top of NVIDIA Optimized Frameworks PyTorch 25.06 container: https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html
24
48
```
25
49
26
-
27
50
## NeMo Framework 25.09
28
51
29
52
| Software Component | Version |
@@ -48,4 +71,3 @@ NVIDIA NeMo™ Framework Training container is built on top of NVIDIA Optimized
48
71
```{note}
49
72
NVIDIA NeMo™ Framework Training container is built on top of NVIDIA Optimized Frameworks PyTorch 25.06 container: https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/index.html
0 commit comments