docs(readme): update convergence table, latest news, and outdated links#2638
docs(readme): update convergence table, latest news, and outdated links#2638sbhavani wants to merge 7 commits intoNVIDIA:mainfrom
Conversation
Greptile SummaryThis documentation-only PR updates Confidence Score: 5/5Safe to merge; all blocking issues from prior threads should be resolved before the notebook link goes live. Documentation-only PR with accurate, well-scoped changes. The only open issues (missing quickstart.ipynb file and table whitespace) are already tracked in existing review threads and are P2 in nature — they do not block the informational value of these updates. No new files require special attention beyond what is already tracked in prior review threads. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[README.rst PR Changes] --> B[Latest News\nAdd Nemotron 3 entry - 12/2025]
A --> C[Overview & Highlights\nAdd MXFP8 / NVFP4 mentions]
A --> D[Examples Section\nAdd quickstart.ipynb link]
A --> E[Installation / Docker\nUpdate NGC containers to 26.01]
A --> F[Convergence Table\nAdd LLM-8B and MoE-16B MXFP8 rows]
A --> G[Integrations\nFix DeepSpeed org URL and Lightning docs link]
D -.->|quickstart.ipynb does not exist| H[404 - tracked in prior threads]
Greploops — Automatically fix all review issues by running Reviews (9): Last reviewed commit: "fix(readme): update convergence section,..." | Re-trigger Greptile |
README.rst
Outdated
| loss = out.sum() | ||
| loss.backward() | ||
|
|
||
| For a tutorial with more details, see the `Quickstart Notebook <https://github.com/NVIDIA/TransformerEngine/blob/main/docs/examples/quickstart.ipynb>`_. |
There was a problem hiding this comment.
The referenced quickstart.ipynb file does not exist in docs/examples/. The actual notebooks in that directory are fp8_primer.ipynb, advanced_optimizations.ipynb, and te_jax_integration.ipynb. Consider using one of these existing notebooks or creating the quickstart notebook before merging.
3f01d10 to
98726c5
Compare
README.rst
Outdated
| | LLM-8B | Megatron Core | https://arxiv.org/abs/2506.08027 | | ||
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | ||
| | MPT-13B | Mosaic Composer | https://www.databricks.com/blog/turbocharged-training-optimizing-databricks-mosaic-ai-stack-fp8 | | ||
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | ||
| | GPT-22B | NeMo Framework | Available on request | | ||
| | MoE-16B | Megatron Core | https://arxiv.org/abs/2506.08027 | |
There was a problem hiding this comment.
extra whitespace in "Megatron Core" framework names - inconsistent with other rows
| | LLM-8B | Megatron Core | https://arxiv.org/abs/2506.08027 | | |
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | |
| | MPT-13B | Mosaic Composer | https://www.databricks.com/blog/turbocharged-training-optimizing-databricks-mosaic-ai-stack-fp8 | | |
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | |
| | GPT-22B | NeMo Framework | Available on request | | |
| | MoE-16B | Megatron Core | https://arxiv.org/abs/2506.08027 | | |
| | LLM-8B | Megatron Core | https://arxiv.org/abs/2506.08027 | | |
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | |
| | MPT-13B | Mosaic Composer | https://www.databricks.com/blog/turbocharged-training-optimizing-databricks-mosaic-ai-stack-fp8 | | |
| +------------+------------------+---------------------------------------------------------------------------------------------------------+ | |
| | MoE-16B | Megatron Core | https://arxiv.org/abs/2506.08027 | |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
90af7f4 to
d5922a3
Compare
cfb045e to
52deeda
Compare
Additional Comments (1)
This link uses |
Additional Comments (1)
The PR description states the convergence table is being "updated with MXFP8 results from arxiv paper" (
If the new entries represent MXFP8 convergence (not standard FP8), readers may be misled about what precision those Megatron Core results actually demonstrate. Consider either:
|
…t info - Add MXFP8 and NVFP4 format support to highlights and description - Update FP8 convergence table with MXFP8 results from arxiv paper - Remove outdated JAX-Toolbox links and "available on request" entries - Update Docker container versions to 26.01 - Fix DeepSpeed and Lightning integration links - Add Nemotron 3 paper to Latest News - Add quickstart notebook link after PyTorch example Signed-off-by: Santosh Bhavani <[email protected]>
- Replace quickstart.ipynb link with fp8_primer.ipynb (file exists) - Fix extra whitespace in Megatron Core table rows Signed-off-by: Santosh Bhavani <[email protected]>
Signed-off-by: Santosh Bhavani <[email protected]>
Signed-off-by: Santosh Bhavani <[email protected]>
This reverts commit 643b3d9. Signed-off-by: Santosh Bhavani <[email protected]>
f0bd464 to
c391c17
Compare
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Santosh Bhavani <[email protected]>
Additional Comments (1)
The section title and its introductory description only mention FP8, but the two new rows ( Consider either:
For example, a quick fix to the section header and description: |
Signed-off-by: Santosh Bhavani <[email protected]>
Description
Updates README add missing format support documentation, update news section, and fix broken/outdated links.
Type of change
Changes
Checklist: