docs: Add comprehensive custom data guide and fix missing _component_#2889
Open
JonSnow1807 wants to merge 1 commit intometa-pytorch:mainfrom
Open
docs: Add comprehensive custom data guide and fix missing _component_#2889JonSnow1807 wants to merge 1 commit intometa-pytorch:mainfrom
JonSnow1807 wants to merge 1 commit intometa-pytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/2889
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This was referenced Jul 24, 2025
Collaborator
|
Hey! Thanks for the PR. To proceed on this we need:
|
- Add new custom_data_quickstart.rst guide addressing meta-pytorch#2221 - Fix missing _component_ field in instruct_datasets.rst examples (meta-pytorch#2215) - Add quick start examples for JSON, CSV, and HuggingFace datasets - Include troubleshooting section for common issues - Add guide to documentation index for easy discovery
5c6be28 to
62caf50
Compare
Author
|
Hi @krammnic, thank you for the review! I've addressed both issues you mentioned: ✅ Lint Issues Fixed
✅ Documentation Build Fixed
🧹 Branch Cleanup
I've tested everything locally and the documentation builds without errors. The CI workflows are awaiting approval. Please let me know if you need any other changes. Thanks again for your time! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR addresses two related documentation issues to significantly improve the new user experience:
_component_field in the instruct dataset examplesWhy This Matters
As noted in #2215 by @johnowhitaker, finding how to use custom data requires searching through multiple documentation pages. This is frustrating for new users who just want to get started with their own data. This PR consolidates all custom data information into a single, easy-to-find guide.
What's Included
New Custom Data Quick Start Guide (
custom_data_quickstart.rst)Bug Fixes in
instruct_datasets.rst_component_: torchtune.datasets.instruct_datasetto YAML examplesTesting
Impact
This documentation directly addresses the #1 user question when starting with TorchTune. It will significantly reduce support burden and improve user onboarding.
Fixes #2215
Fixes #2221
cc @RdoubleA for review