Skip to content

Releases: AI-Hypercomputer/maxtext

maxtext-v0.2.2

08 May 18:03

Choose a tag to compare

Changes

  • Upgraded JAX to version 0.9.2, improving support for both pre-training and post-training.
  • Introduced simplified APIs for accessing MaxText models.
  • Included maxtext_with_gepa.ipynb, a new notebook demonstrating AIME prompt optimization using the GEPA framework within MaxText.
  • Added support for Kimi-K2 models and the MuonClip optimizer. Users can explore this with the kimi-k2-1t config (see user guide for details).
  • Kimi-K2-Thinking, Kimi-K2.5 (text), and Kimi-K2.6 (text) are now supported. See Run_Kimi.md for details.
  • DeepSeek-V3.2 is now supported, including DeepSeek Sparse Attention for handling long contexts. Use the deepseek3.2-671b config to try it out (refer to the user guide for more information).
  • Support has been added for Gemma 4 multi-modal models (26B MoE and 31B dense). These can be used with the gemma4-26b and gemma4-31b configs. See Run_Gemma4.md for further details.
  • Support has been added for Gemma 4 inference using MaxText on vLLM plugin.
  • Enhanced RL capabilities with support for the open-r1/OpenR1-Math-220k dataset and nvidia/OpenMathReasoning.
  • Added more evaluation modes for RL like majority voting and pass@1 estimation.
  • Sync weights to vllm prior to pre RL evaluation.
  • More robust usage of math-verify in RL.
  • MaxText's Supervised Fine-Tuning (SFT) now supports non-instruct models.
  • Added support for tensor parallelism using the Fused MoE kernel for MaxText on vLLM inference.
  • Added support for MaxText to vllm converters for Qwen3 and Gemma4 family of models.
  • validate_converter.py now runs on multislice environment to test larger models with utilities to compare maxtext and vllm weights.

Deprecations

  • Legacy MaxText.* shims have been removed. Please refer to src/MaxText/README.md for details on the new command locations and how to migrate.
  • Sequence parallelism has been deprecated, please use context parallelism instead.
  • The flag expert_shard_attention_option is deprecated, use custom_mesh_and_rule=ep-as-cp for the same functionality.

maxtext-v0.2.1

23 Mar 22:05

Choose a tag to compare

  • Use the new maxtext[runner] installation option to build Docker images without cloning the repository. This can be used for scheduling jobs through XPK. See the MaxText installation instructions for more info.
  • Config can now be inferred for most MaxText commands. If you choose not to provide a config, MaxText will now select an appropriate one.
  • Configs in MaxText PyPI will now be picked up without storing them locally.
  • New features from DeepSeek-AI are now supported: Conditional Memory via Scalable Lookup (Engram) and Manifold-Constrained Hyper-Connections (mHC). Try them out with our deepseek-custom starter config.
  • MaxText now supports customizing your own mesh and logical rules. Two examples guiding how to use your own mesh and rules for sharding are provided in the custom_mesh_and_rule directory.

maxtext-v0.2.0

06 Mar 07:15

Choose a tag to compare

Changes

Deprecations

  • Many MaxText modules have changed locations. Core commands like train, decode, sft, etc. will still work as expected temporarily. Please update your commands to the latest file locations
  • install_maxtext_github_deps installation script replaced with install_maxtext_tpu_github_deps
  • tools/setup/setup_post_training_requirements.sh for post training dependency installation is deprecated in favor of pip installation

maxtext-tutorial-v1.5.0

30 Dec 21:33

Choose a tag to compare

Merge pull request #2898 from AI-Hypercomputer:tests_docker_image

PiperOrigin-RevId: 850456883

maxtext-tutorial-v1.4.0

12 Dec 19:49

Choose a tag to compare

maxtext-tutorial-v1.4.0

maxtext-tutorial-v1.3.0

20 Nov 07:19

Choose a tag to compare

Merge pull request #2706 from AI-Hypercomputer:mohit/tokamax_quant_gmm

PiperOrigin-RevId: 834605168

maxtext-tutorial-v1.2.0: Merge pull request #2676 from AI-Hypercomputer:pypi_release

14 Nov 21:00

Choose a tag to compare

Recipe Branch for TPU performance results

25 Oct 03:54

Choose a tag to compare

Merge pull request #2539 from AI-Hypercomputer:qinwen/latest-tokamax

PiperOrigin-RevId: 823749360

maxtext-tutorial-v1.0.0

24 Oct 01:25

Choose a tag to compare

Merge pull request #2538 from AI-Hypercomputer:mohit/fix_docker

PiperOrigin-RevId: 822796389

tpu-recipes-v0.1.5

18 Oct 07:27

Choose a tag to compare

Use this release for tpu-recipes that require version tpu-recipes-v0.1.5