[feat] Add Training Web UI by yuyu5333 · Pull Request #524 · jingyaogong/minimind

yuyu5333 · 2025-11-06T14:52:32Z

Minimind项目实在是太棒了，第一次让我能够从0训练一个可以对话的chat模型，并且效果非常惊艳，不得不让人一直使用。

在高频使用下，想将训练环节变得更加容易，要是可以用鼠标点点点就完成训练任务，那将大大降低训练的易用程度。

✨✨✨因此我为Minimind开发了训练环节的Web UI，任何一个用户都可以在本地部署并且使用web进行训练。

同时也希望能够和支持Minimind成为Training Web UI开发者、参与者，后续会持续更新~

Update 11-24

支持 GRPO、SPO
训练进程中实时更新进度，不用长时间开启日志框（频繁前后端交互会导致web缓存逐步增加）
优化训练界面，更加用户友好

Update

精度验证：结论：使用web训练不会对Minimind产生性能和效果方面的影响
联动 SwanLab( wandb ) 一键跳转：对于选择 SwanLab日志监控的训练进程，一键跳转到对应Swanlab日志
支持一键多卡并行训练，可自由选择卡数
支持pretrain、sft、lora、dpo、ppo训练
进程安全运行，避免因为vscode、trae等远程连接断开而导致训练中断，见使用方法1

使用方法（需要提前将数据集下载到minimind/dataset文件夹中）：

git clone https://github.com/yuyu5333/minimind.git
cd minimind
git checkout feat/as_a_tools
pip install -r requirements.txt
# 方法1：使用自动进程管理启动
bash trainer_web/start_web_ui.sh
# 方法2：直接运行train_web_ui.py
python trainer_web/train_web_ui.py

如果使用的是云服务器，配合Vscode、TRAE等远程连接软件会自动进行端口转发，从云端启动服务可在本地进行Web操作：

以下是训练Web UI效果展示：

Todo list：

jingyaogong · 2025-11-06T16:12:19Z

很好的尝试😊
我会在空闲时run这个pr（基于fork分支的repo）如果有问题会在pr中追加comment

yuyu5333 · 2025-11-07T05:18:18Z

Update：

支持RL - DPO
训练log自动更新
优化操作界面等

yuyu5333 · 2025-11-10T10:07:22Z

Update：

支持多卡并行训练，自动检测是否支持GPU以及GPU数量，并在单卡/多卡训练时对输入进行限制

yuyu5333 · 2025-11-11T07:55:17Z

Update：

支持ppo训练
重构web代码结构，统一放置minimind/trainer_web
提供更安全地web启动方式，避免因为vscode、trae等远程连接断开而导致训练中断，使用方法：

cd ~/minimind
bash trainer_web/start_web_ui.sh

出现以下信息则表示服务正常启动，可点击 “http://localhost:5000” 进行访问：

bash trainer_web/start_web_ui.sh
启动 MiniMind Web UI 服务...
日志文件: ../logfile/web_ui_20251111_074656.log
服务已启动! PID: 2919497
访问地址: http://localhost:5000
停止命令: kill 2919497 or bash trainer_web/start_web_ui.sh

停止Web Training，直接kill对应进程或者使用 start_web_ui.sh 进行停止：

minimind# bash trainer_web/stop_web_ui.sh
正在停止 Web UI 服务 (PID: 2919497)
服务已停止

🍰启动服务时，Web Training进程会临时保存在 trainer_web/train_web_ui.pid 中，当使用 kill or stop_web_ui.sh 停止服务会自动清理pid文件。

🍿停止服务后，本次训练web运行周期中正在运行和已经停止的进程都会保存在本地 trainer_web/training_processes.json，以便下次启动Web Training时查看历史进程信息。

yuyu5333 · 2025-11-11T08:22:29Z

先说结论：使用web训练不会对Minimind产生性能和效果方面的影响

由于对Minimind的训练代码没有做任何更改，理论上不存在训练效果、性能失效的现象。但是根据以往训练模型的经验，程序往往存在多种不可预测性，即使我认为没有做任何改动总是会出现意想不到的结果。

因此，我通过web 单卡/多卡与cli 单卡/多卡对Minimind进行训练，通过swlab的loss进行对比，以验证训练的正确性。

PS：部分训练环节非常耗时，如pretrain、rl等，众所周知的是Vscode远程连接云服务器长时间不操作可能会产生连接中断导致训练进程停止，这也催促了我实现进程保护功能；lora训练非常迅速，因此将单卡、多卡均进行测试对比，其余流程均采用4卡并行方式训练（pretrain也使用单卡是因为当时多卡训练功能还没写好 hhh）；理论上应该如pretrain和sft那样，训练loss一模一样才对，考虑硬件计算方式随机性，又或者是lora、rl环节计算复杂产生了细微差异，目前来看训练loss趋势一致并且无显著差异，可以说明训练的一致性。

Pretrain

使用Minimind默认配置，仅修改 log_interval = 1：

# web
/usr/bin/python3 ../trainer/train_pretrain.py --save_weight pretrain_web --epochs 1 --batch_size 32 --learning_rate 5e-4 --log_interval 1 --data_path ../dataset/pretrain_hq.jsonl --hidden_size 512 --num_hidden_layers 8 --max_seq_len 512 --use_moe 0 --save_dir ../out --save_interval 100 --from_weight none --device cuda:0 --use_wandb --wandb_project minimind_training

# cli
nohup python3 train_pretrain.py --save_weight pretrain_cli --epochs 1 --batch_size 32 --learning_rate 5e-4 --log_interval 1 --data_path ../dataset/pretrain_hq.jsonl --hidden_size 512 --num_hidden_layers 8 --max_seq_len 512 --use_moe 0 --save_dir ../out --save_interval 100 --from_weight none --device cuda:1 --use_wandb --wandb_project minimind_training_cli > ../logfile_cli/1.pretrain.log 2>&1 &

Pretrain Loss：

SFT

Minimind默认配置，batch size 64

多卡并行（4卡，后续所有多卡并行均为4卡）

# web: cuda 0 1 2 3
torchrun --master_port 12345 --nproc_per_node 4 ../trainer/train_full_sft.py --save_weight full_sft_web --epochs 2 --batch_size 64 --learning_rate 5e-7 --log_interval 1 --data_path ../dataset/sft_mini_512.jsonl --hidden_size 512 --num_hidden_layers 8 --max_seq_len 512 --use_moe 0 --save_dir ../out --save_interval 100 --from_weight pretrain_web --from_resume 0 --use_wandb --wandb_project minimind_training
# cli: 4 5 6 7
CUDA_VISIBLE_DEVICES="4,5,6,7" nohup torchrun --nproc_per_node 4 train_full_sft.py --save_weight full_sft_cli --epochs 2 --batch_size 64 --learning_rate 5e-7 --log_interval 1 --data_path ../dataset/sft_mini_512.jsonl --hidden_size 512 --num_hidden_layers 8 --max_seq_len 512 --use_moe 0 --save_dir ../out --save_interval 100 --from_weight pretrain_cli  --use_wandb --wandb_project minimind_training_cli > ../logfile_cli/2.sft_4gpu.log 2>&1 &

SFT Loss:

Lora

默认配置，4卡并行/单卡，日志打印间隔 1

多卡并行

# web: cuda 0 1 2 3
torchrun --nproc_per_node 4 ../trainer/train_lora.py --lora_name lora_identity_web --epochs 50 --batch_size 32 --learning_rate 1e-4 --log_interval 1 --data_path ../dataset/lora_identity.jsonl --hidden_size 512 --num_hidden_layers 8 --max_seq_len 512 --use_moe 0 --save_dir ../out/lora --save_interval 1 --from_weight full_sft_web --from_resume 0 --use_wandb --wandb_project minimind_training
# cli: 4 5 6 7
CUDA_VISIBLE_DEVICES="4,5,6,7" nohup torchrun --nproc_per_node 4 ../trainer/train_lora.py --lora_name lora_identity_cli --epochs 50 --batch_size 32 --learning_rate 1e-4 --log_interval 1 --data_path ../dataset/lora_identity.jsonl --hidden_size 512 --num_hidden_layers 8 --max_seq_len 512 --use_moe 0 --save_dir ../out/lora --save_interval 1 --from_weight full_sft_cli --from_resume 0 --use_wandb --wandb_project minimind_training_cli > ../logfile_cli/3.lora_4gpu.log 2>&1 &

Lora Loss mitil-gpu:

单卡

# web
/usr/bin/python3 ../trainer/train_lora.py --lora_name lora_identity_web_single --epochs 50 --batch_size 32 --learning_rate 1e-4 --log_interval 1 --data_path ../dataset/lora_identity.jsonl --hidden_size 512 --num_hidden_layers 8 --max_seq_len 512 --use_moe 0 --save_dir ../out/lora --save_interval 1 --from_weight full_sft_web --device cuda:0 --from_resume 0 --use_wandb --wandb_project minimind_training
# cli
nohup python3  ../trainer/train_lora.py --lora_name lora_identity_cli_single --epochs 50 --batch_size 32 --learning_rate 1e-4 --log_interval 1 --data_path ../dataset/lora_identity.jsonl --hidden_size 512 --num_hidden_layers 8 --max_seq_len 512 --use_moe 0 --save_dir ../out/lora --save_interval 1 --from_weight full_sft_cli --from_resume 0 --use_wandb --wandb_project minimind_training_cli --device cuda:1  > ../logfile_cli/3.lora_single.log 2>&1 &

Lora Loss single-gpu:

RL - DPO

默认配置，4卡并行，日志打印间隔 1

# web: cuda 0 1 2 3
torchrun --nproc_per_node 4 ../trainer/train_dpo.py --beta 0.1 --epochs 1 --batch_size 4 --learning_rate 4e-8 --log_interval 1 --data_path ../dataset/dpo.jsonl --hidden_size 512 --num_hidden_layers 8 --max_seq_len 1024 --use_moe 0 --save_dir ../out --save_interval 100 --from_weight full_sft_web --from_resume 0 --use_wandb --wandb_project minimind_training
# cli: cuda 4 5 6 7
CUDA_VISIBLE_DEVICES="4,5,6,7" nohup torchrun --master_port 12345  --nproc_per_node 4 ../trainer/train_dpo.py --beta 0.1 --epochs 1 --batch_size 4 --learning_rate 4e-8 --log_interval 1 --data_path ../dataset/dpo.jsonl --hidden_size 512 --num_hidden_layers 8 --max_seq_len 1024 --use_moe 0 --save_dir ../out --save_interval 100 --from_weight full_sft_cli --from_resume 0 --use_wandb --wandb_project minimind_training_cli --save_weight dpo_cli  > ../logfile_cli/4.dpo_gpu_4.log 2>&1 &

DPO Loss:

yuyu5333 · 2025-11-11T08:35:00Z

@jingyaogong 可能有很多考虑不周全的地方，希望您能给予专业的指导与建议~

yuyu5333 · 2025-11-11T13:23:51Z

Update

Web Training 联动 SwanLab一键跳转：对于选择 SwanLab日志监控的训练进程，一键跳转到对应Swanlab日志界面：

需要提前配置好Swanlab( wandb )的配置，使用方法参考SwanLab官方教程。

Init Web UI

9bfb58a

yuyu5333 mentioned this pull request Nov 6, 2025

Training Web UI is Coming !!! 一条指令部署图形化界面，动动鼠标即可开启Minimind训练 #525

Open

16 tasks

yuyu5333 force-pushed the feat/as_a_tools branch from 4a6d0a7 to 9bfb58a Compare November 6, 2025 15:12

yuyu5333 added 2 commits November 7, 2025 03:11

fix training logs show

e9d2ccb

add: rl-dpo

5bef3f0

yuyu5333 added 4 commits November 7, 2025 09:34

update logfile && process

48a81d9

update web auto port

9b27e8b

add default model para

768f070

support ddp

095a656

yuyu5333 added 3 commits November 10, 2025 11:08

Merge remote-tracking branch 'upstream/master' into feat/as_a_tools

237744d

support ppo and Training web code refactoring

d66a794

update safe web

04477b7

yuyu5333 and others added 11 commits November 11, 2025 13:26

support Swanlab check

3a03ced

update http && process logs

7c947e5

rewrite web

c0b39e2

update bash start

ebf1a85

rewrite web

544c889

support rl-grpo, rl-spo

49bd32f

update web

52b7b88

init sdk

66dcb40

update sdk

25cf74e

update check web server health

a794898

update import web

a955686

yuyu5333 and others added 14 commits November 20, 2025 20:47

update web ui

102a7c0

remove sdk

fc1f07b

update web dataset file

4a741b3

update web dataset file

a0013ea

update web dataset file

249d1c0

update web dataset file

8845d91

update web dataset file

dfe1e5c

update log flash

80f75a2

update web ui

5e5b1be

remove useless file

b336bc6

add process step

826a1fb

update process step

d9dddcc

update para from_resume

aee1913

update start bash

0ebf835

yuyu5333 force-pushed the feat/as_a_tools branch from 125c522 to 0ebf835 Compare November 25, 2025 03:56

Delete README_web.md

b3069d4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] Add Training Web UI#524

[feat] Add Training Web UI#524
yuyu5333 wants to merge 36 commits intojingyaogong:masterfrom
yuyu5333:feat/as_a_tools

yuyu5333 commented Nov 6, 2025 •

edited

Loading

Uh oh!

jingyaogong commented Nov 6, 2025

Uh oh!

yuyu5333 commented Nov 7, 2025 •

edited

Loading

Uh oh!

yuyu5333 commented Nov 10, 2025 •

edited

Loading

Uh oh!

yuyu5333 commented Nov 11, 2025 •

edited

Loading

Uh oh!

yuyu5333 commented Nov 11, 2025

Uh oh!

yuyu5333 commented Nov 11, 2025

Uh oh!

yuyu5333 commented Nov 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yuyu5333 commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update 11-24

Update

Uh oh!

jingyaogong commented Nov 6, 2025

Uh oh!

yuyu5333 commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update：

Uh oh!

yuyu5333 commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update：

Uh oh!

yuyu5333 commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update：

Uh oh!

yuyu5333 commented Nov 11, 2025

先说结论：使用web训练不会对Minimind产生性能和效果方面的影响

Pretrain

SFT

多卡并行（4卡，后续所有多卡并行均为4卡）

Lora

多卡并行

单卡

RL - DPO

Uh oh!

yuyu5333 commented Nov 11, 2025

Uh oh!

yuyu5333 commented Nov 11, 2025

Update

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yuyu5333 commented Nov 6, 2025 •

edited

Loading

yuyu5333 commented Nov 7, 2025 •

edited

Loading

yuyu5333 commented Nov 10, 2025 •

edited

Loading

yuyu5333 commented Nov 11, 2025 •

edited

Loading