Problem
A single training run scatters output across five unrelated directories:
| Artifact |
Current location |
| Model checkpoints |
saved_models/<run-name>/ |
| Lightning CSV logs |
lightning_logs/version_N/ (auto-incremented, not named) |
| W&B offline cache |
wandb/offline-run-<timestamp>-<id>/ |
Eval artifacts (PDFs, CSVs, .pt files) |
wandb/ or mlruns/ via self.logger.save_dir |
| MLFlow artifacts |
mlruns/ (hard-coded in CustomMLFlowLogger.save_dir) |
Run Name Format
Auto-generated in neural_lam/train_model.py as:
{prefix}{model}-{processor_layers}x{hidden_dim}-{MM_DD_HH}-{random_4digits}
e.g. train-graph_lam-2x64-02_21_12-4571. Override with --logger_run_name.
The run name is applied to saved_models/ and the external logger, but not to lightning_logs/.
Proposed Change
All output should go under runs/<run-name>/:
train_model.py: set ModelCheckpoint(dirpath=f"runs/{run_name}/checkpoints") and default_root_dir=f"runs/{run_name}" on pl.Trainer.
utils.py: pass save_dir=f"runs/{run_name}" to WandbLogger and CustomMLFlowLogger.
custom_loggers.py: make save_dir return the run-scoped path instead of "mlruns".
ar_model.py: no changes needed — artifacts already use self.logger.save_dir.
README.md / .gitignore: replace saved_models/, lightning_logs/, wandb/ references with runs/.
Problem
A single training run scatters output across five unrelated directories:
saved_models/<run-name>/lightning_logs/version_N/(auto-incremented, not named)wandb/offline-run-<timestamp>-<id>/.ptfiles)wandb/ormlruns/viaself.logger.save_dirmlruns/(hard-coded inCustomMLFlowLogger.save_dir)Run Name Format
Auto-generated in
neural_lam/train_model.pyas:{prefix}{model}-{processor_layers}x{hidden_dim}-{MM_DD_HH}-{random_4digits}e.g.
train-graph_lam-2x64-02_21_12-4571. Override with--logger_run_name.The run name is applied to
saved_models/and the external logger, but not tolightning_logs/.Proposed Change
All output should go under
runs/<run-name>/:train_model.py: setModelCheckpoint(dirpath=f"runs/{run_name}/checkpoints")anddefault_root_dir=f"runs/{run_name}"onpl.Trainer.utils.py: passsave_dir=f"runs/{run_name}"toWandbLoggerandCustomMLFlowLogger.custom_loggers.py: makesave_dirreturn the run-scoped path instead of"mlruns".ar_model.py: no changes needed — artifacts already useself.logger.save_dir.README.md/.gitignore: replacesaved_models/,lightning_logs/,wandb/references withruns/.