You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PySR supports running across multiple nodes on Slurm via `cluster_manager="slurm"`.
4
+
This backend is **allocation-based**: you request resources with Slurm (`sbatch`/`salloc`), then PySR launches Julia workers inside that allocation (using `SlurmClusterManager.jl`).
5
+
6
+
Here is a minimal `sbatch` example using 3 workers on each of 2 nodes (6 workers total).
7
+
8
+
Save this as `pysr_job.sh`:
9
+
10
+
```bash
11
+
#!/bin/bash
12
+
#SBATCH --job-name=pysr
13
+
#SBATCH --partition=normal
14
+
#SBATCH --nodes=2
15
+
#SBATCH --ntasks-per-node=3
16
+
#SBATCH --time=01:00:00
17
+
18
+
set -euo pipefail
19
+
python pysr_script.py
20
+
```
21
+
22
+
Save this as `pysr_script.py` in the same directory:
23
+
24
+
```python
25
+
import numpy as np
26
+
from pysr import PySRRegressor
27
+
28
+
X = np.random.RandomState(0).randn(1000, 2)
29
+
y = X[:, 0] +2* X[:, 1]
30
+
31
+
model = PySRRegressor(
32
+
niterations=200,
33
+
populations=2,
34
+
parallelism="multiprocessing",
35
+
cluster_manager="slurm",
36
+
procs=6, # must match the Slurm allocation's total task count
37
+
)
38
+
model.fit(X, y)
39
+
print(model)
40
+
```
41
+
42
+
Submit it with:
43
+
44
+
```bash
45
+
sbatch pysr_job.sh
46
+
```
47
+
48
+
## Notes
49
+
50
+
-`procs` is the number of Julia worker processes. It must match the Slurm allocation's total tasks (e.g., `--ntasks` or `--nodes * --ntasks-per-node`).
51
+
- Run the Python script once (as the master) inside the allocation; do not wrap it in `srun`.
Copy file name to clipboardExpand all lines: docs/src/tuning.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ First, my general tips would be to avoid using redundant operators, like how `po
6
6
7
7
When running PySR, I usually do the following:
8
8
9
-
I run from IPython (Jupyter Notebooks don't work as well[^1]) on the head node of a slurm cluster. Passing `cluster_manager="slurm"` will make PySR set up a run over the entire allocation. I set `procs` equal to the total number of cores over my entire allocation.
9
+
I run from IPython (Jupyter Notebooks don't work as well[^1]) on the head node of a slurm cluster. Passing `cluster_manager="slurm"` will make PySR set up a run over the entire allocation. I set `procs` equal to the total number of tasks across my entire allocation (see the [Slurm page](slurm.md) for a complete multi-node example).
10
10
11
11
I use the [tensorboard feature](https://ai.damtp.cam.ac.uk/pysr/examples/#12-using-tensorboard-for-logging) for experiment tracking.
0 commit comments