Skip to content

feat: add MPS module with DirectSampling simulation#409

Draft
n0228a wants to merge 7 commits into
GeoStat-Framework:mainfrom
n0228a:feature/mps
Draft

feat: add MPS module with DirectSampling simulation#409
n0228a wants to merge 7 commits into
GeoStat-Framework:mainfrom
n0228a:feature/mps

Conversation

@n0228a
Copy link
Copy Markdown

@n0228a n0228a commented May 12, 2026

Add Multiple-Point Statistics via Direct Sampling (gstools.mps)

This PR introduces a new gstools.mps submodule implementing Multiple-Point Statistics (MPS) simulation through the Direct Sampling algorithm (Mariethoz et al. 2010; Juda et al. 2022). MPS is a fundamentally different paradigm from two-point geostatistics: instead of parametric covariance models, spatial structure is learned directly from a training image (TI).

What was added

gstools/mps/training_image.pyTrainingImage

The MPS analogue of CovModel. Wraps an n-dimensional NumPy array and encapsulates the distance function used to compare data events. Supports:

  • Categorical variables (e.g. facies): weighted mismatch distance (Mariethoz et al. 2010, Eq. 3)
  • Continuous variables: L1, L2, Lp (arbitrary Minkowski exponent), and variation distance with automatic mean-shift correction (Mariethoz et al. 2010, Eq. 9)
  • Spatial-decay weighting of neighbours (distance power δ, Mariethoz et al. 2010, Eq. 5)
  • Upweighting of hard conditioning nodes (Mariethoz et al. 2010, §3 ¶26)

gstools/mps/distance.py — pure distance functions

Stateless helper functions (categorical_dist, l1_dist, l2_dist, lp_dist, variation_dist, compute_node_weights) separated from class state for clarity and reusability.

gstools/mps/direct_sampling.pyDirectSampling

Subclasses gstools.field.base.Field and follows the same call interface as SRF. Key features:

  • Works on n-dimensional structured grids
  • Randomized simulation path (Mariethoz et al. 2010, §3 ¶13)
  • Neighbour search sorted by Euclidean distance; max_radius caps selection by Euclidean distance
  • scan_fraction caps the fraction of the per-node search window scanned (Mariethoz et al. 2010, §3 ¶24)
  • threshold=0.0 activates DSBC mode (best-candidate, full scan)
  • boundary="strict" (default) or "partial" — partial mode drops lags that can never fit in the TI and searches with the reduced template (Mariethoz et al. 2010, §6.2)
  • set_condition() for hard conditioning data, with smart nearest-node snapping and collision resolution (Mariethoz et al. 2010, §3)
  • DAG-based parallelism via num_threads (or gstools.config.NUM_THREADS): independent nodes in the simulation path are processed concurrently using ThreadPoolExecutor

examples/13_mps/channel_demo.py

An end-to-end demo using the classic Strebelle (2002) channelized fluvial training image, demonstrating conditional simulation with 100 hard-data points.

Usage

import numpy as np
from gstools import mps

ti = mps.TrainingImage(ti_array, categorical=True)
ds = mps.DirectSampling(ti, n_neighbors=30, scan_fraction=1.0, threshold=0.01, num_threads=4)
ds.set_condition(cond_pos, cond_val)

x = y = np.arange(100, dtype=float)
field = ds([x, y], seed=42)

References

  • Mariethoz et al. (2010): Direct sampling method to perform multiple‑point geostatistical simulations.
  • Juda et al. (2022): A parsimonious parametrization of the Direct Sampling algorithm for multiple-point statistical simulations.

n0228a added 7 commits May 12, 2026 15:36
Adds a new gstools.mps submodule implementing Multiple-Point Statistics
via a DirectSampling algorithm. Includes TrainingImage and distance
utilities, plus an initial channel demo example.
## Add `boundary` and `max_radius` to `DirectSampling`

Two new parameters for `DirectSampling` and `ds_simulate`:

**`max_radius`** (float, optional)
Caps SG neighbour selection by Euclidean distance. Provides finer spatial control than the integer `max_offset`, which only bounds the precomputed offset table.

**`boundary`** (`"strict"` | `"partial"`)
Controls what happens when the data-event template extends beyond the training image edges.
- `"strict"` (default) — existing behaviour: if no valid window exists, fall back to a random TI value.
- `"partial"` — drops lags that can never be placed in the TI (|h| ≥ TI size in any dimension), then searches with the reduced template (Mariethoz 2010 §6.2). Avoids unnecessary random fallbacks when a large or stretched template only partially overlaps the TI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant