feat: add MPS module with DirectSampling simulation#409
Draft
n0228a wants to merge 7 commits into
Draft
Conversation
Adds a new gstools.mps submodule implementing Multiple-Point Statistics via a DirectSampling algorithm. Includes TrainingImage and distance utilities, plus an initial channel demo example.
## Add `boundary` and `max_radius` to `DirectSampling` Two new parameters for `DirectSampling` and `ds_simulate`: **`max_radius`** (float, optional) Caps SG neighbour selection by Euclidean distance. Provides finer spatial control than the integer `max_offset`, which only bounds the precomputed offset table. **`boundary`** (`"strict"` | `"partial"`) Controls what happens when the data-event template extends beyond the training image edges. - `"strict"` (default) — existing behaviour: if no valid window exists, fall back to a random TI value. - `"partial"` — drops lags that can never be placed in the TI (|h| ≥ TI size in any dimension), then searches with the reduced template (Mariethoz 2010 §6.2). Avoids unnecessary random fallbacks when a large or stretched template only partially overlaps the TI.
scan_fraction*search_window
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add Multiple-Point Statistics via Direct Sampling (
gstools.mps)This PR introduces a new
gstools.mpssubmodule implementing Multiple-Point Statistics (MPS) simulation through the Direct Sampling algorithm (Mariethoz et al. 2010; Juda et al. 2022). MPS is a fundamentally different paradigm from two-point geostatistics: instead of parametric covariance models, spatial structure is learned directly from a training image (TI).What was added
gstools/mps/training_image.py—TrainingImageThe MPS analogue of
CovModel. Wraps an n-dimensional NumPy array and encapsulates the distance function used to compare data events. Supports:gstools/mps/distance.py— pure distance functionsStateless helper functions (
categorical_dist,l1_dist,l2_dist,lp_dist,variation_dist,compute_node_weights) separated from class state for clarity and reusability.gstools/mps/direct_sampling.py—DirectSamplingSubclasses
gstools.field.base.Fieldand follows the same call interface asSRF. Key features:max_radiuscaps selection by Euclidean distancescan_fractioncaps the fraction of the per-node search window scanned (Mariethoz et al. 2010, §3 ¶24)threshold=0.0activates DSBC mode (best-candidate, full scan)boundary="strict"(default) or"partial"— partial mode drops lags that can never fit in the TI and searches with the reduced template (Mariethoz et al. 2010, §6.2)set_condition()for hard conditioning data, with smart nearest-node snapping and collision resolution (Mariethoz et al. 2010, §3)num_threads(orgstools.config.NUM_THREADS): independent nodes in the simulation path are processed concurrently usingThreadPoolExecutorexamples/13_mps/channel_demo.pyAn end-to-end demo using the classic Strebelle (2002) channelized fluvial training image, demonstrating conditional simulation with 100 hard-data points.
Usage
References