alabi package

alabi.benchmarks module

benchmarks.py

alabi.benchmarks.random_gaussian_covariance(n_dims)[source]

Generate a random positive definite covariance matrix.

alabi.benchmarks.multimodal_gaussian_nd(x, means, covs, amps)[source]

alabi.cache_utils module

cache_utils.py

alabi.cache_utils.load_pickle(savedir, fname='surrogate_model.pkl')[source]
alabi.cache_utils.load_model_cache(savedir)[source]

MPI-safe model loading that prevents file corruption.

Parameters:

savedir – Directory containing the model cache

Returns:

Loaded surrogate model

alabi.cache_utils.write_report_gp(self, file)[source]
alabi.cache_utils.write_report_emcee(self, file)[source]
alabi.cache_utils.write_report_dynesty(self, file)[source]

alabi.core module

core.py

class alabi.core.SurrogateModel(lnlike_fn=None, bounds=None, param_names=None, cache=True, savedir='results/', model_name='surrogate_model', verbose=True, show_warnings=False, ncore=1, pool_method='forkserver', ignore_warnings=True, random_state=None)[source]

Bases: object

Gaussian Process surrogate model for Bayesian inference and optimization.

A SurrogateModel uses a Gaussian Process to create a fast approximation of expensive likelihood functions, enabling efficient Bayesian inference, parameter estimation, and active learning. The model supports various active learning algorithms and scalers for handling different types of likelihood functions.

Parameters:
  • lnlike_fn – (callable, required) Log-likelihood function that takes parameter array theta and returns scalar log-likelihood value. For Bayesian inference, this is your model’s log-likelihood. Signature: lnlike_fn(theta) -> float

  • bounds – (array-like, required) Prior bounds for each parameter. List/array of (min, max) tuples for each dimension. Example: bounds = [(0, 1), (2, 3), (-1, 1)]

  • param_names – (array-like, optional) Names/labels for each parameter. If None, defaults to θ₀, θ₁, etc. Length must match number of dimensions in bounds.

  • cache – (bool, optional, default=True) Whether to cache the trained model to disk for reuse

  • savedir – (str, optional, default=”results/”) Directory for saving results, plots, and cached models

  • model_name – (str, optional, default=”surrogate_model”) Name prefix for cached model files

  • verbose – (bool, optional, default=True) Print progress information during training and inference

  • ncore – (int, optional, default=cpu_count()) Number of CPU cores to use for parallel computation

  • ignore_warnings – (bool, optional, default=True) Suppress sklearn and other package warnings

  • random_state – (int, optional, default=None) Random seed for reproducible results

gp: george.GP

Trained Gaussian Process model

bounds: ndarray

Original parameter bounds (unscaled)

_bounds: ndarray

Scaled parameter bounds used for GP training

_theta: ndarray

Training parameter samples (scaled)

_y: ndarray

Training likelihood values (scaled)

ntrain: int

Number of initial training samples

ndim: int

Number of parameters/dimensions

emcee_samples: ndarray

MCMC samples from emcee (if run_emcee called)

dynesty_samples: ndarray

Nested sampling results from dynesty (if run_dynesty called)

Examples

Basic usage for Bayesian inference:

def log_likelihood(theta):
    # Your model likelihood function
    return -0.5 * np.sum((theta - 2)**2)

bounds = [(0, 4), (0, 4)]  # 2D parameter space
sm = SurrogateModel(log_likelihood, bounds)
sm.init_samples(ntrain=100)  # Initial training data
sm.init_gp()  # Initialize Gaussian Process
sm.active_train(niter=50)  # Active learning
sm.run_dynesty()  # Bayesian inference

For optimization problems:

sm.active_train(algorithm="jones")  # Use Jones algorithm for optimization

See also

init_samples() : Initialize training data init_gp() : Initialize Gaussian Process active_train() : Perform active learning run_dynesty() : Run nested sampling with dynesty run_emcee() : Run MCMC sampling with emcee run_ultranest() : Run nested sampling with UltraNest run_pymultinest() : Run nested sampling with PyMultiNest

__init__(lnlike_fn=None, bounds=None, param_names=None, cache=True, savedir='results/', model_name='surrogate_model', verbose=True, show_warnings=False, ncore=1, pool_method='forkserver', ignore_warnings=True, random_state=None)[source]
save()[source]

Pickle SurrogateModel object and write summary to a text file.

theta()[source]

Return unscaled training theta values

y()[source]

Return unscaled training y values

refit_scalers(theta, y, theta_scaler=None, y_scaler=None)[source]

Refit the theta and y scalers using current training data. Useful if training data has changed significantly.

init_train(nsample=None, sampler='lhs', fname='initial_training_sample.npz')[source]
Parameters:
  • nsample – (int, optional) Number of samples. Defaults to nsample = 50 * self.ndim

  • sampler – (str, optional) Sampling method. Defaults to 'lhs'. See utility.prior_sampler for more details.

load_train(cache_file)[source]

Reload training samples from cache file and apply scalers.

Parameters:

cache_file – (str, required) Name of cache file relative to savedir. Must be a .npz file containing ‘theta’ and ‘y’ arrays.

Returns:

(tuple) Scaled training samples (_theta, _y) after loading from cache.

init_samples(ntrain=100, ntest=0, sampler='uniform', train_file=None, test_file=None)[source]

Initialize training and test samples for the surrogate model.

Creates initial dataset by either loading cached samples or computing new ones by evaluating the likelihood function at randomly sampled parameter values.

Parameters:
  • ntrain – (int, optional, default=100) Number of training samples to generate. Used only if not loading cached samples.

  • ntest – (int, optional, default=0) Number of test samples to generate. Currently unused.

  • sampler

    (str, optional, default=”uniform”) Sampling method for generating parameter values. Options:

    • ”uniform”: Uniform sampling within bounds (default)

    • ”sobol”: Low-discrepancy Sobol sequence sampling

    • ”lhs”: Latin hypercube sampling

  • train_file – (str, optional, default=”initial_training_sample.npz”) Filename for cached training samples relative to savedir. Format: .npz file containing ‘theta’ and ‘y’ arrays.

  • test_file – (str, optional, default=”initial_test_sample.npz”) Filename for cached test samples relative to savedir. Currently unused.

set_hyperparam_prior_bounds()[source]

Configure prior bounds for GP hyperparameters based on current training data.

By default ranges for parameters:
  • mean: [mean(y) - std(y), mean(y) + std(y)]

  • amplitude: [0.1, 10]

  • white noise: [white_noise - 3, white_noise + 3]

expand_hyperparameter_vector(optimized_params)[source]
set_hyperparameter_vector(tmp_gp, optimized_params)[source]
get_hyperparameter_dict(gp)[source]

Get current GP hyperparameters as a dictionary.

Returns:

(dict) Dictionary of current GP hyperparameters with names as keys and values as values.

get_hyperparameter_vector(gp)[source]
init_gp(kernel='ExpSquaredKernel', fit_amp=True, fit_mean=True, fit_white_noise=True, white_noise=-12, gp_scale_rng=[-2, 2], gp_amp_rng=[-1, 1], uniform_scales=False, overwrite=False, theta_scaler=no_scaler, y_scaler=no_scaler, gp_opt_method='l-bfgs-b', gp_nopt=3, optimizer_kwargs={'adaptive': True, 'fatol': 0.001, 'maxiter': 100, 'xatol': 0.0001}, hyperopt_method='cv', regularize=True, amp_0=1.0, mu_0=1.0, sigma_0=2.0, cv_folds=5, cv_scoring='mse', cv_n_candidates=100, cv_stage2_candidates=50, cv_stage2_width=0.5, cv_stage3_candidates=25, cv_stage3_width=0.25, cv_weighted_factor=1.0, multi_proc=True)[source]

Initialize the Gaussian Process surrogate model with specified kernel and hyperparameters.

This function sets up a Gaussian Process (GP) using the george library with the specified kernel type and configuration. The GP is initialized with random scale lengths and then fitted to the current training data.

Parameters:
  • kernel

    (str or george kernel object, optional) Kernel type for the Gaussian Process. Can be either a string specifying one of the built-in kernels or a george kernel object. Default is “ExpSquaredKernel”.

    Built-in options:
    • 'ExpSquaredKernel': Squared exponential (RBF) kernel, smooth functions

    • 'Matern32Kernel': Matérn kernel with ν=3/2, moderately smooth functions

    • 'Matern52Kernel': Matérn kernel with ν=5/2, smooth functions

    • 'RationalQuadraticKernel': Rational quadratic kernel, scale mixture of RBF kernels

    See https://george.readthedocs.io/en/latest/user/kernels/ for more details.

  • fit_amp – (bool, optional) Whether to optimize the amplitude (overall scale) hyperparameter of the kernel. If True, the GP will learn the optimal amplitude from data. Default is True.

  • fit_mean – (bool, optional) Whether to optimize the mean function hyperparameter. If True, the GP will learn a constant mean offset. If False, assumes zero mean. Default is True.

  • fit_white_noise – (bool, optional) Whether to optimize the white noise (nugget) hyperparameter. If True, the GP will learn the optimal noise level. If False, uses the fixed value from white_noise. Default is True.

  • white_noise – (float, optional) Log-scale white noise parameter. If fit_white_noise=False, this fixed value is used. If fit_white_noise=True, this serves as the initial guess. Typical values are between -15 (very low noise) and -5 (high noise). Default is -12.

  • gp_scale_rng – (list of two floats, optional) Log-scale bounds for the characteristic length scale parameters of the kernel. Format: [log_min_scale, log_max_scale]. These bounds apply to all input dimensions. Default is [-2, 2], corresponding to scales between ~0.14 and ~7.4 in original units.

  • uniform_scales – (bool, optional) If True, the same scale length will be used for all input dimensions. If False, each dimension will have its own independent scale length. Default is False.

  • overwrite – (bool, optional) If True, allows reinitializing the GP even if one already exists. If False and a GP already exists, raises an AssertionError. Default is False.

  • theta_scaler – (sklearn transformer, optional, default=no_scaler) Scaler for input parameters. Applied to theta values before GP training. Common options: MinMaxScaler() (scale to [0,1]) or StandardScaler()

  • y_scaler – (sklearn transformer, optional, default=no_scaler) Scaler for output values (log-likelihoods). Options include: - no_scaler: No scaling (default) - minmax_scaler: Scale to [0,1] - nlog_scaler: Apply -log10(-y) transformation for negative log-likelihoods - log_scaler: Apply log10(y) for positive values

  • gp_opt_method – (str, optional) Optimization method for GP hyperparameter optimization. Passed to scipy.optimize.minimize. Common options: ‘l-bfgs-b’, ‘newton-cg’, ‘bfgs’, ‘cg’. Default is ‘l-bfgs-b’.

  • gp_nopt – (int, optional) Number of optimization restarts for GP hyperparameter optimization. Multiple restarts help avoid local minima. Default is 3.

  • optimizer_kwargs – (dict, optional) Additional keyword arguments passed to the scipy optimizer. Common options include ‘maxiter’ (maximum iterations) and convergence tolerances. Default is {“maxiter”: 50}.

  • hyperopt_method

    (str, optional, default=’ml’) Method for optimizing GP hyperparameters:

    • ’ml’: Maximum marginal likelihood (fast, may overfit)

    • ’cv’: k-fold cross-validation (slower, prevents overfitting)

  • cv_folds – (int, optional, default=5) Number of folds for cross-validation (only used if hyperopt_method=’cv’).

  • cv_scoring – (str, optional, default=’mse’) Scoring metric for cross-validation. Options: ‘mse’, ‘mae’, ‘r2’.

  • cv_n_candidates – (int, optional, default=20) Number of hyperparameter candidates to evaluate for CV.

  • cv_stage2_candidates – (int, optional, default=20) Number of candidates for stage 2 grid search. Only used when cv_two_stage=True.

  • cv_stage2_width – (float, optional, default=0.3) Width factor for stage 2 search around best parameters from stage 1. Smaller values = tighter search. Only used when cv_two_stage=True.

  • cv_stage3_candidates – (int, optional, default=None) Number of candidates for stage 3 ultra-fine search. If None, uses max(cv_stage2_candidates // 2, 3). Only used when cv_three_stage=True.

  • cv_stage3_width – (float, optional, default=0.2) Width factor for stage 3 search around best parameters from stage 2. Should be smaller than cv_stage2_width for finer refinement. Only used when cv_three_stage=True.

Raises:
  • AssertionError – If a GP already exists and overwrite=False.

  • ValueError – If an invalid kernel name is provided.

  • Exception – If GP initialization fails after multiple attempts with different scale lengths.

Note

This function must be called after init_samples() since it requires training data to initialize the GP. The function will automatically retry initialization with different random scale lengths if the initial attempt fails.

>>> # Basic initialization with default settings
>>> sm.init_gp()

>>> # Custom kernel with specific hyperparameter settings
>>> sm.init_gp(kernel="Matern52Kernel", 
...            fit_white_noise=False, 
...            white_noise=-10,
...            gp_scale_rng=[-1, 1])

>>> # High-precision optimization
>>> sm.init_gp(gp_opt_method="l-bfgs-b",
...            gp_nopt=5,
...            optimizer_kwargs={"maxiter": 100, "ftol": 1e-9})
eval_gp_at_iteration(iter, return_var=False)[source]
surrogate_log_likelihood(theta_xs, iter=-1, return_var=False)[source]

Evaluate predictive mean of the GP at point(s) theta_xs

This method is vectorized to handle both single parameter vectors and arrays of parameter vectors efficiently.

Parameters:
  • theta_xs (array-like) – Point(s) to evaluate GP mean at. Can be: - 1D array of shape (ndim,) for single point - 2D array of shape (npoints, ndim) for multiple points

  • iter (int, optional) – Iteration number of GP to use. If -1, uses most recent GP.

  • return_var (bool, optional) – Whether to also return variance predictions.

Returns:

  • ypred (array) – GP mean(s) evaluated at theta_xs. Shape matches input.

  • varpred (array, optional) – GP variance(s) if return_var=True.

Return type:

array or tuple of arrays

surrogate_likelihood(theta_xs)[source]

Evaluate predictive probability (not log-probability) of the GP at point(s) theta_xs

This method is vectorized to handle both single parameter vectors and arrays of parameter vectors efficiently.

Parameters:

theta_xs (array-like) – Point(s) to evaluate GP probability at. Can be: - 1D array of shape (ndim,) for single point - 2D array of shape (npoints, ndim) for multiple points

Returns:

GP probability/probabilities evaluated at theta_xs. Shape matches input.

Return type:

float or array

create_cached_surrogate_likelihood(iter=-1, return_var=False)[source]

Create a cached surrogate likelihood function that computes the GP once and can be evaluated multiple times without recomputing the GP.

This is useful when you need to evaluate the surrogate likelihood at many different points with the same GP configuration, as it avoids the expensive GP computation (gp.compute()) on each call.

Parameters:

iter (int, optional) – Iteration number of GP to use. If -1, uses most recent GP.

Returns:

A cached likelihood function that can be called with theta_xs

Return type:

callable

find_max_surrogate(nopt=10, method='l-bfgs-b')[source]

Find the parameter values that maximize the GP surrogate model prediction.

Uses multi-start numerical optimization on the negative surrogate log likelihood to locate the global maximum of the surrogate surface.

Parameters:
  • nopt (int, optional) – Number of optimization restarts. More restarts reduce the chance of converging to a local maximum. Default is 10.

  • method (str, optional) – Scipy optimization method. Default is “l-bfgs-b”.

Returns:

  • theta_max (ndarray of shape (ndim,)) – Parameter values at the surrogate maximum (unscaled).

  • y_max (float) – Surrogate log likelihood value at the maximum.

Return type:

tuple

find_next_point(nopt=3, optimizer_kwargs={})[source]

Find next set of (theta, y) training points by maximizing the active learning utility function.

Parameters:
  • nopt – (int, optional) Number of times to restart the objective function optimization. Defaults to 1. Increase to avoid converging to local minima.

  • optimizer_kwargs – (dict, optional) Additional keyword arguments passed to scipy optimizer. Default is {}.

active_train(niter=100, algorithm='bape', gp_opt_freq=20, save_progress=False, obj_opt_method='l-bfgs-b', nopt=5, optimizer_kwargs={}, use_grad_opt=True, show_progress=True, allow_opt_multiproc=True, max_attempts=10)[source]

Perform active learning to iteratively improve the surrogate model.

Uses acquisition functions to intelligently select new training points that will most improve the Gaussian Process model. Different algorithms balance exploration (uncertainty reduction) vs exploitation (finding optima).

Parameters:
  • niter – (int, optional, default=100) Number of active learning iterations. Each iteration adds one new training point.

  • algorithm – (str, optional, default=”bape”) Active learning algorithm. Options: - “bape”: Bayesian Active Parameter Estimation (exploration-focused) - “jones”: Jones algorithm (exploitation-focused, good for optimization) - “agp”: Augmented Gaussian Process (balanced) - “alternate”: Alternates between exploration and exploitation

  • gp_opt_freq – (int, optional, default=20) Frequency of GP hyperparameter re-optimization. GP hyperparameters are re-optimized every gp_opt_freq iterations. Lower values = more optimization.

  • save_progress – (bool, optional, default=False) Whether to save training progress data for later analysis.

  • obj_opt_method – (str, optional, default=”nelder-mead”) Optimization method for acquisition function. Options: - “l-bfgs-b”: L-BFGS-B (good with gradients) - “nelder-mead”: Nelder-Mead simplex (gradient-free)

  • nopt – (int, optional, default=1) Number of optimization restarts for acquisition function. Higher values help avoid local minima but increase computation time.

  • use_grad_opt – (bool, optional, default=True) Whether to use gradient information if available. Set False for gradient-free optimization.

  • optimizer_kwargs – (dict, optional, default={}) Additional keyword arguments passed to the optimizer.

  • show_progress – (bool, optional, default=True) Whether to display progress bar during training.

Note

Active learning algorithms have different purposes:

  • BAPE: Best for uncertainty quantification and space-filling

  • Jones: Best for finding likelihood maxima/minima (optimization)

  • Alternate: Good balance for both exploration and exploitation

  • AGP: Another balanced approach

The method automatically handles GP re-training and hyperparameter optimization based on the specified frequency. Training data is accumulated in _theta and _y attributes.


Basic active learning with BAPE:

>>> sm.active_train(niter=50, algorithm="bape")

Optimization-focused active learning:

>>> sm.active_train(niter=30, algorithm="jones", gp_opt_freq=10)

Balanced approach with frequent GP optimization:

>>> sm.active_train(niter=40, algorithm="alternate", gp_opt_freq=5)
active_train_parallel(niter=100, nchains=4, algorithm='bape', gp_opt_freq=20, obj_opt_method='nelder-mead', nopt=1, use_grad_opt=True, optimizer_kwargs={}, show_progress=True)[source]

Run multiple active learning chains in parallel.

Parameters:
  • niter – (int, optional) Number of iterations per chain. Default 100.

  • nchains – (int, optional) Number of parallel chains to run. Default 4.

  • algorithm – (str, optional) Active learning algorithm. Default “bape”.

  • gp_opt_freq – (int, optional) Frequency of GP hyperparameter optimization. Default 20.

  • obj_opt_method – (str, optional) Optimization method for acquisition function. Default “nelder-mead”.

  • nopt – (int, optional) Number of restarts for acquisition optimization. Default 1.

  • use_grad_opt – (bool, optional) Whether to use gradient-based optimization. Default True.

  • optimizer_kwargs – (dict, optional) Additional optimizer kwargs. Default {}.

  • show_progress – (bool, optional) Whether to display progress bar during parallel chain execution. Default is True.

Notes

This function uses multiprocessing.Pool instead of threading, which can provide better performance for CPU-intensive tasks and avoids GIL limitations. However:

  • All model data must be pickleable (which it should be for SurrogateModel)

  • Each process runs in separate memory space (higher memory usage)

  • Process startup overhead is higher than threading

  • Better isolation between chains (one chain failure won’t affect others)

  • Can achieve true parallelism on multi-core systems

The function automatically respects the ncore limit and won’t create more processes than specified in self.ncore.

lnprob(theta)[source]

Log probability function used for emcee, which sums the prior with the surrogate model likelihood

\[\ln P(\theta | x) \propto \ln P(x | \theta) + \ln P(\theta)\]

where ln P(x | theta) is the surrogate likelihood function and ln P(theta) is the prior function.

Parameters:

theta – (array, required) Array of model input parameters to evaluate model probability at.

find_map(theta0=None, prior_fn=None, method='nelder-mead', nRestarts=15, options=None)[source]
run_emcee(like_fn=None, prior_fn=None, nwalkers=None, nsteps=50000, sampler_kwargs={}, run_kwargs={}, opt_init=False, multi_proc=True, prior_fn_comment=None, burn=None, thin=None, samples_file=None, min_ess=10000)[source]

Sample the posterior using the emcee affine-invariant MCMC algorithm.

This method uses the emcee package to perform Markov Chain Monte Carlo (MCMC) sampling on either the trained GP surrogate model or the true likelihood function. The affine-invariant ensemble sampler is robust and works well for a wide variety of posterior shapes without requiring manual tuning of step sizes.

Parameters:
  • like_fn – (callable, str, or None, optional) Likelihood function to sample. Options: - None (default): Uses the trained GP surrogate model (self.surrogate_log_likelihood) - “surrogate”, “gp”: Uses the GP surrogate model explicitly - “true”: Uses the true likelihood function (self.true_log_likelihood) - callable: Custom likelihood function with signature like_fn(theta) Default is None.

  • prior_fn – (callable or None, optional) Log-prior function with signature prior_fn(theta). Should return log-probability density. If None, uses uniform prior with bounds from self.bounds. Default is None.

  • nwalkers – (int or None, optional) Number of MCMC walkers in the ensemble. Should be at least 2*ndim. If None, defaults to 10*ndim. More walkers improve convergence but increase computational cost. Default is None.

  • nsteps – (int, optional) Number of MCMC steps per walker. Total number of likelihood evaluations will be nwalkers * nsteps. Default is 50000.

  • sampler_kwargs – (dict, optional) Additional keyword arguments passed to emcee.EnsembleSampler constructor. Common options include: - ‘a’: Stretch move scale parameter (default: 2.0) - ‘moves’: Custom proposal moves Default is {}.

  • run_kwargs – (dict, optional) Additional keyword arguments passed to the run_mcmc() method. Common options include: - ‘progress’: Show progress bar (default: True) - ‘store’: Store chain in memory (default: True) Default is {}.

  • opt_init – (bool, optional) Whether to initialize walkers near the maximum a posteriori (MAP) estimate. If True, uses find_map() to locate starting point. If False, initializes walkers randomly from the prior. Default is False.

  • multi_proc – (bool, optional) Whether to use multiprocessing with self.ncore processes. Generally recommended for expensive likelihood evaluations. Default is True.

  • prior_fn_comment – (str or None, optional) Comment describing the prior function for logging purposes. If None and prior_fn is provided, attempts to extract function name. Default is None.

  • burn – (int or None, optional) Number of burn-in samples to discard from each walker. If None, automatically estimates burn-in using autocorrelation analysis. Default is None.

  • thin – (int or None, optional) Thinning factor - keep every thin-th sample to reduce autocorrelation. If None, automatically estimates based on autocorrelation time. Default is None.

  • samples_file – (str or None, optional) If provided, saves the final samples to this file in NumPy .npz format. Default is None.

  • min_ess – (int, optional) Minimum effective sample size. If the number of final samples is less than min_ess, will run additional sampling rounds and combine samples until the total number of samples exceeds min_ess. Default is 0 (no minimum required).

Attributes Set

sampleremcee.EnsembleSampler

The emcee sampler object containing full chain and metadata

emcee_samplesndarray of shape (nsamples_final, ndim)

Final MCMC samples after burn-in and thinning

emcee_samples_fullndarray of shape (nsteps, nwalkers, ndim)

Full MCMC chain before processing

emcee_samples_truendarray of shape (nsamples_final, ndim)

Final samples when using true likelihood (like_fn=”true”)

emcee_samples_gpndarray of shape (nsamples_final, ndim)

Final samples when using surrogate likelihood

emcee_runbool

Flag indicating emcee has been successfully run

emcee_runtimefloat

Wall-clock time taken for emcee sampling in seconds

nwalkersint

Number of walkers used

nstepsint

Number of steps per walker

burnint

Burn-in length used for final samples

thinint

Thinning factor used for final samples

iburnint

Automatically estimated burn-in length

ithinint

Automatically estimated thinning factor

acc_fracfloat

Mean acceptance fraction across all walkers

autcorr_timefloat

Mean autocorrelation time in steps

like_fn_namestr

Name of likelihood function used (“true”, “surrogate”, or “likelihood”)

prior_fn_commentstr

Description of prior function used


Sample surrogate model with default settings:

>>> sm.run_emcee()

Sample true likelihood with specific settings:

>>> sm.run_emcee(like_fn="true", nwalkers=100, nsteps=10000)

Use custom prior and optimize initialization:

>>> def log_prior(theta):
...     # Custom Gaussian prior
...     return -0.5 * np.sum((theta/2)**2)
>>> sm.run_emcee(prior_fn=log_prior, opt_init=True)

Run with manual burn-in and thinning:

>>> sm.run_emcee(nsteps=100000, burn=10000, thin=10)

References

emcee documentation: https://emcee.readthedocs.io/ Foreman-Mackey et al. (2013): “emcee: The MCMC Hammer”, PASP, 125, 306-312

run_dynesty(like_fn=None, prior_transform=None, mode='dynamic', sampler_kwargs={}, run_kwargs={}, multi_proc=False, save_iter=None, prior_transform_comment=None, samples_file=None, min_ess=10000)[source]

Sample the posterior using the dynesty nested sampling algorithm.

This method uses the dynesty package to perform nested sampling on either the trained GP surrogate model or the true likelihood function. Dynesty is particularly effective for estimating the Bayesian evidence and exploring multi-modal posteriors.

Parameters:
  • like_fn – (callable, str, or None, optional) Likelihood function to sample. Options: - None (default): Uses the trained GP surrogate model (self.surrogate_log_likelihood) - “surrogate”, “gp”: Uses the GP surrogate model explicitly - “true”: Uses the true likelihood function (self.true_log_likelihood) - callable: Custom likelihood function with signature like_fn(theta) Default is None.

  • prior_transform – (callable or None, optional) Prior transformation function that maps from unit hypercube [0,1]^ndim to the parameter space. Should have signature prior_transform(u) where u is array of shape (ndim,) with values in [0,1]. If None, uses uniform prior with bounds from self.bounds. Default is None.

  • mode – ({“dynamic”, “static”}, optional) Dynesty sampling mode. “dynamic” uses DynamicNestedSampler which adaptively allocates live points, while “static” uses fixed number of live points. Dynamic mode is generally more efficient. Default is “dynamic”.

  • sampler_kwargs – (dict, optional) Additional keyword arguments passed to the dynesty sampler constructor. Common options include: - ‘nlive’: Number of live points (default: 50*ndim) - ‘bound’: Bounding method (‘multi’, ‘single’, ‘none’) - ‘sample’: Sampling method (‘auto’, ‘unif’, ‘rwalk’, ‘slice’, ‘rslice’, ‘hslice’) Default is {}.

  • run_kwargs – (dict, optional) Additional keyword arguments passed to the run_nested() method. Common options include: - ‘dlogz’: Target evidence uncertainty (default: 0.5) - ‘maxiter’: Maximum number of iterations (default: 50000) - ‘wt_kwargs’: Weight function arguments (default: {‘pfrac’: 1.0}) - ‘stop_kwargs’: Stopping criterion arguments (default: {‘pfrac’: 1.0}) Default is {}.

  • multi_proc – (bool, optional) Whether to use multiprocessing. If True, uses self.ncore processes. Note that multiprocessing can sometimes be slower due to overhead. Default is False.

  • save_iter – (int or None, optional) If provided, saves the sampler state every save_iter iterations to allow for checkpointing and resuming long runs. Saves to ‘{savedir}/dynesty_sampler_{like_fn_name}.pkl’. Default is None.

  • prior_transform_comment – (str or None, optional) Comment describing the prior transform for logging purposes. If None and prior_transform is provided, attempts to extract function name. Default is None.

  • samples_file – (str or None, optional) If provided, saves the final samples to this file in NumPy .npz format. Default is None.

  • min_ess – (int, optional) Minimum effective sample size. If the number of final samples is less than min_ess, will run additional sampling rounds and combine samples until the total number of samples exceeds min_ess. Default is 0 (no minimum required).

Attributes Set

resdynesty.results.Results

Complete dynesty results object containing samples, weights, evidence, etc.

dynesty_samplesndarray of shape (nsamples, ndim)

Resampled posterior samples with equal weights

dynesty_samples_truendarray of shape (nsamples, ndim)

Posterior samples when using true likelihood (like_fn=”true”)

dynesty_samples_surrogatendarray of shape (nsamples, ndim)

Posterior samples when using surrogate likelihood

dynesty_runbool

Flag indicating dynesty has been successfully run

dynesty_runtimefloat

Wall-clock time taken for dynesty sampling in seconds

like_fn_namestr

Name of likelihood function used (“true”, “surrogate”, or “custom”)

prior_transform_commentstr

Description of prior transform used

Notes

Dynesty is particularly well-suited for: - Computing Bayesian evidence for model comparison - Exploring multi-modal posteriors - Providing robust posterior sampling without tuning

The default settings prioritize posterior sampling over evidence estimation by setting pfrac=1.0, which focuses computational effort on high-likelihood regions rather than exploring the full prior volume.

Examples

Sample surrogate model with default settings:

>>> sm.run_dynesty()

Sample true likelihood with more live points:

>>> sm.run_dynesty(like_fn="true", sampler_kwargs={'nlive': 1000})

Use custom prior with bounds [-5, 5] for each parameter:

>>> def my_prior(u):
...     return 10*u - 5  # maps [0,1] to [-5,5]
>>> sm.run_dynesty(prior_transform=my_prior)

Run with checkpointing every 1000 iterations:

>>> sm.run_dynesty(save_iter=1000, run_kwargs={'maxiter': 50000})

References

Dynesty documentation: https://dynesty.readthedocs.io/ Speagle (2020): “dynesty: a dynamic nested sampling package for estimating Bayesian posteriors and evidences”, MNRAS, 493, 3132-3158

run_pymultinest(like_fn=None, prior_transform=None, sampler_kwargs={}, multi_proc=True, prior_transform_comment=None, samples_file=None, prefix=None, resume=False, n_clustering_params=None, outputfiles_basename=None, min_ess=10000)[source]

Sample the posterior using the PyMultiNest nested sampling algorithm.

This method uses the PyMultiNest package (Python wrapper for MultiNest) to perform nested sampling on either the trained GP surrogate model or the true likelihood function. MultiNest is particularly effective for multi-modal posteriors and computing Bayesian evidence with high accuracy.

Parameters:
  • like_fn – (callable, str, or None, optional) Likelihood function to sample. Options: - None (default): Uses the trained GP surrogate model (self.surrogate_log_likelihood) - “surrogate”, “gp”: Uses the GP surrogate model explicitly - “true”: Uses the true likelihood function (self.true_log_likelihood) - callable: Custom likelihood function with signature like_fn(theta) Default is None.

  • prior_transform – (callable or None, optional) Prior transformation function that maps from unit hypercube [0,1]^ndim to the parameter space. Should have signature prior_transform(cube) where cube is array of shape (ndim,) with values in [0,1]. The function should modify cube in-place. If None, uses uniform prior with bounds from self.bounds. Default is None.

  • sampler_kwargs – (dict, optional) Additional keyword arguments passed to pymultinest.run(). Common options include: - ‘n_live_points’: Number of live points (default: 1000) - ‘evidence_tolerance’: Target evidence uncertainty (default: 0.5) - ‘sampling_efficiency’: Sampling efficiency parameter (default: 0.8) - ‘n_iter_before_update’: Iterations before evidence/posterior update (default: 100) - ‘null_log_evidence’: Null evidence for model comparison (default: -1e90) - ‘max_modes’: Maximum number of modes to find (default: 100) - ‘mode_tolerance’: Mode separation tolerance (default: -1e90) - ‘seed’: Random seed for reproducibility (default: -1, auto) - ‘verbose’: Verbosity level (default: True) - ‘importance_nested_sampling’: Use importance nested sampling (default: True) - ‘multimodal’: Enable multimodal mode detection (default: True) - ‘const_efficiency_mode’: Use constant efficiency mode (default: False) Default is {}.

  • multi_proc

    (bool, optional) Whether to use multiprocessing. If True, uses self.ncore processes. MultiNest handles parallelization internally when MPI is available.

    Note

    This parameter is ignored for PyMultiNest as it uses MPI for parallelization, not Python’s multiprocessing. When PyMultiNest runs with MPI, other alabi functions automatically disable their multiprocessing pools to avoid conflicts.

    Default is True.

  • prior_transform_comment – (str or None, optional) Comment describing the prior function for logging purposes. If None and prior_transform is provided, attempts to extract function name. Default is None.

  • samples_file – (str or None, optional) If provided, saves the final samples to this file in NumPy .npz format. Default is None.

  • prefix – (str or None, optional) Prefix for MultiNest output files. If None, uses default based on likelihood function name and current directory. Default is None.

  • resume – (bool, optional) Whether to resume from previous run if output files exist. Default is False.

  • n_clustering_params – (int or None, optional) Number of parameters to use for mode clustering. If None, uses all parameters (ndim). Set to lower value if some parameters are nuisance. Default is None.

  • outputfiles_basename – (str or None, optional) Base name for MultiNest output files. If None, constructs from savedir and likelihood function name. Default is None.

  • min_ess – (int, optional) Minimum effective sample size. If the number of final samples is less than min_ess, will run additional sampling rounds and combine samples until the total number of samples exceeds min_ess. Default is 0 (no minimum required).

Attributes Set

pymultinest_samplesndarray of shape (nsamples, ndim)

Posterior samples from MultiNest

pymultinest_samples_truendarray of shape (nsamples, ndim)

Posterior samples when using true likelihood (like_fn=”true”)

pymultinest_samples_surrogatendarray of shape (nsamples, ndim)

Posterior samples when using surrogate likelihood

pymultinest_weightsndarray of shape (nsamples,)

Sample weights from nested sampling

pymultinest_logzfloat

Log Bayesian evidence estimate

pymultinest_logz_errfloat

Uncertainty in log evidence estimate

pymultinest_runbool

Flag indicating PyMultiNest has been successfully run

pymultinest_runtimefloat

Wall-clock time taken for MultiNest sampling in seconds

pymultinest_analyzerpymultinest.Analyzer

MultiNest analyzer object for accessing detailed results

like_fn_namestr

Name of likelihood function used (“true”, “surrogate”, or “custom”)

prior_transform_commentstr

Description of prior function used

Note

PyMultiNest is particularly well-suited for:

  • Multi-modal posterior exploration with automatic mode detection

  • High-accuracy Bayesian evidence computation for model comparison

  • Robust sampling without manual tuning of MCMC parameters

  • Handling complex, irregular posterior shapes

MultiNest generates several output files including detailed posterior samples, evidence estimates, and mode information. These files are saved to the model’s savedir for later analysis.

MPI and Multiprocessing Compatibility:

PyMultiNest uses MPI for parallelization across multiple nodes/cores. When MPI is active, alabi automatically disables Python multiprocessing in other functions (run_emcee, run_dynesty) to prevent conflicts. This ensures that:

  • PyMultiNest can run efficiently with MPI

  • Other alabi functions fall back to serial execution when MPI is detected

  • No deadlocks or resource conflicts occur between MPI and multiprocessing

To run PyMultiNest with MPI:

>>> # Single node, multiple cores
>>> sm.run_pymultinest()  # Uses OpenMP if available
>>> # Multiple nodes with MPI (run from command line)
>>> # mpirun -n 4 python your_script.py
example:

Sample surrogate model with default settings:

>>> sm.run_pymultinest()

Sample true likelihood with more live points:

>>> sm.run_pymultinest(like_fn="true", 
...                   sampler_kwargs={'n_live_points': 2000})

Use custom prior with bounds [-10, 10] for each parameter:

>>> def my_prior(cube):
...     for i in range(len(cube)):
...         cube[i] = 20*cube[i] - 10  # maps [0,1] to [-10,10]
>>> sm.run_pymultinest(prior_transform=my_prior)

Enable multimodal mode detection with high accuracy:

>>> sm.run_pymultinest(sampler_kwargs={
...     'multimodal': True,
...     'evidence_tolerance': 0.1,
...     'max_modes': 20
... })

Run with custom output file prefix and resume capability:

>>> sm.run_pymultinest(prefix="my_run_", resume=True)

References

PyMultiNest documentation: https://johannesbuchner.github.io/PyMultiNest/ Feroz et al. (2009): “MultiNest: an efficient and robust Bayesian inference tool for cosmology and particle physics”, MNRAS, 398, 1601-1614

run_ultranest(like_fn=None, prior_transform=None, sampler_kwargs={}, run_kwargs={}, multi_proc=False, prior_transform_comment=None, samples_file=None, log_dir=None, resume='overwrite', min_ess=10000, slice_steps=0)[source]

Sample the posterior using the UltraNest nested sampling algorithm.

This method uses the UltraNest package to perform nested sampling on either the trained GP surrogate model or the true likelihood function. UltraNest is a highly robust nested sampling algorithm that automatically adapts to the problem complexity and provides reliable evidence computation.

Parameters:
  • like_fn – (callable, str, or None, optional) Likelihood function to sample. Options: - None (default): Uses the trained GP surrogate model (self.surrogate_log_likelihood) - “surrogate”, “gp”: Uses the GP surrogate model explicitly - “true”: Uses the true likelihood function (self.true_log_likelihood) - callable: Custom likelihood function with signature like_fn(theta) Default is None.

  • prior_transform – (callable or None, optional) Prior transformation function that maps from unit hypercube [0,1]^ndim to the parameter space. Should have signature prior_transform(cube) where cube is array of shape (ndim,) with values in [0,1]. Must return transformed parameters as array. If None, creates uniform prior from self.bounds. Default is None.

  • sampler_kwargs – (dict, optional) Additional keyword arguments passed to UltraNest ReactiveNestedSampler(). Common options include: - ‘derived_param_names’: List of derived parameter names (default: []) - ‘wrapped_params’: List of bool indicating circular parameters (default: None) - ‘resume’: Resume behavior ‘resume’/’overwrite’/’subfolder’ (default: ‘subfolder’) - ‘run_num’: Run number for subdirectory creation (default: None) - ‘num_test_samples’: Number of test samples for validation (default: 2) - ‘draw_multiple’: Enable dynamic point drawing (default: True) - ‘num_bootstraps’: Number of bootstrap samples (default: 30) - ‘vectorized’: Whether functions accept arrays (default: False) - ‘ndraw_min’: Minimum points to draw per iteration (default: 128) - ‘ndraw_max’: Maximum points to draw per iteration (default: 65536) - ‘storage_backend’: Storage format ‘hdf5’/’csv’/’tsv’ (default: ‘hdf5’) - ‘warmstart_max_tau’: Warmstart maximum tau (default: -1) Default is {}.

  • run_kwargs – (dict, optional) Additional keyword arguments passed to sampler.run(). Common options include: - ‘update_interval_volume_fraction’: Volume fraction for region updates (default: 0.8) - ‘update_interval_ncall’: Number of calls between updates (optional, omit for auto) - ‘log_interval’: Iterations between status updates (optional, omit for auto) - ‘show_status’: Show integration progress (default: True) - ‘viz_callback’: Visualization callback function (default: False, disabled) - ‘dlogz’: Target log-evidence uncertainty (default: 0.5) - ‘dKL’: Target posterior uncertainty in nats (default: 0.5) - ‘frac_remain’: Fraction of evidence remaining to terminate (default: 0.01) - ‘Lepsilon’: Likelihood contour tolerance (default: 0.001) - ‘min_ess’: Target effective sample size (default: 400) - ‘max_iters’: Maximum number of iterations (optional, omit for unlimited) - ‘max_ncalls’: Maximum number of likelihood calls (optional, omit for unlimited) - ‘max_num_improvement_loops’: Maximum improvement loops (default: -1) - ‘min_num_live_points’: Minimum number of live points (default: 400) - ‘cluster_num_live_points’: Live points per cluster (default: 40) - ‘insertion_test_zscore_threshold’: Z-score threshold for insertion test (default: 4) - ‘insertion_test_window’: Window size for insertion test (default: 10) - ‘region_class’: Region sampling class (optional, can be passed via kwargs) - ‘widen_before_initial_plateau_num_warn’: Warning threshold for plateau (default: 10000) - ‘widen_before_initial_plateau_num_max’: Maximum plateau points (optional, omit for auto) Default is {}.

  • multi_proc – (bool, optional) Deprecated and ignored. This parameter is kept for backwards compatibility but no longer has any effect. UltraNest now runs in MPI-compatible mode without multiprocessing pools to avoid conflicts with MPI environments. Default is False.

  • prior_transform_comment – (str or None, optional) Comment describing the prior transform for logging purposes. If None and prior_transform is provided, attempts to extract function name. Default is None.

  • samples_file – (str or None, optional) If provided, saves the final samples to this file in NumPy .npz format. Default is None.

  • log_dir – (str or None, optional) Directory to store UltraNest output files and logs. If None, uses a subdirectory in self.savedir. Default is None.

  • resume – (str, optional) Resume behavior for interrupted runs. Options: - ‘resume’: Resume if possible, otherwise start fresh - ‘resume-similar’: Resume with similar but not identical setup - ‘overwrite’: Always start fresh, overwriting existing files (default) - ‘subfolder’: Create new timestamped subfolder Default is ‘overwrite’.

  • min_ess – (int, optional) Minimum effective sample size. If the number of final samples is less than min_ess, will run additional sampling rounds and combine samples until the total number of samples exceeds min_ess. Default is 0 (no minimum required).

Attributes Set

ultranest_resultsultranest.integrator.Result

Complete UltraNest results object with samples, evidence, etc.

ultranest_samplesndarray of shape (nsamples, ndim)

Equally weighted posterior samples from UltraNest

ultranest_samples_truendarray of shape (nsamples, ndim)

Posterior samples when using true likelihood (like_fn=”true”)

ultranest_samples_surrogatendarray of shape (nsamples, ndim)

Posterior samples when using surrogate likelihood

ultranest_weightsndarray of shape (nsamples,)

Sample weights (typically all equal after resampling)

ultranest_logzfloat

Log Bayesian evidence estimate

ultranest_logz_errfloat

Uncertainty in log evidence estimate

ultranest_runbool

Flag indicating UltraNest has been successfully run

ultranest_runtimefloat

Wall-clock time taken for UltraNest sampling in seconds

ultranest_samplerultranest.ReactiveNestedSampler

UltraNest sampler object for accessing detailed information

like_fn_namestr

Name of likelihood function used (“true”, “surrogate”, or “custom”)

prior_transform_commentstr

Description of prior transform used

Note

UltraNest is particularly well-suited for:

  • Robust nested sampling without manual tuning

  • Automatic adaptation to problem complexity

  • High-dimensional and multi-modal problems

  • Reliable evidence computation for model comparison

  • Problems with complex, irregular likelihood shapes

  • MPI environments (runs in serial mode to avoid multiprocessing conflicts)

UltraNest automatically determines the number of live points and adapts its sampling strategy based on the problem characteristics. It provides excellent performance across a wide range of problems without requiring parameter tuning.

MPI Compatibility: This function runs UltraNest in serial mode, making it fully compatible with MPI environments. While UltraNest itself can use MPI for parallelization, this implementation avoids multiprocessing pools that can conflict with MPI.

example:

Sample surrogate model with default settings:

>>> sm.run_ultranest()

Sample true likelihood with custom termination criteria:

>>> sm.run_ultranest(like_fn="true", 
...                  run_kwargs={'dlogz': 0.1, 'min_ess': 1000})

Use custom prior transform with bounds [-5, 5] for each parameter:

>>> def my_prior_transform(cube):
...     return 10 * cube - 5  # maps [0,1] to [-5,5]
>>> sm.run_ultranest(prior_transform=my_prior_transform)

Run with increased live points for better accuracy:

>>> sm.run_ultranest(like_fn="true",
...                  run_kwargs={'min_num_live_points': 800})

Customize output directory and resume behavior:

>>> sm.run_ultranest(log_dir="ultranest_output", 
...                  resume="overwrite")

References

UltraNest documentation: https://johannesbuchner.github.io/UltraNest/ Buchner (2021): “UltraNest - a robust, general purpose Bayesian inference library for cosmology and particle physics”, Journal of Open Source Software

plot(plots=None, show=False, cb_rng=[None, None], log_scale=False)[source]

Generate diagnostic plots for training progress, GP performance, and MCMC results.

This method creates various diagnostic plots to assess the quality of the surrogate model training, GP hyperparameter optimization, and MCMC sampling results. Plots are automatically saved to the model’s save directory.

Parameters:
  • plots

    (list of str, optional) List of plot types to generate. Each plot requires specific data to be available (e.g., ‘emcee_corner’ requires run_emcee() to have been called first). If None, no plots are generated. Available options:

    Training diagnostics: - ‘test_mse’: Mean squared error vs training iteration - ‘test_scaled_mse’: Scaled MSE vs training iteration - ‘test_log_mse’: Log-scale MSE vs training iteration - ‘gp_hyperparameters’: GP hyperparameter evolution during training - ‘gp_train_time’: GP training time vs iteration - ‘gp_train_corner’: Corner plot of final training samples - ‘gp_train_scatter’: Scatter plot of training samples vs predictions

    GP visualization: - ‘gp_fit_2D’: 2D contour plot of GP surrogate surface (2D only) - ‘gp_predictions_1D’: 1D slices of GP mean and variance through the surrogate maximum

    MCMC diagnostics: - ‘emcee_corner’: Corner plot of emcee posterior samples - ‘emcee_walkers’: Walker trajectories for emcee chains - ‘dynesty_corner’: Corner plot of dynesty posterior samples - ‘dynesty_corner_kde’: KDE version of dynesty corner plot - ‘dynesty_traceplot’: Trace plot of dynesty sampling - ‘dynesty_runplot’: Dynesty convergence diagnostics

    Comparison plots: - ‘mcmc_comparison’: Compare emcee and dynesty posteriors

    Convenience options: - ‘gp_all’: Generate all available GP training plots

    Default is None.

  • show – (bool, optional) Whether to display plots interactively in addition to saving them. If False, plots are only saved to disk. Default is False.

  • cb_rng – (list of [float, float], optional) Colorbar range for 2D contour plots as [vmin, vmax]. If [None, None], uses automatic range determination. Only applies to plots with colorbars like ‘gp_fit_2D’. Default is [None, None].

  • log_scale – (bool, optional) Whether to use logarithmic color scale for 2D contour plots. If True, applies matplotlib.colors.LogNorm to the colorbar. Only applies to plots with colorbars. Default is False.

Returns:

None or matplotlib.figure.Figure Some individual plots may return figure objects for further customization.

Raises:
  • NameError – If required data for a requested plot is not available (e.g., requesting ‘emcee_corner’ before running run_emcee()).

  • AttributeError – If the model has not been properly initialized or trained.

Notes

Plots are automatically saved to the model’s save directory (self.savedir) with descriptive filenames. The save directory is created if it doesn’t exist.

Training diagnostic plots help assess: - Convergence of active learning process - Quality of GP hyperparameter optimization - Efficiency of training sample selection

MCMC diagnostic plots help assess: - Posterior sampling convergence - Chain mixing and autocorrelation - Comparison between different sampling methods

Examples

Generate all GP training plots:

>>> sm.plot(plots=['gp_all'])

Create MCMC comparison plots:

>>> sm.run_emcee()
>>> sm.run_dynesty()
>>> sm.plot(plots=['emcee_corner', 'dynesty_corner', 'mcmc_comparison'])

Generate 2D GP visualization with custom colorbar:

>>> sm.plot(plots=['gp_fit_2D'], cb_rng=[-10, 0], log_scale=True)

Show plots interactively:

>>> sm.plot(plots=['test_mse', 'gp_hyperparameters'], show=True)
get_chain_diversity_metrics()[source]

Calculate diversity metrics for the combined training samples. Useful for assessing the effectiveness of parallel chains.

class alabi.core.CachedSurrogateLikelihood(gp_iter, _y_cond, theta_scaler, y_scaler, ndim, return_var=False)[source]

Bases: object

A picklable cached surrogate likelihood function.

This class creates a callable object that caches the GP computation and can be pickled for use with multiprocessing.

__init__(gp_iter, _y_cond, theta_scaler, y_scaler, ndim, return_var=False)[source]

Initialize the cached surrogate likelihood.

Parameters:
  • gp_iter – Pre-computed GP object

  • y_cond – Training target values

  • theta_scaler – Parameter scaler object

  • y_scaler – Target scaler object

  • ndim – Number of dimensions

alabi.gp_utils module

gp_utils.py

Gaussian process utility functions for initializing GPs and optimizing their hyperparameters.

alabi.gp_utils.configure_gp(theta, y, kernel, fit_amp=True, fit_mean=True, fit_white_noise=False, white_noise=-12, hyperparameters=None)[source]

Configure and initialize a Gaussian Process with robust error handling.

Creates a george.GP object with the specified kernel and configuration options. Includes automatic fixes for common numerical issues such as singular matrices, duplicate points, and poor conditioning.

Parameters:
  • theta – (array-like, shape (n_samples, n_features)) Training input locations (parameters). Must contain finite values only.

  • y – (array-like, shape (n_samples,)) Training target values (function evaluations). Must contain finite values only.

  • kernel – (george kernel object) George kernel object defining the covariance function. Common options include ExpSquaredKernel, Matern32Kernel, Matern52Kernel.

  • fit_amp – (bool, optional, default=True) Whether to fit the kernel amplitude. If True, scales the kernel by the variance of y to improve conditioning.

  • fit_mean – (bool, optional, default=True) Whether to fit a constant mean function. If True, initializes mean to median(y) and allows optimization.

  • fit_white_noise – (bool, optional, default=False) Whether to fit the white noise (nugget) parameter. If True, the noise level will be optimized during hyperparameter tuning.

  • white_noise – (float, optional, default=-12) Log-scale white noise parameter. Acts as regularization to prevent singular matrices. More negative values = less noise.

  • hyperparameters – (array-like, optional, default=None) Pre-specified hyperparameters to set. If provided, these values are used instead of the kernel’s default initialization.

Returns:

george.GP or None Configured and computed GP object ready for predictions, or None if configuration failed despite all attempted fixes.

Raises:
  • ValueError – If theta or y contain non-finite values (NaN or inf).

  • LinAlgError – If GP computation fails due to singular covariance matrix, automatically attempts several fixes before giving up.

alabi.gp_utils.optimize_gp(gp, _theta, _y, gp_hyper_prior, p0, bounds=None, method='l-bfgs-b', optimizer_kwargs=None, regularize=True, amp_0=1.0, mu_0=1.0, sigma_0=2.0, lengthscale_indices=None)[source]

Optimize Gaussian Process hyperparameters by maximizing marginal likelihood.

Performs hyperparameter optimization for a Gaussian Process using scipy’s minimize function. Supports multiple optimization restarts and automatically selects the result with highest marginal likelihood.

Parameters:
  • gp – (george.GP) Configured Gaussian Process object. Should be computed with training data.

  • _theta – (array-like, shape (n_samples, n_features)) Training input locations (parameters). Will be squeezed if 1D.

  • _y – (array-like, shape (n_samples,)) Training target values (function evaluations). Will be squeezed if 1D.

  • gp_hyper_prior – (callable) Prior function for hyperparameters. Should return log-probability density for given hyperparameter vector. Used to constrain optimization.

  • p0 – (array-like, shape (n_restarts, n_hyperparams) or (n_hyperparams,)) Initial guesses for hyperparameter optimization. If 2D, performs multiple restarts with different initializations.

  • bounds – (list of tuples, optional, default=None) Bounds for hyperparameter optimization as [(min, max), …]. Only used for methods that support bounds (e.g., ‘l-bfgs-b’).

  • method

    (str, optional, default=”l-bfgs-b”) Scipy optimization method. Supported methods:

    • ’l-bfgs-b’: L-BFGS-B with bounds support (default)

    • ’newton-cg’: Newton-CG with gradients

    • ’bfgs’: BFGS (no bounds support)

    • ’powell’: Powell method (derivative-free)

  • optimizer_kwargs – (dict, optional, default=None) Additional keyword arguments passed to scipy.optimize.minimize. If None, uses method-specific defaults optimized for GP optimization.

  • max_iter – (int, optional, default=50) Maximum number of iterations for optimization. Used as default in optimizer_kwargs if not specified.

  • regularize – (bool, optional, default=False) Whether to apply Hvarfner dimensionality-scaled regularization (Equation 4 from “Vanilla Bayesian Optimization Performs Great in High Dimensions”). When True, lengthscale priors are scaled as LogNormal(μ_0 + log(√d), σ_0).

  • mu_0 – (float, optional, default=0.0) Base location parameter for Hvarfner regularization. Only used if regularize=True.

  • sigma_0 – (float, optional, default=1.0) Scale parameter for Hvarfner regularization. Only used if regularize=True.

  • lengthscale_indices – (list of int, optional, default=None) Indices in the parameter vector corresponding to lengthscale parameters. Only used if regularize=True. If None, attempts to infer from kernel.

Returns:

george.GP or None GP object with optimized hyperparameters, or None if optimization failed. The returned GP is recomputed with optimal hyperparameters.

alabi.gp_utils.optimize_gp_kfold_cv(gp, _theta, _y, hyperparameter_candidates, y_scaler, k_folds=5, scoring='mse', pool=None, stage2_candidates=None, stage2_width=0.5, stage3_candidates=None, stage3_width=0.2, weighted_mse_method='exponential', weighted_mse_factor=1.0, verbose=True)[source]

Optimize Gaussian Process hyperparameters using k-fold cross-validation.

This function evaluates different hyperparameter configurations using k-fold cross-validation to select the configuration that generalizes best to unseen data. This approach helps prevent overfitting compared to standard marginal likelihood maximization.

Parameters:
  • gp – (george.GP) Configured Gaussian Process object template. Will be copied for each CV fold.

  • _theta – (array-like, shape (n_samples, n_features)) Training input locations (parameters). Must have at least k_folds samples.

  • _y – (array-like, shape (n_samples,)) Training target values (function evaluations). Must match _theta length.

  • gp_hyper_prior – (callable) Prior function for hyperparameters. Should return log-probability density for given hyperparameter vector. Used to constrain search space.

  • hyperparameter_candidates – (array-like, shape (n_candidates, n_hyperparams)) Array of hyperparameter vectors to evaluate via cross-validation. Each row represents one hyperparameter configuration to test.

  • k_folds – (int, optional, default=5) Number of folds for cross-validation. Must be >= 2 and <= n_samples. Common choices: 5 or 10 for good bias-variance tradeoff.

  • scoring

    (str, optional, default=’mse’) Scoring metric for cross-validation. Supported options:

    • ’mse’: Mean Squared Error (lower is better)

    • ’mae’: Mean Absolute Error (lower is better)

    • ’r2’: R-squared coefficient (higher is better)

    • ’weighted_mse’: Weighted MSE giving higher weight to high-probability regions

  • pool – (multiprocessing.Pool, optional, default=None) Multiprocessing pool for parallel evaluation of hyperparameter candidates. If None, evaluation runs sequentially. If provided, candidates are evaluated in parallel using the pool’s worker processes.

  • stage2_candidates – (int, optional, default=None) Number of candidates for stage 2 grid search. If None, uses len(hyperparameter_candidates) // 2.

  • stage2_width – (float, optional, default=0.5) Width factor for stage 2 search around best parameters. Smaller values = tighter search, larger values = wider search.

  • stage3_candidates – (int, optional, default=None) Number of candidates for stage 3 ultra-fine search. If None, uses max(stage2_candidates // 2, 3).

  • stage3_width – (float, optional, default=0.2) Width factor for stage 3 search around stage 2 best parameters. Should be smaller than stage2_width for finer refinement.

  • weighted_mse_method – (str, optional, default=’exponential’) Weighting method for weighted_mse scoring. Options: - ‘exponential’: w = exp(y_true / temperature) - ‘linear’: w = y_true - min(y_true) + epsilon - ‘softmax’: w = softmax(y_true / temperature) - ‘rank’: w based on rank order of y_true

  • weighted_mse_factor – (float, optional, default=1.0) Temperature parameter for exponential/softmax weighting. Lower values emphasize high-probability regions more strongly.

Returns:

  • best_gp (george.GP) – GP with optimal hyperparameters set

  • best_hyperparams (array) – Best hyperparameter vector

  • cv_results (dict) – Detailed CV results with scores and statistics

Raises:
  • ValueError – If insufficient data, invalid parameters, or no valid hyperparameters

  • RuntimeError – If all hyperparameter evaluations fail

alabi.gp_utils.regularization_term(hparams, lengthscale_indices, amp_0=1.0, mu_0=1.0, sigma_0=2.0)[source]

Compute the regularization term (negative log prior) from Hvafner 2024 Equation 4.

This implements the dimensionality-scaled LogNormal prior: p(ℓ_i | d) = LogNormal(μ_0 + log(√d), σ_0)

The regularization term is -log p(ℓ | d) = -Σ log p(ℓ_i | d)

Parameters:

mu_0float, default=0.0

The base location parameter for the 1D LogNormal prior

sigma_0float, default=1.0

The scale parameter for the LogNormal prior

Returns:

float

The regularization term (negative log prior)

alabi.gp_utils.regularization_gradient(hparams, lengthscale_indices, amp_0=1.0, mu_0=1.0, sigma_0=2.0)[source]

Compute the gradient of the regularization term with respect to lengthscales.

Parameters:

lengthscalesarray-like, shape (d,)

The lengthscale parameters for each dimension

dint

The dimensionality of the problem

mu_0float, default=0.0

The base location parameter for the 1D LogNormal prior

sigma_0float, default=1.0

The scale parameter for the LogNormal prior

Returns:

ndarray, shape (d,)

The gradient of the regularization term with respect to each lengthscale

alabi.mcmc_utils module

mcmc_utils.py

alabi.mcmc_utils.estimate_burnin(sampler, est_burnin=True, thin_chains=True, verbose=False)[source]

Estimate the integrated autocorrelation length on the MCMC chain associated with an emcee sampler object. With the integrated autocorrelation length, we can then estimate the burn-in length for the MCMC chain. This procedure follows the example outlined here: https://emcee.readthedocs.io/en/stable/tutorials/autocorr/

Parameters:
  • sampler – (emcee.EnsembleSampler, optional) emcee MCMC sampler object/backend handler, given a complete chain

  • est_burnin – (bool, optional) Estimate burn-in time using integrated autocorrelation time heuristic. Defaults to True. In general, we recommend users inspect the chains and calculate the burnin after the fact to ensure convergence, but this function works pretty well.

  • thin_chains – (bool, optional) Whether or not to thin chains. Useful if running long chains. Defaults to True. If true, estimates a thin cadence via int(0.5*np.min(tau)) where tau is the intergrated autocorrelation time.

  • verbose – (bool, optional) Output all the diagnostics? Defaults to False.

Returns iburn:

(int) burn-in index estimate. If est_burnin == False, returns 0.

Returns ithin:

(int) thin cadence estimate. If thin_chains == False, returns 1.

alabi.utility module

utility.py

Utility functions in terms of usefulness, e.g. minimizing GP utility functions or computing KL divergences, and the GP utility functions, e.g. the bape utility.

alabi.utility.agp_utility(theta, predict_gp, bounds)[source]

Compute the AGP (Adaptive Gaussian Process) utility function based on posterior entropy.

AGP is an information-theoretic acquisition function that measures the entropy of the Gaussian Process posterior distribution. It balances the GP mean prediction with the uncertainty (variance), preferring regions with high predicted values and high uncertainty.

Parameters:
  • theta – (array-like of shape (ndim,)) Parameter values at which to evaluate the utility function.

  • y – (array-like of shape (nsamples,)) Observed function values at training points, used to condition the GP.

  • gp – (george.GP) Trained Gaussian Process model. The GP will be computed if not already done.

  • bounds – (array-like of shape (ndim, 2)) Parameter bounds as [(min, max), …] for each dimension. Used to check if theta is within the prior support.

Returns:

float Negative AGP utility value. The negative is used so that minimizing this function is equivalent to maximizing the actual utility.

Notes

The AGP utility function is defined as:

\[u(\theta) = \mu(\theta) + \frac{1}{2} \ln(2\pi e \sigma^2(\theta))\]

This represents the entropy of the posterior predictive distribution at θ. The utility encourages sampling where: - The mean prediction μ(θ) is high (exploitation) - The predictive variance σ²(θ) is high (exploration)

AGP provides a different balance compared to other acquisition functions: - More exploitative than BAPE (focuses on high mean regions) - Less optimization-focused than Expected Improvement - Naturally balances exploration and exploitation through entropy

Raises:

ValueError – If the utility computation results in invalid values (e.g., negative variance).

Examples

Evaluate AGP utility at a test point:

theta_test = np.array([0.5, 1.2])
utility = agp_utility(theta_test, y_train, gp, bounds)

References

Wang & Li (2017): “Adaptive Gaussian Process Approximation for Bayesian Inference with Expensive Likelihood Functions”, Neural Computation, 30, 3072-3094.

alabi.utility.bape_utility(theta, predict_gp, bounds)[source]

Compute the BAPE (Bayesian Active Posterior Estimation) utility function.

BAPE is an active learning acquisition function designed for posterior exploration rather than optimization. It identifies regions where the GP variance is high relative to the mean, promoting exploration of the parameter space. The utility is computed in log-form for numerical stability.

Parameters:
  • theta (array-like of shape (ndim,)) – Parameter values at which to evaluate the utility function.

  • y (array-like of shape (nsamples,)) – Observed function values at training points, used to condition the GP.

  • gp (george.GP) – Trained Gaussian Process model. Must have been computed (gp.computed=True).

  • bounds (array-like of shape (ndim, 2)) – Parameter bounds as [(min, max), …] for each dimension. Used to check if theta is within the prior support.

Returns:

Negative log-BAPE utility value. The negative is used so that minimizing this function is equivalent to maximizing the actual utility.

Return type:

float

Note

The BAPE utility function is defined as:

\[u(\theta) = e^{2\mu(\theta) + \sigma^2(\theta)} \left(e^{\sigma^2(\theta)} - 1 \right)\]

This function returns the negative logarithm of the utility for numerical stability:

\[-\log u(\theta) = -\left(2\mu(\theta) + \sigma^2(\theta) + \log(e^{\sigma^2(\theta)} - 1)\right)\]

BAPE is particularly effective for:

  • Exploring multi-modal posteriors

  • Reducing uncertainty in posterior estimates

  • Active learning when the goal is posterior characterization

Unlike optimization-focused acquisitions (e.g., Expected Improvement), BAPE prioritizes exploration over exploitation, making it less suitable for finding global optima but excellent for posterior mapping.

Raises:
  • ValueError – If the utility computation results in invalid values (e.g., negative variance).

  • RuntimeError – If the GP has not been computed before calling this function.

Examples

Evaluate BAPE utility at a test point:

>>> theta_test = np.array([0.5, 1.2])
>>> utility = bape_utility(theta_test, y_train, gp, bounds)

References

Kandasamy et al. (2015): “Query efficient posterior estimation in scientific experiments via Bayesian active learning”, Artificial Intelligence, 243, 45-56.

alabi.utility.jones_utility(theta, predict_gp, bounds, y_best, zeta=0.01)[source]

Compute the Expected Improvement (EI) acquisition function.

This function implements the Expected Improvement criterion from Jones et al. (1998), which balances exploitation (sampling where the mean is high) with exploration (sampling where the uncertainty is high). Unlike BAPE, this acquisition function is designed specifically for global optimization.

Parameters:
  • theta (array-like of shape (ndim,)) – Parameter values at which to evaluate the utility function.

  • y (array-like of shape (nsamples,)) – Observed function values at training points. The maximum value is used as the current best (f_best).

  • gp (george.GP) – Trained Gaussian Process model. Must have been computed (gp.computed=True).

  • bounds (array-like of shape (ndim, 2)) – Parameter bounds as [(min, max), …] for each dimension. Used to check if theta is within the prior support.

  • zeta (float, optional) –

    Exploration parameter controlling the trade-off between exploitation and exploration. Larger values promote more exploration:

    • zeta = 0: Pure exploitation (greedy)

    • zeta > 0: Balanced exploration/exploitation

    • zeta >> 0: Pure exploration

    Default is 0.01.

Returns:

Negative Expected Improvement value. The negative is used so that minimizing this function is equivalent to maximizing the Expected Improvement.

Return type:

float

Note

The Expected Improvement is defined as:

\[EI(\theta) = \mathbb{E}[\max(f(\theta) - f_{\text{best}} - \zeta, 0)]\]

where f_best is the current best observed value. For a Gaussian predictive distribution with mean μ and variance σ², this becomes:

\[EI(\theta) = (\mu - f_{\text{best}} - \zeta) \Phi(z) + \sigma \phi(z)\]

where z = (μ - f_best - ζ)/σ, Φ is the standard normal CDF, and φ is the standard normal PDF.

Expected Improvement is particularly effective for:

  • Global optimization problems

  • Finding the maximum of expensive functions

  • Balancing local and global search

Compared to BAPE, Expected Improvement focuses on exploitation (finding optima) rather than exploration (mapping the entire posterior).

Raises:
  • ValueError – If the utility computation results in invalid values.

  • RuntimeError – If the GP has not been computed before calling this function.

References

Jones et al. (1998): “Efficient global optimization of expensive black-box functions”, Journal of Global Optimization, 13, 455-492.

alabi.utility.assign_utility(algorithm)[source]
alabi.utility.minimize_objective(obj_fn, bounds=None, nopt=1, method='l-bfgs-b', ps=None, options=None, grad_obj_fn=None, pool=None, show_warnings=False)[source]

Find the global minimum of an acquisition function using multiple restarts.

This function optimizes acquisition functions (BAPE, Jones/EI, etc.) to select the next point for active learning. Multiple optimization restarts with different initial points help avoid local minima, which is crucial for acquisition function optimization where the landscape can be highly multi-modal.

Parameters:
  • obj_fn (callable) – Objective function to minimize. Should have signature obj_fn(theta) and return a scalar value. Typically an acquisition function like BAPE or EI.

  • grad_obj_fn (callable or None, optional) – Gradient of the objective function. Should have signature grad_obj_fn(theta) and return array of shape (ndim,). If None, uses finite differences.

  • bounds (array-like of shape (ndim, 2)) – Parameter bounds as [(min, max), …] for each dimension. Used both for optimization constraints and generating initial points.

  • nopt (int, optional) – Number of optimization restarts with different initial points. More restarts increase the chance of finding the global minimum but increase computational cost. Default is 1.

  • method (str, optional) –

    Scipy optimization method to use. Common choices:

    • ”l-bfgs-b”: Quasi-Newton with bounds (default, works well with gradients)

    • ”nelder-mead”: Simplex method (gradient-free, robust)

    • ”tnc”: Truncated Newton with bounds

    • ”slsqp”: Sequential Least Squares Programming

    Default is “l-bfgs-b”.

  • ps (callable or None, optional) – Prior sampling function with signature ps(nsample=1). Used to generate initial points for optimization restarts. If None, uses uniform sampling within bounds. Default is None.

  • options (dict or None, optional) –

    Additional options passed to scipy.optimize.minimize. Common options:

    • ”max_iter”: Maximum iterations (default: 50)

    • ”ftol”: Function tolerance for convergence

    • ”gtol”: Gradient tolerance for convergence

    Default is {“max_iter”: 50}.

  • grad_obj_fn – Gradient of the objective function. If provided, can significantly speed up optimization for methods like l-bfgs-b. Default is None (use finite differences).

Returns:

  • theta_best (ndarray of shape (ndim,)) – Parameter values that minimize the objective function.

  • obj_best (float) – Minimum objective function value achieved.

Return type:

tuple

Raises:

RuntimeError – If no valid solutions are found after all optimization attempts.

alabi.utility.prior_sampler(bounds=None, nsample=1, sampler='uniform', random_state=None)[source]

Sample from parameter space using various quasi-random sampling methods.

This function generates samples within specified bounds using different sampling strategies from the scikit-optimize library. It provides a unified interface to various space-filling designs commonly used in Bayesian optimization and surrogate modeling.

Parameters:
  • bounds – (array-like of shape (ndim, 2)) Array of (min, max) bounds for each parameter dimension. Each row specifies the lower and upper bounds for one parameter.

  • nsample – (int, optional) Number of samples to generate. Default is 1.

  • sampler

    ({‘uniform’, ‘sobol’, ‘lhs’, ‘halton’, ‘hammersly’, ‘grid’}, optional) Sampling method to use:

    • ’uniform’: random uniform sampling

    • ’sobol’: Sobol sequence (quasi-random, good space-filling)

    • ’lhs’: Latin Hypercube Sampling (stratified sampling)

    • ’halton’: Halton sequence (quasi-random, low discrepancy)

    • ’hammersly’: Hammersley sequence (quasi-random)

    • ’grid’: Regular grid sampling

    Default is ‘uniform’.

  • random_state – (int, RandomState instance or None, optional) Random state for reproducible sampling. If None, uses a different random seed each time to avoid clustering. Default is None.

Returns:

ndarray of shape (nsample, ndim) Array of parameter samples within the specified bounds.

Raises:

ValueError – If an invalid sampler method is specified.

Notes

Quasi-random samplers (sobol, halton, hammersley) provide better space-filling properties than pseudo-random uniform sampling, which is beneficial for: - Initial design of experiments - Training surrogate models - Global optimization

Latin Hypercube Sampling ensures each parameter dimension is stratified, providing good marginal coverage even with small sample sizes.

For optimization starting points, consider using ‘sobol’ or ‘lhs’ instead of ‘uniform’ to avoid clustering issues when generating single samples repeatedly.

Examples

Generate uniform random samples:

bounds = [(-1, 1), (0, 10)]
samples = prior_sampler(bounds, nsample=5, sampler='uniform')
print(samples.shape)  # (5, 2)

Use Sobol sequence for better space-filling:

samples = prior_sampler(bounds, nsample=100, sampler='sobol')

Latin Hypercube Sampling for stratified design:

samples = prior_sampler(bounds, nsample=20, sampler='lhs')

References

For more information on sampling methods, see: https://scikit-optimize.github.io/stable/auto_examples/sampler/initial-sampling-method.html

alabi.utility.prior_sampler_normal(prior_data, bounds, nsample=1)[source]
alabi.utility.lnprior_uniform(x, bounds)[source]

Evaluate log-probability density of uniform prior distribution.

This function computes the log-probability density for a uniform (flat) prior distribution within specified bounds. Points outside the bounds receive log-probability of negative infinity.

Parameters:
  • x – (array-like of shape (ndim,) or float) Parameter values at which to evaluate the log-prior. For scalar input, assumes 1D parameter space.

  • bounds – (array-like of shape (ndim, 2)) Parameter bounds as [(min, max), …] for each dimension.

Returns:

float Log-probability density. Returns 0.0 if all parameters are within bounds, -np.inf if any parameter is outside bounds.

Notes

For a uniform distribution on [a, b], the probability density is 1/(b-a) and the log-probability density is -log(b-a). However, this function returns 0.0 for in-bounds points since constant offsets don’t affect relative probabilities in MCMC sampling.

Examples

Check if point is within bounds:

bounds = [(-1, 1), (0, 2)]
x = [0.5, 1.0]
lnp = lnprior_uniform(x, bounds)  # Returns 0.0

Point outside bounds:

x = [2.0, 1.0]  # First parameter out of bounds
lnp = lnprior_uniform(x, bounds)  # Returns -inf
alabi.utility.prior_transform_uniform(theta, bounds)[source]

Transform uniform random variables to parameter space with specified bounds.

This function implements the inverse CDF transformation for uniform distributions, mapping from the unit hypercube [0,1]^ndim to the parameter space with given bounds. It is commonly used in nested sampling algorithms like dynesty and UltraNest.

Parameters:
  • theta – (array-like) Random variables uniformly distributed on [0,1]. Can be: - 1D array of shape (ndim,) for a single parameter vector - 2D array of shape (nsamples, ndim) for multiple parameter vectors

  • bounds – (array-like of shape (ndim, 2)) Parameter bounds as [(min, max), …] for each dimension.

Returns:

ndarray Transformed parameter values within the specified bounds. - If input is 1D: returns 1D array of shape (ndim,) - If input is 2D: returns 2D array of shape (nsamples, ndim)

Notes

The transformation for each dimension i is:

\[\theta'_i = (b_{i,\text{max}} - b_{i,\text{min}}) \theta_i + b_{i,\text{min}}\]

where b_{i,min} and b_{i,max} are the bounds for dimension i.

This is the inverse of the uniform CDF, mapping uniform random variables on [0,1] to uniform random variables on [a,b].

Examples

Transform single parameter vector:

bounds = [(-2, 2), (0, 10)]
theta_unit = [0.25, 0.8]  # Single vector
theta_params = prior_transform_uniform(theta_unit, bounds)
print(theta_params)  # [-1.0, 8.0]

Transform multiple parameter vectors:

theta_unit = [[0.25, 0.8], [0.5, 0.2]]  # Multiple vectors
theta_params = prior_transform_uniform(theta_unit, bounds)
print(theta_params)  # [[-1.0, 8.0], [0.0, 2.0]]

Use with nested sampling:

def my_prior_transform(u):
    return prior_transform_uniform(u, bounds)
# Pass to dynesty/UltraNest sampler
alabi.utility.lnprior_normal(x, bounds, data)[source]
alabi.utility.prior_transform_normal(x, bounds, data)[source]

Transform uniform random variables to parameter space with mixed prior distributions.

This function implements prior transformations supporting both uniform and Gaussian priors for different parameters. It maps from the unit hypercube [0,1]^ndim to the parameter space, handling each dimension according to its specified prior type.

Parameters:
  • x – (array-like) Random variables uniformly distributed on [0,1]. Can be: - 1D array of shape (ndim,) for a single parameter vector - 2D array of shape (nsamples, ndim) for multiple parameter vectors

  • bounds – (array-like of shape (ndim, 2)) Parameter bounds as [(min, max), …] for each dimension. Used for uniform priors.

  • data – (list of tuples) Prior specification for each dimension as [(mean, std), …]. - If data[i] = (None, None): use uniform prior with bounds[i] - If data[i] = (mean, std): use Gaussian prior with specified mean and standard deviation

Returns:

ndarray Transformed parameter values according to their prior distributions. - If input is 1D: returns 1D array of shape (ndim,) - If input is 2D: returns 2D array of shape (nsamples, ndim)

Notes

For uniform priors (when data[i][0] is None):

\[\theta'_i = (b_{i,\text{max}} - b_{i,\text{min}}) x_i + b_{i,\text{min}}\]

For Gaussian priors (when data[i] = (μ, σ)):

\[\theta'_i = \Phi^{-1}(x_i; \mu_i, \sigma_i)\]

where Φ^{-1} is the inverse normal CDF (percent-point function).

Examples

Mixed uniform and Gaussian priors:

bounds = [(-2, 2), (0, 10)]
data = [(None, None), (5.0, 1.0)]  # uniform, then Gaussian(5, 1)
x_unit = [0.5, 0.84]  # Single vector
x_params = prior_transform_normal(x_unit, bounds, data)
print(x_params)  # [0.0, ~6.0] (second value from normal inverse CDF)

Vectorized transformation:

x_unit = [[0.5, 0.84], [0.25, 0.16]]  # Multiple vectors
x_params = prior_transform_normal(x_unit, bounds, data)
# Returns 2D array with transformed parameters
class alabi.utility.BetaWarpingFunction(alpha=2.0, beta=2.0)[source]

Bases: object

Swersky et al. 2017 Beta CDF warping function for sklearn FunctionTransformer.

Note: This transformer must be refit when new data extends beyond the original data range to update the internal MinMaxScaler bounds.

__init__(alpha=2.0, beta=2.0)[source]
fit(X, y=None)[source]

Fit the MinMaxScaler to the data, establishing the data range bounds.

This updates the internal min/max values used for scaling. Should be called whenever new data extends beyond the previous range.

transform(X)[source]
fit_transform(X, y=None)[source]

Fit and transform in one step.

inverse_transform(X)[source]
alabi.utility.beta_warping_transformer(alpha=2.0, beta=2.0)[source]

Create a scikit-learn FunctionTransformer for Beta CDF warping.

Parameters

alphafloat or array-like

Shape parameter α for Beta distribution

betafloat or array-like

Shape parameter β for Beta distribution

Returns

transformerFunctionTransformer

Scikit-learn transformer object

class alabi.utility.NewFunctionTransformer(name, func=None, inverse_func=None, *, validate=False, accept_sparse=False, check_inverse=True, feature_names_out=None, kw_args=None, inv_kw_args=None)[source]

Bases: FunctionTransformer

__init__(name, func=None, inverse_func=None, *, validate=False, accept_sparse=False, check_inverse=True, feature_names_out=None, kw_args=None, inv_kw_args=None)[source]

alabi.visualization module

visualization.py

alabi.visualization.plot_error_vs_iteration(iteration, train_error, test_error=None, log=False, metric='Error', title='GP fit', show=False, savedir='.', savename=None)[source]
alabi.visualization.plot_hyperparam_vs_iteration(sm, title='GP fit', show=False)[source]
alabi.visualization.plot_train_time_vs_iteration(sm, title='GP fit', show=False)[source]
alabi.visualization.plot_corner_lnp(sm, show=False)[source]
alabi.visualization.plot_corner_scatter(sm, show=False)[source]
alabi.visualization.plot_gp_fit_1D(sm, title='GP fit', show=False)[source]
alabi.visualization.plot_gp_fit_2D(sm, ngrid=60, title='GP fit', cmap='Blues_r', show=False, vmin=None, vmax=None, log_scale=False)[source]
alabi.visualization.plot_contour_2D(fn, bounds, savedir, savename, title, ngrid=60, cmap='Blues_r', show=False, xlabel=None, ylabel=None, vmin=None, vmax=None, log_scale=False)[source]
alabi.visualization.plot_true_fit_2D(sm, ngrid=60, show=False, log_scale=False, vmin=None, vmax=None)[source]
alabi.visualization.plot_utility_2D(sm, ngrid=60, show=False, log_scale=False, vmin=None, vmax=None)[source]
alabi.visualization.plot_dynesty_traceplot(sm, show=False)[source]
alabi.visualization.plot_dynesty_runplot(sm, show=False)[source]
alabi.visualization.plot_mcmc_comparison(samples1, samples2, bounds=None, param_names=None, name1='sampler 1 posterior', name2='sampler 2 posterior', savedir='.', savename='mcmc_comparison.png', show=False, lw=1.5, colors=['orange', 'royalblue'], **kwargs)[source]
alabi.visualization.plot_sampler_comparison(sm, show=False)[source]
alabi.visualization.plot_2D_panel4(savedir, savename=None)[source]
alabi.visualization.plot_gp_predictions_1D(sm, theta, ngrid=100, nsigma=2, plot_samples=None, plot_layout=None, title=None, show=False, savedir='.', savename=None, ylim=None, legend_loc='best')[source]

Plot 1D GP surrogate predictions for each input parameter, varying one parameter at a time while holding the rest fixed at theta.

For each dimension a panel is drawn showing:

  • The GP predictive mean as a function of that parameter.

  • A shaded band of ±``nsigma`` predictive standard deviations.

  • Vertical dashed line marking the reference value in theta.

  • Scatter of training points projected onto that parameter axis.

Parameters:
  • sm (SurrogateModel) – Trained surrogate model.

  • theta (array-like of shape (ndim,)) – Reference point (unscaled). Each panel sweeps one dimension across sm.bounds while the other dimensions are held fixed at the corresponding value in theta.

  • ngrid (int, optional) – Number of grid points used per sweep. Default is 100.

  • nsigma (int, optional) – Width of the shaded uncertainty band in standard deviations. Default is 2.

  • plot_samples (int or None, optional) – Number of posterior function samples to draw from the GP and overlay on each panel. Each sample is a random draw from the GP posterior conditioned on the training data, plotted as a thin semi-transparent line. If None (default), no samples are drawn.

  • title (str or None, optional) – Overall figure title. If None, no suptitle is added.

  • show (bool, optional) – Whether to call plt.show(). Default is False.

  • savedir (str, optional) – Directory to save the figure. Default is ".".

  • savename (str or None, optional) – Filename for the saved figure. If None, defaults to "gp_predictions_1D.png".

  • ylim (tuple of (float, float) or None, optional) – Optional y-axis limits for all panels, given as (ymin, ymax). If None, the limits are determined automatically.

Returns:

The matplotlib figure.

Return type:

matplotlib.figure.Figure