alabi package¶

alabi.benchmarks module¶

`benchmarks.py`¶

alabi.benchmarks.random_gaussian_covariance(n_dims)[source]¶: Generate a random positive definite covariance matrix.

alabi.benchmarks.multimodal_gaussian_nd(x, means, covs, amps)[source]¶

alabi.cache_utils module¶

`cache_utils.py`¶

alabi.cache_utils.load_pickle(savedir, fname='surrogate_model.pkl')[source]¶

alabi.cache_utils.load_model_cache(savedir)[source]¶

MPI-safe model loading that prevents file corruption.

Parameters:: savedir – Directory containing the model cache
Returns:: Loaded surrogate model

alabi.cache_utils.write_report_gp(self, file)[source]¶

alabi.cache_utils.write_report_emcee(self, file)[source]¶

alabi.cache_utils.write_report_dynesty(self, file)[source]¶

alabi.core module¶

`core.py`¶

class alabi.core.SurrogateModel(lnlike_fn=None, bounds=None, param_names=None, cache=True, savedir='results/', model_name='surrogate_model', verbose=True, show_warnings=False, ncore=1, pool_method='forkserver', ignore_warnings=True, random_state=None)[source]¶

Bases: object

Gaussian Process surrogate model for Bayesian inference and optimization.

A SurrogateModel uses a Gaussian Process to create a fast approximation of expensive likelihood functions, enabling efficient Bayesian inference, parameter estimation, and active learning. The model supports various active learning algorithms and scalers for handling different types of likelihood functions.

Parameters:

lnlike_fn – (callable, required) Log-likelihood function that takes parameter array theta and returns scalar log-likelihood value. For Bayesian inference, this is your model’s log-likelihood. Signature: lnlike_fn(theta) -> float
bounds – (array-like, required) Prior bounds for each parameter. List/array of (min, max) tuples for each dimension. Example: bounds = [(0, 1), (2, 3), (-1, 1)]
param_names – (array-like, optional) Names/labels for each parameter. If None, defaults to θ₀, θ₁, etc. Length must match number of dimensions in bounds.
cache – (bool, optional, default=True) Whether to cache the trained model to disk for reuse
savedir – (str, optional, default=”results/”) Directory for saving results, plots, and cached models
model_name – (str, optional, default=”surrogate_model”) Name prefix for cached model files
verbose – (bool, optional, default=True) Print progress information during training and inference
ncore – (int, optional, default=cpu_count()) Number of CPU cores to use for parallel computation
ignore_warnings – (bool, optional, default=True) Suppress sklearn and other package warnings
random_state – (int, optional, default=None) Random seed for reproducible results

gp: george.GP¶: Trained Gaussian Process model

bounds: ndarray¶: Original parameter bounds (unscaled)

_bounds: ndarray¶: Scaled parameter bounds used for GP training

_theta: ndarray¶: Training parameter samples (scaled)

_y: ndarray¶: Training likelihood values (scaled)

ntrain: int¶: Number of initial training samples

ndim: int¶: Number of parameters/dimensions

emcee_samples: ndarray¶: MCMC samples from emcee (if run_emcee called)

dynesty_samples: ndarray¶: Nested sampling results from dynesty (if run_dynesty called)

Examples

Basic usage for Bayesian inference:

def log_likelihood(theta):
    # Your model likelihood function
    return -0.5 * np.sum((theta - 2)**2)

bounds = [(0, 4), (0, 4)]  # 2D parameter space
sm = SurrogateModel(log_likelihood, bounds)
sm.init_samples(ntrain=100)  # Initial training data
sm.init_gp()  # Initialize Gaussian Process
sm.active_train(niter=50)  # Active learning
sm.run_dynesty()  # Bayesian inference

For optimization problems:

sm.active_train(algorithm="jones")  # Use Jones algorithm for optimization

Notes¶

This function uses multiprocessing.Pool instead of threading, which can provide better performance for CPU-intensive tasks and avoids GIL limitations. However:

All model data must be pickleable (which it should be for SurrogateModel)
Each process runs in separate memory space (higher memory usage)
Process startup overhead is higher than threading
Better isolation between chains (one chain failure won’t affect others)
Can achieve true parallelism on multi-core systems

The function automatically respects the ncore limit and won’t create more processes than specified in self.ncore.

lnprob(theta)[source]¶

Log probability function used for emcee, which sums the prior with the surrogate model likelihood

\[\ln P(\theta | x) \propto \ln P(x | \theta) + \ln P(\theta)\]

where ln P(x | theta) is the surrogate likelihood function and ln P(theta) is the prior function.

Parameters:: theta – (array, required) Array of model input parameters to evaluate model probability at.

find_map(theta0=None, prior_fn=None, method='nelder-mead', nRestarts=15, options=None)[source]¶

run_emcee(like_fn=None, prior_fn=None, nwalkers=None, nsteps=50000, sampler_kwargs={}, run_kwargs={}, opt_init=False, multi_proc=True, prior_fn_comment=None, burn=None, thin=None, samples_file=None, min_ess=10000)[source]¶

Sample the posterior using the emcee affine-invariant MCMC algorithm.

This method uses the emcee package to perform Markov Chain Monte Carlo (MCMC) sampling on either the trained GP surrogate model or the true likelihood function. The affine-invariant ensemble sampler is robust and works well for a wide variety of posterior shapes without requiring manual tuning of step sizes.

Parameters:

like_fn – (callable, str, or None, optional) Likelihood function to sample. Options: - None (default): Uses the trained GP surrogate model (self.surrogate_log_likelihood) - “surrogate”, “gp”: Uses the GP surrogate model explicitly - “true”: Uses the true likelihood function (self.true_log_likelihood) - callable: Custom likelihood function with signature like_fn(theta) Default is None.
prior_fn – (callable or None, optional) Log-prior function with signature prior_fn(theta). Should return log-probability density. If None, uses uniform prior with bounds from self.bounds. Default is None.
nwalkers – (int or None, optional) Number of MCMC walkers in the ensemble. Should be at least 2*ndim. If None, defaults to 10*ndim. More walkers improve convergence but increase computational cost. Default is None.
nsteps – (int, optional) Number of MCMC steps per walker. Total number of likelihood evaluations will be nwalkers * nsteps. Default is 50000.
sampler_kwargs – (dict, optional) Additional keyword arguments passed to emcee.EnsembleSampler constructor. Common options include: - ‘a’: Stretch move scale parameter (default: 2.0) - ‘moves’: Custom proposal moves Default is {}.
run_kwargs – (dict, optional) Additional keyword arguments passed to the run_mcmc() method. Common options include: - ‘progress’: Show progress bar (default: True) - ‘store’: Store chain in memory (default: True) Default is {}.
opt_init – (bool, optional) Whether to initialize walkers near the maximum a posteriori (MAP) estimate. If True, uses find_map() to locate starting point. If False, initializes walkers randomly from the prior. Default is False.
multi_proc – (bool, optional) Whether to use multiprocessing with self.ncore processes. Generally recommended for expensive likelihood evaluations. Default is True.
prior_fn_comment – (str or None, optional) Comment describing the prior function for logging purposes. If None and prior_fn is provided, attempts to extract function name. Default is None.
burn – (int or None, optional) Number of burn-in samples to discard from each walker. If None, automatically estimates burn-in using autocorrelation analysis. Default is None.
thin – (int or None, optional) Thinning factor - keep every thin-th sample to reduce autocorrelation. If None, automatically estimates based on autocorrelation time. Default is None.
samples_file – (str or None, optional) If provided, saves the final samples to this file in NumPy .npz format. Default is None.
min_ess – (int, optional) Minimum effective sample size. If the number of final samples is less than min_ess, will run additional sampling rounds and combine samples until the total number of samples exceeds min_ess. Default is 0 (no minimum required).

Attributes Set¶

sampleremcee.EnsembleSampler: The emcee sampler object containing full chain and metadata
emcee_samplesndarray of shape (nsamples_final, ndim): Final MCMC samples after burn-in and thinning
emcee_samples_fullndarray of shape (nsteps, nwalkers, ndim): Full MCMC chain before processing
emcee_samples_truendarray of shape (nsamples_final, ndim): Final samples when using true likelihood (like_fn=”true”)
emcee_samples_gpndarray of shape (nsamples_final, ndim): Final samples when using surrogate likelihood
emcee_runbool: Flag indicating emcee has been successfully run
emcee_runtimefloat: Wall-clock time taken for emcee sampling in seconds
nwalkersint: Number of walkers used
nstepsint: Number of steps per walker
burnint: Burn-in length used for final samples
thinint: Thinning factor used for final samples
iburnint: Automatically estimated burn-in length
ithinint: Automatically estimated thinning factor
acc_fracfloat: Mean acceptance fraction across all walkers
autcorr_timefloat: Mean autocorrelation time in steps
like_fn_namestr: Name of likelihood function used (“true”, “surrogate”, or “likelihood”)
prior_fn_commentstr: Description of prior function used

Sample surrogate model with default settings:

>>> sm.run_emcee()

Sample true likelihood with specific settings:

>>> sm.run_emcee(like_fn="true", nwalkers=100, nsteps=10000)

Use custom prior and optimize initialization:

>>> def log_prior(theta):
...     # Custom Gaussian prior
...     return -0.5 * np.sum((theta/2)**2)
>>> sm.run_emcee(prior_fn=log_prior, opt_init=True)

Run with manual burn-in and thinning:

>>> sm.run_emcee(nsteps=100000, burn=10000, thin=10)

References¶

emcee documentation: https://emcee.readthedocs.io/ Foreman-Mackey et al. (2013): “emcee: The MCMC Hammer”, PASP, 125, 306-312

run_dynesty(like_fn=None, prior_transform=None, mode='dynamic', sampler_kwargs={}, run_kwargs={}, multi_proc=False, save_iter=None, prior_transform_comment=None, samples_file=None, min_ess=10000)[source]¶

Sample the posterior using the dynesty nested sampling algorithm.

This method uses the dynesty package to perform nested sampling on either the trained GP surrogate model or the true likelihood function. Dynesty is particularly effective for estimating the Bayesian evidence and exploring multi-modal posteriors.

Parameters:

like_fn – (callable, str, or None, optional) Likelihood function to sample. Options: - None (default): Uses the trained GP surrogate model (self.surrogate_log_likelihood) - “surrogate”, “gp”: Uses the GP surrogate model explicitly - “true”: Uses the true likelihood function (self.true_log_likelihood) - callable: Custom likelihood function with signature like_fn(theta) Default is None.
prior_transform – (callable or None, optional) Prior transformation function that maps from unit hypercube [0,1]^ndim to the parameter space. Should have signature prior_transform(u) where u is array of shape (ndim,) with values in [0,1]. If None, uses uniform prior with bounds from self.bounds. Default is None.
mode – ({“dynamic”, “static”}, optional) Dynesty sampling mode. “dynamic” uses DynamicNestedSampler which adaptively allocates live points, while “static” uses fixed number of live points. Dynamic mode is generally more efficient. Default is “dynamic”.
sampler_kwargs – (dict, optional) Additional keyword arguments passed to the dynesty sampler constructor. Common options include: - ‘nlive’: Number of live points (default: 50*ndim) - ‘bound’: Bounding method (‘multi’, ‘single’, ‘none’) - ‘sample’: Sampling method (‘auto’, ‘unif’, ‘rwalk’, ‘slice’, ‘rslice’, ‘hslice’) Default is {}.
run_kwargs – (dict, optional) Additional keyword arguments passed to the run_nested() method. Common options include: - ‘dlogz’: Target evidence uncertainty (default: 0.5) - ‘maxiter’: Maximum number of iterations (default: 50000) - ‘wt_kwargs’: Weight function arguments (default: {‘pfrac’: 1.0}) - ‘stop_kwargs’: Stopping criterion arguments (default: {‘pfrac’: 1.0}) Default is {}.
multi_proc – (bool, optional) Whether to use multiprocessing. If True, uses self.ncore processes. Note that multiprocessing can sometimes be slower due to overhead. Default is False.
save_iter – (int or None, optional) If provided, saves the sampler state every save_iter iterations to allow for checkpointing and resuming long runs. Saves to ‘{savedir}/dynesty_sampler_{like_fn_name}.pkl’. Default is None.
prior_transform_comment – (str or None, optional) Comment describing the prior transform for logging purposes. If None and prior_transform is provided, attempts to extract function name. Default is None.
samples_file – (str or None, optional) If provided, saves the final samples to this file in NumPy .npz format. Default is None.
min_ess – (int, optional) Minimum effective sample size. If the number of final samples is less than min_ess, will run additional sampling rounds and combine samples until the total number of samples exceeds min_ess. Default is 0 (no minimum required).

Attributes Set¶

resdynesty.results.Results: Complete dynesty results object containing samples, weights, evidence, etc.
dynesty_samplesndarray of shape (nsamples, ndim): Resampled posterior samples with equal weights
dynesty_samples_truendarray of shape (nsamples, ndim): Posterior samples when using true likelihood (like_fn=”true”)
dynesty_samples_surrogatendarray of shape (nsamples, ndim): Posterior samples when using surrogate likelihood
dynesty_runbool: Flag indicating dynesty has been successfully run
dynesty_runtimefloat: Wall-clock time taken for dynesty sampling in seconds
like_fn_namestr: Name of likelihood function used (“true”, “surrogate”, or “custom”)
prior_transform_commentstr: Description of prior transform used

Notes¶

Dynesty is particularly well-suited for: - Computing Bayesian evidence for model comparison - Exploring multi-modal posteriors - Providing robust posterior sampling without tuning

The default settings prioritize posterior sampling over evidence estimation by setting pfrac=1.0, which focuses computational effort on high-likelihood regions rather than exploring the full prior volume.

Examples¶

Sample surrogate model with default settings:

>>> sm.run_dynesty()

Sample true likelihood with more live points:

>>> sm.run_dynesty(like_fn="true", sampler_kwargs={'nlive': 1000})

Use custom prior with bounds [-5, 5] for each parameter:

>>> def my_prior(u):
...     return 10*u - 5  # maps [0,1] to [-5,5]
>>> sm.run_dynesty(prior_transform=my_prior)

Run with checkpointing every 1000 iterations:

>>> sm.run_dynesty(save_iter=1000, run_kwargs={'maxiter': 50000})

References¶

Dynesty documentation: https://dynesty.readthedocs.io/ Speagle (2020): “dynesty: a dynamic nested sampling package for estimating Bayesian posteriors and evidences”, MNRAS, 493, 3132-3158

run_pymultinest(like_fn=None, prior_transform=None, sampler_kwargs={}, multi_proc=True, prior_transform_comment=None, samples_file=None, prefix=None, resume=False, n_clustering_params=None, outputfiles_basename=None, min_ess=10000)[source]¶

Sample the posterior using the PyMultiNest nested sampling algorithm.

This method uses the PyMultiNest package (Python wrapper for MultiNest) to perform nested sampling on either the trained GP surrogate model or the true likelihood function. MultiNest is particularly effective for multi-modal posteriors and computing Bayesian evidence with high accuracy.

Parameters:

like_fn – (callable, str, or None, optional) Likelihood function to sample. Options: - None (default): Uses the trained GP surrogate model (self.surrogate_log_likelihood) - “surrogate”, “gp”: Uses the GP surrogate model explicitly - “true”: Uses the true likelihood function (self.true_log_likelihood) - callable: Custom likelihood function with signature like_fn(theta) Default is None.
prior_transform – (callable or None, optional) Prior transformation function that maps from unit hypercube [0,1]^ndim to the parameter space. Should have signature prior_transform(cube) where cube is array of shape (ndim,) with values in [0,1]. The function should modify cube in-place. If None, uses uniform prior with bounds from self.bounds. Default is None.
sampler_kwargs – (dict, optional) Additional keyword arguments passed to pymultinest.run(). Common options include: - ‘n_live_points’: Number of live points (default: 1000) - ‘evidence_tolerance’: Target evidence uncertainty (default: 0.5) - ‘sampling_efficiency’: Sampling efficiency parameter (default: 0.8) - ‘n_iter_before_update’: Iterations before evidence/posterior update (default: 100) - ‘null_log_evidence’: Null evidence for model comparison (default: -1e90) - ‘max_modes’: Maximum number of modes to find (default: 100) - ‘mode_tolerance’: Mode separation tolerance (default: -1e90) - ‘seed’: Random seed for reproducibility (default: -1, auto) - ‘verbose’: Verbosity level (default: True) - ‘importance_nested_sampling’: Use importance nested sampling (default: True) - ‘multimodal’: Enable multimodal mode detection (default: True) - ‘const_efficiency_mode’: Use constant efficiency mode (default: False) Default is {}.
multi_proc –
(bool, optional) Whether to use multiprocessing. If True, uses self.ncore processes. MultiNest handles parallelization internally when MPI is available.

Note

This parameter is ignored for PyMultiNest as it uses MPI for parallelization, not Python’s multiprocessing. When PyMultiNest runs with MPI, other alabi functions automatically disable their multiprocessing pools to avoid conflicts.

Default is True.
prior_transform_comment – (str or None, optional) Comment describing the prior function for logging purposes. If None and prior_transform is provided, attempts to extract function name. Default is None.
samples_file – (str or None, optional) If provided, saves the final samples to this file in NumPy .npz format. Default is None.
prefix – (str or None, optional) Prefix for MultiNest output files. If None, uses default based on likelihood function name and current directory. Default is None.
resume – (bool, optional) Whether to resume from previous run if output files exist. Default is False.
n_clustering_params – (int or None, optional) Number of parameters to use for mode clustering. If None, uses all parameters (ndim). Set to lower value if some parameters are nuisance. Default is None.
outputfiles_basename – (str or None, optional) Base name for MultiNest output files. If None, constructs from savedir and likelihood function name. Default is None.
min_ess – (int, optional) Minimum effective sample size. If the number of final samples is less than min_ess, will run additional sampling rounds and combine samples until the total number of samples exceeds min_ess. Default is 0 (no minimum required).

Attributes Set¶

pymultinest_samplesndarray of shape (nsamples, ndim): Posterior samples from MultiNest
pymultinest_samples_truendarray of shape (nsamples, ndim): Posterior samples when using true likelihood (like_fn=”true”)
pymultinest_samples_surrogatendarray of shape (nsamples, ndim): Posterior samples when using surrogate likelihood
pymultinest_weightsndarray of shape (nsamples,): Sample weights from nested sampling
pymultinest_logzfloat: Log Bayesian evidence estimate
pymultinest_logz_errfloat: Uncertainty in log evidence estimate
pymultinest_runbool: Flag indicating PyMultiNest has been successfully run
pymultinest_runtimefloat: Wall-clock time taken for MultiNest sampling in seconds
pymultinest_analyzerpymultinest.Analyzer: MultiNest analyzer object for accessing detailed results
like_fn_namestr: Name of likelihood function used (“true”, “surrogate”, or “custom”)
prior_transform_commentstr: Description of prior function used

Note

PyMultiNest is particularly well-suited for:

Multi-modal posterior exploration with automatic mode detection
High-accuracy Bayesian evidence computation for model comparison
Robust sampling without manual tuning of MCMC parameters
Handling complex, irregular posterior shapes

MultiNest generates several output files including detailed posterior samples, evidence estimates, and mode information. These files are saved to the model’s savedir for later analysis.

MPI and Multiprocessing Compatibility:

PyMultiNest uses MPI for parallelization across multiple nodes/cores. When MPI is active, alabi automatically disables Python multiprocessing in other functions (run_emcee, run_dynesty) to prevent conflicts. This ensures that:

PyMultiNest can run efficiently with MPI
Other alabi functions fall back to serial execution when MPI is detected
No deadlocks or resource conflicts occur between MPI and multiprocessing

To run PyMultiNest with MPI:

>>> # Single node, multiple cores
>>> sm.run_pymultinest()  # Uses OpenMP if available

>>> # Multiple nodes with MPI (run from command line)
>>> # mpirun -n 4 python your_script.py

example:

Sample surrogate model with default settings:

>>> sm.run_pymultinest()

Sample true likelihood with more live points:

>>> sm.run_pymultinest(like_fn="true", 
...                   sampler_kwargs={'n_live_points': 2000})

Use custom prior with bounds [-10, 10] for each parameter:

>>> def my_prior(cube):
...     for i in range(len(cube)):
...         cube[i] = 20*cube[i] - 10  # maps [0,1] to [-10,10]
>>> sm.run_pymultinest(prior_transform=my_prior)

Enable multimodal mode detection with high accuracy:

>>> sm.run_pymultinest(sampler_kwargs={
...     'multimodal': True,
...     'evidence_tolerance': 0.1,
...     'max_modes': 20
... })

Run with custom output file prefix and resume capability:

>>> sm.run_pymultinest(prefix="my_run_", resume=True)

References¶

PyMultiNest documentation: https://johannesbuchner.github.io/PyMultiNest/ Feroz et al. (2009): “MultiNest: an efficient and robust Bayesian inference tool for cosmology and particle physics”, MNRAS, 398, 1601-1614

run_ultranest(like_fn=None, prior_transform=None, sampler_kwargs={}, run_kwargs={}, multi_proc=False, prior_transform_comment=None, samples_file=None, log_dir=None, resume='overwrite', min_ess=10000, slice_steps=0)[source]¶

Sample the posterior using the UltraNest nested sampling algorithm.

This method uses the UltraNest package to perform nested sampling on either the trained GP surrogate model or the true likelihood function. UltraNest is a highly robust nested sampling algorithm that automatically adapts to the problem complexity and provides reliable evidence computation.

Parameters:

like_fn – (callable, str, or None, optional) Likelihood function to sample. Options: - None (default): Uses the trained GP surrogate model (self.surrogate_log_likelihood) - “surrogate”, “gp”: Uses the GP surrogate model explicitly - “true”: Uses the true likelihood function (self.true_log_likelihood) - callable: Custom likelihood function with signature like_fn(theta) Default is None.
prior_transform – (callable or None, optional) Prior transformation function that maps from unit hypercube [0,1]^ndim to the parameter space. Should have signature prior_transform(cube) where cube is array of shape (ndim,) with values in [0,1]. Must return transformed parameters as array. If None, creates uniform prior from self.bounds. Default is None.
sampler_kwargs – (dict, optional) Additional keyword arguments passed to UltraNest ReactiveNestedSampler(). Common options include: - ‘derived_param_names’: List of derived parameter names (default: []) - ‘wrapped_params’: List of bool indicating circular parameters (default: None) - ‘resume’: Resume behavior ‘resume’/’overwrite’/’subfolder’ (default: ‘subfolder’) - ‘run_num’: Run number for subdirectory creation (default: None) - ‘num_test_samples’: Number of test samples for validation (default: 2) - ‘draw_multiple’: Enable dynamic point drawing (default: True) - ‘num_bootstraps’: Number of bootstrap samples (default: 30) - ‘vectorized’: Whether functions accept arrays (default: False) - ‘ndraw_min’: Minimum points to draw per iteration (default: 128) - ‘ndraw_max’: Maximum points to draw per iteration (default: 65536) - ‘storage_backend’: Storage format ‘hdf5’/’csv’/’tsv’ (default: ‘hdf5’) - ‘warmstart_max_tau’: Warmstart maximum tau (default: -1) Default is {}.
run_kwargs – (dict, optional) Additional keyword arguments passed to sampler.run(). Common options include: - ‘update_interval_volume_fraction’: Volume fraction for region updates (default: 0.8) - ‘update_interval_ncall’: Number of calls between updates (optional, omit for auto) - ‘log_interval’: Iterations between status updates (optional, omit for auto) - ‘show_status’: Show integration progress (default: True) - ‘viz_callback’: Visualization callback function (default: False, disabled) - ‘dlogz’: Target log-evidence uncertainty (default: 0.5) - ‘dKL’: Target posterior uncertainty in nats (default: 0.5) - ‘frac_remain’: Fraction of evidence remaining to terminate (default: 0.01) - ‘Lepsilon’: Likelihood contour tolerance (default: 0.001) - ‘min_ess’: Target effective sample size (default: 400) - ‘max_iters’: Maximum number of iterations (optional, omit for unlimited) - ‘max_ncalls’: Maximum number of likelihood calls (optional, omit for unlimited) - ‘max_num_improvement_loops’: Maximum improvement loops (default: -1) - ‘min_num_live_points’: Minimum number of live points (default: 400) - ‘cluster_num_live_points’: Live points per cluster (default: 40) - ‘insertion_test_zscore_threshold’: Z-score threshold for insertion test (default: 4) - ‘insertion_test_window’: Window size for insertion test (default: 10) - ‘region_class’: Region sampling class (optional, can be passed via kwargs) - ‘widen_before_initial_plateau_num_warn’: Warning threshold for plateau (default: 10000) - ‘widen_before_initial_plateau_num_max’: Maximum plateau points (optional, omit for auto) Default is {}.
multi_proc – (bool, optional) Deprecated and ignored. This parameter is kept for backwards compatibility but no longer has any effect. UltraNest now runs in MPI-compatible mode without multiprocessing pools to avoid conflicts with MPI environments. Default is False.
prior_transform_comment – (str or None, optional) Comment describing the prior transform for logging purposes. If None and prior_transform is provided, attempts to extract function name. Default is None.
samples_file – (str or None, optional) If provided, saves the final samples to this file in NumPy .npz format. Default is None.
log_dir – (str or None, optional) Directory to store UltraNest output files and logs. If None, uses a subdirectory in self.savedir. Default is None.
resume – (str, optional) Resume behavior for interrupted runs. Options: - ‘resume’: Resume if possible, otherwise start fresh - ‘resume-similar’: Resume with similar but not identical setup - ‘overwrite’: Always start fresh, overwriting existing files (default) - ‘subfolder’: Create new timestamped subfolder Default is ‘overwrite’.
min_ess – (int, optional) Minimum effective sample size. If the number of final samples is less than min_ess, will run additional sampling rounds and combine samples until the total number of samples exceeds min_ess. Default is 0 (no minimum required).

Attributes Set¶

ultranest_resultsultranest.integrator.Result: Complete UltraNest results object with samples, evidence, etc.
ultranest_samplesndarray of shape (nsamples, ndim): Equally weighted posterior samples from UltraNest
ultranest_samples_truendarray of shape (nsamples, ndim): Posterior samples when using true likelihood (like_fn=”true”)
ultranest_samples_surrogatendarray of shape (nsamples, ndim): Posterior samples when using surrogate likelihood
ultranest_weightsndarray of shape (nsamples,): Sample weights (typically all equal after resampling)
ultranest_logzfloat: Log Bayesian evidence estimate
ultranest_logz_errfloat: Uncertainty in log evidence estimate
ultranest_runbool: Flag indicating UltraNest has been successfully run
ultranest_runtimefloat: Wall-clock time taken for UltraNest sampling in seconds
ultranest_samplerultranest.ReactiveNestedSampler: UltraNest sampler object for accessing detailed information
like_fn_namestr: Name of likelihood function used (“true”, “surrogate”, or “custom”)
prior_transform_commentstr: Description of prior transform used

Note

UltraNest is particularly well-suited for:

Robust nested sampling without manual tuning
Automatic adaptation to problem complexity
High-dimensional and multi-modal problems
Reliable evidence computation for model comparison
Problems with complex, irregular likelihood shapes
MPI environments (runs in serial mode to avoid multiprocessing conflicts)

UltraNest automatically determines the number of live points and adapts its sampling strategy based on the problem characteristics. It provides excellent performance across a wide range of problems without requiring parameter tuning.

MPI Compatibility: This function runs UltraNest in serial mode, making it fully compatible with MPI environments. While UltraNest itself can use MPI for parallelization, this implementation avoids multiprocessing pools that can conflict with MPI.

example:

Sample surrogate model with default settings:

>>> sm.run_ultranest()

Sample true likelihood with custom termination criteria:

>>> sm.run_ultranest(like_fn="true", 
...                  run_kwargs={'dlogz': 0.1, 'min_ess': 1000})

Use custom prior transform with bounds [-5, 5] for each parameter:

>>> def my_prior_transform(cube):
...     return 10 * cube - 5  # maps [0,1] to [-5,5]
>>> sm.run_ultranest(prior_transform=my_prior_transform)

Run with increased live points for better accuracy:

>>> sm.run_ultranest(like_fn="true",
...                  run_kwargs={'min_num_live_points': 800})

Customize output directory and resume behavior:

>>> sm.run_ultranest(log_dir="ultranest_output", 
...                  resume="overwrite")

References¶

UltraNest documentation: https://johannesbuchner.github.io/UltraNest/ Buchner (2021): “UltraNest - a robust, general purpose Bayesian inference library for cosmology and particle physics”, Journal of Open Source Software

plot(plots=None, show=False, cb_rng=[None, None], log_scale=False)[source]¶

Generate diagnostic plots for training progress, GP performance, and MCMC results.

This method creates various diagnostic plots to assess the quality of the surrogate model training, GP hyperparameter optimization, and MCMC sampling results. Plots are automatically saved to the model’s save directory.

Parameters:

plots –
(list of str, optional) List of plot types to generate. Each plot requires specific data to be available (e.g., ‘emcee_corner’ requires run_emcee() to have been called first). If None, no plots are generated. Available options:

Training diagnostics: - ‘test_mse’: Mean squared error vs training iteration - ‘test_scaled_mse’: Scaled MSE vs training iteration - ‘test_log_mse’: Log-scale MSE vs training iteration - ‘gp_hyperparameters’: GP hyperparameter evolution during training - ‘gp_train_time’: GP training time vs iteration - ‘gp_train_corner’: Corner plot of final training samples - ‘gp_train_scatter’: Scatter plot of training samples vs predictions

GP visualization: - ‘gp_fit_2D’: 2D contour plot of GP surrogate surface (2D only) - ‘gp_predictions_1D’: 1D slices of GP mean and variance through the surrogate maximum

MCMC diagnostics: - ‘emcee_corner’: Corner plot of emcee posterior samples - ‘emcee_walkers’: Walker trajectories for emcee chains - ‘dynesty_corner’: Corner plot of dynesty posterior samples - ‘dynesty_corner_kde’: KDE version of dynesty corner plot - ‘dynesty_traceplot’: Trace plot of dynesty sampling - ‘dynesty_runplot’: Dynesty convergence diagnostics

Comparison plots: - ‘mcmc_comparison’: Compare emcee and dynesty posteriors

Convenience options: - ‘gp_all’: Generate all available GP training plots

Default is None.
show – (bool, optional) Whether to display plots interactively in addition to saving them. If False, plots are only saved to disk. Default is False.
cb_rng – (list of [float, float], optional) Colorbar range for 2D contour plots as [vmin, vmax]. If [None, None], uses automatic range determination. Only applies to plots with colorbars like ‘gp_fit_2D’. Default is [None, None].
log_scale – (bool, optional) Whether to use logarithmic color scale for 2D contour plots. If True, applies matplotlib.colors.LogNorm to the colorbar. Only applies to plots with colorbars. Default is False.

Returns:

None or matplotlib.figure.Figure Some individual plots may return figure objects for further customization.

Raises:

NameError – If required data for a requested plot is not available (e.g., requesting ‘emcee_corner’ before running run_emcee()).
AttributeError – If the model has not been properly initialized or trained.

Notes¶

Plots are automatically saved to the model’s save directory (self.savedir) with descriptive filenames. The save directory is created if it doesn’t exist.

Training diagnostic plots help assess: - Convergence of active learning process - Quality of GP hyperparameter optimization - Efficiency of training sample selection

MCMC diagnostic plots help assess: - Posterior sampling convergence - Chain mixing and autocorrelation - Comparison between different sampling methods

Examples¶

Generate all GP training plots:

>>> sm.plot(plots=['gp_all'])

Create MCMC comparison plots:

>>> sm.run_emcee()
>>> sm.run_dynesty()
>>> sm.plot(plots=['emcee_corner', 'dynesty_corner', 'mcmc_comparison'])

Generate 2D GP visualization with custom colorbar:

>>> sm.plot(plots=['gp_fit_2D'], cb_rng=[-10, 0], log_scale=True)

Show plots interactively:

>>> sm.plot(plots=['test_mse', 'gp_hyperparameters'], show=True)

get_chain_diversity_metrics()[source]¶: Calculate diversity metrics for the combined training samples. Useful for assessing the effectiveness of parallel chains.

class alabi.core.CachedSurrogateLikelihood(gp_iter, _y_cond, theta_scaler, y_scaler, ndim, return_var=False)[source]¶

Bases: object

A picklable cached surrogate likelihood function.

This class creates a callable object that caches the GP computation and can be pickled for use with multiprocessing.

__init__(gp_iter, _y_cond, theta_scaler, y_scaler, ndim, return_var=False)[source]¶

Initialize the cached surrogate likelihood.

Parameters:

gp_iter – Pre-computed GP object
y_cond – Training target values
theta_scaler – Parameter scaler object
y_scaler – Target scaler object
ndim – Number of dimensions

alabi.gp_utils module¶

`gp_utils.py`¶

Gaussian process utility functions for initializing GPs and optimizing their hyperparameters.

alabi.gp_utils.configure_gp(theta, y, kernel, fit_amp=True, fit_mean=True, fit_white_noise=False, white_noise=-12, hyperparameters=None)[source]¶

Configure and initialize a Gaussian Process with robust error handling.

Creates a george.GP object with the specified kernel and configuration options. Includes automatic fixes for common numerical issues such as singular matrices, duplicate points, and poor conditioning.

Parameters:

theta – (array-like, shape (n_samples, n_features)) Training input locations (parameters). Must contain finite values only.
y – (array-like, shape (n_samples,)) Training target values (function evaluations). Must contain finite values only.
kernel – (george kernel object) George kernel object defining the covariance function. Common options include ExpSquaredKernel, Matern32Kernel, Matern52Kernel.
fit_amp – (bool, optional, default=True) Whether to fit the kernel amplitude. If True, scales the kernel by the variance of y to improve conditioning.
fit_mean – (bool, optional, default=True) Whether to fit a constant mean function. If True, initializes mean to median(y) and allows optimization.
fit_white_noise – (bool, optional, default=False) Whether to fit the white noise (nugget) parameter. If True, the noise level will be optimized during hyperparameter tuning.
white_noise – (float, optional, default=-12) Log-scale white noise parameter. Acts as regularization to prevent singular matrices. More negative values = less noise.
hyperparameters – (array-like, optional, default=None) Pre-specified hyperparameters to set. If provided, these values are used instead of the kernel’s default initialization.

Returns:

george.GP or None Configured and computed GP object ready for predictions, or None if configuration failed despite all attempted fixes.

Raises:

ValueError – If theta or y contain non-finite values (NaN or inf).
LinAlgError – If GP computation fails due to singular covariance matrix, automatically attempts several fixes before giving up.

alabi.gp_utils.optimize_gp(gp, _theta, _y, gp_hyper_prior, p0, bounds=None, method='l-bfgs-b', optimizer_kwargs=None, regularize=True, amp_0=1.0, mu_0=1.0, sigma_0=2.0, lengthscale_indices=None)[source]¶

Optimize Gaussian Process hyperparameters by maximizing marginal likelihood.

Performs hyperparameter optimization for a Gaussian Process using scipy’s minimize function. Supports multiple optimization restarts and automatically selects the result with highest marginal likelihood.

Parameters:

gp – (george.GP) Configured Gaussian Process object. Should be computed with training data.
_theta – (array-like, shape (n_samples, n_features)) Training input locations (parameters). Will be squeezed if 1D.
_y – (array-like, shape (n_samples,)) Training target values (function evaluations). Will be squeezed if 1D.
gp_hyper_prior – (callable) Prior function for hyperparameters. Should return log-probability density for given hyperparameter vector. Used to constrain optimization.
p0 – (array-like, shape (n_restarts, n_hyperparams) or (n_hyperparams,)) Initial guesses for hyperparameter optimization. If 2D, performs multiple restarts with different initializations.
bounds – (list of tuples, optional, default=None) Bounds for hyperparameter optimization as [(min, max), …]. Only used for methods that support bounds (e.g., ‘l-bfgs-b’).
method –
(str, optional, default=”l-bfgs-b”) Scipy optimization method. Supported methods:
- ’l-bfgs-b’: L-BFGS-B with bounds support (default)
- ’newton-cg’: Newton-CG with gradients
- ’bfgs’: BFGS (no bounds support)
- ’powell’: Powell method (derivative-free)
optimizer_kwargs – (dict, optional, default=None) Additional keyword arguments passed to scipy.optimize.minimize. If None, uses method-specific defaults optimized for GP optimization.
max_iter – (int, optional, default=50) Maximum number of iterations for optimization. Used as default in optimizer_kwargs if not specified.
regularize – (bool, optional, default=False) Whether to apply Hvarfner dimensionality-scaled regularization (Equation 4 from “Vanilla Bayesian Optimization Performs Great in High Dimensions”). When True, lengthscale priors are scaled as LogNormal(μ_0 + log(√d), σ_0).
mu_0 – (float, optional, default=0.0) Base location parameter for Hvarfner regularization. Only used if regularize=True.
sigma_0 – (float, optional, default=1.0) Scale parameter for Hvarfner regularization. Only used if regularize=True.
lengthscale_indices – (list of int, optional, default=None) Indices in the parameter vector corresponding to lengthscale parameters. Only used if regularize=True. If None, attempts to infer from kernel.

Returns:

george.GP or None GP object with optimized hyperparameters, or None if optimization failed. The returned GP is recomputed with optimal hyperparameters.

alabi.gp_utils.optimize_gp_kfold_cv(gp, _theta, _y, hyperparameter_candidates, y_scaler, k_folds=5, scoring='mse', pool=None, stage2_candidates=None, stage2_width=0.5, stage3_candidates=None, stage3_width=0.2, weighted_mse_method='exponential', weighted_mse_factor=1.0, verbose=True)[source]¶

Optimize Gaussian Process hyperparameters using k-fold cross-validation.

This function evaluates different hyperparameter configurations using k-fold cross-validation to select the configuration that generalizes best to unseen data. This approach helps prevent overfitting compared to standard marginal likelihood maximization.

Parameters:

gp – (george.GP) Configured Gaussian Process object template. Will be copied for each CV fold.
_theta – (array-like, shape (n_samples, n_features)) Training input locations (parameters). Must have at least k_folds samples.
_y – (array-like, shape (n_samples,)) Training target values (function evaluations). Must match _theta length.
gp_hyper_prior – (callable) Prior function for hyperparameters. Should return log-probability density for given hyperparameter vector. Used to constrain search space.
hyperparameter_candidates – (array-like, shape (n_candidates, n_hyperparams)) Array of hyperparameter vectors to evaluate via cross-validation. Each row represents one hyperparameter configuration to test.
k_folds – (int, optional, default=5) Number of folds for cross-validation. Must be >= 2 and <= n_samples. Common choices: 5 or 10 for good bias-variance tradeoff.
scoring –
(str, optional, default=’mse’) Scoring metric for cross-validation. Supported options:
- ’mse’: Mean Squared Error (lower is better)
- ’mae’: Mean Absolute Error (lower is better)
- ’r2’: R-squared coefficient (higher is better)
- ’weighted_mse’: Weighted MSE giving higher weight to high-probability regions
pool – (multiprocessing.Pool, optional, default=None) Multiprocessing pool for parallel evaluation of hyperparameter candidates. If None, evaluation runs sequentially. If provided, candidates are evaluated in parallel using the pool’s worker processes.
stage2_candidates – (int, optional, default=None) Number of candidates for stage 2 grid search. If None, uses len(hyperparameter_candidates) // 2.
stage2_width – (float, optional, default=0.5) Width factor for stage 2 search around best parameters. Smaller values = tighter search, larger values = wider search.
stage3_candidates – (int, optional, default=None) Number of candidates for stage 3 ultra-fine search. If None, uses max(stage2_candidates // 2, 3).
stage3_width – (float, optional, default=0.2) Width factor for stage 3 search around stage 2 best parameters. Should be smaller than stage2_width for finer refinement.
weighted_mse_method – (str, optional, default=’exponential’) Weighting method for weighted_mse scoring. Options: - ‘exponential’: w = exp(y_true / temperature) - ‘linear’: w = y_true - min(y_true) + epsilon - ‘softmax’: w = softmax(y_true / temperature) - ‘rank’: w based on rank order of y_true
weighted_mse_factor – (float, optional, default=1.0) Temperature parameter for exponential/softmax weighting. Lower values emphasize high-probability regions more strongly.

Returns:

best_gp (george.GP) – GP with optimal hyperparameters set
best_hyperparams (array) – Best hyperparameter vector
cv_results (dict) – Detailed CV results with scores and statistics

Raises:

ValueError – If insufficient data, invalid parameters, or no valid hyperparameters
RuntimeError – If all hyperparameter evaluations fail

alabi.gp_utils.regularization_term(hparams, lengthscale_indices, amp_0=1.0, mu_0=1.0, sigma_0=2.0)[source]¶

Compute the regularization term (negative log prior) from Hvafner 2024 Equation 4.

This implements the dimensionality-scaled LogNormal prior: p(ℓ_i | d) = LogNormal(μ_0 + log(√d), σ_0)

The regularization term is -log p(ℓ | d) = -Σ log p(ℓ_i | d)

Parameters:¶

mu_0float, default=0.0: The base location parameter for the 1D LogNormal prior
sigma_0float, default=1.0: The scale parameter for the LogNormal prior

Returns:¶

float: The regularization term (negative log prior)

alabi.gp_utils.regularization_gradient(hparams, lengthscale_indices, amp_0=1.0, mu_0=1.0, sigma_0=2.0)[source]¶

Compute the gradient of the regularization term with respect to lengthscales.

Parameters:¶

lengthscalesarray-like, shape (d,): The lengthscale parameters for each dimension
dint: The dimensionality of the problem
mu_0float, default=0.0: The base location parameter for the 1D LogNormal prior
sigma_0float, default=1.0: The scale parameter for the LogNormal prior

Returns:¶

ndarray, shape (d,): The gradient of the regularization term with respect to each lengthscale

alabi.mcmc_utils module¶

`mcmc_utils.py`¶

alabi.mcmc_utils.estimate_burnin(sampler, est_burnin=True, thin_chains=True, verbose=False)[source]¶

Estimate the integrated autocorrelation length on the MCMC chain associated with an emcee sampler object. With the integrated autocorrelation length, we can then estimate the burn-in length for the MCMC chain. This procedure follows the example outlined here: https://emcee.readthedocs.io/en/stable/tutorials/autocorr/

Parameters:

sampler – (emcee.EnsembleSampler, optional) emcee MCMC sampler object/backend handler, given a complete chain
est_burnin – (bool, optional) Estimate burn-in time using integrated autocorrelation time heuristic. Defaults to True. In general, we recommend users inspect the chains and calculate the burnin after the fact to ensure convergence, but this function works pretty well.
thin_chains – (bool, optional) Whether or not to thin chains. Useful if running long chains. Defaults to True. If true, estimates a thin cadence via int(0.5*np.min(tau)) where tau is the intergrated autocorrelation time.
verbose – (bool, optional) Output all the diagnostics? Defaults to False.

Returns iburn:

(int) burn-in index estimate. If est_burnin == False, returns 0.

Returns ithin:

(int) thin cadence estimate. If thin_chains == False, returns 1.

alabi.utility module¶

`utility.py`¶

Utility functions in terms of usefulness, e.g. minimizing GP utility functions or computing KL divergences, and the GP utility functions, e.g. the bape utility.

alabi.utility.agp_utility(theta, predict_gp, bounds)[source]¶

Compute the AGP (Adaptive Gaussian Process) utility function based on posterior entropy.

AGP is an information-theoretic acquisition function that measures the entropy of the Gaussian Process posterior distribution. It balances the GP mean prediction with the uncertainty (variance), preferring regions with high predicted values and high uncertainty.

Parameters:

theta – (array-like of shape (ndim,)) Parameter values at which to evaluate the utility function.
y – (array-like of shape (nsamples,)) Observed function values at training points, used to condition the GP.
gp – (george.GP) Trained Gaussian Process model. The GP will be computed if not already done.
bounds – (array-like of shape (ndim, 2)) Parameter bounds as [(min, max), …] for each dimension. Used to check if theta is within the prior support.

Returns:

float Negative AGP utility value. The negative is used so that minimizing this function is equivalent to maximizing the actual utility.

Notes

The AGP utility function is defined as:

\[u(\theta) = \mu(\theta) + \frac{1}{2} \ln(2\pi e \sigma^2(\theta))\]

This represents the entropy of the posterior predictive distribution at θ. The utility encourages sampling where: - The mean prediction μ(θ) is high (exploitation) - The predictive variance σ²(θ) is high (exploration)

AGP provides a different balance compared to other acquisition functions: - More exploitative than BAPE (focuses on high mean regions) - Less optimization-focused than Expected Improvement - Naturally balances exploration and exploitation through entropy

Raises:: ValueError – If the utility computation results in invalid values (e.g., negative variance).

Examples

Evaluate AGP utility at a test point:

theta_test = np.array([0.5, 1.2])
utility = agp_utility(theta_test, y_train, gp, bounds)

References

Wang & Li (2017): “Adaptive Gaussian Process Approximation for Bayesian Inference with Expensive Likelihood Functions”, Neural Computation, 30, 3072-3094.

alabi.utility.bape_utility(theta, predict_gp, bounds)[source]¶

Compute the BAPE (Bayesian Active Posterior Estimation) utility function.

BAPE is an active learning acquisition function designed for posterior exploration rather than optimization. It identifies regions where the GP variance is high relative to the mean, promoting exploration of the parameter space. The utility is computed in log-form for numerical stability.

Parameters:

theta (array-like of shape (ndim,)) – Parameter values at which to evaluate the utility function.
y (array-like of shape (nsamples,)) – Observed function values at training points, used to condition the GP.
gp (george.GP) – Trained Gaussian Process model. Must have been computed (gp.computed=True).
bounds (array-like of shape (ndim, 2)) – Parameter bounds as [(min, max), …] for each dimension. Used to check if theta is within the prior support.

Returns:

Negative log-BAPE utility value. The negative is used so that minimizing this function is equivalent to maximizing the actual utility.

Return type:

float

Note

The BAPE utility function is defined as:

\[u(\theta) = e^{2\mu(\theta) + \sigma^2(\theta)} \left(e^{\sigma^2(\theta)} - 1 \right)\]

This function returns the negative logarithm of the utility for numerical stability:

\[-\log u(\theta) = -\left(2\mu(\theta) + \sigma^2(\theta) + \log(e^{\sigma^2(\theta)} - 1)\right)\]

BAPE is particularly effective for:

Exploring multi-modal posteriors
Reducing uncertainty in posterior estimates
Active learning when the goal is posterior characterization

Unlike optimization-focused acquisitions (e.g., Expected Improvement), BAPE prioritizes exploration over exploitation, making it less suitable for finding global optima but excellent for posterior mapping.

Raises:

ValueError – If the utility computation results in invalid values (e.g., negative variance).
RuntimeError – If the GP has not been computed before calling this function.

Examples

Evaluate BAPE utility at a test point:

>>> theta_test = np.array([0.5, 1.2])
>>> utility = bape_utility(theta_test, y_train, gp, bounds)

References

Kandasamy et al. (2015): “Query efficient posterior estimation in scientific experiments via Bayesian active learning”, Artificial Intelligence, 243, 45-56.

alabi.utility.jones_utility(theta, predict_gp, bounds, y_best, zeta=0.01)[source]¶

Compute the Expected Improvement (EI) acquisition function.

This function implements the Expected Improvement criterion from Jones et al. (1998), which balances exploitation (sampling where the mean is high) with exploration (sampling where the uncertainty is high). Unlike BAPE, this acquisition function is designed specifically for global optimization.

Parameters:

theta (array-like of shape (ndim,)) – Parameter values at which to evaluate the utility function.
y (array-like of shape (nsamples,)) – Observed function values at training points. The maximum value is used as the current best (f_best).
gp (george.GP) – Trained Gaussian Process model. Must have been computed (gp.computed=True).
bounds (array-like of shape (ndim, 2)) – Parameter bounds as [(min, max), …] for each dimension. Used to check if theta is within the prior support.
zeta (float, optional) –
Exploration parameter controlling the trade-off between exploitation and exploration. Larger values promote more exploration:
- zeta = 0: Pure exploitation (greedy)
- zeta > 0: Balanced exploration/exploitation
- zeta >> 0: Pure exploration
Default is 0.01.

Returns:

Negative Expected Improvement value. The negative is used so that minimizing this function is equivalent to maximizing the Expected Improvement.

Return type:

float

Note

The Expected Improvement is defined as:

\[EI(\theta) = \mathbb{E}[\max(f(\theta) - f_{\text{best}} - \zeta, 0)]\]

where f_best is the current best observed value. For a Gaussian predictive distribution with mean μ and variance σ², this becomes:

\[EI(\theta) = (\mu - f_{\text{best}} - \zeta) \Phi(z) + \sigma \phi(z)\]

where z = (μ - f_best - ζ)/σ, Φ is the standard normal CDF, and φ is the standard normal PDF.

Expected Improvement is particularly effective for:

Global optimization problems
Finding the maximum of expensive functions
Balancing local and global search

Compared to BAPE, Expected Improvement focuses on exploitation (finding optima) rather than exploration (mapping the entire posterior).

Raises:

ValueError – If the utility computation results in invalid values.
RuntimeError – If the GP has not been computed before calling this function.

References

Jones et al. (1998): “Efficient global optimization of expensive black-box functions”, Journal of Global Optimization, 13, 455-492.

alabi.utility.assign_utility(algorithm)[source]¶

alabi.utility.minimize_objective(obj_fn, bounds=None, nopt=1, method='l-bfgs-b', ps=None, options=None, grad_obj_fn=None, pool=None, show_warnings=False)[source]¶

Find the global minimum of an acquisition function using multiple restarts.

This function optimizes acquisition functions (BAPE, Jones/EI, etc.) to select the next point for active learning. Multiple optimization restarts with different initial points help avoid local minima, which is crucial for acquisition function optimization where the landscape can be highly multi-modal.

Parameters:

obj_fn (callable) – Objective function to minimize. Should have signature obj_fn(theta) and return a scalar value. Typically an acquisition function like BAPE or EI.
grad_obj_fn (callable or None, optional) – Gradient of the objective function. Should have signature grad_obj_fn(theta) and return array of shape (ndim,). If None, uses finite differences.
bounds (array-like of shape (ndim, 2)) – Parameter bounds as [(min, max), …] for each dimension. Used both for optimization constraints and generating initial points.
nopt (int, optional) – Number of optimization restarts with different initial points. More restarts increase the chance of finding the global minimum but increase computational cost. Default is 1.
method (str, optional) –
Scipy optimization method to use. Common choices:
- ”l-bfgs-b”: Quasi-Newton with bounds (default, works well with gradients)
- ”nelder-mead”: Simplex method (gradient-free, robust)
- ”tnc”: Truncated Newton with bounds
- ”slsqp”: Sequential Least Squares Programming
Default is “l-bfgs-b”.
ps (callable or None, optional) – Prior sampling function with signature ps(nsample=1). Used to generate initial points for optimization restarts. If None, uses uniform sampling within bounds. Default is None.
options (dict or None, optional) –
Additional options passed to scipy.optimize.minimize. Common options:
- ”max_iter”: Maximum iterations (default: 50)
- ”ftol”: Function tolerance for convergence
- ”gtol”: Gradient tolerance for convergence
Default is {“max_iter”: 50}.
grad_obj_fn – Gradient of the objective function. If provided, can significantly speed up optimization for methods like l-bfgs-b. Default is None (use finite differences).

Returns:

theta_best (ndarray of shape (ndim,)) – Parameter values that minimize the objective function.
obj_best (float) – Minimum objective function value achieved.

Return type:

tuple

Raises:

RuntimeError – If no valid solutions are found after all optimization attempts.

alabi.utility.prior_sampler(bounds=None, nsample=1, sampler='uniform', random_state=None)[source]¶

Sample from parameter space using various quasi-random sampling methods.

This function generates samples within specified bounds using different sampling strategies from the scikit-optimize library. It provides a unified interface to various space-filling designs commonly used in Bayesian optimization and surrogate modeling.

Parameters:

bounds – (array-like of shape (ndim, 2)) Array of (min, max) bounds for each parameter dimension. Each row specifies the lower and upper bounds for one parameter.
nsample – (int, optional) Number of samples to generate. Default is 1.
sampler –
({‘uniform’, ‘sobol’, ‘lhs’, ‘halton’, ‘hammersly’, ‘grid’}, optional) Sampling method to use:
- ’uniform’: random uniform sampling
- ’sobol’: Sobol sequence (quasi-random, good space-filling)
- ’lhs’: Latin Hypercube Sampling (stratified sampling)
- ’halton’: Halton sequence (quasi-random, low discrepancy)
- ’hammersly’: Hammersley sequence (quasi-random)
- ’grid’: Regular grid sampling
Default is ‘uniform’.
random_state – (int, RandomState instance or None, optional) Random state for reproducible sampling. If None, uses a different random seed each time to avoid clustering. Default is None.

Returns:

ndarray of shape (nsample, ndim) Array of parameter samples within the specified bounds.

Raises:

ValueError – If an invalid sampler method is specified.

Notes

Quasi-random samplers (sobol, halton, hammersley) provide better space-filling properties than pseudo-random uniform sampling, which is beneficial for: - Initial design of experiments - Training surrogate models - Global optimization

Latin Hypercube Sampling ensures each parameter dimension is stratified, providing good marginal coverage even with small sample sizes.

For optimization starting points, consider using ‘sobol’ or ‘lhs’ instead of ‘uniform’ to avoid clustering issues when generating single samples repeatedly.

Examples

Generate uniform random samples:

bounds = [(-1, 1), (0, 10)]
samples = prior_sampler(bounds, nsample=5, sampler='uniform')
print(samples.shape)  # (5, 2)

Use Sobol sequence for better space-filling:

samples = prior_sampler(bounds, nsample=100, sampler='sobol')

Latin Hypercube Sampling for stratified design:

samples = prior_sampler(bounds, nsample=20, sampler='lhs')

References

For more information on sampling methods, see: https://scikit-optimize.github.io/stable/auto_examples/sampler/initial-sampling-method.html

alabi.utility.prior_sampler_normal(prior_data, bounds, nsample=1)[source]¶

alabi.utility.lnprior_uniform(x, bounds)[source]¶

Evaluate log-probability density of uniform prior distribution.

This function computes the log-probability density for a uniform (flat) prior distribution within specified bounds. Points outside the bounds receive log-probability of negative infinity.

Parameters:

x – (array-like of shape (ndim,) or float) Parameter values at which to evaluate the log-prior. For scalar input, assumes 1D parameter space.
bounds – (array-like of shape (ndim, 2)) Parameter bounds as [(min, max), …] for each dimension.

Returns:

float Log-probability density. Returns 0.0 if all parameters are within bounds, -np.inf if any parameter is outside bounds.

Notes

For a uniform distribution on [a, b], the probability density is 1/(b-a) and the log-probability density is -log(b-a). However, this function returns 0.0 for in-bounds points since constant offsets don’t affect relative probabilities in MCMC sampling.

Examples

Check if point is within bounds:

bounds = [(-1, 1), (0, 2)]
x = [0.5, 1.0]
lnp = lnprior_uniform(x, bounds)  # Returns 0.0

Point outside bounds:

x = [2.0, 1.0]  # First parameter out of bounds
lnp = lnprior_uniform(x, bounds)  # Returns -inf

alabi.utility.prior_transform_uniform(theta, bounds)[source]¶

Transform uniform random variables to parameter space with specified bounds.

This function implements the inverse CDF transformation for uniform distributions, mapping from the unit hypercube [0,1]^ndim to the parameter space with given bounds. It is commonly used in nested sampling algorithms like dynesty and UltraNest.

Parameters:

theta – (array-like) Random variables uniformly distributed on [0,1]. Can be: - 1D array of shape (ndim,) for a single parameter vector - 2D array of shape (nsamples, ndim) for multiple parameter vectors
bounds – (array-like of shape (ndim, 2)) Parameter bounds as [(min, max), …] for each dimension.

Returns:

ndarray Transformed parameter values within the specified bounds. - If input is 1D: returns 1D array of shape (ndim,) - If input is 2D: returns 2D array of shape (nsamples, ndim)

Notes

The transformation for each dimension i is:

\[\theta'_i = (b_{i,\text{max}} - b_{i,\text{min}}) \theta_i + b_{i,\text{min}}\]

where b_{i,min} and b_{i,max} are the bounds for dimension i.

This is the inverse of the uniform CDF, mapping uniform random variables on [0,1] to uniform random variables on [a,b].

Examples

Transform single parameter vector:

bounds = [(-2, 2), (0, 10)]
theta_unit = [0.25, 0.8]  # Single vector
theta_params = prior_transform_uniform(theta_unit, bounds)
print(theta_params)  # [-1.0, 8.0]

Transform multiple parameter vectors:

theta_unit = [[0.25, 0.8], [0.5, 0.2]]  # Multiple vectors
theta_params = prior_transform_uniform(theta_unit, bounds)
print(theta_params)  # [[-1.0, 8.0], [0.0, 2.0]]

Use with nested sampling:

def my_prior_transform(u):
    return prior_transform_uniform(u, bounds)
# Pass to dynesty/UltraNest sampler

alabi.utility.lnprior_normal(x, bounds, data)[source]¶

alabi.utility.prior_transform_normal(x, bounds, data)[source]¶

Transform uniform random variables to parameter space with mixed prior distributions.

This function implements prior transformations supporting both uniform and Gaussian priors for different parameters. It maps from the unit hypercube [0,1]^ndim to the parameter space, handling each dimension according to its specified prior type.

Parameters:

x – (array-like) Random variables uniformly distributed on [0,1]. Can be: - 1D array of shape (ndim,) for a single parameter vector - 2D array of shape (nsamples, ndim) for multiple parameter vectors
bounds – (array-like of shape (ndim, 2)) Parameter bounds as [(min, max), …] for each dimension. Used for uniform priors.
data – (list of tuples) Prior specification for each dimension as [(mean, std), …]. - If data[i] = (None, None): use uniform prior with bounds[i] - If data[i] = (mean, std): use Gaussian prior with specified mean and standard deviation

Returns:

ndarray Transformed parameter values according to their prior distributions. - If input is 1D: returns 1D array of shape (ndim,) - If input is 2D: returns 2D array of shape (nsamples, ndim)

Notes

For uniform priors (when data[i][0] is None):

\[\theta'_i = (b_{i,\text{max}} - b_{i,\text{min}}) x_i + b_{i,\text{min}}\]

For Gaussian priors (when data[i] = (μ, σ)):

\[\theta'_i = \Phi^{-1}(x_i; \mu_i, \sigma_i)\]

where Φ^{-1} is the inverse normal CDF (percent-point function).

Examples

Mixed uniform and Gaussian priors:

bounds = [(-2, 2), (0, 10)]
data = [(None, None), (5.0, 1.0)]  # uniform, then Gaussian(5, 1)
x_unit = [0.5, 0.84]  # Single vector
x_params = prior_transform_normal(x_unit, bounds, data)
print(x_params)  # [0.0, ~6.0] (second value from normal inverse CDF)

Vectorized transformation:

x_unit = [[0.5, 0.84], [0.25, 0.16]]  # Multiple vectors
x_params = prior_transform_normal(x_unit, bounds, data)
# Returns 2D array with transformed parameters

class alabi.utility.BetaWarpingFunction(alpha=2.0, beta=2.0)[source]¶

Bases: object

Swersky et al. 2017 Beta CDF warping function for sklearn FunctionTransformer.

Note: This transformer must be refit when new data extends beyond the original data range to update the internal MinMaxScaler bounds.

__init__(alpha=2.0, beta=2.0)[source]¶

fit(X, y=None)[source]¶

Fit the MinMaxScaler to the data, establishing the data range bounds.

This updates the internal min/max values used for scaling. Should be called whenever new data extends beyond the previous range.

transform(X)[source]¶

fit_transform(X, y=None)[source]¶: Fit and transform in one step.

inverse_transform(X)[source]¶

alabi.utility.beta_warping_transformer(alpha=2.0, beta=2.0)[source]¶

Create a scikit-learn FunctionTransformer for Beta CDF warping.

Parameters¶

alphafloat or array-like: Shape parameter α for Beta distribution
betafloat or array-like: Shape parameter β for Beta distribution

Returns¶

transformerFunctionTransformer: Scikit-learn transformer object

class alabi.utility.NewFunctionTransformer(name, func=None, inverse_func=None, *, validate=False, accept_sparse=False, check_inverse=True, feature_names_out=None, kw_args=None, inv_kw_args=None)[source]¶

Bases: FunctionTransformer

__init__(name, func=None, inverse_func=None, *, validate=False, accept_sparse=False, check_inverse=True, feature_names_out=None, kw_args=None, inv_kw_args=None)[source]¶

alabi.visualization module¶

`visualization.py`¶

alabi.visualization.plot_error_vs_iteration(iteration, train_error, test_error=None, log=False, metric='Error', title='GP fit', show=False, savedir='.', savename=None)[source]¶

alabi.visualization.plot_hyperparam_vs_iteration(sm, title='GP fit', show=False)[source]¶

alabi.visualization.plot_train_time_vs_iteration(sm, title='GP fit', show=False)[source]¶

alabi.visualization.plot_corner_lnp(sm, show=False)[source]¶

alabi.visualization.plot_corner_scatter(sm, show=False)[source]¶

alabi.visualization.plot_gp_fit_1D(sm, title='GP fit', show=False)[source]¶

alabi.visualization.plot_gp_fit_2D(sm, ngrid=60, title='GP fit', cmap='Blues_r', show=False, vmin=None, vmax=None, log_scale=False)[source]¶

alabi.visualization.plot_contour_2D(fn, bounds, savedir, savename, title, ngrid=60, cmap='Blues_r', show=False, xlabel=None, ylabel=None, vmin=None, vmax=None, log_scale=False)[source]¶

alabi.visualization.plot_true_fit_2D(sm, ngrid=60, show=False, log_scale=False, vmin=None, vmax=None)[source]¶

alabi.visualization.plot_utility_2D(sm, ngrid=60, show=False, log_scale=False, vmin=None, vmax=None)[source]¶

alabi.visualization.plot_dynesty_traceplot(sm, show=False)[source]¶

alabi.visualization.plot_dynesty_runplot(sm, show=False)[source]¶

alabi.visualization.plot_mcmc_comparison(samples1, samples2, bounds=None, param_names=None, name1='sampler 1 posterior', name2='sampler 2 posterior', savedir='.', savename='mcmc_comparison.png', show=False, lw=1.5, colors=['orange', 'royalblue'], **kwargs)[source]¶

alabi.visualization.plot_sampler_comparison(sm, show=False)[source]¶

alabi.visualization.plot_2D_panel4(savedir, savename=None)[source]¶

alabi.visualization.plot_gp_predictions_1D(sm, theta, ngrid=100, nsigma=2, plot_samples=None, plot_layout=None, title=None, show=False, savedir='.', savename=None, ylim=None, legend_loc='best')[source]¶

Plot 1D GP surrogate predictions for each input parameter, varying one parameter at a time while holding the rest fixed at theta.

For each dimension a panel is drawn showing:

The GP predictive mean as a function of that parameter.
A shaded band of ±``nsigma`` predictive standard deviations.
Vertical dashed line marking the reference value in theta.
Scatter of training points projected onto that parameter axis.

Parameters:

sm (SurrogateModel) – Trained surrogate model.
theta (array-like of shape (ndim,)) – Reference point (unscaled). Each panel sweeps one dimension across sm.bounds while the other dimensions are held fixed at the corresponding value in theta.
ngrid (int, optional) – Number of grid points used per sweep. Default is 100.
nsigma (int, optional) – Width of the shaded uncertainty band in standard deviations. Default is 2.
plot_samples (int or None, optional) – Number of posterior function samples to draw from the GP and overlay on each panel. Each sample is a random draw from the GP posterior conditioned on the training data, plotted as a thin semi-transparent line. If None (default), no samples are drawn.
title (str or None, optional) – Overall figure title. If None, no suptitle is added.
show (bool, optional) – Whether to call plt.show(). Default is False.
savedir (str, optional) – Directory to save the figure. Default is ".".
savename (str or None, optional) – Filename for the saved figure. If None, defaults to "gp_predictions_1D.png".
ylim (tuple of (float, float) or None, optional) – Optional y-axis limits for all panels, given as (ymin, ymax). If None, the limits are determined automatically.

Returns:

The matplotlib figure.

Return type:

matplotlib.figure.Figure

alabi package¶

alabi.benchmarks module¶

benchmarks.py¶

alabi.cache_utils module¶

cache_utils.py¶

alabi.core module¶

core.py¶

Notes¶

Attributes Set¶

References¶

Attributes Set¶

Notes¶

Examples¶

References¶

Attributes Set¶

References¶

Attributes Set¶

References¶

Notes¶

Examples¶

alabi.gp_utils module¶

gp_utils.py¶

Parameters:¶

Returns:¶

Parameters:¶

Returns:¶

alabi.mcmc_utils module¶

mcmc_utils.py¶

alabi.utility module¶

utility.py¶

Parameters¶

Returns¶

alabi.visualization module¶

visualization.py¶

`benchmarks.py`¶

`cache_utils.py`¶

`core.py`¶

`gp_utils.py`¶

`mcmc_utils.py`¶

`utility.py`¶

`visualization.py`¶