This notebook is available on GitHub. Interactive online version:

Installation + Quickstart¶

To install alabi, clone it from the git repo:

git clone https://github.com/jbirky/alabi
cd alabi
python setup.py install

Step 1¶

Import python modules:

[1]:

import numpy as np
import matplotlib.pyplot as plt

from alabi.core import SurrogateModel

from matplotlib import rcParams
# rcParams['font.family'] = 'serif'
# rcParams['text.usetex'] = True

random_state = 7
np.random.seed(random_state)

Step 2¶

Define the test function and the bounds for the input space. For example:

[2]:

def test1d_fn(x):
    return np.sin(5 * x) * (1 - np.tanh(x**2))

bounds = [(-1, 1)]

Step 3¶

Initialize the surrogate model, specifying the function to train on, the bounds of the input space, and directory where the results will be saved:

[3]:

sm = SurrogateModel(lnlike_fn=test1d_fn, bounds=bounds, savedir=f"results/test1d", random_state=random_state)

Step 4¶

Initialize the gaussian process surrogate model by specifying a kernel. In this example we’ll use a squared exponential kernel:

$ k(x, x’) = \sigma_f^2 \exp\left`(-:nbsphinx-math:frac{(x - x’)^2}{2ell^2}`:nbsphinx-math:right) $

where $ k(x, x’) $ is the kernel function, $ \sigma_f^2 $ is the amplitude hyperparameter, $ :nbsphinx-math:`ell `$ is the length scale hyperparameter, and $ x $ and $ x’ $ are input points.

[4]:

sm.init_samples(ntrain=10)
sm.init_gp(kernel="ExpSquaredKernel", fit_amp=True, fit_mean=True, hyperopt_method="ml", overwrite=True)

Initialized GP with squared exponential kernel.
Successfully initialized GP on attempt 1

Optimizing GP: 100%|██████████| 3/3 [00:00<00:00, 54.26it/s]

Initial -logL:   0.5847 |        Regularization: 1.6234 |        Total: 2.2081
Final -logL:     -8.6579 |       Regularization: 1.3110 |        Total: -7.3469
13 iterations | Success: True | Message: CONVERGENCE: RELATIVE REDUCTION OF F <= FACTR*EPSMCH

Step 5¶

Improve the surrogate model fit by iteratively selecting new training points using active learning:

[5]:

sm.active_train(niter=30, algorithm="bape", gp_opt_freq=10)

Running 30 active learning iterations using bape...

Optimizing GP: 100%|██████████| 3/3 [00:00<00:00, 53.97it/s]
 33%|███▎      | 10/30 [00:00<00:00, 72.83it/s]

Initial -logL:   -57.3958 |      Regularization: 1.3110 |        Total: -56.0848
Final -logL:     -59.0394 |      Regularization: 1.3352 |        Total: -57.7042
18 iterations | Success: True | Message: CONVERGENCE: RELATIVE REDUCTION OF F <= FACTR*EPSMCH

Train MSE: 9.298666599099183e-10

Optimizing GP: 100%|██████████| 3/3 [00:00<00:00, 60.86it/s]
 67%|██████▋   | 20/30 [00:00<00:00, 75.69it/s]

Initial -logL:   -121.3295 |     Regularization: 1.3352 |        Total: -119.9943
Final -logL:     -121.3499 |     Regularization: 1.3367 |        Total: -120.0132
17 iterations | Success: True | Message: CONVERGENCE: RELATIVE REDUCTION OF F <= FACTR*EPSMCH

Train MSE: 2.79227309296412e-09

Optimizing GP: 100%|██████████| 3/3 [00:00<00:00,  4.29it/s]
100%|██████████| 30/30 [00:02<00:00, 13.53it/s]

Initial -logL:   -185.4210 |     Regularization: 1.3367 |        Total: -184.0843
Final -logL:     -185.4270 |     Regularization: 1.3376 |        Total: -184.0893
6 iterations | Success: True | Message: CONVERGENCE: RELATIVE REDUCTION OF F <= FACTR*EPSMCH

Train MSE: 2.5884057559963692e-09
Caching model to results/test1d/surrogate_model...

Step 6¶

Run Markov Chain Monte Carlo (MCMC) sampler using either the emcee package:

[6]:

sm.run_emcee(nwalkers=4, nsteps=int(5e4))

Initializing emcee with self.surrogate_log_likelihood surrogate model as likelihood.
No prior_fn specified. Defaulting to uniform prior with bounds [[-1  1]]
Running emcee with 4 walkers for 50000 steps on 1 cores...

Run 1: Need 10000 more samples...
==================================================

100%|██████████| 50000/50000 [00:51<00:00, 973.36it/s]

burn-in estimate: 118
thin estimate: 29

Run 1 complete: 6880 samples
Total accumulated samples: 6880

Run 2: Need 3120 more samples...
==================================================

100%|██████████| 50000/50000 [00:50<00:00, 981.16it/s]

burn-in estimate: 105
thin estimate: 26

Run 2 complete: 7676 samples
Total accumulated samples: 14556

Combined 2 runs into 14556 total samples
Total samples: 14556
Mean acceptance fraction: 0.694
Mean autocorrelation time: 52.681 steps
Caching model to results/test1d/surrogate_model...
Saving final emcee samples to results/test1d/emcee_samples_final_surrogate_iter_30.npz ...

[7]:

sm.run_dynesty()

Initializing dynesty with self.surrogate_log_likelihood surrogate model as likelihood.
Initialized dynesty DynamicNestedSampler.
Running dynesty with 50 live points on 1 cores...

Run 1: Need 10000 more samples...
==================================================

10103it [00:09, 1057.98it/s, batch: 78 | bound: 0 | nc: 1 | ncall: 22538 | eff(%): 44.827 | loglstar:   -inf <  0.909 <  0.295 | logz:  0.127 +/-  0.022 | stop:  0.999]

Run 1 complete: 10103 samples, logZ = 0.127
Total accumulated samples: 10103
Caching model to results/test1d/surrogate_model...
Saved dynesty samples to results/test1d/dynesty_samples_final_surrogate_iter_30.npz

[8]:

xgrid = np.linspace(bounds[0][0], bounds[0][1], 100)
ygrid = np.exp(test1d_fn(xgrid))

plt.hist(sm.emcee_samples.T[0], bins=50, histtype='step', density=True, label="emcee samples")
plt.hist(sm.dynesty_samples.T[0], bins=50, histtype='step', density=True, label="dynesty samples")
plt.plot(xgrid, ygrid/max(ygrid), label="true function", color="k", linestyle="--")
plt.xlabel("$x$", fontsize=25)
plt.xlim(*bounds)
plt.legend(loc="upper left", fontsize=18, frameon=False)
plt.minorticks_on()
plt.show()

[ ]:

Download Python source code: 1_quickstart.py

Download Jupyter notebook: 1_quickstart.ipynb