Bayesian Fit with UltraNest

In this tutorial we will first look at an example runcard to run a Bayesian fit with the Les Houches parametrisation model (see this tutorial for details on the Les Houches model and how to implement it).

Then we will look at the command to execute the runcard.

We do this using UltraNest as the nested sampler [Buchner16, Buchner19, Buchner21].

Runcard

meta: 'An example fit using Colibri, reduced DIS dataset.'

#######################
# Data and theory specs
#######################

dataset_inputs:
# DIS
- {dataset: SLAC_NC_NOTFIXED_P_EM-F2, variant: legacy_dw}
- {dataset: SLAC_NC_NOTFIXED_D_EM-F2, variant: legacy_dw}
- {dataset: BCDMS_NC_NOTFIXED_P_EM-F2, variant: legacy_dw}
- {dataset: BCDMS_NC_NOTFIXED_D_EM-F2, variant: legacy_dw}
# - {dataset: CHORUS_CC_NOTFIXED_PB_NU-SIGMARED, variant: legacy_dw}
# - {dataset: CHORUS_CC_NOTFIXED_PB_NB-SIGMARED, variant: legacy_dw}
# - {dataset: NUTEV_CC_NOTFIXED_FE_NU-SIGMARED, cfac: [MAS], variant: legacy_dw}
# - {dataset: NUTEV_CC_NOTFIXED_FE_NB-SIGMARED, cfac: [MAS], variant: legacy_dw}
# - {dataset: HERA_NC_318GEV_EM-SIGMARED, variant: legacy}
# - {dataset: HERA_NC_225GEV_EP-SIGMARED, variant: legacy}
# - {dataset: HERA_NC_251GEV_EP-SIGMARED, variant: legacy}
# - {dataset: HERA_NC_300GEV_EP-SIGMARED, variant: legacy}
# - {dataset: HERA_NC_318GEV_EP-SIGMARED, variant: legacy}
# - {dataset: HERA_CC_318GEV_EM-SIGMARED, variant: legacy}
# - {dataset: HERA_CC_318GEV_EP-SIGMARED, variant: legacy}
# - {dataset: HERA_NC_318GEV_EAVG_CHARM-SIGMARED, variant: legacy}
# - {dataset: HERA_NC_318GEV_EAVG_BOTTOM-SIGMARED, variant: legacy}
# - {dataset: NMC_NC_NOTFIXED_EM-F2, variant: legacy_dw}
# - {dataset: NMC_NC_NOTFIXED_P_EM-SIGMARED, variant: legacy}

theoryid: 40000000                     # The theory from which the predictions are drawn.
use_cuts: internal                     # The kinematic cuts to be applied to the data.

#####################
# Loss function specs
#####################

positivity:                            # Positivity datasets, used in the positivity penalty.
    posdatasets:
    - {dataset: NNPDF_POS_2P24GEV_F2U, variant: None, maxlambda: 1e6}

positivity_penalty_settings:
    positivity_penalty: false
    alpha: 1e-7
    lambda_positivity: 0

# Integrability Settings
integrability_settings:
    integrability: False

use_fit_t0: True                        # Whether the t0 covariance is used in the chi2 loss.
t0pdfset: NNPDF40_nnlo_as_01180         # The t0 PDF used to build the t0 covariance matrix.


###################
# Methodology specs
###################
prior_settings:
    prior_distribution: uniform_parameter_prior
    prior_distribution_specs:
        bounds:
            alpha_gluon: [-0.1, 1]
            beta_gluon: [9, 13]
            alpha_up: [0.4, 0.9]
            beta_up: [3, 4.5]
            epsilon_up: [-3, 3]
            gamma_up: [1, 6]
            alpha_down: [1, 2]
            beta_down: [8, 12]
            epsilon_down: [-4.5, -3]
            gamma_down: [3.8, 5.8]
            norm_sigma: [0.1, 0.5]
            alpha_sigma: [-0.2, 0.1]
            beta_sigma: [1.2, 3]


# Nested Sampling settings
ultranest_settings:
    sampler_plot: true
    n_posterior_samples: 100        # Number of posterior samples generated.
    ReactiveNS_settings:
        vectorized: False
        ndraw_max: 500              # Maximum number of points to simultaneously propose.
        # Any of the options of ultranest ReactiveNestedSampler can be used
    Run_settings:
        min_num_live_points: 200    # Minimum number of live points throughout the run.
        min_ess: 50                 # Target number of effective posterior samples.
        frac_remain: 0.3            # Integrate until this fraction of the integral is left in the remainder.
        # Any of the options of ultranest ReactiveNestedSampler run method can be defined
    SliceSampler_settings:
        nsteps: 106                 # number of accepted steps until the sample is considered independent.


actions_:
- run_ultranest_fit                 # Choose from ultranest_fit, monte_carlo_fit, analytic_fit

Note how the prior bounds need to be specified for each parameter. Alternatively, global bounds (i.e the same bounds for all parameters) can be used, by replacing

bounds:
    alpha_gluon: [-0.1, 1]
    beta_gluon: [9, 13]
...

with, for example:

min_val: -4.5
max_val: 13

in those cases where it is appropriate for the given parameters of the model (eg. only one parameter or all parameters have close numerical values).

For details on general settings (such as positivity) see this section.

ultranest_settings

  • ultranest_seed: Seed for the numpy random number generator used by UltraNest.

  • sampler_plot: true will generate diagnostic plots (corner, run and trace plots) in fit_output_directory/ultranest_logs/plots. These help assess the convergence and efficiency of the fit.

  • n_posterior_samples: Number of posterior samples (‘replicas’) drawn (resampled) from the posterior distribution. The default is 1000. See this tutorial for details on resampling.

  • vectorized: Determines whether the likelihood function supports vectorised evaluation (i.e., evaluating multiple points at once).

  • ndraw_max: Maximum number of points to simultaneously propose. Can be commented out.

  • min_num_live_points: Minimum number of live points throughout the run.

  • min_ess: Target number of effective posterior samples.

  • frac_remain: Integrate until this fraction of the integral is left in the remainder.

  • SliceSampler_settings: Sampling uniformly within “slices” of constant probability. Slice sampling is optional, so these settings can be commented out.

  • nsteps: Number of accepted steps until the sample is considered independent.

  • posterior_resampling_seed: Seed for resampling the posterior samples.

prior_distribution

As well as initialising with a uniform prior distribution, such as in the runcard above, it is possible to initialise by setting the prior from a gaussian posterior as a result from a previous fit. This has shown to yield equivalent results, while being more computationally efficient (see Ref. CMMU25). This can be achieved by setting:

prior_settings:
    prior_distribution: prior_from_gauss_posterior
    prior_distribution_specs:
        prior_fit: your_previous_fit

The previous fit your_previous_fit folder needs to be placed in sys.prefix/share/colibri/results/.

UltraNest settings

For ReactiveNS_settings and Run_settings, you can use any of the options of UltraNest’s ReactiveNestedSampler, which you can read more about here.

Running the fit

In general, Colibri runcards can be executed by running the following command:

model_executable runcard.yaml

This must be done after installing the dependencies specific to the model. For example, for the Les Houches parametrisation model presented in this tutorial, the first step would be to run

pip install -e .

from the examples/les_houches_example directory.

Then, you can use the above runcard with the following command:

les_houches_exe runcard.yaml

Running fits will generate fit folders, the details of which can be found in this section.

Terminal output

As the fit runs, a status line and live point display will be displayed in the terminal for each iteration. For details on what they mean and how to interpret them, see the UltraNest documentation. Specifically, this page.