Resampling script

The final step in a Bayesian or analytic fit in Colibri (before evolution) is to write out posterior samples as .exportgrid files in the replicas/ folder. These samples will then be evolved into a PDF set by the evolution script.

As explained in the tutorial on running a Bayesian fit, you can decide how many samples are written out in the runcard. Here we give more details on the resampling settings, which apply both to Bayesain and analytic fits.

For a Bayesian fit using the analytical inference method, you can set the total number of posterior draws via the analytic_settings block. For example:

# Analytic settings
analytic_settings:
  n_posterior_samples: 100
  full_sample_size: 50000

Likewise, if you instead use the UltraNest nested sampler, you can specify the same parameter name under ultranest_settings:

# ultranest settings
ultranest_settings:
  n_posterior_samples: 100
  ...

Key Parameters

n_posterior_samples: The number of individual posterior draws that will each be written out as a separate .exportgrid file in the replicas/ folder. If it is not specified, a default of 1000 samples will be written out.
full_sample_size (analytic only) : The total size of the merged posterior sample, which is saved to full_posterior_sample.csv at the top level of your fit directory.

Note

In the case of a fit done using the UltraNest nested sampler, the full_sample_size defaults to an internal number that might depends on the specific run.

If you want to draw additional replicas (or have a smaller set for a finite-size effects studies) from the posterior distribution of an already-completed PDF fit, you do not need to re-run the full fit. Instead, use the resample_fit helper script.

Usage

To see all available options, invoke:

$ resample_fit --help

This will print out a help message that looks like this:

usage: resample_fit [-h] [--fitype FITYPE] [--nreplicas NREPLICAS] [--resampling_seed RESAMPLING_SEED]
                    [--resampled_fit_name RESAMPLED_FIT_NAME] [--parametrisation_scale PARAMETRISATION_SCALE]
                    fit_name

Script to resample from Bayesian posterior

positional arguments:
  fit_name              The colibri fit from which to sample.

options:
  -h, --help            show this help message and exit
  --fitype FITYPE, -t FITYPE
                        The type of fit to be resampled. Currently only `ultranest` and `analytic` are supported.
  --nreplicas NREPLICAS, -nrep NREPLICAS
                        The number of samples.
  --resampling_seed RESAMPLING_SEED, -seed RESAMPLING_SEED
                        The random seed to be used to sample from the posterior.
  --resampled_fit_name RESAMPLED_FIT_NAME, -newfit RESAMPLED_FIT_NAME
                        The name of the resampled fit.
  --parametrisation_scale PARAMETRISATION_SCALE, -Q PARAMETRISATION_SCALE
                        The scale at which the PDFs are fitted.

As an example, if we want to resample from the posterior distribution of an analytical fit called my_fit we can do it as follows:

resample_fit my_fit -t analytic -n 100 -seed 1234 -newfit my_resampled_fit

Note

In order to resample from the posterior distribution of a fit, you need to be in the same environment as the one used to perform the fit. Hence, if you want to resample a fit done using the Les Houches model, you need to be in the environment where the les_houches_exe exectuable is available.