All Functions

Analytic Fit

colibri.analytic_fit.py

For a linear model, this module allows for an analytic Bayesian fit of the model.

class colibri.analytic_fit.AnalyticFit(param_names: list, resampled_posterior: array, full_posterior_samples: array, bayesian_metrics: dict, analytic_specs: dict)[source]

Bases: BayesianFit

Dataclass containing the results and specs of an analytic fit.

analytic_specs

Dictionary containing the settings of the analytic fit.

Type:

dict

resampled_posterior

Array containing the resampled posterior samples.

Type:

jnp.array

analytic_specs: dict
colibri.analytic_fit.analytic_evidence_uniform_prior(sol_covmat, sol_mean, max_logl, a_vec, b_vec)[source]

Compute the log of the evidence for Gaussian likelihood and uniform prior. The implementation is based on the following paper: https://arxiv.org/pdf/2301.13783 and consists in a small improvement of the Laplace approximation.

Parameters:
  • sol_covmat (array) – Covariance matrix of the posterior (X^T Sigma^-1 X)^-1.

  • sol_mean

  • a_vec (np.ndarray) – Lower bounds of the Uniform prior.

  • b_vec (np.ndarray) – Upper bounds of the Uniform prior.

Returns:

float

Return type:

The log evidence.

colibri.analytic_fit.analytic_fit(central_inv_covmat_index, _pred_data, pdf_model, analytic_settings, prior_settings, FIT_XGRID, fast_kernel_arrays)[source]

Analytic fits, for any linear PDF model.

The assumption is that the model is linear with an intercept: T(w) = T(0) + X w. The linear problem to solve is through minimisation of the chi2: chi2 = (D - (T(0) + X w))^T Sigma^-1 (D - (T(0) + X w)) = (Y - X w)^T Sigma^-1 (Y - X w) with Y = D - T(0).

Parameters:
  • central_inv_covmat_index (commondata_utils.CentralInvCovmatIndex) – dataclass containing central values and inverse covmat.

  • _pred_data (@jax.jit CompiledFunction) – Prediction function for the fit.

  • pdf_model (pdf_model.PDFModel) – PDF model to fit.

  • analytic_settings (dict) – Settings for the analytic fit.

  • prior_settings (PriorSettings) – Settings for the prior.

  • FIT_XGRID (np.ndarray) – xgrid of the theory, computed by a production rule by taking the sorted union of the xgrids of the datasets entering the fit.

  • fast_kernel_arrays (tuple) – Tuple containing the fast kernel arrays.

colibri.analytic_fit.run_analytic_fit(analytic_fit, output_path, pdf_model)[source]

Export the results of an analytic fit.

Parameters:
  • analytic_fit (AnalyticFit) – The results of the analytic fit.

  • output_path (pathlib.PosixPath) – Path to the output folder.

  • pdf_model (pdf_model.PDFModel) – The PDF model used in the fit.

Bayes Prior

colibri.bayes_prior.bayesian_prior(prior_settings, pdf_model)[source]

Produces a prior transform function.

Parameters:

prior_settings (dict) – The settings for the prior transform.

Returns:

prior_transform – The prior transform function.

Return type:

@jax.jit CompiledFunction

Core

colibri.core.py

Core module of colibri, containing the main (data) classes for the framework.

class colibri.core.IntegrabilitySettings(integrability: bool, integrability_specs: dict)[source]

Bases: object

Dataclass containing the settings for the Integrability constraints to be imposed during a fit.

integrability

Whether to impose integrability constraints.

Type:

bool

integrability_specs

The settings for the integrability constraints.

Type:

dict

Example

integrability_settings:

integrability: True integrability_specs: {‘evolution_flavours’: [V, V3, V8, T3, T8], ‘lambda_integrability’: 1000}

integrability: bool
integrability_specs: dict
class colibri.core.PriorSettings(prior_distribution: str, prior_distribution_specs: dict)[source]

Bases: object

Dataclass containing the settings for the prior transform.

prior_distribution

The type of prior transform.

Type:

str

prior_distribution_specs

The settings for the prior distribution. Examples: if prior_distribution is “uniform_parameter_prior”, prior_distribution_specs could be {“max_val”: 1.0, “min_val”: -1.0}

Type:

dict

prior_distribution: str
prior_distribution_specs: dict

Export Results

colibri.export_results.py

This module contains the functions to export the results of the fit.

class colibri.export_results.BayesianFit(param_names: list, resampled_posterior: array, full_posterior_samples: array, bayesian_metrics: dict)[source]

Bases: object

Dataclass containing the results and specs of a Bayesian fit.

param_names

List of the names of the parameters.

Type:

list

resampled_posterior

Array containing the resampled posterior samples.

Type:

jnp.array

full_posterior_samples

Array containing the full posterior samples.

Type:

jnp.array

bayes_complexity

The Bayesian complexity of the model.

Type:

float

avg_chi2

The average chi2 of the model.

Type:

float

min_chi2

The minimum chi2 of the model.

Type:

float

logz

The log evidence of the model.

Type:

float

bayesian_metrics: dict
full_posterior_samples: array
param_names: list
resampled_posterior: array
colibri.export_results.export_bayes_results(bayes_fit, output_path, results_name)[source]

Export the results of a Bayesian fit to a csv file.

Parameters:
  • bayes_fit (BayesianFit) – The results of the Bayesian fit.

  • output_path (pathlib.PosixPath) – Path to the output folder.

  • results_name (str) – Name of the results file.

colibri.export_results.get_pdfgrid_from_exportgrids(fit_path: Path)[source]

Reads the exportgrids contained in the replicas folder of the fit_path and returns the pdf grid of shape (Nrep, Nfl, Nx) in the evolution basis.

Parameters:

fit_path (pathlib.Path) – Path to the fit folder.

Returns:

pdf_grid – Array containing the pdf grid in the evolution basis.

Return type:

np.array

colibri.export_results.read_exportgrid(exportgrid_path: Path)[source]

Reads an exportgrid file from the output path, and returns a dictionary containing the pdf grid in the evolution basis.

Parameters:

exportgrid_path (pathlib.Path) – Path to the exportgrid file.

Returns:

export_grid – Dictionary containing the pdf grid in the evolution basis.

Return type:

dict

colibri.export_results.write_exportgrid(grid_for_writing, grid_name, replica_index, Q=1.65, xgrid=[1e-09, 1.29708482343957e-09, 1.68242903474257e-09, 2.18225315420583e-09, 2.83056741739819e-09, 3.67148597892941e-09, 4.76222862935315e-09, 6.1770142737618e-09, 8.01211109898438e-09, 1.03923870607245e-08, 1.34798064073805e-08, 1.74844503691778e-08, 2.26788118881103e-08, 2.94163370300835e-08, 3.81554746595878e-08, 4.94908707232129e-08, 6.41938295708371e-08, 8.32647951986859e-08, 1.08001422993829e-07, 1.4008687308113e-07, 1.81704331793772e-07, 2.35685551545377e-07, 3.05703512595323e-07, 3.96522309841747e-07, 5.1432125723657e-07, 6.67115245136676e-07, 8.65299922973143e-07, 1.12235875241487e-06, 1.45577995547683e-06, 1.88824560514613e-06, 2.44917352454946e-06, 3.17671650028717e-06, 4.12035415232797e-06, 5.3442526575209e-06, 6.93161897806315e-06, 8.99034258238145e-06, 1.16603030112258e-05, 1.51228312288769e-05, 1.96129529349212e-05, 2.54352207134502e-05, 3.29841683435992e-05, 4.27707053972016e-05, 5.54561248105849e-05, 7.18958313632514e-05, 9.31954227979614e-05, 0.00012078236773133, 0.000156497209466554, 0.000202708936328495, 0.000262459799331951, 0.000339645244168985, 0.000439234443000422, 0.000567535660104533, 0.000732507615725537, 0.000944112105452451, 0.00121469317686978, 0.00155935306118224, 0.00199627451141338, 0.00254691493736552, 0.00323597510213126, 0.00409103436509565, 0.00514175977083962, 0.00641865096062317, 0.00795137940306351, 0.009766899996241, 0.0118876139251364, 0.0143298947643919, 0.0171032279460271, 0.0202100733925079, 0.0236463971369542, 0.0274026915728357, 0.0314652506132444, 0.0358174829282429, 0.0404411060163317, 0.0453171343973807, 0.0504266347950069, 0.0557512610084339, 0.0612736019390519, 0.0669773829498255, 0.0728475589986517, 0.0788703322292727, 0.0850331197801452, 0.0913244910278679, 0.0977340879783772, 0.104252538208639, 0.110871366547237, 0.117582909372878, 0.124380233801599, 0.131257062945031, 0.138207707707289, 0.145227005135651, 0.152310263065985, 0.159453210652156, 0.166651954293987, 0.173902938455578, 0.181202910873333, 0.188548891679097, 0.195938145999193, 0.203368159629765, 0.210836617429103, 0.218341384106561, 0.225880487124065, 0.233452101459503, 0.241054536011681, 0.248686221452762, 0.256345699358723, 0.264031612468684, 0.271742695942783, 0.279477769504149, 0.287235730364833, 0.295015546847664, 0.302816252626866, 0.310636941519503, 0.318476762768082, 0.326334916761672, 0.334210651149156, 0.342103257303627, 0.350012067101685, 0.357936449985571, 0.365875810279643, 0.373829584735962, 0.381797240286494, 0.389778271981947, 0.397772201099286, 0.40577857340234, 0.413796957540671, 0.421826943574548, 0.429868141614175, 0.437920180563205, 0.44598270695699, 0.454055383887562, 0.462137890007651, 0.470229918607142, 0.478331176755675, 0.486441384506059, 0.494560274153348, 0.502687589545177, 0.510823085439086, 0.518966526903235, 0.527117688756998, 0.535276355048428, 0.543442318565661, 0.551615380379768, 0.559795349416641, 0.5679820420558, 0.576175281754088, 0.584374898692498, 0.59258072944444, 0.60079261666395, 0.609010408792398, 0.61723395978245, 0.625463128838069, 0.633697780169485, 0.641937782762089, 0.650183010158361, 0.658433340251944, 0.666688655093089, 0.674948840704708, 0.683213786908386, 0.691483387159697, 0.699757538392251, 0.708036140869916, 0.716319098046733, 0.724606316434025, 0.732897705474271, 0.741193177421404, 0.749492647227008, 0.757796032432224, 0.766103253064927, 0.774414231541921, 0.782728892575836, 0.791047163086478, 0.799368972116378, 0.807694250750291, 0.816022932038457, 0.824354950923382, 0.832690244169987, 0.841028750298844, 0.8493704095226, 0.857715163684985, 0.866062956202683, 0.874413732009721, 0.882767437504206, 0.891124020497459, 0.899483430165226, 0.907845617001021, 0.916210532771399, 0.924578130473112, 0.932948364292029, 0.941321189563734, 0.949696562735755, 0.958074441331298, 0.966454783914439, 0.974837550056705, 0.983222700304978, 0.991610196150662, 1.0], export_labels=['TBAR', 'BBAR', 'CBAR', 'SBAR', 'UBAR', 'DBAR', 'GLUON', 'D', 'U', 'S', 'C', 'B', 'T', 'PHT'])[source]

Writes an exportgrid file to the output path. The exportgrids are written in the format required by EKO, but are not yet evolved.

Note: grid should be in the evolution basis.

Parameters:
  • grid_for_writing (jnp.array) – An array of shape (Nfl,Nx) containing the PDF values. Note that the grid should be in the evolution basis.

  • grid_name (str) – The name of the grid to write.

  • replica_index (int) – The replica number which will be written.

colibri.export_results.write_replicas(bayes_fit, output_path, pdf_model, Q=1.65, xgrid=[1e-09, 1.29708482343957e-09, 1.68242903474257e-09, 2.18225315420583e-09, 2.83056741739819e-09, 3.67148597892941e-09, 4.76222862935315e-09, 6.1770142737618e-09, 8.01211109898438e-09, 1.03923870607245e-08, 1.34798064073805e-08, 1.74844503691778e-08, 2.26788118881103e-08, 2.94163370300835e-08, 3.81554746595878e-08, 4.94908707232129e-08, 6.41938295708371e-08, 8.32647951986859e-08, 1.08001422993829e-07, 1.4008687308113e-07, 1.81704331793772e-07, 2.35685551545377e-07, 3.05703512595323e-07, 3.96522309841747e-07, 5.1432125723657e-07, 6.67115245136676e-07, 8.65299922973143e-07, 1.12235875241487e-06, 1.45577995547683e-06, 1.88824560514613e-06, 2.44917352454946e-06, 3.17671650028717e-06, 4.12035415232797e-06, 5.3442526575209e-06, 6.93161897806315e-06, 8.99034258238145e-06, 1.16603030112258e-05, 1.51228312288769e-05, 1.96129529349212e-05, 2.54352207134502e-05, 3.29841683435992e-05, 4.27707053972016e-05, 5.54561248105849e-05, 7.18958313632514e-05, 9.31954227979614e-05, 0.00012078236773133, 0.000156497209466554, 0.000202708936328495, 0.000262459799331951, 0.000339645244168985, 0.000439234443000422, 0.000567535660104533, 0.000732507615725537, 0.000944112105452451, 0.00121469317686978, 0.00155935306118224, 0.00199627451141338, 0.00254691493736552, 0.00323597510213126, 0.00409103436509565, 0.00514175977083962, 0.00641865096062317, 0.00795137940306351, 0.009766899996241, 0.0118876139251364, 0.0143298947643919, 0.0171032279460271, 0.0202100733925079, 0.0236463971369542, 0.0274026915728357, 0.0314652506132444, 0.0358174829282429, 0.0404411060163317, 0.0453171343973807, 0.0504266347950069, 0.0557512610084339, 0.0612736019390519, 0.0669773829498255, 0.0728475589986517, 0.0788703322292727, 0.0850331197801452, 0.0913244910278679, 0.0977340879783772, 0.104252538208639, 0.110871366547237, 0.117582909372878, 0.124380233801599, 0.131257062945031, 0.138207707707289, 0.145227005135651, 0.152310263065985, 0.159453210652156, 0.166651954293987, 0.173902938455578, 0.181202910873333, 0.188548891679097, 0.195938145999193, 0.203368159629765, 0.210836617429103, 0.218341384106561, 0.225880487124065, 0.233452101459503, 0.241054536011681, 0.248686221452762, 0.256345699358723, 0.264031612468684, 0.271742695942783, 0.279477769504149, 0.287235730364833, 0.295015546847664, 0.302816252626866, 0.310636941519503, 0.318476762768082, 0.326334916761672, 0.334210651149156, 0.342103257303627, 0.350012067101685, 0.357936449985571, 0.365875810279643, 0.373829584735962, 0.381797240286494, 0.389778271981947, 0.397772201099286, 0.40577857340234, 0.413796957540671, 0.421826943574548, 0.429868141614175, 0.437920180563205, 0.44598270695699, 0.454055383887562, 0.462137890007651, 0.470229918607142, 0.478331176755675, 0.486441384506059, 0.494560274153348, 0.502687589545177, 0.510823085439086, 0.518966526903235, 0.527117688756998, 0.535276355048428, 0.543442318565661, 0.551615380379768, 0.559795349416641, 0.5679820420558, 0.576175281754088, 0.584374898692498, 0.59258072944444, 0.60079261666395, 0.609010408792398, 0.61723395978245, 0.625463128838069, 0.633697780169485, 0.641937782762089, 0.650183010158361, 0.658433340251944, 0.666688655093089, 0.674948840704708, 0.683213786908386, 0.691483387159697, 0.699757538392251, 0.708036140869916, 0.716319098046733, 0.724606316434025, 0.732897705474271, 0.741193177421404, 0.749492647227008, 0.757796032432224, 0.766103253064927, 0.774414231541921, 0.782728892575836, 0.791047163086478, 0.799368972116378, 0.807694250750291, 0.816022932038457, 0.824354950923382, 0.832690244169987, 0.841028750298844, 0.8493704095226, 0.857715163684985, 0.866062956202683, 0.874413732009721, 0.882767437504206, 0.891124020497459, 0.899483430165226, 0.907845617001021, 0.916210532771399, 0.924578130473112, 0.932948364292029, 0.941321189563734, 0.949696562735755, 0.958074441331298, 0.966454783914439, 0.974837550056705, 0.983222700304978, 0.991610196150662, 1.0], export_labels=['TBAR', 'BBAR', 'CBAR', 'SBAR', 'UBAR', 'DBAR', 'GLUON', 'D', 'U', 'S', 'C', 'B', 'T', 'PHT'])[source]

Write the replicas of the Bayesian fit to export grids.

Parameters:
  • bayes_fit (BayesianFit) – The results of the Bayesian fit.

  • output_path (pathlib.PosixPath) – Path to the output folder.

  • pdf_model (pdf_model.PDFModel) – The PDF model used in the fit.

Monte Carlo Fit

colibri.monte_carlo_fit.py

This module contains the main Monte Carlo fitting routine of colibri.

class colibri.monte_carlo_fit.MonteCarloFit(monte_carlo_specs: dict, training_loss: array, validation_loss: array, optimized_parameters: array)[source]

Bases: object

Dataclass containing the results and specs of a Monte Carlo fit.

monte_carlo_specs

Dictionary containing the settings of the Monte Carlo fit.

Type:

dict

training_loss

Array containing the training loss.

Type:

jnp.array

validation_loss

Array containing the validation loss.

Type:

jnp.array

optimized_parameters

Array containing the optimized parameters.

Type:

jnp.array

monte_carlo_specs: dict
optimized_parameters: array
training_loss: array
validation_loss: array
colibri.monte_carlo_fit.monte_carlo_fit(_chi2_training_data_with_positivity, _chi2_validation_data_with_positivity, _pred_data, fast_kernel_arrays, positivity_fast_kernel_arrays, len_trval_data, pdf_model, mc_initial_parameters, optimizer_provider, early_stopper, max_epochs, FIT_XGRID, batch_size=None, batch_seed=1, alpha=1e-07, lambda_positivity=1000)[source]

This function performs a Monte Carlo fit.

Parameters:
  • _chi2_training_data_with_positivity (PjitFunction) – Function that computes the chi2 of the training data.

  • _chi2_validation_data_with_positivity (PjitFunction) – Function that computes the chi2 of the validation data.

  • _pred_data (theory_predictions.make_pred_data) – The function to compute the theory predictions.

  • len_trval_data (tuple) – Tuple containing the length of the training and validation data.

  • pdf_model (pdf_model.PDFModel) – A PDFModel specifying the way in which the PDF is constructed from the parameters.

  • mc_initial_parameters (jnp.array) – Initial parameters for the Monte Carlo fit.

  • optimizer_provider (optax._src.base.GradientTransformationExtraArgs) – Optax optimizer.

  • early_stopper (flax.training.early_stopping.EarlyStopping) – Early stopping criteria.

  • max_epochs (int) – Number of maximum epochs.

  • FIT_XGRID (np.ndarray) – xgrid of the theory, computed by a production rule by taking the sorted union of the xgrids of the datasets entering the fit.

  • batch_size (int, default is None which sets it to the full size of data) – Size of batches during training.

  • batch_seed (int, optional) – Seed used to construct the batches. Defaults to 1.

  • alpha (float, optional) – Alpha parameter of the ELU positivity penalty term. Defaults to 1e-7.

  • lambda_positivity (int, optional) – Lagrange multiplier of the positivity penalty. Defaults to 1000.

Returns:

MonteCarloFit – monte_carlo_specs: dict training_loss: jnp.array validation_loss: jnp.array

Return type:

The result of the fit with following attributes:

colibri.monte_carlo_fit.run_monte_carlo_fit(monte_carlo_fit, pdf_model, output_path, replica_index)[source]

Runs the Monte Carlo fit and writes the output to the output directory.

Parameters:
  • monte_carlo_fit (MonteCarloFit) – The results of the Monte Carlo fit.

  • pdf_model (pdf_model.PDFModel) – The PDF model used in the fit.

  • output_path (pathlib.PosixPath) – Path to the output folder.

  • replica_index (int)

Provider Aliases

colibri.provider_aliases.py

Module collecting aliases for functions when used as providers.

Ultranest Fit

colibri.ultranest_fit.py

This module contains the main Bayesian fitting routine of colibri.

class colibri.ultranest_fit.UltranestFit(param_names: list, resampled_posterior: array, full_posterior_samples: array, bayesian_metrics: dict, ultranest_specs: dict, ultranest_result: dict)[source]

Bases: BayesianFit

Dataclass containing the results and specs of an Ultranest fit.

ultranest_specs

Dictionary containing the settings of the Ultranest fit.

Type:

dict

ultranest_result

result from ultranest, can be used eg for corner plots

Type:

dict

ultranest_result: dict
ultranest_specs: dict
colibri.ultranest_fit.run_ultranest_fit(ultranest_fit, output_path, pdf_model)[source]

Export the results of an Ultranest fit.

Parameters:
  • ultranest_fit (UltranestFit) – The results of the Ultranest fit.

  • output_path (pathlib.PosixPath) – Path to the output folder.

  • pdf_model (pdf_model.PDFModel) – The PDF model used in the fit.

colibri.ultranest_fit.ultranest_fit(pdf_model, bayesian_prior, ns_settings, log_likelihood)[source]

The complete Nested Sampling fitting routine, for any PDF model.

Parameters:
  • pdf_model (pdf_model.PDFModel) – The PDF model to fit.

  • bayesian_prior (@jax.jit CompiledFunction) – The prior function for the model.

  • ns_settings (dict) – Settings for the Nested Sampling fit.

  • log_likelihood (Callable) – The log likelihood function for the model.

Returns:

Dataclass containing the results and specs of an Ultranest fit.

Return type:

UltranestFit

Checks

Module that contains checks for the colibri package. See validphys/checks.py and reportengine/checks.py for more information / examples.

colibri.checks.check_pdf_model_is_linear(pdf_model, FIT_XGRID, data)[source]

Decorator that can be added to functions to check that the PDF model is linear.

colibri.checks.check_pdf_models_equal(prior_settings, pdf_model, theoryid)[source]

Decorator that can be added to functions to check that the PDF model used as prior (eg when using prior_settings[“type”] == “prior_from_gauss_posterior”) matches the PDF model used in the current fit (pdf_model).

Covariance Matrices

colibri.covmats.py

Module containing covariance matrices functions.

Notes: Several functions are taken from validphys.covmats

colibri.covmats.colibri_dataset_inputs_t0_predictions(_pred_t0data, t0_pdf_grid, fast_kernel_arrays)[source]

Similar to validphys.covmats.dataset_inputs_t0_predictions.

Parameters:
  • _pred_t0data (jax.jit compiled function) – function taking a pdf grid and returning theory prediction for one data group

  • t0_pdf_grid (jnp.array)

Returns:

t0predictions – list of theory predictions for each dataset

Return type:

list

colibri.covmats.dataset_inputs_covmat_from_systematics(data, experimental_commondata_tuple)[source]

Similar to validphys.covmats.dataset_inputs_covmat_from_systematics but jax.numpy array.

Note: see production rule in config.py for commondata_tuple options.

colibri.covmats.dataset_inputs_t0_covmat_from_systematics(data, experimental_commondata_tuple, colibri_dataset_inputs_t0_predictions)[source]

Similar as validphys.covmats.dataset_inputs_t0_covmat_from_systematics but jax.numpy array.

Note: see production rule in config.py for commondata_tuple options.

colibri.covmats.sqrt_covmat_jax(covariance_matrix)[source]

Same as validphys.covmats.sqrt_covmat but for jax.numpy arrays

Parameters:

covariance_matrix (jnp.ndarray) – A positive definite covariance matrix, which is N_dat x N_dat (where N_dat is the number of data points after cuts) containing uncertainty and correlation information.

Returns:

sqrt_mat – The square root of the input covariance matrix, which is N_dat x N_dat (where N_dat is the number of data points after cuts), and which is the the lower triangular decomposition. The following should be True: jnp.allclose(sqrt_covmat @ sqrt_covmat.T, covariance_matrix).

Return type:

jnp.ndarray

Loss Functions

colibri.loss_functions.py

This module provides the functions necessary for the computation of the chi2.

colibri.loss_functions.chi2(central_values, predictions, inv_covmat)[source]

Compute the chi2 loss.

Parameters:
  • central_values (jnp.ndarray) – The central values of the data.

  • predictions (jnp.ndarray) – The predictions of the model.

  • inv_covmat (jnp.ndarray) – The inverse of the covariance matrix.

Returns:

loss – The chi2 loss.

Return type:

jnp.ndarray

Common Data Utils

colibri.commondata_utils.py

Module containing commondata and central covmat index functions.

class colibri.commondata_utils.CentralCovmatIndex(central_values: <function array at 0x7f0131302f20>, covmat: <function array at 0x7f0131302f20>, central_values_idx: <function array at 0x7f0131302f20>)[source]

Bases: object

central_values: array
central_values_idx: array
covmat: array
to_dict()[source]
class colibri.commondata_utils.CentralInvCovmatIndex(central_values: <function array at 0x7f0131302f20>, inv_covmat: <function array at 0x7f0131302f20>, central_values_idx: <function array at 0x7f0131302f20>)[source]

Bases: object

central_values: array
central_values_idx: array
inv_covmat: array
to_dict()[source]
colibri.commondata_utils.central_covmat_index(commondata_tuple, fit_covariance_matrix)[source]

Given a commondata_tuple and a covariance_matrix, generated according to respective explicit node in config.py, store relevant data into CentralCovmatIndex dataclass.

Parameters:
  • commondata_tuple (tuple) – tuple of commondata instances, is generated as explicit node (see config.produce_commondata_tuple) and accordingly to the specified options.

  • fit_covariance_matrix (jnp.array) – covariance matrix, is generated as explicit node (see config.fit_covariance_matrix) can be either experimental or t0 covariance matrix depending on whether use_fit_t0 is True or False

Returns:

dataclass containing central values, covariance matrix and index of central values

Return type:

CentralCovmatIndex dataclass

colibri.commondata_utils.central_inv_covmat_index(central_covmat_index)[source]

Given a CentralCovmatIndex dataclass, compute the inverse of the covariance matrix and store the relevant data into CentralInvCovmatIndex dataclass.

colibri.commondata_utils.experimental_commondata_tuple(data)[source]

Returns a tuple (validphys nodes should be immutable) of commondata instances with experimental central values.

Parameters:

data (validphys.core.DataGroupSpec)

Returns:

tuple of nnpdf_data.coredata.CommonData instances

Return type:

tuple

colibri.commondata_utils.level_0_commondata_tuple(data, experimental_commondata_tuple, closure_test_central_pdf_grid, FIT_XGRID, fast_kernel_arrays, flavour_indices=None, fill_fk_xgrid_with_zeros=False)[source]

Returns a tuple (validphys nodes should be immutable) of commondata instances with experimental central values replaced with theory predictions computed from a PDF closure_test_pdf and fktables corresponding to datasets within data.

Parameters:
  • data (validphys.core.DataGroupSpec)

  • FIT_XGRID (np.ndarray) – xgrid of the theory, computed by a production rule by taking the sorted union of the xgrids of the datasets entering the fit.

  • experimental_commondata_tuple (tuple) – tuple of commondata with experimental central values

  • closure_test_central_pdf_grid (jnp.array) – grid is of shape N_fl x N_x

  • fast_kernel_arrays (tuple) – tuple of jnp.array of shape (Ndat, Nfl, Nfk_xgrid) containing the fast kernel arrays for each dataset in data.

  • flavour_indices (list, default is None) – Subset of flavour (evolution basis) indices to be used.

  • fill_fk_xgrid_with_zeros (bool, default is False) – If True, then the missing xgrid points in the FK table will be filled with zeros. This is useful when the FK table is needed as tensor of shape (Ndat, Nfl, Nfk_xgrid) with Nfk_xgrid and Nfl fixed for all datasets.

Returns:

tuple of nnpdf_data.coredata.CommonData instances

Return type:

tuple

colibri.commondata_utils.level_1_commondata_tuple(level_0_commondata_tuple, data_generation_covariance_matrix, level_1_seed=123456)[source]

Returns a tuple (validphys nodes should be immutable) of level 1 commondata instances. Noise is added to the level_0_commondata_tuple central values according to a multivariate Gaussian with covariance data_generation_covariance_matrix

Parameters:
  • level_0_commondata_tuple (tuple of nnpdf_data.coredata.CommonData instances) – A tuple of level_0 closure test data.

  • data_generation_covariance_matrix (jnp.array) – The covariance matrix used for data generation.

  • level_1_seed (int) – The random seed from which the level_1 data is drawn.

Returns:

tuple of nnpdf_data.coredata.CommonData instances

Return type:

tuple

colibri.commondata_utils.pseudodata_central_covmat_index(commondata_tuple, data_generation_covariance_matrix)[source]

Same as central_covmat_index, but with the pseudodata generation covariance matrix for a Monte Carlo fit.

Data Batch

colibri.data_batch.py

Module containing data batches provider.

class colibri.data_batch.DataBatches(data_batch_stream_index: Callable, num_batches: int, batch_size: int)[source]

Bases: object

batch_size: int
data_batch_stream_index: Callable
num_batches: int
colibri.data_batch.data_batches(n_training_points, batch_size, batch_seed=1)[source]
Parameters:
  • n_training_points (int)

  • batch_size (int, default is None which sets it to n_training_points)

  • batch_seed (int, default is 1)

Return type:

DataBatches dataclass

MC Initialisation

colibri.mc_initialisation.mc_initial_parameters(pdf_model, mc_initialiser_settings, replica_index)[source]

This function initialises the parameters in a Monte Carlo fit.

Parameters:
  • pdf_model (pdf_mode.PDFModel) – The PDF model to initialise the parameters for.

  • mc_initialiser_settings (dict) – The settings for the initialiser.

  • replica_index (int) – The index of the replica.

Returns:

initial_values – The initial values for the parameters.

Return type:

jnp.array

Optax Optimizer

colibri.optax_optimizer.py

Module contains functions for optax gradient descent optimisation.

colibri.optax_optimizer.early_stopper(min_delta=1e-05, patience=20, max_epochs=1000, mc_validation_fraction=0.2)[source]

Define the early stopping criteria. If mc_validation_fraction is zero then patience is the same as max_epochs.

colibri.optax_optimizer.optimizer_provider(optimizer='adam', optimizer_settings={}) GradientTransformationExtraArgs[source]

Define the optimizer.

Parameters:
  • optimizer (str, default = "adam") – Name of the optimizer to use.

  • optimizer_settings (dict, default = {}) – Dictionary containing the optimizer settings.

Returns:

Optax optimizer.

Return type:

optax._src.base.GradientTransformationExtraArgs

Config

colibri.config.py

Config module of colibri

Note: several functions are taken from validphys.config

class colibri.config.Environment(replica_index=None, trval_index=0, float32=False, *args, **kwargs)[source]

Bases: Environment

init_output()[source]
classmethod ns_dump_description()[source]
exception colibri.config.EnvironmentError_[source]

Bases: Exception

class colibri.config.colibriConfig(input_params, environment=None)[source]

Bases: Config

Config class inherits from validphys Config class

parse_analytic_settings(settings)[source]

For an analytic fit, parses the analytic_settings namespace from the runcard, and ensures the choice of settings is valid.

parse_closure_test_pdf(name)[source]

PDF set used to generate fakedata

parse_integrability_settings(settings)[source]

Parses the integrability settings defined in the runcard into an IntegrabilitySettings dataclass.

parse_ns_settings(settings, output_path)[source]

For a Nested Sampling fit, parses the ns_settings namespace from the runcard, and ensures the choice of settings is valid.

parse_positivity_penalty_settings(settings)[source]

Parses the positivity_penalty_settings namespace from the runcard, and ensures the choice of settings is valid.

parse_prior_settings(settings)[source]

Parses the prior_settings namespace from the runcard, into the core.PriorSettings dataclass.

produce_FIT_XGRID(data=None, posdatasets=None)[source]

Produces the xgrid for the fit from the union of all xgrids

Parameters:
  • data (validphys.core.DataGroupSpec) – The data object containing all datasets

  • posdatasets (validphys.core.PositivitySetSpec)

Returns:

FIT_XGRID – array from the set defined as the union of all xgrids

Return type:

np.array

produce_commondata_tuple(closure_test_level=False)[source]

Produces a commondata tuple node in the reportengine dag according to some options

produce_data_generation_covariance_matrix(use_gen_t0: bool = False)[source]

Produces the covariance matrix used in: - level 1 closure test data construction (fluctuating around the level 0 data) - Monte Carlo pseudodata (fluctuating either around the level 0 data or level 1 data)

produce_fit_covariance_matrix(use_fit_t0: bool = True)[source]

Produces the covariance matrix used in the fit. This covariance matrix is used in: - commondata_utils.central_covmat_index - loss functions in mc_loss_functions.py

produce_flavour_indices(flavour_mapping=None)[source]

Produce flavour indices according to flavour_mapping.

Parameters:

flavour_mapping (list, default is None) – list of flavours names in the evolution basis (see e.g. validphys.convolution.FK_FLAVOURS). Specified by the user in the runcard.

produce_pdf_model()[source]

Returns None as the pdf_model is not used in the colibri module.

produce_vectorized(ns_settings)[source]

Returns True if the fit is vectorized, False otherwise. This is required for the predictions functions, which do not take ns_settings as an argument.

MC Loss Functions

colibri.mc_loss_functions.py

This module provides the functions necessary for the computation of the chi2 for a MC fit.

Date: 17.01.2024

colibri.mc_loss_functions.make_chi2_training_data(mc_pseudodata, fit_covariance_matrix)[source]

Returns a jax.jit compiled function that computes the chi2 of a pdf grid on a training data batch.

Notes

  • Does not include positivity constraint.

  • This function is designed for Monte Carlo like PDF fits.

Parameters:
  • mc_pseudodata (mc_utils.MCPseudodata) – dataclass containing Monte Carlo pseudodata.

  • fit_covariance_matrix (jnp.array) – covariance matrix of the fit (see config.produce_fit_covariance_matrix).

Returns:

function to compute chi2 of a pdf grid on a data batch.

Return type:

@jax.jit Callable

colibri.mc_loss_functions.make_chi2_training_data_with_positivity(mc_pseudodata, mc_posdata_split, fit_covariance_matrix, _penalty_posdata)[source]

Returns a jax.jit compiled function that computes the chi2 of a pdf grid on a training data batch including positivity penalty.

Notes

  • This function is designed for Monte Carlo like PDF fits.

Parameters:
  • mc_pseudodata (mc_utils.MCPseudodata) – dataclass containing Monte Carlo pseudodata.

  • mc_posdata_split (training_validation.PosdataTrainValidationSplit) – dataclass containing the indices of the positivity data for the train and validation split.

  • fit_covariance_matrix (jnp.array) – covariance matrix of the fit (see config.produce_fit_covariance_matrix).

  • _penalty_posdata (theory_predictions._penalty_posdata) – colibri provider used to compute positivity penalty.

Returns:

function to compute chi2 of a pdf grid on a data batch.

Return type:

@jax.jit Callable

colibri.mc_loss_functions.make_chi2_validation_data(mc_pseudodata, fit_covariance_matrix)[source]

Returns a jax.jit compiled function that computes the chi2 of a pdf grid on validation data.

Notes

  • Does not include positivity constraint.

  • This function is designed for Monte Carlo like PDF fits.

Parameters:
  • mc_pseudodata (mc_utils.MCPseudodata) – dataclass containing Monte Carlo pseudodata.

  • fit_covariance_matrix (jnp.array) – covariance matrix of the fit (see config.produce_fit_covariance_matrix).

Returns:

function to compute chi2 of a pdf grid on validation data.

Return type:

@jax.jit Callable

colibri.mc_loss_functions.make_chi2_validation_data_with_positivity(mc_pseudodata, mc_posdata_split, fit_covariance_matrix, _penalty_posdata)[source]

Returns a jax.jit compiled function that computes the chi2 of a pdf grid on validation data.

Notes

  • This function is designed for Monte Carlo like PDF fits.

Parameters:
  • mc_pseudodata (mc_utils.MCPseudodata) – dataclass containing Monte Carlo pseudodata.

  • mc_posdata_split (training_validation.PosdataTrainValidationSplit) – dataclass containing the indices of the positivity data for the train and validation split.

  • fit_covariance_matrix (jnp.array) – covariance matrix of the fit (see config.produce_fit_covariance_matrix).

  • _penalty_posdata (theory_predictions._penalty_posdata) – colibri provider used to compute positivity penalty.

Returns:

function to compute chi2 of a pdf grid on validation data.

Return type:

@jax.jit Callable

PDF Model

colibri.pdf_model.py

This module implements an abstract class PDFModel which is filled by the various models.

class colibri.pdf_model.PDFModel[source]

Bases: ABC

An abstract class describing the key features of a PDF model.

abstractmethod grid_values_func(xgrid: Array | ndarray | bool_ | number | bool | int | float | complex) Callable[[array], Array][source]

This function should produce a grid values function, which takes in the model parameters, and produces the PDF values on the grid xgrid. The grid values function should be a function of the parameters and return an array of shape (N_fl, Nx). The first dimension is the number of flavours expected by the FK tables belonging to the chosen theoryID. The second dimension is the number of points in the xgrid, i.e. Nx = len(xgrid).

Example

def grid_values_func(xgrid):
    def func(params):
        # Define expression for each flavour
        fl_1 = params[0] + params[1] * xgrid
        fl_2 = params[2] + params[3] * xgrid

        # Combine the flavours into a single array
        # This is just an example, the actual implementation will depend on the model
        # and the number of flavours

        return jnp.array([fl_1, fl_2])
    return func
name = 'Abstract PDFModel'
abstract property param_names: list

This should return a list of names for the fitted parameters of the model. The order of the names is important as it will be assumed to be the order of the parameters fed to the model.

pred_and_pdf_func(xgrid: Array | ndarray | bool_ | number | bool | int | float | complex, forward_map: Callable[[Array, Array], Array]) Callable[[Array, Array], Tuple[Array, Array]][source]

Creates a function that returns a tuple of two arrays, given the model parameters and the fast kernel arrays as input.

The returned function produces: - The first array: 1D vector of theory predictions for the data. - The second array: PDF values evaluated on the x-grid, using self.grid_values_func, with shape (Nfl, Nx).

The forward_map is used to map the PDF values defined on the x-grid and the fast kernel arrays into the corresponding theory prediction vector.

Theory Predictions

colibri.theory_predictions.py

This module contains the functions necessary for the computation of theory predictions by means of fast-kernel (FK) tables.

colibri.theory_predictions.fast_kernel_arrays(data, FIT_XGRID, flavour_indices=None, fill_fk_xgrid_with_zeros=False)[source]

Returns a tuple of tuples of jax.numpy arrays.

Parameters:
  • data (validphys.core.DataGroupSpec)

  • FIT_XGRID (np.ndarray)

  • flavour_indices (list, default is None) – if not None, the function will return fk arrays that allow to compute the prediction for a subset of flavours. The list must contain the flavour indices. The indices correspond to the flavours in convolution.FK_FLAVOURS e.g.: [1,2] -> [’Sigma’, ‘g’]

  • fill_fk_xgrid_with_zeros (bool, default is False) – If True, then the missing xgrid points in the FK table will be filled with zeros. This is useful when the FK table is needed as tensor of shape (Ndat, Nfl, Nfk_xgrid) with Nfk_xgrid and Nfl fixed for all datasets.

Returns:

tuple of tuples of jax.numpy arrays

Return type:

tuple

colibri.theory_predictions.fktable_xgrid_indices(fktable, FIT_XGRID, fill_fk_xgrid_with_zeros=False)[source]

Given an FKTableData instance and the xgrid used in the fit returns the indices of the xgrid of the FK table in the xgrid of the fit.

If fill_fk_xgrid_with_zeros is True, then the all indices of the fit xgrid are returned. This is useful when the FK table is needed as tensor of shape (Ndat, Nfl, Nfk_xgrid) with Nfk_xgrid and Nfl fixed for all datasets.

Parameters:
  • fktable (validphys.coredata.FKTableData)

  • FIT_XGRID (jnp.ndarray) – array of xgrid points of the theory entering the fit

  • fill_fk_xgrid_with_zeros (bool, default is False)

Return type:

jnp.ndarray of indices

colibri.theory_predictions.make_dis_prediction(fktable, FIT_XGRID, flavour_indices=None, fill_fk_xgrid_with_zeros=False)[source]

Closure to compute the theory prediction for a DIS observable.

Parameters:
  • fktable (validphys.coredata.FKTableData) – The fktable should be a validphys.coredata.FKTableData instance and with cuts and masked flavours already applied.

  • FIT_XGRID (np.ndarray) – xgrid of the theory, computed by a production rule by taking the sorted union of the xgrids of the datasets entering the fit.

  • flavour_indices (list, default is None)

  • fill_fk_xgrid_with_zeros (bool, default is False) – If True, then the missing xgrid points in the FK table will be filled with zeros. This is useful when the FK table is needed as tensor of shape (Ndat, Nfl, Nfk_xgrid) with Nfk_xgrid and Nfl fixed for all datasets.

Return type:

Callable

colibri.theory_predictions.make_had_prediction(fktable, FIT_XGRID, flavour_indices=None, fill_fk_xgrid_with_zeros=False)[source]

Closure to compute the theory prediction for a Hadronic observable.

Parameters:
  • fktable (validphys.coredata.FKTableData)

  • FIT_XGRID (np.ndarray) – xgrid of the theory, computed by a production rule by taking the sorted union of the xgrids of the datasets entering the fit.

  • flavour_indices (list, default is None)

  • fill_fk_xgrid_with_zeros (bool, default is False) – If True, then the missing xgrid points in the FK table will be filled with zeros. This is useful when the FK table is needed as tensor of shape (Ndat, Nfl, Nfk_xgrid) with Nfk_xgrid and Nfl fixed for all datasets.

Return type:

Callable

colibri.theory_predictions.make_pred_data(data, FIT_XGRID, flavour_indices=None, fill_fk_xgrid_with_zeros=False)[source]

Compute theory prediction for entire DataGroupSpec

Parameters:
  • data (DataGroupSpec instance)

  • FIT_XGRID (np.ndarray) – xgrid of the theory, computed by a production rule by taking the sorted union of the xgrids of the datasets entering the fit.

  • flavour_indices (list, default is None)

  • fill_fk_xgrid_with_zeros (bool, default is False)

Return type:

Callable

colibri.theory_predictions.make_pred_dataset(dataset, FIT_XGRID, flavour_indices=None, fill_fk_xgrid_with_zeros=False)[source]

Compute theory prediction for a DataSetSpec

Parameters:
  • dataset (validphys.core.DataSetSpec)

  • FIT_XGRID (np.ndarray) – xgrid of the theory, computed by a production rule by taking the sorted union of the xgrids of the datasets entering the fit.

  • flavour_indices (list, default is None)

  • fill_fk_xgrid_with_zeros (bool, default is False)

Return type:

Callable

colibri.theory_predictions.make_pred_t0data(data, FIT_XGRID, flavour_indices=None, fill_fk_xgrid_with_zeros=False)[source]

Compute theory prediction for entire DataGroupSpec. It is specifically meant for t0 predictions, i.e. it is similar to dataset_t0_predictions in validphys.covmats.

Parameters:
  • data (DataGroupSpec instance)

  • FIT_XGRID (np.ndarray) – xgrid of the theory, computed by a production rule by taking the sorted union of the xgrids of the datasets entering the fit.

  • flavour_indices (list, default is None)

  • fill_fk_xgrid_with_zeros (bool, default is False)

Return type:

Callable

colibri.theory_predictions.pred_funcs_from_dataset(dataset, FIT_XGRID, flavour_indices, fill_fk_xgrid_with_zeros=False)[source]

Returns a list containing the forward maps associated with the fkspecs of a dataset.

Parameters:
  • dataset (validphys.core.DataGroupSpec)

  • FIT_XGRID (array)

  • flavour_indices (list, default is None)

  • fill_fk_xgrid_with_zeros (bool, default is False)

Return type:

list of Mappings

MC Utils

colibri.mc_utils.py

Module containing utils functions for the Monte Carlo fit.

class colibri.mc_utils.MCPseudodata(pseudodata: <function array at 0x7f0131302f20>, training_indices: <function array at 0x7f0131302f20>, validation_indices: <function array at 0x7f0131302f20>, trval_split: bool = False)[source]

Bases: object

pseudodata: array
to_dict()[source]
training_indices: array
trval_split: bool = False
validation_indices: array
colibri.mc_utils.len_trval_data(mc_pseudodata)[source]

Returns the number of training data points.

colibri.mc_utils.mc_pseudodata(pseudodata_central_covmat_index, replica_index, trval_seed, shuffle_indices=True, mc_validation_fraction=0.2)[source]

Produces Monte Carlo pseudodata for the replica with index replica_index. The pseudodata is returned with a set of training indices, which account for a fraction mc_validation_fraction of the data.

colibri.mc_utils.write_exportgrid_mc(parameters, pdf_model, replica_index, output_path, Q=1.65, xgrid=[1e-09, 1.29708482343957e-09, 1.68242903474257e-09, 2.18225315420583e-09, 2.83056741739819e-09, 3.67148597892941e-09, 4.76222862935315e-09, 6.1770142737618e-09, 8.01211109898438e-09, 1.03923870607245e-08, 1.34798064073805e-08, 1.74844503691778e-08, 2.26788118881103e-08, 2.94163370300835e-08, 3.81554746595878e-08, 4.94908707232129e-08, 6.41938295708371e-08, 8.32647951986859e-08, 1.08001422993829e-07, 1.4008687308113e-07, 1.81704331793772e-07, 2.35685551545377e-07, 3.05703512595323e-07, 3.96522309841747e-07, 5.1432125723657e-07, 6.67115245136676e-07, 8.65299922973143e-07, 1.12235875241487e-06, 1.45577995547683e-06, 1.88824560514613e-06, 2.44917352454946e-06, 3.17671650028717e-06, 4.12035415232797e-06, 5.3442526575209e-06, 6.93161897806315e-06, 8.99034258238145e-06, 1.16603030112258e-05, 1.51228312288769e-05, 1.96129529349212e-05, 2.54352207134502e-05, 3.29841683435992e-05, 4.27707053972016e-05, 5.54561248105849e-05, 7.18958313632514e-05, 9.31954227979614e-05, 0.00012078236773133, 0.000156497209466554, 0.000202708936328495, 0.000262459799331951, 0.000339645244168985, 0.000439234443000422, 0.000567535660104533, 0.000732507615725537, 0.000944112105452451, 0.00121469317686978, 0.00155935306118224, 0.00199627451141338, 0.00254691493736552, 0.00323597510213126, 0.00409103436509565, 0.00514175977083962, 0.00641865096062317, 0.00795137940306351, 0.009766899996241, 0.0118876139251364, 0.0143298947643919, 0.0171032279460271, 0.0202100733925079, 0.0236463971369542, 0.0274026915728357, 0.0314652506132444, 0.0358174829282429, 0.0404411060163317, 0.0453171343973807, 0.0504266347950069, 0.0557512610084339, 0.0612736019390519, 0.0669773829498255, 0.0728475589986517, 0.0788703322292727, 0.0850331197801452, 0.0913244910278679, 0.0977340879783772, 0.104252538208639, 0.110871366547237, 0.117582909372878, 0.124380233801599, 0.131257062945031, 0.138207707707289, 0.145227005135651, 0.152310263065985, 0.159453210652156, 0.166651954293987, 0.173902938455578, 0.181202910873333, 0.188548891679097, 0.195938145999193, 0.203368159629765, 0.210836617429103, 0.218341384106561, 0.225880487124065, 0.233452101459503, 0.241054536011681, 0.248686221452762, 0.256345699358723, 0.264031612468684, 0.271742695942783, 0.279477769504149, 0.287235730364833, 0.295015546847664, 0.302816252626866, 0.310636941519503, 0.318476762768082, 0.326334916761672, 0.334210651149156, 0.342103257303627, 0.350012067101685, 0.357936449985571, 0.365875810279643, 0.373829584735962, 0.381797240286494, 0.389778271981947, 0.397772201099286, 0.40577857340234, 0.413796957540671, 0.421826943574548, 0.429868141614175, 0.437920180563205, 0.44598270695699, 0.454055383887562, 0.462137890007651, 0.470229918607142, 0.478331176755675, 0.486441384506059, 0.494560274153348, 0.502687589545177, 0.510823085439086, 0.518966526903235, 0.527117688756998, 0.535276355048428, 0.543442318565661, 0.551615380379768, 0.559795349416641, 0.5679820420558, 0.576175281754088, 0.584374898692498, 0.59258072944444, 0.60079261666395, 0.609010408792398, 0.61723395978245, 0.625463128838069, 0.633697780169485, 0.641937782762089, 0.650183010158361, 0.658433340251944, 0.666688655093089, 0.674948840704708, 0.683213786908386, 0.691483387159697, 0.699757538392251, 0.708036140869916, 0.716319098046733, 0.724606316434025, 0.732897705474271, 0.741193177421404, 0.749492647227008, 0.757796032432224, 0.766103253064927, 0.774414231541921, 0.782728892575836, 0.791047163086478, 0.799368972116378, 0.807694250750291, 0.816022932038457, 0.824354950923382, 0.832690244169987, 0.841028750298844, 0.8493704095226, 0.857715163684985, 0.866062956202683, 0.874413732009721, 0.882767437504206, 0.891124020497459, 0.899483430165226, 0.907845617001021, 0.916210532771399, 0.924578130473112, 0.932948364292029, 0.941321189563734, 0.949696562735755, 0.958074441331298, 0.966454783914439, 0.974837550056705, 0.983222700304978, 0.991610196150662, 1.0], export_labels=['TBAR', 'BBAR', 'CBAR', 'SBAR', 'UBAR', 'DBAR', 'GLUON', 'D', 'U', 'S', 'C', 'B', 'T', 'PHT'])[source]

Similar to colibri.export_results.write_replicas but for a Monte Carlo fit. The main difference is that the replicas are written to a fit_replicas folder which is then used by the postfit script to select valid replicas.

Training and Validation

colibri.training_validation.py

Module containing training validation dataclasses for MC fits.

Date: 11.11.2023

class colibri.training_validation.PosdataTrainValidationSplit(training: <function array at 0x7f0131302f20>, validation: <function array at 0x7f0131302f20>, n_training: int, n_validation: int)[source]

Bases: TrainValidationSplit

n_training: int
n_validation: int
class colibri.training_validation.TrainValidationSplit(training: <function array at 0x7f0131302f20>, validation: <function array at 0x7f0131302f20>)[source]

Bases: object

to_dict()[source]
training: array
validation: array
colibri.training_validation.mc_posdata_split(posdatasets, trval_seed, mc_validation_fraction=0.2, shuffle_indices=True)[source]

Function for positivity training validation split.

Note: the random split is done using the same seed as

for data tr/val split is used.

Parameters:
  • posdatasets (list) – list of positivity datasets, see also validphys.config.parse_posdataset.

  • trval_seed (jax.random.PRNGKey) – utils.trval_seed, colibri provider.

  • mc_validation_fraction (float, default is 0.2)

  • shuffle_indices (bool, default is True)

Returns:

dataclass

Return type:

PosdataTrainValidationSplit

colibri.training_validation.training_validation_split(indices, mc_validation_fraction, random_seed, shuffle_indices=True)[source]

Performs training validation split on an array.

Parameters:
  • indices (jaxlib.Array)

  • mc_validation_fraction (float)

  • random_seed (jaxlib.Array) – PRNGKey, obtained as jax.random.PRNGKey(random_number)

  • shuffle_indices (bool)

Return type:

dataclass

colibri.training_validation.trval_seed(trval_index)[source]

Returns a PRNGKey key given trval_index seed.

Utils

colibri.utils.py

Module containing several utils for PDF fits.

colibri.utils.cast_to_numpy(func)[source]
colibri.utils.closest_indices(a, v, atol=1e-08)[source]

Finds the indices of values in a that are closest to the given value(s) v.

Unlike np.searchsorted, this function identifies indices where the values in v are approximately equal to those in a within the specified tolerance. The main difference is that np.searchsorted returns the index where each element of v should be inserted in a in order to preserve the order (see example below).

Parameters:
  • a (array-like)

  • v (array-like or float)

  • atol (float, default is 1e-8) – absolute tolerance used to find closest indices.

Return type:

array-like

Examples

>>> a = np.array([1, 2, 3])
>>> v = np.array([1.1, 3.0])
>>> closest_indices(array, value, atol=0.1)
array([0, 2])
>>> np.searchsorted(a, v)
array([1, 2])
colibri.utils.compute_determinants_of_principal_minors(C)[source]

Computes the determinants of the principal minors of a symmetric, positive semi-definite matrix C.

Parameters:

(np.ndarray) (C)

Returns:

List[float]

Return type:

A list of determinants of the principal minors from C_n down to C_0

colibri.utils.full_posterior_sample_fit_resampler(fit_path: Path, n_replicas: int, resampling_seed: int)[source]

Wrapper for resampling from a fit with a full_posterior_sample.csv like file storing the posterior samples in the root of the folder.

colibri.utils.get_fit_path(fit)[source]
colibri.utils.get_full_posterior(colibri_fit)[source]

Given a colibri fit, returns the pandas dataframe with the results of the fit at the parameterisation scale.

Parameters:

colibri_fit (str) – The name of the fit to read.

Return type:

pandas dataframe

colibri.utils.get_pdf_model(colibri_fit)[source]

Given a colibri fit, returns the PDF model.

Parameters:

colibri_fit (str) – The name of the fit to read.

Return type:

PDFModel

colibri.utils.likelihood_float_type(_pred_data, pdf_model, FIT_XGRID, bayesian_prior, output_path, central_inv_covmat_index, fast_kernel_arrays)[source]

Writes the dtype of the likelihood function to a file. Mainly used for testing purposes.

colibri.utils.mask_fktable_array(fktable, flavour_indices=None)[source]

Takes an FKTableData instance and returns an FK table array with masked flavours.

Parameters:
  • fktable (validphys.coredata.FKTableData)

  • flavour_indices (list, default is None) – The indices of the flavours to keep. If None, returns the original FKTableData.get_np_fktable() array with no masking.

Returns:

The FK table array with masked flavours.

Return type:

jnp.array

colibri.utils.mask_luminosity_mapping(fktable, flavour_indices=None)[source]

Takes an FKTableData instance and returns a new instance with masked luminosity mapping.

Parameters:
  • fktable (validphys.coredata.FKTableData)

  • flavour_indices (list, default is None) – The indices of the flavours to keep. If None, returns the original FKTableData.luminosity_mapping with no masking.

Returns:

The luminosity mapping with masked flavours.

Return type:

jnp.array

colibri.utils.pdf_model_from_colibri_model(model_settings)[source]

Produce a PDF model from a colibri model.

Parameters:

model_settings (dict) – The settings to produce the PDF model.

Return type:

PDFModel

colibri.utils.pdf_models_equal(pdf_model_1, pdf_model_2)[source]

Checks if two pdf models are equal.

Parameters:
  • pdf_model_1 (PDFModel)

  • pdf_model_2 (PDFModel)

Return type:

bool

colibri.utils.resample_from_ns_posterior(samples, n_posterior_samples=1000, posterior_resampling_seed=123456)[source]

Resamples a subset of data points from a given set of samples without replacement.

Parameters:
  • samples (jnp.ndarray) – The input dataset to be resampled.

  • n_posterior_samples (int, default is 1000) – The number of samples to draw from the input dataset.

  • posterior_resampling_seed (int, default is 123456) – The random seed to ensure reproducibility of the resampling process.

Returns:

resampled_samples – The resampled subset of the input dataset, containing n_posterior_samples without selected replacement.

Return type:

jax.Array

colibri.utils.resample_posterior_from_file(fit_path: Path, file_path: str, n_replicas: int, resampling_seed: int, use_all_columns: bool = False, read_csv_args: dict = None)[source]

Generic function to resample from a posterior using a specified file path.

Parameters:
  • fit_path (pathlib.Path) – The path to the fit folder.

  • file_path (str) – The name of the file containing the posterior samples inside the fit folder.

  • n_replicas (int) – The number of posterior samples to resample from the file.

  • resampling_seed (int) – The random seed to use for resampling.

  • use_all_columns (bool, default is False) – If True, all columns of the file are used. If False, the first column is ignored.

  • read_csv_args (dict, default is None) – Additional arguments to pass to pd.read_csv when loading the file.

Returns:

resampled_posterior – The resampled posterior samples.

Return type:

np.ndarray

colibri.utils.t0_pdf_grid(t0pdfset, FIT_XGRID, Q0=1.65)[source]

Computes the t0 pdf grid in the evolution basis.

Parameters:
  • t0pdfset (validphys.core.PDF)

  • FIT_XGRID (np.ndarray) – xgrid of the theory, computed by a production rule by taking the sorted union of the xgrids of the datasets entering the fit.

  • Q0 (float, default is 1.65)

Returns:

t0grid – t0 grid, is N_rep x N_fl x N_x

Return type:

jnp.array

colibri.utils.write_resampled_bayesian_fit(resampled_posterior: ndarray, fit_path: Path, resampled_fit_path: Path, resampled_fit_name: str | Path, parametrisation_scale: float, csv_results_name: str)[source]

Writes the resampled ns fit to resampled_fit_path.

Parameters:
  • resampled_posterior (np.ndarray) – The resampled posterior.

  • fit_path (pathlib.Path) – The path to the original fit.

  • resampled_fit_path (pathlib.Path) – The path to the resampled fit.

  • resampled_fit_name (Union[str, pathlib.Path]) – The name of the resampled fit.

  • parametrisation_scale (float)

  • csv_results_name (str) – The name of the csv file to store the resampled posterior.

Scripts