Implementing a Model in Colibri

In this section, we will present how to implement a model in Colibri, which can then be used to run fits, closure tests, and other applications (see the tutorials).

In general, a Colibri model is contained in a directory with the following structure:

model_to_implement/
├── pyproject.toml        # Defines a python package for the project and sets up executable
├── model_to_implement/
   ├── app.py             # Enables the use of reportengine and validphys
   ├── config.py          # Defines the configuration layer for the model
   ├── model.py           # Script where the model is defined
└── runcards/             # Directory containing any runcards

The best way to understand how to implement a model is to go through an example, so let’s have a look at how the Les Houches parametrisation (presented here) is built.

Implementing the Les Houches model in Colibri

In the colibri/examples/ directory, you will find a directory called les_houches_example, which follows the structure defined above. We will have a look at them one by one.

pyproject.toml

The pyproject.toml file defines the Python package configuration for this model using Poetry as the dependency management and packaging tool. The configuration file structure looks like this:

[build-system]
requires = ["poetry-core>=1.0.0", "poetry-dynamic-versioning>=1.1.0"]
build-backend = "poetry_dynamic_versioning.backend"

[tool.poetry]
name = "les_houches_example"
version = "1.0.0"
authors = ["PBSP collaboration"]
description = "Les Houches Parametrisation Example"

[tool.poetry.dependencies]


[tool.poetry.extras]
test = [
    "pytest",
    "hypothesis",
]
doc = [
    "sphinx",
    "recommonmark",
    "sphinx_rtd_theme"
]

[tool.poetry.scripts]
les_houches_exe = "les_houches_example.app:main"

Note that here the executable les_houches_exe is introduced, which is an executable that is specific to this model, and will be used to initialise a fit. (See Running Fits).

app.py

The app.py module defines the core application class for the Les Houches model:

"""
les_houches_example.app.py

"""

from colibri.app import colibriApp
from les_houches_example.config import LesHouchesConfig


lh_pdf_providers = [
    "les_houches_example.model",
]


class LesHouchesApp(colibriApp):
    config_class = LesHouchesConfig


def main():
    a = LesHouchesApp(name="les_houches", providers=lh_pdf_providers)
    a.main()


if __name__ == "__main__":
    main()

The LesHouchesApp class enables the Les Houches model to function as a reportengine App. This integration provides a structured framework for data processing and report generation.

Key Features:

  • Provider System: The LesHouchesApp accepts a list of providers (lh_pdf_providers) containing modules that are recognized by the application framework.

  • Inheritance Hierarchy: The LesHouchesApp is a subclass of colibriApp, which means it automatically inherits all providers from both colibri and validphys, giving access to their full functionality without additional configuration.

config.py

The config.py script defines the configuration layer for the Les Houches model. It extends Colibri’s configuration system to provide a custom model builder and environment.

"""
les_houches_example.config.py

"""

import dill
import logging
from les_houches_example.model import LesHouchesPDF

from colibri.config import Environment, colibriConfig


log = logging.getLogger(__name__)


class LesHouchesEnvironment(Environment):
    pass


class LesHouchesConfig(colibriConfig):
    """
    LesHouchesConfig class Inherits from colibri.config.colibriConfig
    """

    def produce_pdf_model(self, output_path, dump_model=True):
        """
        Produce the Les Houches model.
        """
        model = LesHouchesPDF()
        # dump model to output_path using dill
        # this is mainly needed by scripts/bayesian_resampler.py
        if dump_model:
            with open(output_path / "pdf_model.pkl", "wb") as file:
                dill.dump(model, file)
        return model

The produce_pdf_model method creates an instance of the LesHouchesPDF model. Therefore, every model should have this production rule.

If dump_model is set to True, the method serialises the model using dill and writes it to pdf_model.pkl in the output_path, where output_path will be the output directory created when running a colibri fit (see Colibri fit folders). pdf_model.pkl will be loaded by scripts/bayesian_resampler.py for resampling.

If dump_model is set to False, the serialised model will not be written to the disk.

model.py

The model.py script defines the Les Houches parametrisation model. It looks like this:

"""
les_houches_example.model.py
"""

import jax.numpy as jnp
import jax.scipy.special as jsp
from colibri.pdf_model import PDFModel


class LesHouchesPDF(PDFModel):
    """
    A PDFModel implementation for the Les Houches parametrisation.
    """

    @property
    def param_names(self):
        """The fitted parameters of the model."""
        return [
            "alpha_gluon",
            "beta_gluon",
            "alpha_up",
            "beta_up",
            "epsilon_up",
            "gamma_up",
            "alpha_down",
            "beta_down",
            "epsilon_down",
            "gamma_down",
            "norm_sigma",
            "alpha_sigma",
            "beta_sigma",
        ]

    @property
    def n_parameters(self):
        """The number of parameters of the model."""
        return len(self.param_names)

    def _pdf_gluon(
        self, x, alpha_gluon, beta_gluon, norm_sigma, alpha_sigma, beta_sigma
    ):
        """Computes normalisation factor A_g in terms of free parameters and computes the gluon PDF."""
        A_g = (
            jsp.gamma(alpha_gluon + beta_gluon + 2)
            / (jsp.gamma(alpha_gluon + 1) * jsp.gamma(beta_gluon + 1))
        ) * (
            1
            - norm_sigma
            * (jsp.gamma(alpha_sigma + 1) * jsp.gamma(beta_sigma + 1))
            / jsp.gamma(alpha_sigma + beta_sigma + 2)
        )
        return A_g * x**alpha_gluon * (1 - x) ** beta_gluon

    def _pdf_sigma(self, x, norm_sigma, alpha_sigma, beta_sigma):
        """Compute the Sigma pdf."""
        return norm_sigma * x**alpha_sigma * (1 - x) ** beta_sigma

    def _A_uv(self, alpha_up, beta_up, epsilon_up, gamma_up):
        """Compute the normalization factor A_{u_v} in terms of free parameters."""
        return (2 / jsp.gamma(beta_up + 1)) / (
            (jsp.gamma(alpha_up) / jsp.gamma(alpha_up + beta_up + 1))
            + epsilon_up
            * (jsp.gamma(alpha_up + 0.5) / jsp.gamma(alpha_up + beta_up + 1.5))
            + gamma_up * (jsp.gamma(alpha_up + 1) / jsp.gamma(alpha_up + beta_up + 2))
        )

    def _A_dv(self, alpha_down, beta_down, epsilon_down, gamma_down):
        """Compute the normalization factor A_{d_v} in terms of free parameters."""
        return (1 / jsp.gamma(beta_down + 1)) / (
            (jsp.gamma(alpha_down) / jsp.gamma(alpha_down + beta_down + 1))
            + epsilon_down
            * (jsp.gamma(alpha_down + 0.5) / jsp.gamma(alpha_down + beta_down + 1.5))
            + gamma_down
            * (jsp.gamma(alpha_down + 1) / jsp.gamma(alpha_down + beta_down + 2))
        )

    def _pdf_valence(
        self,
        x,
        alpha_up,
        beta_up,
        epsilon_up,
        gamma_up,
        alpha_down,
        beta_down,
        epsilon_down,
        gamma_down,
    ):
        """ """
        A_uv = self._A_uv(alpha_up, beta_up, epsilon_up, gamma_up)
        A_dv = self._A_dv(alpha_down, beta_down, epsilon_down, gamma_down)
        up_valence = (
            x**alpha_up
            * (1 - x) ** beta_up
            * (1 + epsilon_up * jnp.sqrt(x) + gamma_up * x)
        )
        down_valence = (
            x**alpha_down
            * (1 - x) ** beta_down
            * (1 + epsilon_down * jnp.sqrt(x) + gamma_down * x)
        )
        return A_uv * up_valence + A_dv * down_valence

    def _pdf_valence3(
        self,
        x,
        alpha_up,
        beta_up,
        epsilon_up,
        gamma_up,
        alpha_down,
        beta_down,
        epsilon_down,
        gamma_down,
    ):
        """ """
        A_uv = self._A_uv(alpha_up, beta_up, epsilon_up, gamma_up)
        A_dv = self._A_dv(alpha_down, beta_down, epsilon_down, gamma_down)
        up_valence = (
            x**alpha_up
            * (1 - x) ** beta_up
            * (1 + epsilon_up * jnp.sqrt(x) + gamma_up * x)
        )
        down_valence = (
            x**alpha_down
            * (1 - x) ** beta_down
            * (1 + epsilon_down * jnp.sqrt(x) + gamma_down * x)
        )
        return A_uv * up_valence - A_dv * down_valence

    def grid_values_func(self, xgrid):
        """This function should produce a grid values function, which takes
        in the model parameters, and produces the PDF values on the grid xgrid.
        """

        xgrid = jnp.array(xgrid)

        def pdf_func(params):
            """ """
            alpha_gluon = params[0]
            beta_gluon = params[1]
            alpha_up = params[2]
            beta_up = params[3]
            epsilon_up = params[4]
            gamma_up = params[5]
            alpha_down = params[6]
            beta_down = params[7]
            epsilon_down = params[8]
            gamma_down = params[9]
            norm_sigma = params[10]
            alpha_sigma = params[11]
            beta_sigma = params[12]
            pdf_grid = []

            # Compute the PDFs for each flavour
            gluon_pdf = self._pdf_gluon(
                xgrid, alpha_gluon, beta_gluon, norm_sigma, alpha_sigma, beta_sigma
            )
            sigma_pdf = self._pdf_sigma(xgrid, norm_sigma, alpha_sigma, beta_sigma)
            valence_pdf = self._pdf_valence(
                xgrid,
                alpha_up,
                beta_up,
                epsilon_up,
                gamma_up,
                alpha_down,
                beta_down,
                epsilon_down,
                gamma_down,
            )
            valence3_pdf = self._pdf_valence3(
                xgrid,
                alpha_up,
                beta_up,
                epsilon_up,
                gamma_up,
                alpha_down,
                beta_down,
                epsilon_down,
                gamma_down,
            )
            t8_pdf = 0.4 * sigma_pdf  # T8 is a scaled version of Sigma

            # Build the PDF grid
            pdf_grid = jnp.array(
                [
                    jnp.zeros_like(xgrid),  # Photon
                    sigma_pdf,  # Σ
                    gluon_pdf,  # g
                    valence_pdf,  # V
                    valence3_pdf,  # V3
                    valence_pdf,  # V8 = V
                    valence_pdf,  # V15 = V
                    valence_pdf,  # V24 = V
                    valence_pdf,  # V35 = V
                    jnp.zeros_like(xgrid),  # T3 = 0
                    t8_pdf,  # T8
                    sigma_pdf,  # T15 = Σ
                    sigma_pdf,  # T24 = Σ
                    sigma_pdf,  # T35 = Σ
                ]
            )
            return pdf_grid

        return pdf_func

The LesHouchesPDF class completes the abstract methods of the PDFModel class, which you can read more about here. This allows for the definition of a specific model in a way that can be used in the colibri code. The LesHouchesPDF class does the following:

  • takes a list of flavours to be fitted (param_names),

  • defines the PDF for each flavour,

  • computes grid values.

Having defined this model, it is used in the production rule produce_pdf_model, defined in the config.py script, shown above. This allows the model to be seen by the rest of the code, so that it can be used to run a fit and perform closure tests.

You can find an example of how to execute the model here.