Implementing a Model in Colibri

In this section, we will present how to implement a model in Colibri, which can then be used to run fits, closure tests, and other applications (see the tutorials).

In general, a Colibri model is contained in a directory with the following structure:

model_to_implement/
├── pyproject.toml        # Defines a python package for the project and sets up executable
├── model_to_implement/
   ├── app.py             # Enables the use of reportengine and validphys
   ├── config.py          # Defines the configuration layer for the model
   ├── model.py           # Script where the model is defined
└── runcards/             # Directory containing any runcards

The best way to understand how to implement a model is to go through an example, so let’s have a look at how the Les Houches parametrisation (presented here) is built.

Implementing the Les Houches model in Colibri

In the colibri/examples/ directory, you will find a directory called les_houches_example, which follows the structure defined above. We will have a look at them one by one.

`pyproject.toml`

The pyproject.toml file defines the Python package configuration for this model using Poetry as the dependency management and packaging tool. The configuration file structure looks like this:

[build-system]
requires = ["poetry-core>=1.0.0", "poetry-dynamic-versioning>=1.1.0"]
build-backend = "poetry_dynamic_versioning.backend"

[tool.poetry]
name = "les_houches_example"
version = "1.0.0"
authors = ["PBSP collaboration"]
description = "Les Houches Parametrisation Example"

[tool.poetry.dependencies]


[tool.poetry.extras]
test = [
    "pytest",
    "hypothesis",
]
doc = [
    "sphinx",
    "recommonmark",
    "sphinx_rtd_theme"
]

[tool.poetry.scripts]
les_houches_exe = "les_houches_example.app:main"

Note that here the executable les_houches_exe is introduced, which is an executable that is specific to this model, and will be used to initialise a fit. (See Running Fits).

`app.py`

The app.py module defines the core application class for the Les Houches model:

"""
les_houches_example.app.py

"""

from colibri.app import colibriApp
from les_houches_example.config import LesHouchesConfig


lh_pdf_providers = [
    "les_houches_example.model",
]


class LesHouchesApp(colibriApp):
    config_class = LesHouchesConfig


def main():
    a = LesHouchesApp(name="les_houches", providers=lh_pdf_providers)
    a.main()


if __name__ == "__main__":
    main()

The LesHouchesApp class enables the Les Houches model to function as a reportengine App. This integration provides a structured framework for data processing and report generation.

Key Features:

Provider System: The LesHouchesApp accepts a list of providers (lh_pdf_providers) containing modules that are recognized by the application framework.
Inheritance Hierarchy: The LesHouchesApp is a subclass of colibriApp, which means it automatically inherits all providers from both colibri and validphys, giving access to their full functionality without additional configuration.

`config.py`

The config.py script defines the configuration layer for the Les Houches model. It extends Colibri’s configuration system to provide a custom model builder and environment.

"""
les_houches_example.config.py

"""

import dill
import logging
from les_houches_example.model import LesHouchesPDF

from colibri.config import Environment, colibriConfig


log = logging.getLogger(__name__)


class LesHouchesEnvironment(Environment):
    pass


class LesHouchesConfig(colibriConfig):
    """
    LesHouchesConfig class Inherits from colibri.config.colibriConfig
    """

    def produce_pdf_model(self, output_path, dump_model=True):
        """
        Produce the Les Houches model.
        """
        model = LesHouchesPDF()
        # dump model to output_path using dill
        # this is mainly needed by scripts/bayesian_resampler.py
        if dump_model:
            with open(output_path / "pdf_model.pkl", "wb") as file:
                dill.dump(model, file)
        return model

The produce_pdf_model method creates an instance of the LesHouchesPDF model. Therefore, every model should have this production rule.

If dump_model is set to True, the method serialises the model using dill and writes it to pdf_model.pkl in the output_path, where output_path will be the output directory created when running a colibri fit (see Colibri fit folders). pdf_model.pkl will be loaded by scripts/bayesian_resampler.py for resampling.

If dump_model is set to False, the serialised model will not be written to the disk.

`model.py`

The model.py script defines the Les Houches parametrisation model. It looks like this:

"""
les_houches_example.model.py
"""

import jax.numpy as jnp
import jax.scipy.special as jsp
from colibri.pdf_model import PDFModel


class LesHouchesPDF(PDFModel):
    """
    A PDFModel implementation for the Les Houches parametrisation.
    """

    @property
    def param_names(self):
        """The fitted parameters of the model."""
        return [
            "alpha_gluon",
            "beta_gluon",
            "alpha_up",
            "beta_up",
            "epsilon_up",
            "gamma_up",
            "alpha_down",
            "beta_down",
            "epsilon_down",
            "gamma_down",
            "norm_sigma",
            "alpha_sigma",
            "beta_sigma",
        ]

    @property
    def n_parameters(self):
        """The number of parameters of the model."""
        return len(self.param_names)

    def _pdf_gluon(
        self, x, alpha_gluon, beta_gluon, norm_sigma, alpha_sigma, beta_sigma
    ):
        """Computes normalisation factor A_g in terms of free parameters and computes the gluon PDF."""
        A_g = (
            jsp.gamma(alpha_gluon + beta_gluon + 2)
            / (jsp.gamma(alpha_gluon + 1) * jsp.gamma(beta_gluon + 1))
        ) * (
            1
            - norm_sigma
            * (jsp.gamma(alpha_sigma + 1) * jsp.gamma(beta_sigma + 1))
            / jsp.gamma(alpha_sigma + beta_sigma + 2)
        )
        return A_g * x**alpha_gluon * (1 - x) ** beta_gluon

    def _pdf_sigma(self, x, norm_sigma, alpha_sigma, beta_sigma):
        """Compute the Sigma pdf."""
        return norm_sigma * x**alpha_sigma * (1 - x) ** beta_sigma

    def _A_uv(self, alpha_up, beta_up, epsilon_up, gamma_up):
        """Compute the normalization factor A_{u_v} in terms of free parameters."""
        return (2 / jsp.gamma(beta_up + 1)) / (
            (jsp.gamma(alpha_up) / jsp.gamma(alpha_up + beta_up + 1))
            + epsilon_up
            * (jsp.gamma(alpha_up + 0.5) / jsp.gamma(alpha_up + beta_up + 1.5))
            + gamma_up * (jsp.gamma(alpha_up + 1) / jsp.gamma(alpha_up + beta_up + 2))
        )

    def _A_dv(self, alpha_down, beta_down, epsilon_down, gamma_down):
        """Compute the normalization factor A_{d_v} in terms of free parameters."""
        return (1 / jsp.gamma(beta_down + 1)) / (
            (jsp.gamma(alpha_down) / jsp.gamma(alpha_down + beta_down + 1))
            + epsilon_down
            * (jsp.gamma(alpha_down + 0.5) / jsp.gamma(alpha_down + beta_down + 1.5))
            + gamma_down
            * (jsp.gamma(alpha_down + 1) / jsp.gamma(alpha_down + beta_down + 2))
        )

    def _pdf_valence(
        self,
        x,
        alpha_up,
        beta_up,
        epsilon_up,
        gamma_up,
        alpha_down,
        beta_down,
        epsilon_down,
        gamma_down,
    ):
        """ """
        A_uv = self._A_uv(alpha_up, beta_up, epsilon_up, gamma_up)
        A_dv = self._A_dv(alpha_down, beta_down, epsilon_down, gamma_down)
        up_valence = (
            x**alpha_up
            * (1 - x) ** beta_up
            * (1 + epsilon_up * jnp.sqrt(x) + gamma_up * x)
        )
        down_valence = (
            x**alpha_down
            * (1 - x) ** beta_down
            * (1 + epsilon_down * jnp.sqrt(x) + gamma_down * x)
        )
        return A_uv * up_valence + A_dv * down_valence

    def _pdf_valence3(
        self,
        x,
        alpha_up,
        beta_up,
        epsilon_up,
        gamma_up,
        alpha_down,
        beta_down,
        epsilon_down,
        gamma_down,
    ):
        """ """
        A_uv = self._A_uv(alpha_up, beta_up, epsilon_up, gamma_up)
        A_dv = self._A_dv(alpha_down, beta_down, epsilon_down, gamma_down)
        up_valence = (
            x**alpha_up
            * (1 - x) ** beta_up
            * (1 + epsilon_up * jnp.sqrt(x) + gamma_up * x)
        )
        down_valence = (
            x**alpha_down
            * (1 - x) ** beta_down
            * (1 + epsilon_down * jnp.sqrt(x) + gamma_down * x)
        )
        return A_uv * up_valence - A_dv * down_valence

    def grid_values_func(self, xgrid):
        """This function should produce a grid values function, which takes
        in the model parameters, and produces the PDF values on the grid xgrid.
        """

        xgrid = jnp.array(xgrid)

        def pdf_func(params):
            """ """
            alpha_gluon = params[0]
            beta_gluon = params[1]
            alpha_up = params[2]
            beta_up = params[3]
            epsilon_up = params[4]
            gamma_up = params[5]
            alpha_down = params[6]
            beta_down = params[7]
            epsilon_down = params[8]
            gamma_down = params[9]
            norm_sigma = params[10]
            alpha_sigma = params[11]
            beta_sigma = params[12]
            pdf_grid = []

            # Compute the PDFs for each flavour
            gluon_pdf = self._pdf_gluon(
                xgrid, alpha_gluon, beta_gluon, norm_sigma, alpha_sigma, beta_sigma
            )
            sigma_pdf = self._pdf_sigma(xgrid, norm_sigma, alpha_sigma, beta_sigma)
            valence_pdf = self._pdf_valence(
                xgrid,
                alpha_up,
                beta_up,
                epsilon_up,
                gamma_up,
                alpha_down,
                beta_down,
                epsilon_down,
                gamma_down,
            )
            valence3_pdf = self._pdf_valence3(
                xgrid,
                alpha_up,
                beta_up,
                epsilon_up,
                gamma_up,
                alpha_down,
                beta_down,
                epsilon_down,
                gamma_down,
            )
            t8_pdf = 0.4 * sigma_pdf  # T8 is a scaled version of Sigma

            # Build the PDF grid
            pdf_grid = jnp.array(
                [
                    jnp.zeros_like(xgrid),  # Photon
                    sigma_pdf,  # Σ
                    gluon_pdf,  # g
                    valence_pdf,  # V
                    valence3_pdf,  # V3
                    valence_pdf,  # V8 = V
                    valence_pdf,  # V15 = V
                    valence_pdf,  # V24 = V
                    valence_pdf,  # V35 = V
                    jnp.zeros_like(xgrid),  # T3 = 0
                    t8_pdf,  # T8
                    sigma_pdf,  # T15 = Σ
                    sigma_pdf,  # T24 = Σ
                    sigma_pdf,  # T35 = Σ
                ]
            )
            return pdf_grid

        return pdf_func

The LesHouchesPDF class completes the abstract methods of the PDFModel class, which you can read more about here. This allows for the definition of a specific model in a way that can be used in the colibri code. The LesHouchesPDF class does the following:

takes a list of flavours to be fitted (param_names),
defines the PDF for each flavour,
computes grid values.

Having defined this model, it is used in the production rule produce_pdf_model, defined in the config.py script, shown above. This allows the model to be seen by the rest of the code, so that it can be used to run a fit and perform closure tests.

You can find an example of how to execute the model here.

Implementing a Model in Colibri

Implementing the Les Houches model in Colibri

pyproject.toml

app.py

config.py

model.py

`pyproject.toml`

`app.py`

`config.py`

`model.py`