# Getting Started ## Installation From [PyPI](https://pypi.org/project/gpyconform/) ```bash pip install gpyconform ``` From [conda-forge](https://anaconda.org/conda-forge/gpyconform) ```bash conda install conda-forge::gpyconform ``` ## Tutorials Let us illustrate how to use **GPyConform** for obtaining conformal Prediction Intervals (PIs) with both the **Transductive (Full)** and the **Inductive (Split)** Gaussian Process Regression (GPR) Conformal Predictors. Since GPyConform is built on top of GPyTorch, we’ll closely follow the [GPyTorch Simple GP Regression Tutorial](https://docs.gpytorch.ai/en/stable/examples/01_Exact_GPs/Simple_GP_Regression.html). This makes it easy to see how GPyConform extends GPyTorch’s functionality, while leaving everything else unchanged. Specifically, we'll be training an RBF kernel Gaussian Process on data generated by the function $$ \begin{aligned} y &= \sin(2\pi x) + \epsilon \\ \epsilon &\sim \mathcal{N}(0, 0.04) \end{aligned} $$ with inputs sampled uniformly at random from the interval [0,1]. ## Transductive (Full) GPR Conformal Prediction The Transductive (or Full) GPR Conformal Predictor (CP) [1] is implemented in `gpyconform.ExactGPCP`, an extension of GPyTorch's `gpytorch.models.ExactGP`, since the transductive version requires changes to the model's internal prediction process. `ExactGPCP` assumes the use of `gpytorch.likelihoods.GaussianLikelihood`; other likelihoods are not supported. In practice, `ExactGPCP` behaves just like `ExactGP`: you can define and train your model in exactly the same way as in the GPyTorch tutorials. The only difference is that, in addition to the standard GP predictions, `ExactGPCP` can also provide (symmetric or asymmetric) conformal PIs. ### Imports and data First, let's import the needed packages and set up the data. For reproducibility we set the random seed to 42. The training set is generated as 500 random points uniformly distributed in [0,1], on which we evaluate the function and add Gaussian noise to get the training labels. The test set is generated in the same way, with 5000 random points: ```python import math import torch import gpytorch import gpyconform torch.manual_seed(42) train_x = torch.rand(500) train_y = torch.sin(train_x * (2 * math.pi)) + torch.randn(train_x.size()) * math.sqrt(0.04) test_x = torch.rand(5000) test_y = torch.sin(test_x * (2 * math.pi)) + torch.randn(test_x.size()) * math.sqrt(0.04) ``` ### Setting up the model The model is constructed in exactly the same way as with GPyTorch, but using `gpyconform.ExactGPCP` instead of `gpytorch.models.ExactGP`. For details please refer to the [GPyTorch documentation](https://docs.gpytorch.ai/en/latest/). ```python class ExactGPCPModel(gpyconform.ExactGPCP): def __init__(self, train_x, train_y, likelihood, cpmode='symmetric'): super(ExactGPCPModel, self).__init__(train_x, train_y, likelihood, cpmode=cpmode) self.mean_module = gpytorch.means.ConstantMean() self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel()) def forward(self, x): mean_x = self.mean_module(x) covar_x = self.covar_module(x) return gpytorch.distributions.MultivariateNormal(mean_x, covar_x) likelihood = gpytorch.likelihoods.GaussianLikelihood() model = ExactGPCPModel(train_x, train_y, likelihood) ``` **Note:** Any mean function from `gpytorch.means` and any kernel function that employs an exact prediction strategy from `gpytorch.kernels` can be used with GPyConform. ### Model modes Like the GPyTorch `ExactGP` module, `ExactGPCP` also has a `.train()` and `.eval()` mode: - `.train()` mode is for optimizing model hyperameters. - `.eval()` mode is for computing the Conformal Prediction Intervals and the original GP predictions through the model posterior. ### Training the model The hyperparameter training of the Gaussian Process is also performed in exactly the same way as GPyTorch. ```python training_iter = 50 model.train() likelihood.train() optimizer = torch.optim.Adam(model.parameters(), lr=0.1) mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model) for i in range(training_iter): optimizer.zero_grad() output = model(train_x) loss = -mll(output, train_y) loss.backward() print('Iter %d/%d - Loss: %.3f lengthscale: %.3f noise: %.3f' % ( i + 1, training_iter, loss.item(), model.covar_module.base_kernel.lengthscale.item(), model.likelihood.noise.item() )) optimizer.step() ``` ```text Iter 1/50 - Loss: 0.882 lengthscale: 0.693 noise: 0.693 Iter 2/50 - Loss: 0.841 lengthscale: 0.644 noise: 0.644 Iter 3/50 - Loss: 0.795 lengthscale: 0.598 noise: 0.598 ... Iter 48/50 - Loss: -0.167 lengthscale: 0.318 noise: 0.030 Iter 49/50 - Loss: -0.170 lengthscale: 0.323 noise: 0.030 Iter 50/50 - Loss: -0.172 lengthscale: 0.328 noise: 0.031 ``` ### Making predictions: Obtaining prediction intervals Like in GPyTorch, to make predictions with the model we put it in `.eval()` mode. `ExactGPCP` objects have an additional parameter named `cpmode`, which determines the CP approach: - `.cpmode='symmetric'` (default) employs the absolute residual nonconformity measure approach as described in [1]. - `.cpmode='asymmetric'` employs the asymmetric version of the nonconformity measure defined in [1], following the approach described in Chapter 2.3 of [2]. - `.cpmode=None` reverts to GPyTorch's ExactGP behavior (for details refer to the GPyTorch documentation). **Note:** The ``cpmode`` property can change at any time without affecting the model. For `.cpmode=None` we should also put the likelihood in `.eval()` mode and call both modules on the test data. After putting the model in `.eval()` mode, and if needed changing `cpmode`, we can then call the model on the test data to obtain the corresponding PIs. In addition to the test data, the model (in `.eval()` mode) has the following optional parameters: - `gamma`: the gamma parameter of the nonconformity measure (float). The default value is 2. - `confs`: a list of the confidence levels for which to return PIs (torch.Tensor, numpy.array, or list). The default is [0.95]. A trained GPyConform model in `.eval()` mode (with `cpmode` set to either `symmetric` or `asymmetric`) returns a `PredictionIntervals` object containing the PIs for all confidence levels in `confs`. The `PredictionIntervals` object allows direct access to the PIs as well as their evaluation in terms of empirical calibration and informational efficiency. So let's put the model in `.eval()` mode and obtain the `PredictionIntervals` object with `gamma` set to 2 for the 99%, 95% and 90% confidence levels. ```python model.eval() with torch.no_grad(): # Disable gradient calculation pis = model(test_x, gamma=2, confs=[0.99, 0.95, 0.9]) ``` We can then extract a torch tensor with the PIs for any of the three confidence levels (here 95%): ```python print(pis(0.95)) ``` ```text tensor([[ 0.4023, 1.1818], [-1.2279, -0.4469], [-1.3555, -0.5769], ..., [-1.3729, -0.5945], [ 0.2350, 1.0155], [ 0.3880, 1.1654]]) ``` The output is a tensor with a row for each test instance, and where the two columns specify the lower and upper bound of each PI. We can also extract a dictionary with confidence levels as keys and the corresponding PI tensors as values: ```python print(pis()) ``` ```text { 0.9: tensor([[ 0.4642, 1.1230], [-1.1645, -0.5086], [-1.2947, -0.6365], ..., [-1.3116, -0.6540], [ 0.2972, 0.9568], [ 0.4483, 1.1060]]), 0.95: tensor([[ 0.4023, 1.1818], [-1.2279, -0.4469], [-1.3555, -0.5769], ..., [-1.3729, -0.5945], [ 0.2350, 1.0155], [ 0.3880, 1.1654]]), 0.99: tensor([[ 0.2844, 1.2996], [-1.3431, -0.3283], [-1.4742, -0.4584], ..., [-1.4912, -0.4757], [ 0.1181, 1.1335], [ 0.2733, 1.2876]]) } ``` Additionally, we can evaluate the PIs (here using all available metrics, which is the default), at any of the three confidence levels (here 99%): ```python print(pis.evaluate(0.99, y=test_y)) ``` ```text {'mean_width': 1.01625, 'median_width': 1.01515, 'error': 0.01020} ``` The output is a dictionary with the metrics as keys. To obtain the corresponding `'asymmetric'` PIs, we only set `.cpmode='asymmetric'`: ```python model.cpmode = 'asymmetric' with torch.no_grad(): # Disable gradient calculation pis = model(test_x, gamma=2, confs=[0.99, 0.95, 0.9]) ``` We can extract and/or evaluate the PIs in the same way as above. E.g. to obtain the mean and median widths of the 99% PIs: ```python pis.evaluate(0.99, ['mean_width', 'median_width']) ``` ```text {'mean_width': 1.08704, 'median_width': 1.08623} ``` Note that in this case, since the 'error' metric is not required, we do not need to provide the true target values. ## Inductive (Split) GPR Conformal Prediction The Inductive (or Split) GPR Conformal Predictor (CP) is implemented in `gpyconform.GPRICPWrapper`. Unlike the transductive version, the inductive version does not modify GPyTorch’s internals. Instead, it wraps around an underlying GPR model trained on the **proper training set** and uses a separate **calibration set** to construct the conformal Prediction Intervals (PIs). In practice, the wrapper can be used with any GPyTorch regression model. It is therefore not restricted to `ExactGP` or `GaussianLikelihood`. In this tutorial we will focus on the `GPRICPWrapper`, which provides a simple interface for building and calibrating a GPR inductive CP on top of a trained GPR model to produce (symmetric or asymmetric) conformal PIs. For advanced use cases, GPyConform also includes `gpyconform.InductiveConformalRegressor`, which works at a lower level by taking calibration and test predictions (means and variances) together with the corresponding ground truth targets directly, rather than through a wrapped model. This makes it possible to apply inductive CP not only to GPyTorch models, but also to any other regression framework that exposes predictive means and variances. ### Imports and data Again let's first import the needed packages and set up the data. For reproducibility we set the random seed to 42. We'll be using more training (1000) and test (10000) points, to highlight Inductive CP's computational efficiency with larger sets. ```python import math import torch import gpytorch import gpyconform from gpyconform.utils import tensor_train_cal_split torch.manual_seed(42) train_x = torch.rand(1000) train_y = torch.sin(train_x * (2 * math.pi)) + torch.randn(train_x.size()) * math.sqrt(0.04) test_x = torch.rand(10000) test_y = torch.sin(test_x * (2 * math.pi)) + torch.randn(test_x.size()) * math.sqrt(0.04) ``` ### Divide the training data For Inductive CP (ICP) we divide the training data into two disjoint sets: the **Proper Training** and **Calibration** sets. We can do this using the `gpyconform.utils.tensor_train_cal_split` function, which works on torch tensors. ```python prop_train_x, cal_x, prop_train_y, cal_y = tensor_train_cal_split( train_x, train_y, cal_size=0.3, shuffle=True, seed=None, ) ``` ### Setting up the model We construct the exact same `ExactGP` model as in the GPyTorch tutorial (and as in the transductive example) as base model. For details on constructing different types of GPR models please refer to the [GPyTorch documentation](https://docs.gpytorch.ai/en/latest/). ```python class ExactGPModel(gpytorch.models.ExactGP): def __init__(self, train_x, train_y, likelihood): super(ExactGPModel, self).__init__(train_x, train_y, likelihood) self.mean_module = gpytorch.means.ConstantMean() self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel()) def forward(self, x): mean_x = self.mean_module(x) covar_x = self.covar_module(x) return gpytorch.distributions.MultivariateNormal(mean_x, covar_x) # initialize likelihood and model likelihood = gpytorch.likelihoods.GaussianLikelihood() base_model = ExactGPModel(prop_train_x, prop_train_y, likelihood) ``` ### Training the base model We optimize the hyperparameters of the Gaussian Process on the **proper training data**, rather than one the whole training set. ```python training_iter = 50 base_model.train() likelihood.train() optimizer = torch.optim.Adam(base_model.parameters(), lr=0.1) mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, base_model) for i in range(training_iter): optimizer.zero_grad() output = base_model(prop_train_x) loss = -mll(output, prop_train_y) loss.backward() print('Iter %d/%d - Loss: %.3f lengthscale: %.3f noise: %.3f' % ( i + 1, training_iter, loss.item(), base_model.covar_module.base_kernel.lengthscale.item(), base_model.likelihood.noise.item() )) optimizer.step() ``` ```text Iter 1/50 - Loss: 0.880 lengthscale: 0.693 noise: 0.693 Iter 2/50 - Loss: 0.836 lengthscale: 0.644 noise: 0.644 Iter 3/50 - Loss: 0.786 lengthscale: 0.598 noise: 0.598 ... Iter 48/50 - Loss: -0.123 lengthscale: 0.290 noise: 0.034 Iter 49/50 - Loss: -0.126 lengthscale: 0.296 noise: 0.035 Iter 50/50 - Loss: -0.129 lengthscale: 0.301 noise: 0.036 ``` ### Wrap with ICP and calibrate We put the base model and likelihood in evaluation mode and construct a GPR ICP Wrapper (`gpyconform.GPRICPWrapper`) providing the base model and calibration data. We then calibrate our ICP providing the nonconformity measure parameter `gamma` (in this example set to 2.0). `GPRICPWrapper` objects have an additional parameter named `cpmode`, which determines the ICP approach: - `.cpmode='symmetric'` (default) employs the standard absolute residual nonconformity measure approch. - `.cpmode='asymmetric'` employs the asymmetric version of the nonconformity measure for computing upper and lower bounds individually. - `.cpmode=None` Reverts to the base model behavior; returning predictive means and variances. The `cpmode` can be provided as a parameter of the `GPRICPWrapper` constructor and can be changed at any time after construction. **Note** that unlike the transductive version, in this case changing `cpmode` requires recalibration. Let's construct a `GPRICPWrapper` with our base model and calibration data, leaving `cpmode` to the default `'symmetric'` value: ```python base_model.eval() likelihood.eval() gpricpmodel = gpyconform.GPRICPWrapper(base_model, cal_x, cal_y) gpricpmodel.calibrate(gamma=2.0) ``` ### Making predictions: Obtaining prediction intervals We can now call the `predict` function of `GPRICPWrapper` to obtain a `PredictionIntervals` object containing the conformal PIs for any set of confidence levels. Let's obtain a `PredictionIntervals` object for our test data at the 99%, 95% and 90% confidence levels. ```python with torch.no_grad(): # Disable gradient calculation pis = gpricpmodel.predict(test_x, confs=[0.99, 0.95, 0.9]) ``` We can now extract a torch tensor with the PIs for any of the three confidence levels or a dictionary with confidence levels as keys and the corresponding PI tensors as values in the same way as in the transductive example above. For brevity we only do the former here: ```python print(pis(0.95)) ``` ```text tensor([[-0.9292, -0.0945], [ 0.5864, 1.4195], [ 0.5543, 1.3874], ..., [ 0.5527, 1.3859], [-1.2778, -0.4445], [-1.4284, -0.5949]]) ``` Aditionally, we can evaluate the PIs in exactly the same way (here using all available metrics, which is the default), at any of the three confidence levels (here 99%): ```python print(pis.evaluate(0.99, y=test_y)) ``` ```text {'mean_width': 1.03457, 'median_width': 1.03386, 'error': 0.0098} ``` To switch to `'asymmetric'` prediction intervals, we need to set `.cpmode='asymmetric'` and recalibrate: ```python gpricpmodel.cpmode = 'asymmetric' gpricpmodel.calibrate(gamma=2.0) ``` We can now use the `predict` function to obtain PIs as before. However, since the base model predictions are not affected by the change of `cpmode` or `gamma`, we can reuse the cached test predictions by not providing test inputs as parameters to `predict`. ```python with torch.no_grad(): # Disable gradient calculation pis = gpricpmodel.predict(confs=[0.99, 0.95, 0.9]) ``` We can extract and/or evaluate the PIs in the same way as above. E.g. to obtain the mean and median widths of the 99% PIs: ```python print(pis.evaluate(0.99, ['mean_width', 'median_width'])) ``` ```text {'mean_width': 1.15715, 'median_width': 1.15636} ```