# Getting Started

## Installation

From [PyPI](https://pypi.org/project/gpyconform/)

```bash
pip install gpyconform
```

From [conda-forge](https://anaconda.org/conda-forge/gpyconform)

```bash
conda install conda-forge::gpyconform
```

## Tutorials

Let us illustrate how to use **GPyConform** for obtaining conformal Prediction Intervals (PIs) with both the **Transductive (Full)** 
and the **Inductive (Split)** Gaussian Process Regression (GPR) Conformal Predictors. Since GPyConform is built on top of GPyTorch, 
we’ll closely follow the 
[GPyTorch Simple GP Regression Tutorial](https://docs.gpytorch.ai/en/stable/examples/01_Exact_GPs/Simple_GP_Regression.html).
This makes it easy to see how GPyConform extends GPyTorch’s functionality, while leaving everything else unchanged.

Specifically, we'll be training an RBF kernel Gaussian Process on data generated by the function

$$
\begin{aligned}
  y &= \sin(2\pi x) + \epsilon \\
  \epsilon &\sim \mathcal{N}(0, 0.04) 
\end{aligned}
$$

with inputs sampled uniformly at random from the interval [0,1].

## Transductive (Full) GPR Conformal Prediction

The Transductive (or Full) GPR Conformal Predictor (CP) [1] is implemented in `gpyconform.ExactGPCP`, an extension of GPyTorch's 
`gpytorch.models.ExactGP`, since the transductive version requires changes to the model's internal prediction process. `ExactGPCP` 
assumes the use of `gpytorch.likelihoods.GaussianLikelihood`; other likelihoods are not supported.

In practice, `ExactGPCP` behaves just like `ExactGP`: you can define and train your model in exactly the same way as in the GPyTorch tutorials. 
The only difference is that, in addition to the standard GP predictions, `ExactGPCP` can also provide (symmetric or asymmetric) conformal PIs.

### Imports and data

First, let's import the needed packages and set up the data. For reproducibility we set the random seed to 42. The training set is generated as 
500 random points uniformly distributed in [0,1], on which we evaluate the function and add Gaussian noise to get the training labels. 
The test set is generated in the same way, with 5000 random points:

```python
import math
import torch
import gpytorch
import gpyconform

torch.manual_seed(42)

train_x = torch.rand(500)
train_y = torch.sin(train_x * (2 * math.pi)) + torch.randn(train_x.size()) * math.sqrt(0.04)

test_x = torch.rand(5000)
test_y = torch.sin(test_x * (2 * math.pi)) + torch.randn(test_x.size()) * math.sqrt(0.04)
```

### Setting up the model

The model is constructed in exactly the same way as with GPyTorch, but using `gpyconform.ExactGPCP` instead of `gpytorch.models.ExactGP`. 
For details please refer to the [GPyTorch documentation](https://docs.gpytorch.ai/en/latest/).

```python
class ExactGPCPModel(gpyconform.ExactGPCP):
    def __init__(self, train_x, train_y, likelihood, cpmode='symmetric'):
        super(ExactGPCPModel, self).__init__(train_x, train_y, likelihood, cpmode=cpmode)
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())
    
    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

likelihood = gpytorch.likelihoods.GaussianLikelihood()
model = ExactGPCPModel(train_x, train_y, likelihood)
```

**Note:** Any mean function from `gpytorch.means` and any kernel function that employs an exact prediction 
strategy from `gpytorch.kernels` can be used with GPyConform.

### Model modes

Like the GPyTorch `ExactGP` module, `ExactGPCP` also has a `.train()` and `.eval()` mode:
- `.train()` mode is for optimizing model hyperameters.
- `.eval()` mode is for computing the Conformal Prediction Intervals and the original GP predictions through the model posterior.

### Training the model

The hyperparameter training of the Gaussian Process is also performed in exactly the same way as GPyTorch.

```python
training_iter = 50

model.train()
likelihood.train()

optimizer = torch.optim.Adam(model.parameters(), lr=0.1)

mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, model)

for i in range(training_iter):
    optimizer.zero_grad()
    output = model(train_x)
    loss = -mll(output, train_y)
    loss.backward()
    print('Iter %d/%d - Loss: %.3f   lengthscale: %.3f   noise: %.3f' % (
        i + 1, training_iter, loss.item(),
        model.covar_module.base_kernel.lengthscale.item(),
        model.likelihood.noise.item()
    ))
    optimizer.step()
```

```text
Iter 1/50 - Loss: 0.882   lengthscale: 0.693   noise: 0.693
Iter 2/50 - Loss: 0.841   lengthscale: 0.644   noise: 0.644
Iter 3/50 - Loss: 0.795   lengthscale: 0.598   noise: 0.598
...
Iter 48/50 - Loss: -0.167   lengthscale: 0.318   noise: 0.030
Iter 49/50 - Loss: -0.170   lengthscale: 0.323   noise: 0.030
Iter 50/50 - Loss: -0.172   lengthscale: 0.328   noise: 0.031
```

### Making predictions: Obtaining prediction intervals

Like in GPyTorch, to make predictions with the model we put it in `.eval()` mode.

`ExactGPCP` objects have an additional parameter named `cpmode`, which determines the CP approach:
- `.cpmode='symmetric'` (default) employs the absolute residual nonconformity measure approach as described in [1].
- `.cpmode='asymmetric'` employs the asymmetric version of the nonconformity measure defined in [1], following the approach described in Chapter 2.3 of [2].
- `.cpmode=None` reverts to GPyTorch's ExactGP behavior (for details refer to the GPyTorch documentation).

**Note:** The ``cpmode`` property can change at any time without affecting the model. For `.cpmode=None` we should also put the likelihood in `.eval()` mode 
and call both modules on the test data.

After putting the model in `.eval()` mode, and if needed changing `cpmode`, we can then call the model on the test data to obtain the 
corresponding PIs. In addition to the test data, the model (in `.eval()` mode) has the following optional parameters:
- `gamma`: the gamma parameter of the nonconformity measure (float). The default value is 2.
- `confs`: a list of the confidence levels for which to return PIs (torch.Tensor, numpy.array, or list). The default is [0.95].

A trained GPyConform model in `.eval()` mode (with `cpmode` set to either `symmetric` or `asymmetric`) returns a `PredictionIntervals` object containing the 
PIs for all confidence levels in `confs`. The `PredictionIntervals` object allows direct access to the PIs as well as their 
evaluation in terms of empirical calibration and informational efficiency.

So let's put the model in `.eval()` mode and obtain the `PredictionIntervals` object with `gamma` set to 2 for the 99%, 95% and 90% confidence levels.

```python
model.eval()

with torch.no_grad():  # Disable gradient calculation
    pis = model(test_x, gamma=2, confs=[0.99, 0.95, 0.9])
```

We can then extract a torch tensor with the PIs for any of the three confidence levels (here 95%):

```python
print(pis(0.95))
```

```text
tensor([[ 0.4023,  1.1818],
        [-1.2279, -0.4469],
        [-1.3555, -0.5769],
        ...,
        [-1.3729, -0.5945],
        [ 0.2350,  1.0155],
        [ 0.3880,  1.1654]])
```
The output is a tensor with a row for each test instance, and where the two columns specify the lower and upper bound of each PI.

We can also extract a dictionary with confidence levels as keys and the corresponding PI tensors as values:

```python
print(pis())
```

```text
{
   0.9: tensor([[ 0.4642,  1.1230],
        	[-1.1645, -0.5086],
        	[-1.2947, -0.6365],
        	...,
        	[-1.3116, -0.6540],
        	[ 0.2972,  0.9568],
        	[ 0.4483,  1.1060]]),
  0.95: tensor([[ 0.4023,  1.1818],
        	[-1.2279, -0.4469],
        	[-1.3555, -0.5769],
        	...,
        	[-1.3729, -0.5945],
        	[ 0.2350,  1.0155],
        	[ 0.3880,  1.1654]]),
  0.99: tensor([[ 0.2844,  1.2996],
        	[-1.3431, -0.3283],
        	[-1.4742, -0.4584],
        	...,
        	[-1.4912, -0.4757],
        	[ 0.1181,  1.1335],
        	[ 0.2733,  1.2876]])
}
```

Additionally, we can evaluate the PIs (here using all available metrics, which is the default), at any of the three confidence levels (here 99%):

```python
print(pis.evaluate(0.99, y=test_y))
```

```text
{'mean_width': 1.01625, 'median_width': 1.01515, 'error': 0.01020}
```
The output is a dictionary with the metrics as keys.

To obtain the corresponding `'asymmetric'` PIs, we only set `.cpmode='asymmetric'`:

```python
model.cpmode = 'asymmetric'

with torch.no_grad():  # Disable gradient calculation
    pis = model(test_x, gamma=2, confs=[0.99, 0.95, 0.9])
```

We can extract and/or evaluate the PIs in the same way as above. E.g. to obtain the mean and median widths of the 99% PIs:

```python
pis.evaluate(0.99, ['mean_width', 'median_width'])
```

```text
{'mean_width': 1.08704, 'median_width': 1.08623}
```
Note that in this case, since the 'error' metric is not required, we do not need to provide the true target values.

## Inductive (Split) GPR Conformal Prediction

The Inductive (or Split) GPR Conformal Predictor (CP) is implemented in `gpyconform.GPRICPWrapper`. 
Unlike the transductive version, the inductive version does not modify GPyTorch’s internals. Instead, 
it wraps around an underlying GPR model trained on the **proper training set** and uses a separate 
**calibration set** to construct the conformal Prediction Intervals (PIs).  

In practice, the wrapper can be used with any GPyTorch regression model. It is therefore not restricted 
to `ExactGP` or `GaussianLikelihood`.  

In this tutorial we will focus on the `GPRICPWrapper`, which provides a simple interface for building and calibrating a GPR inductive 
CP on top of a trained GPR model to produce (symmetric or asymmetric) conformal PIs. For advanced use cases, GPyConform 
also includes `gpyconform.InductiveConformalRegressor`, which works at a lower level by 
taking calibration and test predictions (means and variances) together with the corresponding ground truth targets directly, 
rather than through a wrapped model. This makes it possible to apply inductive CP not only to GPyTorch models, 
but also to any other regression framework that exposes predictive means and variances.

### Imports and data

Again let's first import the needed packages and set up the data. For reproducibility we set the random seed to 42. We'll be using more 
training (1000) and test (10000) points, to highlight Inductive CP's computational efficiency with larger sets.

```python
import math
import torch
import gpytorch
import gpyconform
from gpyconform.utils import tensor_train_cal_split

torch.manual_seed(42)

train_x = torch.rand(1000)
train_y = torch.sin(train_x * (2 * math.pi)) + torch.randn(train_x.size()) * math.sqrt(0.04)

test_x = torch.rand(10000)
test_y = torch.sin(test_x * (2 * math.pi)) + torch.randn(test_x.size()) * math.sqrt(0.04)
```

### Divide the training data

For Inductive CP (ICP) we divide the training data into two disjoint sets: the **Proper Training** and **Calibration** sets. We can do this 
using the `gpyconform.utils.tensor_train_cal_split` function, which works on torch tensors.

```python
prop_train_x, cal_x, prop_train_y, cal_y = tensor_train_cal_split(
    train_x, train_y,
    cal_size=0.3,
    shuffle=True,
    seed=None,
)
```

### Setting up the model

We construct the exact same `ExactGP` model as in the GPyTorch tutorial (and as in the transductive example) as base model. 
For details on constructing different types of GPR models please refer to the [GPyTorch documentation](https://docs.gpytorch.ai/en/latest/).

```python
class ExactGPModel(gpytorch.models.ExactGP):
    def __init__(self, train_x, train_y, likelihood):
        super(ExactGPModel, self).__init__(train_x, train_y, likelihood)
        self.mean_module = gpytorch.means.ConstantMean()
        self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())
    
    def forward(self, x):
        mean_x = self.mean_module(x)
        covar_x = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean_x, covar_x)

# initialize likelihood and model
likelihood = gpytorch.likelihoods.GaussianLikelihood()
base_model = ExactGPModel(prop_train_x, prop_train_y, likelihood)
```

### Training the base model

We optimize the hyperparameters of the Gaussian Process on the **proper training data**, rather than one the whole training set.

```python
training_iter = 50

base_model.train()
likelihood.train()

optimizer = torch.optim.Adam(base_model.parameters(), lr=0.1)

mll = gpytorch.mlls.ExactMarginalLogLikelihood(likelihood, base_model)

for i in range(training_iter):
    optimizer.zero_grad()
    output = base_model(prop_train_x)
    loss = -mll(output, prop_train_y)
    loss.backward()
    print('Iter %d/%d - Loss: %.3f   lengthscale: %.3f   noise: %.3f' % (
        i + 1, training_iter, loss.item(),
        base_model.covar_module.base_kernel.lengthscale.item(),
        base_model.likelihood.noise.item()
    ))
    optimizer.step()
```

```text
Iter 1/50 - Loss: 0.880   lengthscale: 0.693   noise: 0.693
Iter 2/50 - Loss: 0.836   lengthscale: 0.644   noise: 0.644
Iter 3/50 - Loss: 0.786   lengthscale: 0.598   noise: 0.598
...
Iter 48/50 - Loss: -0.123   lengthscale: 0.290   noise: 0.034
Iter 49/50 - Loss: -0.126   lengthscale: 0.296   noise: 0.035
Iter 50/50 - Loss: -0.129   lengthscale: 0.301   noise: 0.036
```

### Wrap with ICP and calibrate

We put the base model and likelihood in evaluation mode and construct a GPR ICP Wrapper (`gpyconform.GPRICPWrapper`) providing 
the base model and calibration data. We then calibrate our ICP providing the nonconformity measure parameter `gamma` (in this example set to 2.0).

`GPRICPWrapper` objects have an additional parameter named `cpmode`, which determines the ICP approach: 
- `.cpmode='symmetric'` (default) employs the standard absolute residual nonconformity measure approch.
- `.cpmode='asymmetric'` employs the asymmetric version of the nonconformity measure for computing upper and lower bounds individually.
- `.cpmode=None` Reverts to the base model behavior; returning predictive means and variances.

The `cpmode` can be provided as a parameter of the `GPRICPWrapper` constructor and can be changed at any time after construction. **Note** 
that unlike the transductive version, in this case changing `cpmode` requires recalibration.

Let's construct a `GPRICPWrapper` with our base model and calibration data, leaving `cpmode` to the default `'symmetric'` value:

```python
base_model.eval()
likelihood.eval()

gpricpmodel = gpyconform.GPRICPWrapper(base_model, cal_x, cal_y)

gpricpmodel.calibrate(gamma=2.0)
```

### Making predictions: Obtaining prediction intervals

We can now call the `predict` function of `GPRICPWrapper` to obtain a `PredictionIntervals` object containing the conformal PIs 
for any set of confidence levels. Let's obtain a `PredictionIntervals` object for our test data at the 
99%, 95% and 90% confidence levels.

```python
with torch.no_grad():  # Disable gradient calculation
    pis = gpricpmodel.predict(test_x, confs=[0.99, 0.95, 0.9])
```

We can now extract a torch tensor with the PIs for any of the three confidence levels or a dictionary with 
confidence levels as keys and the corresponding PI tensors as values in the same way as in the transductive 
example above. For brevity we only do the former here: 

```python
print(pis(0.95))
```

```text
tensor([[-0.9292, -0.0945],
        [ 0.5864,  1.4195],
        [ 0.5543,  1.3874],
        ...,
        [ 0.5527,  1.3859],
        [-1.2778, -0.4445],
        [-1.4284, -0.5949]])
```

Aditionally, we can evaluate the PIs in exactly the same way (here using all available metrics, which is the default), 
at any of the three confidence levels (here 99%):

```python
print(pis.evaluate(0.99, y=test_y))
```

```text
{'mean_width': 1.03457, 'median_width': 1.03386, 'error': 0.0098}
```

To switch to `'asymmetric'` prediction intervals, we need to set `.cpmode='asymmetric'` and recalibrate:

```python
gpricpmodel.cpmode = 'asymmetric'
gpricpmodel.calibrate(gamma=2.0)
```

We can now use the `predict` function to obtain PIs as before. However, since the base model 
predictions are not affected by the change of `cpmode` or `gamma`, we can reuse the cached test predictions by not providing
test inputs as parameters to `predict`.

```python
with torch.no_grad():  # Disable gradient calculation
    pis = gpricpmodel.predict(confs=[0.99, 0.95, 0.9]) 
```

We can extract and/or evaluate the PIs in the same way as above. E.g. to obtain the mean and median widths of 
the 99% PIs:

```python
print(pis.evaluate(0.99, ['mean_width', 'median_width']))
```

```text
{'mean_width': 1.15715, 'median_width': 1.15636}
```