The GPyConform Package#

Transductive (Full) Conformal Prediction#

class gpyconform.ExactGPCP(train_inputs, train_targets, likelihood, cpmode='symmetric')[source]#

Bases: ExactGP

Extends GPyTorch’s ExactGP to produce Conformal Prediction Intervals, specifically modifying behavior only in the evaluation (.eval()) mode. In particular, it implements both the symmetric approach described in [1] and its asymmetric version, following the approach described in Chapter 2.3 of [2]. For more details on the inherited functionality of ExactGP please see GPyTorch’s documentation at: GPyTorch Docs.

Parameters:
  • train_inputs (torch.Tensor of shape (n_train, n_features)) – Training features.

  • train_targets (torch.Tensor of shape (n_train,)) – Training targets.

  • likelihood (gpytorch.likelihoods.GaussianLikelihood) – Gaussian likelihood (required for this transductive CP implementation). Other likelihoods are not supported.

  • cpmode ({'symmetric', 'asymmetric', None}, optional, default='symmetric') – Mode of the Conformal Prediction: - 'symmetric': Employs the absolute residual nonconformity measure approach as described in [1]. - 'asymmetric': Employs the asymmetric version of the nonconformity measure defined in [1], following the approach described in Chapter 2.3 of [2]. - None: Reverts to standard ExactGP behavior.

Notes

  • The cpmode property can be changed at any time without retraining.

  • Internally, GPyConform applies a small monkey-patch to GPyTorch’s default exact prediction strategy to expose CP Prediction Intervals. The constructor ensures this is applied unless patching is explicitly disabled.

References

[1] Harris Papadopoulos. Guaranteed Coverage Prediction Intervals with Gaussian Process Regression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024. DOI: 10.1109/TPAMI.2024.3418214. (arXiv version).

[2] Vladimir Vovk, Alexander Gammerman, Glenn Shafer. Algorithmic Learning in a Random World, 2nd Ed. Springer, 2023. DOI: 10.1007/978-3-031-06649-8.

Examples

Assuming train_x and train_y are torch tensors with the training features and targets respectively, a Gaussian Process Regression model with Conformal Prediction capabilities can be formed by:

# Construct the model
class MyGPCP(gpyconform.ExactGPCP):
    def __init__(self, train_x, train_y, likelihood, cpmode='symmetric'):
        super(MyGPCP, self).__init__(train_x, train_y, likelihood, cpmode=cpmode)
        self.mean_module = gpytorch.means.ZeroMean()  # Prior mean - any mean module can be used
        self.covar_module = gpytorch.kernels.ScaleKernel(gpytorch.kernels.RBFKernel())

    def forward(self, x):
        mean = self.mean_module(x)
        covar = self.covar_module(x)
        return gpytorch.distributions.MultivariateNormal(mean, covar)

# Initialize likelihood and model
likelihood = gpytorch.likelihoods.GaussianLikelihood()
model = MyGPCP(train_x, train_y, likelihood, 'symmetric')

# If needed change the cpmode property at any time
model.cpmode = 'asymmetric'

Notes

  • Any mean module from gpytorch.means and any covariance module from gpytorch.kernels that is compatible with the ExactPredictionStrategy (e.g. RBF, Matern, SpectralMixture, and standard Scale/Add/Product compositions) can be used.

  • Approximation kernels (e.g. GridInterpolationKernel, InducingPointKernel) are not supported by this transductive approach.

__call__(*args, **kwargs)[source]#

In evaluation (.eval()) mode, calling this model with test inputs will return the symmetric or asymmetric Conformal Prediction Intervals depending on cpmode.

Parameters (in CP / .eval() mode)#

test_inputstorch.Tensor of shape (n_test, n_features)

Test features.

gammafloat, default=2

Nonconformity measure parameter controlling sensitivity to predictive variance differences.

confsarray-like of float in (0, 1), optional, default=[0.95]

Confidence levels for which to return Prediction Intervals.

returns:
  • PredictionIntervals (gpyconform.PredictionIntervals) – If CP is enabled (cpmode not None) and the model is in .eval() mode, returns the Prediction Intervals for each confidence level in confs.

  • gpytorch.distributions.MultivariateNormal – If CP is disabled (cpmode=None) or the model is not in .eval() mode, returns the usual latent posterior from ExactGP.

Notes

The gamma and confs parameters are used only in .eval() mode. They are ignored in all other cases.

Examples

Assuming model is an instance of a GP Conformal Regressor, with optimized hyperparameters, and test_x is a torch tensor containing the test features. The Conformal Prediction Intervals at the 90%, 95%, and 99% confidence levels, with the nonconformity measure parameter gamma set to 2, can be obtained as an instance of PredictionIntervals by:

model.eval()

with torch.no_grad():  # Disable gradient calculation
    PIs = model(test_x, gamma=2, confs=[0.9, 0.95, 0.99])

Inductive (Split) Conformal Prediction#

class gpyconform.InductiveConformalRegressor(cal_targets, cal_preds, cal_vars=None, gamma=2.0, cpmode='symmetric', device: device | str | None = None)[source]#

Bases: object

Inductive conformal regressor calibrated on a given calibration set.

This class stores calibration residuals (normalized or not) computed from a calibration set and produces prediction intervals (PIs) for new points. It can be used not only with GPyTorch models, but with any regression framework that provides predictive means and variances.

Parameters:
  • cal_targets (torch.Tensor of shape (n_cal,)) – Calibration targets.

  • cal_preds (torch.Tensor of shape (n_cal,)) – Predictive means at the calibration inputs.

  • cal_vars (torch.Tensor of shape (n_cal,), optional) – Predictive variances at the calibration inputs. If None, the non-normalized residuals are used as nonconformity measure (equivalent to setting gamma = ).

  • gamma (float, optional, default=2.0) – Power parameter for the normalized ICP nonconformity. If gamma >= 1e8, normalization is short-circuited (equivalent to unnormalized nonconformity).

  • cpmode ({'symmetric', 'asymmetric'}, optional, default='symmetric') – Conformal mode. 'symmetric' uses absolute residuals; 'asymmetric' uses signed residuals. (None is not accepted here.)

  • device (torch.device or str, optional) – Device for internal tensors. Defaults to the device inferred from the inputs.

Notes

  • All ICP computations inside this class are performed in float64.

  • A small epsilon clamp (1e-12) is used when taking powers of variances.

  • The calibrated nonconformity scores (and therefore the resulting PIs) are tied to the chosen cpmode and gamma. For a different cpmode or gamma, construct a new InductiveConformalRegressor.

Examples

Assuming cal_x and cal_y are torch tensors with the calibration set inputs and targets, and cal_means and cal_vars are torch tensors with the predictive means and variances at cal_x. An InductiveConformalRegressor with the absolute residual nonconformity measure (cpmode='symmetric') and gamma set to 2 can be constructed by:

>>> calibratedICR = InductiveConformalRegressor(
...     cal_y, cal_means, cal_vars, gamma=2.0, cpmode='symmetric'
... )
__call__(test_preds, test_vars=None, confs=None)[source]#

Produce prediction intervals at requested confidence levels.

Parameters:
  • test_preds (torch.Tensor of shape (n_test,)) – Predictive means at the test inputs.

  • test_vars (torch.Tensor of shape (n_test,), optional) – Predictive variances at the test inputs. Required if cal_vars were provided at construction time; ignored otherwise.

  • confs (array-like of float in (0, 1), optional, default=[0.95]) – Confidence levels for which to return PIs.

Returns:

PredictionIntervals – Prediction intervals for each confidence level in confs. The returned object keeps the intervals on the same device as this InductiveConformalRegressor instance.

Return type:

gpyconform.PredictionIntervals

Examples

Assuming calibratedICR is an InductiveConformalRegressor, and test_means and test_vars are torch tensors with the predictive means and variances at the test inputs. The Conformal Prediction Intervals at the 90%, 95%, and 99% confidence levels can be obtained as an instance of PredictionIntervals by:

>>> PIs = calibratedICR(test_means, test_vars, confs=[0.9, 0.95, 0.99])
class gpyconform.GPRICPWrapper(model: GP, cal_inputs: Tensor, cal_targets: Tensor, cpmode='symmetric', device: device | str | None = None, batch_size: int | None = None)[source]#

Bases: object

Inductive Conformal Prediction wrapper for a trained GPyTorch GP model.

This wrapper preserves the GP model’s dtype for predictions and (optionally) moves the model to a chosen device. All ICP computations are performed in float64 for numerical stability. It supports both symmetric and asymmetric ICP and can reuse cached predictions.

Parameters:
  • model (gpytorch.models.GP) – Trained GPyTorch model. The model’s dtype is not modified by this wrapper. Ensure the model (and likelihood) are in .eval() mode before calibrating and predicting.

  • cal_inputs (torch.Tensor of shape (n_cal, n_features) | torch.utils.data.DataLoader) – Calibration features (tensor or DataLoader yielding tensors). Used to compute calibration predictions.

  • cal_targets (torch.Tensor of shape (n_cal,)) – Calibration targets.

  • cpmode ({'symmetric', 'asymmetric', None}, optional, default='symmetric') – Mode of the Conformal Prediction: - 'symmetric': Employs the absolute residual nonconformity measure approach as described in [1]. - 'asymmetric': Employs the asymmetric version of the nonconformity measure defined in [1], following the approach described in Chapter 2.3 of [2]. - None: Reverts to the provided model behavior; predict() returns (means, vars).

  • device (torch.device or str, optional) – Device to place the model on. If omitted, the current model device is used.

  • batch_size (int, optional) – Batch size used when evaluating the calibration predictions from a tensor. If omitted, a single batch with all instances is used.

Notes

  • If a DataLoader is provided for cal_inputs, ensure it yields batches in a fixed order (shuffle=False) so predictions align with cal_targets.

  • Inputs passed to the GP are cast to the model’s dtype/device.

  • All ICP calculations run in float64 for numerical stability.

  • Calibration depends on the current cpmode and the chosen gamma. If either changes, call calibrate() again.

  • When cpmode=None, this wrapper returns (means, vars) tensors (not a MultivariateNormal like gpyconform.ExactGPCP).

  • __call__ is an alias of predict().

Examples

Assuming model is a trained (on the proper-training set) GPyTorch model, and cal_x and cal_y are torch tensors with the calibration set inputs and targets. A GPRICPWrapper with the absolute residual nonconformity measure (cpmode='symmetric') can be constructed by:

>>> icp = GPRICPWrapper(model, cal_x, cal_y, cpmode='symmetric')
>>> icp.calibrate(gamma=2.0)
>>> PIs = icp.predict(X_test, confs=[0.9, 0.95])
>>> PIs(0.95)

References

[1] Harris Papadopoulos. Guaranteed Coverage Prediction Intervals with Gaussian Process Regression. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024. DOI: 10.1109/TPAMI.2024.3418214. (arXiv version).

[2] Vladimir Vovk, Alexander Gammerman, Glenn Shafer. Algorithmic Learning in a Random World, 2nd Ed. Springer, 2023. DOI: 10.1007/978-3-031-06649-8.

refresh_model(model: GP, cal_inputs: Tensor, cal_targets: Tensor, device: device | str | None = None, batch_size: int | None = None)[source]#

Replace the underlying GP model and recompute calibration predictions.

Parameters:
  • model (gpytorch.models.GP) – New trained model. The model’s dtype is not modified.

  • cal_inputs (torch.Tensor of shape (n_cal, n_features) or DataLoader) – Calibration features (tensor or DataLoader yielding tensors).

  • cal_targets (torch.Tensor of shape (n_cal,)) – Calibration targets.

  • device (torch.device or str, optional) – Device to place the model on. If omitted, keeps current device.

  • batch_size (int, optional) – Batch size used when evaluating the calibration predictions from a tensor. If omitted, a single batch with all instances is used.

Returns:

The wrapper itself (for chaining).

Return type:

GPRICPWrapper

Notes

This resets any previously calibrated ICP state. You must call calibrate() again before requesting intervals.

Examples

Assuming icp is a GPRICPWrapper the underlying model of which was modified or we would like to replace with a new one, new_model is the modified/new GPyTorch model (trained on the proper-training set), and cal_x and cal_y are torch tensors with the calibration set inputs and targets. The icp object is refreshed by:

>>> icp = icp.refresh_model(new_model, cal_x, cal_y)
property cpmode#

Get the current Conformal Prediction mode.

calibrate(gamma=2.0)[source]#

Calibrate the ICP regressor on the stored calibration set.

Calibration uses the wrapper’s current cpmode (not passed as an argument) and the provided gamma. If cpmode or gamma changes, call this method again before requesting prediction intervals.

Parameters:

gamma (float, optional, default=2.0) – Power parameter for normalized ICP nonconformity. If gamma >= 1e8, normalization is short-circuited (equivalent to unnormalized nonconformity).

Returns:

The wrapper itself (for chaining).

Return type:

GPRICPWrapper

Notes

  • If cpmode=None (CP disabled), this method has no effect.

  • If the underlying model is changed after construction, the wrapper should be updated by calling refresh_model() before calibration.

Examples

Assuming icp is a GPRICPWrapper, it can be calibrated with the nonconformity measure parameter (gamma) set to 2 by:

>>> icp.calibrate(gamma=2.0)
predict(test_inputs=None, confs=None, batch_size: int | None = None)[source]#

Predict on test inputs and produce conformal prediction intervals.

Parameters:
  • test_inputs (torch.Tensor of shape (n_test, n_features) or torch.utils.data.DataLoader, optional) – Test inputs. If None, reuse cached predictions from a previous call. If a tensor is provided and batch_size is set, predictions are computed in batches.

  • confs (array-like of float in (0, 1), optional, default=[0.95]) – Confidence levels. Ignored when cpmode=None.

  • batch_size (int, optional) – Batch size when evaluating a tensor of test inputs. If omitted, a single batch with all instances is used.

Returns:

  • PredictionIntervals (gpyconform.PredictionIntervals) – If CP is enabled (cpmode not None) and the wrapper has been calibrated, returns prediction intervals for each confidence level in confs. The returned object keeps the intervals on the same device as this GPRICPWrapper instance.

  • (means, vars) (tuple[torch.Tensor, torch.Tensor]) – If CP is disabled (cpmode=None), returns predictive means and variances.

Notes

  • __call__ is an alias of predict().

  • If the underlying model is changed after construction, the wrapper should be updated by calling refresh_model() and then calibrated by calling calibrate() before prediction.

Examples

Assuming icp is a calibrated GPRICPWrapper and test_x is a torch tensor with the test set inputs, the Conformal Prediction Intervals at the 90% and 95% confidence levels can be obtained as an instance of PredictionIntervals by:

>>> PIs = icp.predict(test_x, confs=[0.9, 0.95])

or by:

>>> PIs = icp(test_x, confs=[0.9, 0.95])

Conformal Prediction Intervals#

class gpyconform.PredictionIntervals(conf_levels: Tensor, all_pis: Tensor)[source]#

Bases: object

Contains the conformal Prediction Intervals (PIs) at one or more confidence levels and provides functionality for their retrieval and evaluation.

Notes

Confidence levels are internally stored using fixed-point integer keys to avoid float equality issues when retrieving intervals for a requested level.

__call__(conf_level: float | None = None, *, y_min=-inf, y_max=inf, dp: int = 6)[source]#

Returns the Prediction Intervals for a specified confidence level or all intervals if confidence level is not specified.

Parameters:
  • conf_level (float in range (0,1), optional) – Confidence level for which to return the corresponding Prediction Intervals. If not specified, the Prediction Intervals for all confidence levels will be returned.

  • y_min (float, keyword-only, default=-inf) – If provided, PIs are cut to exclude values below y_min.

  • y_max (float, keyword-only, default=inf) – If provided, PIs are cut to exclude values above y_max.

  • dp (int in range [1,6], keyword-only, default=6) – Number of decimals to show in the string keys when returning all levels.

Returns:

A torch tensor with the Prediction Intervals for the specified conf_level, or a dictionary with confidence levels as keys (str) and the corresponding Prediction Interval tensors as values if conf_level is None.

Return type:

torch.Tensor or dict[str, torch.Tensor]

Examples

Assuming PIs is an instance of PredictionIntervals that includes the 95% confidence level.

To retrieve the Prediction Intervals at the 95% confidence level as a tensor:

>>> intervals = PIs(0.95)
>>> print(intervals)

To retrieve the Prediction Intervals for all confidence levels as a dictionary:

>>> all_intervals = PIs()
>>> print(all_intervals)
evaluate(conf_level, metrics=None, y=None, *, y_min=-inf, y_max=inf)[source]#

Evaluates the Prediction Intervals at a specified confidence level.

Parameters:
  • conf_level (float in range (0,1)) – Confidence level of the Prediction Intervals to be evaluated.

  • metrics (list of str or str, optional, default=['mean_width', 'median_width', 'error']) – Metrics to calculate. Possible options: - ‘mean_width’: Average width of the Prediction Intervals. - ‘median_width’: Median width of the Prediction Intervals. - ‘error’: Percentage of Prediction Intervals that do not contain the true target value.

  • y (torch.Tensor of shape (n_test,), optional, default=None) – True target values, required for calculating the ‘error’ metric. If not provided, ‘error’ is not calculated.

  • y_min (float, keyword-only, default=-inf) – If provided, PIs are evaluated after cutting values below y_min.

  • y_max (float, keyword-only, default=inf) – If provided, PIs are evaluated after cutting values above y_max.

Returns:

results – A dictionary with a key for each metric in metrics and the calculated result as its value. For example: {‘mean_width’: 3.852, ‘error’: 0.049}.

Return type:

dict

Examples

Assuming PIs is an instance of PredictionIntervals that includes the 99% confidence level, and test_y is a tensor with the true targets.

To evaluate the Prediction Intervals at the 99% confidence level using all available metrics (which is the default):

>>> results = PIs.evaluate(0.99, y=test_y)

To evaluate only the mean width of the Prediction Intervals at the 99% confidence level:

>>> results = PIs.evaluate(0.99, metrics='mean_width')