Core Estimators#

The two main entry points for running nested cross-validation in nestkit.

Classifier#

class nestkit.NestedCVClassifier(estimator, param_grid, *, search_strategy='grid', outer_cv=5, inner_cv=5, scoring=None, refit=True, return_train_score=False, return_estimator=True, error_score='raise', n_jobs_outer=None, n_jobs_inner=None, verbose=0, random_state=None, callbacks=None, pre_dispatch='2*n_jobs', calibration_method=None, threshold_strategy=None, threshold_criterion='youden', threshold_beta=1.0, cost_matrix=None, min_recall=None, calibration_cv=None, conformal_prediction=False, conformal_alpha=0.1)[source]#

Bases: _BaseNestedCV

Nested cross-validation for classification tasks.

Supports binary and multiclass classification. Extends _BaseNestedCV with optional post-hoc probability calibration and decision-threshold optimization. Both features are disabled by default and must be explicitly enabled.

When calibration is enabled, out-of-fold (OOF) predictions from the inner CV are used to fit a calibrator, which is then applied to the outer test-set probabilities. When threshold optimization is enabled, the optimal decision boundary is selected on the calibrated (or raw) OOF probabilities.

Parameters:

estimator (estimator object) – A scikit-learn compatible classifier that implements fit and predict_proba. Cloned for each outer fold.
param_grid (dict or list of dict) – Hyperparameter search space. See GridSearchCV.
search_strategy ({'grid', 'random', 'bayesian'}, default='grid') – Inner hyperparameter search strategy.
outer_cv (int, cross-validation generator, or iterable, default=5) – Outer cross-validation splitting strategy.
inner_cv (int, cross-validation generator, or iterable, default=5) – Inner cross-validation splitting strategy.
scoring (str, callable, list, tuple, or dict, default=None) – Scoring metric(s) for the inner search.
refit (bool or str, default=True) – Whether to refit on the full outer training set.
return_train_score (bool, default=False) – Whether to include training scores in inner CV results.
return_estimator (bool, default=True) – Whether to store fitted estimators per outer fold.
error_score ('raise' or numeric, default='raise') – Value assigned on inner CV fitting errors.
n_jobs_outer (int or None, default=None) – Number of parallel jobs for outer folds.
n_jobs_inner (int or None, default=None) – Number of parallel jobs for inner search.
verbose (int, default=0) – Verbosity level.
random_state (int, RandomState instance, or None, default=None) – Random state for reproducibility.
callbacks (list of callback objects or None, default=None) – FoldCallback instances for monitoring.
pre_dispatch (int or str, default='2*n_jobs') – Controls job dispatch for parallel execution.
calibration_method ({'sigmoid', 'isotonic', 'beta', 'venn_abers'} or None, default=None) – Post-hoc calibration method. If None, no calibration is applied. 'sigmoid' corresponds to Platt scaling, 'isotonic' to isotonic regression, 'beta' to beta calibration, and 'venn_abers' to Venn-ABERS prediction.
threshold_strategy ({'pooled', 'fold_specific'} or None, default=None) – Threshold optimization strategy. If None, no threshold optimization is performed. 'pooled' selects a single threshold from all OOF predictions; 'fold_specific' selects a per-fold threshold.
threshold_criterion (str or callable, default='youden') – Criterion for threshold selection. Built-in options: 'youden', 'f_beta', 'cost', 'balanced_accuracy', 'precision_at_recall'. A custom callable must accept (y_true, y_proba, threshold) and return a float to be maximised.
threshold_beta (float, default=1.0) – Beta parameter for the F-beta criterion. Only used when threshold_criterion='f_beta'.
cost_matrix (array-like of shape (2, 2) or None, default=None) – Cost matrix [[TN_cost, FP_cost], [FN_cost, TP_cost]] for cost-sensitive threshold optimization. Required when threshold_criterion='cost'.
min_recall (float or None, default=None) – Minimum recall constraint for the 'precision_at_recall' criterion. Required when threshold_criterion='precision_at_recall'.
calibration_cv (int, cross-validation generator, or None, default=None) – CV strategy for generating OOF calibration predictions. If None, uses the same inner_cv strategy. Note that when inner_cv is an integer, a new splitter instance is created for the calibration OOF loop, which may produce different fold assignments than the inner hyperparameter search.
conformal_prediction (bool, default=False) – If True, compute CV+ Mondrian conformal prediction sets using inner out-of-fold probabilities (calibrated if calibration is enabled). Each outer fold gets its own per-class q-hat threshold, applied to the held-out test fold.
conformal_alpha (float, default=0.1) – Significance level (miscoverage rate) for conformal prediction. Target coverage is 1 - alpha. Must be in (0, 1).

Notes

Enabling calibration and/or threshold optimization roughly doubles computation time per outer fold, as the inner CV folds must be re-run to produce OOF probability estimates for the calibrator and threshold optimizer.

For multiclass tasks, calibration is applied independently per class using a one-vs-rest (OVR) decomposition. After calibration the per-class probabilities are renormalized to sum to 1. Because each calibrator is fitted on a marginal binary problem, the resulting multiclass probabilities may not be jointly well-calibrated – this is a known limitation of OVR calibration approaches.

Examples

Basic classification:

>>> from sklearn.datasets import load_breast_cancer
>>> from sklearn.ensemble import RandomForestClassifier
>>> from nestkit import NestedCVClassifier
>>> X, y = load_breast_cancer(return_X_y=True)
>>> ncv = NestedCVClassifier(
...     estimator=RandomForestClassifier(random_state=42),
...     param_grid={"n_estimators": [50, 100], "max_depth": [3, 5]},
...     outer_cv=5, inner_cv=3, random_state=42,
... )
>>> ncv.fit(X, y)
>>> print(ncv.results_.summary_default_)

With calibration and threshold optimization:

>>> ncv = NestedCVClassifier(
...     estimator=RandomForestClassifier(random_state=42),
...     param_grid={"n_estimators": [50, 100]},
...     outer_cv=5, inner_cv=3,
...     calibration_method="isotonic",
...     threshold_strategy="pooled",
...     threshold_criterion="youden",
...     random_state=42,
... )
>>> ncv.fit(X, y)

Regressor#

class nestkit.NestedCVRegressor(estimator, param_grid, *, search_strategy='grid', outer_cv=5, inner_cv=5, scoring=None, refit=True, return_train_score=False, return_estimator=True, error_score='raise', n_jobs_outer=None, n_jobs_inner=None, verbose=0, random_state=None, callbacks=None, pre_dispatch='2*n_jobs', prediction_intervals=False, confidence_level=0.95, mondrian_bins=None, mondrian_min_bin_size=20)[source]#

Bases: _BaseNestedCV

Nested cross-validation for regression tasks.

Extends _BaseNestedCV with support for residual-based prediction intervals. When prediction_intervals=True, inner out-of-fold residuals are collected and their quantiles (with finite-sample correction) are used to construct prediction intervals on the outer test set.

Note

The residuals are collected from OOF models fitted with the best hyperparameters, but the final model is refitted on the full outer training set. The residual distribution may therefore not perfectly match the final model’s errors. These intervals are approximate and do not carry formal conformal coverage guarantees.

Parameters:

estimator (estimator object) – A scikit-learn compatible regressor that implements fit and predict. Cloned for each outer fold.
param_grid (dict or list of dict) – Hyperparameter search space.
search_strategy ({'grid', 'random', 'bayesian'}, default='grid') – Inner hyperparameter search strategy.
outer_cv (int, cross-validation generator, or iterable, default=5) – Outer cross-validation splitting strategy.
inner_cv (int, cross-validation generator, or iterable, default=5) – Inner cross-validation splitting strategy.
scoring (str, callable, list, tuple, or dict, default=None) – Scoring metric(s) for the inner search.
refit (bool or str, default=True) – Whether to refit on the full outer training set.
return_train_score (bool, default=False) – Whether to include training scores in inner CV results.
return_estimator (bool, default=True) – Whether to store fitted estimators per outer fold.
error_score ('raise' or numeric, default='raise') – Value assigned on inner CV fitting errors.
n_jobs_outer (int or None, default=None) – Number of parallel jobs for outer folds.
n_jobs_inner (int or None, default=None) – Number of parallel jobs for inner search.
verbose (int, default=0) – Verbosity level.
random_state (int, RandomState instance, or None, default=None) – Random state for reproducibility.
callbacks (list of callback objects or None, default=None) – FoldCallback instances for monitoring.
pre_dispatch (int or str, default='2*n_jobs') – Controls job dispatch for parallel execution.
prediction_intervals (bool, default=False) – Whether to compute residual-based prediction intervals using inner out-of-fold residuals. When enabled, the results contain prediction_interval_lower and prediction_interval_upper per outer fold.
confidence_level (float, default=0.95) – Confidence level for prediction intervals (e.g., 0.95 for 95% intervals). Only used when prediction_intervals=True.
mondrian_bins (int or None, default=None) – Number of Mondrian bins for conditional prediction intervals. When set (and prediction_intervals=True), OOF predictions are grouped into equal-frequency bins and per-bin residual quantiles are used instead of global quantiles. This yields tighter intervals for easy-to-predict regions.
mondrian_min_bin_size (int, default=20) – Minimum number of calibration samples per Mondrian bin. Bins with fewer samples are merged with their nearest neighbour.

Examples

>>> from sklearn.datasets import load_diabetes
>>> from sklearn.linear_model import Ridge
>>> from nestkit import NestedCVRegressor
>>> X, y = load_diabetes(return_X_y=True)
>>> ncv = NestedCVRegressor(
...     estimator=Ridge(),
...     param_grid={"alpha": [0.01, 0.1, 1.0, 10.0]},
...     outer_cv=5, inner_cv=3,
...     prediction_intervals=True,
...     random_state=42,
... )
>>> ncv.fit(X, y)
>>> print(ncv.results_.summary_default_)