Inner CV#

Utilities for inspecting the inner cross-validation loop.

InnerCVReport#

class nestkit.inner.InnerCVReport(cv_results, outer_fold_idx)[source]#

Bases: object

Diagnostics for the inner CV hyperparameter search of a single outer fold.

Wraps the cv_results_ dictionary produced by scikit-learn search objects (GridSearchCV, RandomizedSearchCV, etc.) and exposes convenience methods for ranking configurations, estimating parameter importance, and examining score distributions.

Parameters:
  • cv_results (dict) – The cv_results_ dictionary from the inner search object.

  • outer_fold_idx (int) – Zero-based index of the outer fold this report belongs to.

cv_results_#

Raw cv_results_ dictionary.

Type:

dict

outer_fold_idx#

Outer fold index.

Type:

int

Examples

>>> report = InnerCVReport(search.cv_results_, outer_fold_idx=0)
>>> report.top_k(3, metric="roc_auc")

See also

nestkit.inner.search.build_search

Construct the inner search object.

to_dataframe()[source]#

Convert the full cv_results_ dictionary to a DataFrame.

Returns:

One row per hyperparameter configuration with all columns from scikit-learn’s cv_results_ (parameters, mean/std scores, ranks, fit times, etc.).

Return type:

pandas.DataFrame

ranking(metric=None)[source]#

Return all configurations ranked by mean inner-CV score.

Sorts by the rank_test_<metric> column if it exists; otherwise falls back to descending mean_test_<metric>.

Parameters:

metric (str or None, optional) – Metric name (e.g., "roc_auc"). If None, uses the default "score" suffix from scikit-learn’s single-metric results.

Returns:

Sorted configurations with all cv_results_ columns.

Return type:

pandas.DataFrame

top_k(k=5, metric=None)[source]#

Return the top k hyperparameter configurations.

Parameters:
  • k (int, default=5) – Number of top configurations to return.

  • metric (str or None, optional) – Metric name for ranking. See ranking().

Returns:

The k best-ranked rows from ranking().

Return type:

pandas.DataFrame

param_importance(metric=None)[source]#

Estimate the marginal importance of each hyperparameter.

Groups configurations by each param_* column and computes the variance of the group means (a simplified, fANOVA-inspired measure). Higher variance indicates that the parameter has a larger effect on the inner-CV score.

Parameters:

metric (str or None, optional) – Metric name. See ranking().

Returns:

Columns: parameter, variance_explained, n_unique, relative_importance. Sorted by variance_explained in descending order. Returns an empty DataFrame if the score column is not found.

Return type:

pandas.DataFrame

Notes

This is a first-order marginal analysis and does not account for interactions between hyperparameters. For a full fANOVA decomposition, consider dedicated tools such as fanova.

score_distribution(param, metric=None)[source]#

Show mean inner-CV score as a function of a single hyperparameter.

Useful for generating 1-D parameter-sweep plots.

Parameters:
  • param (str) – Hyperparameter name without the param_ prefix.

  • metric (str or None, optional) – Metric name. See ranking().

Returns:

Columns include param_<param>, the mean score column, and (if available) the standard deviation column. Sorted by the parameter value. Returns an empty DataFrame if the requested columns are not found.

Return type:

pandas.DataFrame