Skip to content

Validator

Bases: ModelExtractor

Validator class for evaluating trained models on a separate validation dataset.

This class loads a validation dataset, applies necessary transformations, handles encoding, and evaluates a pre-trained model.

Inherits
  • ModelExtractor: Base class for extracting trained models and evaluation.

Parameters:

Name Type Description Default
learners_dict Dict

Dictionary containing trained models.

required
criterion str

Performance criterion for evaluation (e.g., "f1", "brier_score").

required
aggregate bool

Whether to aggregate results across multiple models.

required
path_train Path

Path to the training dataset used for encoding reference.

required
path_val Path

Path to the validation dataset.

required
verbose bool

Whether to print detailed logs. Defaults to False.

False
random_state Optional[int]

Random seed for reproducibility. Defaults to 0.

0
test_size float

Proportion of the dataset to use as a test set when performing target encoding. Defaults to 0.2.

0.2

Attributes:

Name Type Description
dataloader ProcessedDataLoader

Data loader for preprocessing input data.

resampler Resampler

Resampler for handling encoding and dataset splitting.

path_train Path

Path to the training dataset.

path_val Path

Path to the validation dataset.

test_size float

Proportion of training data used for validation when encoding.

data DataFrame

Raw validation dataset.

data_processed DataFrame

Processed validation dataset.

X DataFrame

Features from the validation dataset.

y Series

Target labels from the validation dataset.

Methods:

Name Description
_prepare_validation_data

Loads, processes, and encodes validation data to match training features.

perform_validation

Runs model evaluation on the validation dataset, returning performance metrics.

Inherited Methods
  • load_learners: Loads trained models from a specified directory.
  • apply_target_encoding: Applies target encoding to categorical variables.
  • apply_sampling: Applies specified sampling strategy to balance the dataset.
  • validate_dataframe: Validates that input data meets requirements.
  • get_probs: Computes model prediction probabilities.
  • final_metrics: Computes final evaluation metrics based on task.
Example
from periomod.wrapper import Validator

validator = Validator(
    learners_dict=learners,
    criterion="f1",
    aggregate=True,
    path_train=Path("../data/processed/processed_data.csv"),
    path_val=Path("../data/processed/processed_data_val.csv"),
    random_state=42,
    test_size=0.2
)

results = validator.perform_validation(verbose=True)
Source code in periomod/wrapper/_val.py
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
class Validator(ModelExtractor):
    """Validator class for evaluating trained models on a separate validation dataset.

    This class loads a validation dataset, applies necessary transformations,
    handles encoding, and evaluates a pre-trained model.

    Inherits:
        - `ModelExtractor`: Base class for extracting trained models and evaluation.

    Args:
        learners_dict (Dict): Dictionary containing trained models.
        criterion (str): Performance criterion for evaluation
            (e.g., "f1", "brier_score").
        aggregate (bool): Whether to aggregate results across multiple models.
        path_train (Path): Path to the training dataset used for encoding reference.
        path_val (Path): Path to the validation dataset.
        verbose (bool, optional): Whether to print detailed logs. Defaults to False.
        random_state (Optional[int], optional): Random seed for reproducibility.
            Defaults to 0.
        test_size (float, optional): Proportion of the dataset to use as a test set when
            performing target encoding. Defaults to 0.2.

    Attributes:
        dataloader (ProcessedDataLoader): Data loader for preprocessing input data.
        resampler (Resampler): Resampler for handling encoding and dataset splitting.
        path_train (Path): Path to the training dataset.
        path_val (Path): Path to the validation dataset.
        test_size (float): Proportion of training data used for validation when
            encoding.
        data (pd.DataFrame): Raw validation dataset.
        data_processed (pd.DataFrame): Processed validation dataset.
        X (pd.DataFrame): Features from the validation dataset.
        y (pd.Series): Target labels from the validation dataset.

    Methods:
        _prepare_validation_data: Loads, processes, and encodes validation data to match
            training features.
        perform_validation: Runs model evaluation on the validation dataset, returning
            performance metrics.

    Inherited Methods:
        - `load_learners`: Loads trained models from a specified directory.
        - `apply_target_encoding`: Applies target encoding to categorical variables.
        - `apply_sampling`: Applies specified sampling strategy to balance the dataset.
        - `validate_dataframe`: Validates that input data meets requirements.
        - `get_probs`: Computes model prediction probabilities.
        - `final_metrics`: Computes final evaluation metrics based on task.

    Example:
        ```
        from periomod.wrapper import Validator

        validator = Validator(
            learners_dict=learners,
            criterion="f1",
            aggregate=True,
            path_train=Path("../data/processed/processed_data.csv"),
            path_val=Path("../data/processed/processed_data_val.csv"),
            random_state=42,
            test_size=0.2
        )

        results = validator.perform_validation(verbose=True)
        ```
    """

    def __init__(
        self,
        learners_dict: Dict,
        criterion: str,
        aggregate: bool,
        path_train: Path,
        path_val: Path,
        verbose: bool = False,
        random_state: int = 0,
        test_size: float = 0.2,
    ):
        """Initializes the Validator class.

        Args:
            learners_dict (Dict): Dictionary containing trained models.
            criterion (str): Performance criterion for evaluation
                (e.g., "f1", "brier_score").
            aggregate (bool): Whether to aggregate results across multiple models.
            path_train (Path): Path to the training dataset used for encoding reference.
            path_val (Path): Path to the validation dataset.
            verbose (bool, optional): Whether to print detailed logs. Defaults to False.
            random_state (Optional[int], optional): Random seed for reproducibility.
                Defaults to 0.
            test_size (float, optional): Proportion of the dataset to use as a test set
                when performing target encoding. Defaults to 0.2.
        """
        super().__init__(
            learners_dict=learners_dict,
            criterion=criterion,
            aggregate=aggregate,
            verbose=verbose,
            random_state=random_state,
        )
        self.dataloader = ProcessedDataLoader(task=self.task, encoding=self.encoding)
        self.resampler = Resampler(
            classification=self.classification, encoding=self.encoding
        )
        self.path_train = path_train
        self.path_val = path_val
        self.test_size = test_size
        self.data, self.data_processed, self.X, self.y = self._prepare_validation_data()

    def _prepare_validation_data(
        self,
    ) -> Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame, pd.DataFrame]:
        """Loads and prepares validation data for model evaluation.

        This function loads the dataset, applies necessary transformations,
        and encodes categorical variables if required.

        Returns:
                Tuple[pd.DataFrame, pd.DataFrame, pd.DataFrame, pd.Series]:
                    - data (pd.DataFrame): Raw validation dataset.
                    - data_processed (pd.DataFrame): Preprocessed dataset.
                    - X (pd.DataFrame): Feature matrix.
                    - y (pd.Series): Target labels.

        Raises:
            ValueError: If model type is not supported for extraction.
        """
        data_train = self.dataloader.load_data(path=self.path_train)
        data_train = self.dataloader.transform_data(data=data_train, fit_encoder=True)

        # 2) Load + transform (val) with fit_encoder=False
        data_val = self.dataloader.load_data(path=self.path_val)
        data_processed = self.dataloader.transform_data(
            data=data_val, fit_encoder=False
        )

        X_val = data_processed.drop(columns=[self.y])
        y = data_processed[self.y]

        # If using target encoding, apply target encoding logic here:
        if self.encoding == "target":
            train_df, test_df = self.resampler.split_train_test_df(
                df=data_train, seed=self.random_state, test_size=self.test_size
            )
            X_train, y_train, _, _ = self.resampler.split_x_y(
                train_df=train_df, test_df=test_df
            )

            X_train, X_val = self.resampler.apply_target_encoding(
                X=X_train, X_val=X_val, y=y_train
            )

            missing_cols = set(X_train.columns) - set(X_val.columns)
            for col in missing_cols:
                X_val[col] = 0  # Fill with zero if missing
            X_val = X_val[X_train.columns]

        if self.encoding == "one_hot":
            if hasattr(self.model, "get_booster"):
                expected_columns = set(self.model.get_booster().feature_names)
            elif hasattr(self.model, "feature_names_in_"):
                expected_columns = set(self.model.feature_names_in_)
            else:
                raise ValueError("Model type not supported for feature extraction")

            current_columns = set(X_val.columns)
            missing_columns = expected_columns - current_columns
            for col in missing_columns:
                X_val[col] = 0

            X_val = X_val[list(self.model.feature_names_in_)]

        if self.group_col in X_val:
            X_val = X_val.drop(columns=[self.group_col])

        return data_val, data_processed, X_val, y

    def perform_validation(self, verbose: bool = False) -> pd.DataFrame:
        """Runs model evaluation on the validation dataset.

        This function computes predictions and evaluates the model's performance
        using predefined metrics such as F1-score or other classification metrics.

        Args:
            verbose (bool, optional): Whether to print detailed evaluation metrics.
                Defaults to False.

        Returns:
            pd.DataFrane: A dataframe containing computed evaluation metrics.
        """
        best_threshold = getattr(self.model, "best_threshold", None)

        final_probs = get_probs(
            model=self.model, classification=self.classification, X=self.X
        )

        if (
            self.criterion == "f1"
            and final_probs is not None
            and np.any(final_probs)
            and best_threshold is not None
        ):
            final_predictions = (final_probs >= best_threshold).astype(int)
        else:
            final_predictions = self.model.predict(self.X)

        metrics = final_metrics(
            classification=self.classification,
            y=self.y,
            preds=final_predictions,
            probs=final_probs,
            threshold=best_threshold,
        )
        unpacked_metrics = {
            k: round(v, 4) if isinstance(v, float) else v for k, v in metrics.items()
        }
        results = {
            "Learner": self.learner,
            "Tuning": "final",
            "Criterion": self.criterion,
            **unpacked_metrics,
        }

        df_results = pd.DataFrame([results])
        if verbose:
            pd.set_option("display.max_columns", None, "display.width", 1000)
            print("\nFinal Model Metrics Summary:\n", df_results)
        return df_results

__init__(learners_dict, criterion, aggregate, path_train, path_val, verbose=False, random_state=0, test_size=0.2)

Initializes the Validator class.

Parameters:

Name Type Description Default
learners_dict Dict

Dictionary containing trained models.

required
criterion str

Performance criterion for evaluation (e.g., "f1", "brier_score").

required
aggregate bool

Whether to aggregate results across multiple models.

required
path_train Path

Path to the training dataset used for encoding reference.

required
path_val Path

Path to the validation dataset.

required
verbose bool

Whether to print detailed logs. Defaults to False.

False
random_state Optional[int]

Random seed for reproducibility. Defaults to 0.

0
test_size float

Proportion of the dataset to use as a test set when performing target encoding. Defaults to 0.2.

0.2
Source code in periomod/wrapper/_val.py
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
def __init__(
    self,
    learners_dict: Dict,
    criterion: str,
    aggregate: bool,
    path_train: Path,
    path_val: Path,
    verbose: bool = False,
    random_state: int = 0,
    test_size: float = 0.2,
):
    """Initializes the Validator class.

    Args:
        learners_dict (Dict): Dictionary containing trained models.
        criterion (str): Performance criterion for evaluation
            (e.g., "f1", "brier_score").
        aggregate (bool): Whether to aggregate results across multiple models.
        path_train (Path): Path to the training dataset used for encoding reference.
        path_val (Path): Path to the validation dataset.
        verbose (bool, optional): Whether to print detailed logs. Defaults to False.
        random_state (Optional[int], optional): Random seed for reproducibility.
            Defaults to 0.
        test_size (float, optional): Proportion of the dataset to use as a test set
            when performing target encoding. Defaults to 0.2.
    """
    super().__init__(
        learners_dict=learners_dict,
        criterion=criterion,
        aggregate=aggregate,
        verbose=verbose,
        random_state=random_state,
    )
    self.dataloader = ProcessedDataLoader(task=self.task, encoding=self.encoding)
    self.resampler = Resampler(
        classification=self.classification, encoding=self.encoding
    )
    self.path_train = path_train
    self.path_val = path_val
    self.test_size = test_size
    self.data, self.data_processed, self.X, self.y = self._prepare_validation_data()

perform_validation(verbose=False)

Runs model evaluation on the validation dataset.

This function computes predictions and evaluates the model's performance using predefined metrics such as F1-score or other classification metrics.

Parameters:

Name Type Description Default
verbose bool

Whether to print detailed evaluation metrics. Defaults to False.

False

Returns:

Type Description
DataFrame

pd.DataFrane: A dataframe containing computed evaluation metrics.

Source code in periomod/wrapper/_val.py
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
def perform_validation(self, verbose: bool = False) -> pd.DataFrame:
    """Runs model evaluation on the validation dataset.

    This function computes predictions and evaluates the model's performance
    using predefined metrics such as F1-score or other classification metrics.

    Args:
        verbose (bool, optional): Whether to print detailed evaluation metrics.
            Defaults to False.

    Returns:
        pd.DataFrane: A dataframe containing computed evaluation metrics.
    """
    best_threshold = getattr(self.model, "best_threshold", None)

    final_probs = get_probs(
        model=self.model, classification=self.classification, X=self.X
    )

    if (
        self.criterion == "f1"
        and final_probs is not None
        and np.any(final_probs)
        and best_threshold is not None
    ):
        final_predictions = (final_probs >= best_threshold).astype(int)
    else:
        final_predictions = self.model.predict(self.X)

    metrics = final_metrics(
        classification=self.classification,
        y=self.y,
        preds=final_predictions,
        probs=final_probs,
        threshold=best_threshold,
    )
    unpacked_metrics = {
        k: round(v, 4) if isinstance(v, float) else v for k, v in metrics.items()
    }
    results = {
        "Learner": self.learner,
        "Tuning": "final",
        "Criterion": self.criterion,
        **unpacked_metrics,
    }

    df_results = pd.DataFrame([results])
    if verbose:
        pd.set_option("display.max_columns", None, "display.width", 1000)
        print("\nFinal Model Metrics Summary:\n", df_results)
    return df_results