Skip to content

BaseTrainer

Bases: BaseValidator, ABC

Abstract base class for training machine learning models.

This class provides foundational methods for training and evaluating machine learning models, including MLP models with early stopping, and optimizing decision thresholds. It supports binary and multiclass classification and allows for various evaluation metrics, threshold tuning, and cross-validation procedures.

Inherits
  • BaseValidator: Validates instance level variables.
  • ABC: Specifies abstract methods for subclasses to implement.

Parameters:

Name Type Description Default
classification str

Specifies the type of classification ('binary' or 'multiclass').

required
criterion str

Defines the performance criterion to optimize (e.g., 'f1' or 'brier_score').

required
tuning Optional[str]

Specifies the tuning method ('holdout' or 'cv') or None.

required
hpo Optional[str]

Specifies the hyperparameter optimization method.

required
mlp_training Optional[bool]

Flag to indicate if a separate MLP training procedure with early stopping is to be used.

required
threshold_tuning Optional[bool]

Determines if threshold tuning is performed for binary classification when the criterion is "f1".

required

Attributes:

Name Type Description
classification str

Type of classification ('binary' or 'multiclass').

criterion str

Performance criterion to optimize ('f1', 'brier_score' or 'macro_f1').

tuning Optional[str]

Tuning method ('holdout' or 'cv') or None.

hpo Optional[str]

Hyperparameter optimization method if specified.

mlp_training Optional[bool]

Indicates if MLP training with early stopping is applied.

threshold_tuning Optional[bool]

Specifies if threshold tuning is performed for binary classification when the criterion is 'f1'.

Methods:

Name Description
evaluate

Determines model performance based on the specified classification criterion.

optimize_threshold

Utilizes cross-validation to optimize the decision threshold by aggregating probability predictions.

evaluate_cv

Evaluates a model on a training-validation fold based on the specified criterion, supporting cross-validation.

Abstract Methods
  • train: Trains the model with standard or custom logic depending on the specified learner type.
  • train_mlp: Trains an MLP model with early stopping and additional evaluation logic if required.
  • train_final_model: Trains the final model on the entire dataset, applying resampling, parallel processing, and specified sampling methods.
Source code in periomod/training/_basetrainer.py
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
class BaseTrainer(BaseValidator, ABC):
    """Abstract base class for training machine learning models.

    This class provides foundational methods for training and evaluating
    machine learning models, including MLP models with early stopping,
    and optimizing decision thresholds. It supports binary and multiclass
    classification and allows for various evaluation metrics, threshold
    tuning, and cross-validation procedures.

    Inherits:
        - `BaseValidator`: Validates instance level variables.
        - `ABC`: Specifies abstract methods for subclasses to implement.

    Args:
        classification (str): Specifies the type of classification ('binary'
            or 'multiclass').
        criterion (str): Defines the performance criterion to optimize (e.g.,
            'f1' or 'brier_score').
        tuning (Optional[str]): Specifies the tuning method ('holdout' or
            'cv') or None.
        hpo (Optional[str]): Specifies the hyperparameter optimization method.
        mlp_training (Optional[bool]): Flag to indicate if a separate MLP training
            procedure with early stopping is to be used.
        threshold_tuning (Optional[bool]): Determines if threshold tuning is performed
            for binary classification when the criterion is "f1".

    Attributes:
        classification (str): Type of classification ('binary' or 'multiclass').
        criterion (str): Performance criterion to optimize
            ('f1', 'brier_score' or 'macro_f1').
        tuning (Optional[str]): Tuning method ('holdout' or 'cv') or None.
        hpo (Optional[str]): Hyperparameter optimization method if specified.
        mlp_training (Optional[bool]): Indicates if MLP training with early stopping is
            applied.
        threshold_tuning (Optional[bool]): Specifies if threshold tuning is performed
            for binary classification when the criterion is 'f1'.

    Methods:
        evaluate: Determines model performance based on the specified
            classification criterion.
        optimize_threshold: Utilizes cross-validation to optimize the
            decision threshold by aggregating probability predictions.
        evaluate_cv: Evaluates a model on a training-validation fold
            based on the specified criterion, supporting cross-validation.

    Abstract Methods:
        - `train`: Trains the model with standard or custom logic depending
          on the specified learner type.
        - `train_mlp`: Trains an MLP model with early stopping and additional
          evaluation logic if required.
        - `train_final_model`: Trains the final model on the entire dataset,
          applying resampling, parallel processing, and specified sampling
          methods.
    """

    def __init__(
        self,
        classification: str,
        criterion: str,
        tuning: Optional[str],
        hpo: Optional[str],
        mlp_training: Optional[bool],
        threshold_tuning: Optional[bool],
    ) -> None:
        """Initializes the Trainer with classification type and criterion."""
        super().__init__(
            classification=classification, criterion=criterion, tuning=tuning, hpo=hpo
        )
        self.mlp_training = mlp_training
        self.threshold_tuning = threshold_tuning

    def evaluate(
        self,
        y: np.ndarray,
        probs: np.ndarray,
        threshold: Optional[bool] = None,
    ) -> Tuple[float, Optional[float]]:
        """Evaluates model performance based on the classification criterion.

        For binary or multiclass classification.

        Args:
            y (np.ndarray): True labels for the validation data.
            probs (np.ndarray): Probability predictions for each class.
                For binary classification, the probability for the positive class.
                For multiclass, a 2D array with probabilities.
            threshold (bool): Flag for threshold tuning when tuning with F1.
                Defaults to None.

        Returns:
            Tuple: Score and optimal threshold (if for binary).
                For multiclass, only the score is returned.
        """
        if self.classification == "binary":
            return self._evaluate_binary(y=y, probs=probs, threshold=threshold)
        else:
            return self._evaluate_multiclass(y=y, probs=probs)

    def _evaluate_binary(
        self,
        y: np.ndarray,
        probs: np.ndarray,
        threshold: Optional[bool] = None,
    ) -> Tuple[float, Optional[float]]:
        """Evaluates binary classification metrics based on probabilities.

        Args:
            y (np.ndarray): True labels for the validation data.
            probs (np.ndarray): Probability predictions for the positive class.
            threshold (bool): Flag for threshold tuning when tuning with F1.
                Defaults to None.

        Returns:
            Tuple: Score and optimal threshold (if applicable).
        """
        if self.criterion == "f1":
            if threshold:
                scores, thresholds = [], np.linspace(0, 1, 101)
                for threshold in thresholds:
                    preds = (probs >= threshold).astype(int)
                    scores.append(f1_score(y, preds, pos_label=0))
                best_idx = np.argmax(scores)
                return scores[best_idx], thresholds[best_idx]
            else:
                preds = (probs >= 0.5).astype(int)
                return f1_score(y_true=y, y_pred=preds, pos_label=0), 0.5
        else:
            return brier_score_loss(y_true=y, y_proba=probs), None

    def _evaluate_multiclass(
        self, y: np.ndarray, probs: np.ndarray
    ) -> Tuple[float, Optional[float]]:
        """Evaluates multiclass classification metrics based on probabilities.

        Args:
            y (np.ndarray): True labels for the validation data.
            probs (np.ndarray): Probability predictions for each class (2D array).

        Returns:
            Tuple: The calculated score and None.
        """
        preds = np.argmax(probs, axis=1)

        if self.criterion == "macro_f1":
            return f1_score(y_true=y, y_pred=preds, average="macro"), None
        else:
            return brier_loss_multi(y=y, probs=probs), None

    def evaluate_cv(
        self, model: Any, fold: Tuple, return_probs: bool = False
    ) -> Union[float, Tuple[float, np.ndarray, np.ndarray]]:
        """Evaluates a model on a specific training-validation fold.

        Based on a chosen performance criterion.

        Args:
            model (Any): The machine learning model used for
                evaluation.
            fold (tuple): A tuple containing two tuples:
                - The first tuple contains the training data (features and labels).
                - The second tuple contains the validation data (features and labels).
                Specifically, it is structured as ((X_train, y_train), (X_val, y_val)),
                where X_train and X_val are the feature matrices, and y_train and y_val
                are the target vectors.
            return_probs (bool): Return predicted probabilities with score if True.

        Returns:
            Union: The calculated score of the model on the validation data, and
                optionally the true labels and predicted probabilities.
        """
        (X_train, y_train), (X_val, y_val) = fold
        with warnings.catch_warnings():
            warnings.filterwarnings("ignore", category=UserWarning)
            warnings.filterwarnings("ignore", category=ConvergenceWarning)

            score, _, _ = self.train(
                model=model, X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val
            )

            if return_probs:
                if hasattr(model, "predict_proba"):
                    probs = model.predict_proba(X_val)[:, 1]
                    return score, y_val, probs
                else:
                    raise AttributeError(
                        f"The model {type(model)} does not support predict_proba."
                    )

        return score

    def _find_optimal_threshold(
        self, true_labels: np.ndarray, probs: np.ndarray
    ) -> Union[float, None]:
        """Find the optimal threshold based on the criterion.

        Converts probabilities into binary decisions.

        Args:
            true_labels (np.ndarray): The true labels for validation or test data.
            probs (np.ndarray): Predicted probabilities for the positive class.

        Returns:
            Union: The optimal threshold for 'f1', or None if the criterion is
                'brier_score'.
        """
        if self.criterion == "brier_score":
            return None

        elif self.criterion == "f1":
            thresholds = np.linspace(0, 1, 101)
            scores = [
                f1_score(y_true=true_labels, y_pred=probs >= th, pos_label=0)
                for th in thresholds
            ]
            best_threshold = thresholds[np.argmax(scores)]
            print(f"Best threshold: {best_threshold}, Best F1 score: {np.max(scores)}")
            return best_threshold
        raise ValueError(f"Invalid criterion: {self.criterion}")

    def optimize_threshold(
        self,
        model: Any,
        outer_splits: Optional[List[Tuple[pd.DataFrame, pd.DataFrame]]],
        n_jobs: int,
    ) -> Union[float, None]:
        """Optimize the decision threshold using cross-validation.

        Aggregates probability predictions across cross-validation folds.

        Args:
            model (Any): The trained machine learning model.
            outer_splits (List[Tuple]): List of ((X_train, y_train), (X_val, y_val)).
            n_jobs (int): Number of parallel jobs to use for cross-validation.

        Returns:
            Union: The optimal threshold for 'f1', or None if the criterion is
                'brier_score'.
        """
        if outer_splits is None:
            return None

        results = Parallel(n_jobs=n_jobs)(
            delayed(self.evaluate_cv)(model, fold, return_probs=True)
            for fold in outer_splits
        )

        all_true_labels = np.concatenate([y for _, y, _ in results])
        all_probs = np.concatenate([probs for _, _, probs in results])

        return self._find_optimal_threshold(
            true_labels=all_true_labels, probs=all_probs
        )

    @abstractmethod
    def train(
        self,
        model: Any,
        X_train: pd.DataFrame,
        y_train: pd.Series,
        X_val: pd.DataFrame,
        y_val: pd.Series,
    ):
        """Trains either an MLP model with custom logic or a standard model.

        Args:
            model (Any): The machine learning model to be trained.
            X_train (pd.DataFrame): Training features.
            y_train (pd.Series): Training labels.
            X_val (pd.DataFrame): Validation features.
            y_val (pd.Series): Validation labels.
        """

    @abstractmethod
    def train_mlp(
        self,
        mlp_model: MLPClassifier,
        X_train: pd.DataFrame,
        y_train: pd.Series,
        X_val: pd.DataFrame,
        y_val: pd.Series,
        final: bool = False,
    ):
        """Trains MLPClassifier with early stopping and evaluates performance.

        Applies evaluation for both binary and multiclass classification.

        Args:
            mlp_model (MLPClassifier): The MLPClassifier to be trained.
            X_train (pd.DataFrame): Training features.
            y_train (pd.Series): Training labels.
            X_val (pd.DataFrame): Validation features.
            y_val (pd.Series): Validation labels.
            final (bool): Flag for final model training.
        """

    @abstractmethod
    def train_final_model(
        self,
        df: pd.DataFrame,
        resampler: Resampler,
        model: Tuple,
        sampling: Optional[str],
        factor: Optional[float],
        n_jobs: int,
        seed: int,
        test_size: float,
        verbose: bool,
    ):
        """Trains the final model.

        Args:
            df (pandas.DataFrame): The dataset used for model evaluation.
            resampler: Resampling class.
            model (sklearn estimator): The machine learning model used for evaluation.
            sampling (str): The type of sampling to apply.
            factor (float): The factor by which to upsample or downsample.
            n_jobs (int): The number of parallel jobs to run for evaluation.
            seed (int): Seed for splitting.
            test_size (float): Size of train test split.
            verbose (bool): verbose during model evaluation process if set to True.
        """

__init__(classification, criterion, tuning, hpo, mlp_training, threshold_tuning)

Initializes the Trainer with classification type and criterion.

Source code in periomod/training/_basetrainer.py
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
def __init__(
    self,
    classification: str,
    criterion: str,
    tuning: Optional[str],
    hpo: Optional[str],
    mlp_training: Optional[bool],
    threshold_tuning: Optional[bool],
) -> None:
    """Initializes the Trainer with classification type and criterion."""
    super().__init__(
        classification=classification, criterion=criterion, tuning=tuning, hpo=hpo
    )
    self.mlp_training = mlp_training
    self.threshold_tuning = threshold_tuning

evaluate(y, probs, threshold=None)

Evaluates model performance based on the classification criterion.

For binary or multiclass classification.

Parameters:

Name Type Description Default
y ndarray

True labels for the validation data.

required
probs ndarray

Probability predictions for each class. For binary classification, the probability for the positive class. For multiclass, a 2D array with probabilities.

required
threshold bool

Flag for threshold tuning when tuning with F1. Defaults to None.

None

Returns:

Name Type Description
Tuple Tuple[float, Optional[float]]

Score and optimal threshold (if for binary). For multiclass, only the score is returned.

Source code in periomod/training/_basetrainer.py
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
def evaluate(
    self,
    y: np.ndarray,
    probs: np.ndarray,
    threshold: Optional[bool] = None,
) -> Tuple[float, Optional[float]]:
    """Evaluates model performance based on the classification criterion.

    For binary or multiclass classification.

    Args:
        y (np.ndarray): True labels for the validation data.
        probs (np.ndarray): Probability predictions for each class.
            For binary classification, the probability for the positive class.
            For multiclass, a 2D array with probabilities.
        threshold (bool): Flag for threshold tuning when tuning with F1.
            Defaults to None.

    Returns:
        Tuple: Score and optimal threshold (if for binary).
            For multiclass, only the score is returned.
    """
    if self.classification == "binary":
        return self._evaluate_binary(y=y, probs=probs, threshold=threshold)
    else:
        return self._evaluate_multiclass(y=y, probs=probs)

evaluate_cv(model, fold, return_probs=False)

Evaluates a model on a specific training-validation fold.

Based on a chosen performance criterion.

Parameters:

Name Type Description Default
model Any

The machine learning model used for evaluation.

required
fold tuple

A tuple containing two tuples: - The first tuple contains the training data (features and labels). - The second tuple contains the validation data (features and labels). Specifically, it is structured as ((X_train, y_train), (X_val, y_val)), where X_train and X_val are the feature matrices, and y_train and y_val are the target vectors.

required
return_probs bool

Return predicted probabilities with score if True.

False

Returns:

Name Type Description
Union Union[float, Tuple[float, ndarray, ndarray]]

The calculated score of the model on the validation data, and optionally the true labels and predicted probabilities.

Source code in periomod/training/_basetrainer.py
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
def evaluate_cv(
    self, model: Any, fold: Tuple, return_probs: bool = False
) -> Union[float, Tuple[float, np.ndarray, np.ndarray]]:
    """Evaluates a model on a specific training-validation fold.

    Based on a chosen performance criterion.

    Args:
        model (Any): The machine learning model used for
            evaluation.
        fold (tuple): A tuple containing two tuples:
            - The first tuple contains the training data (features and labels).
            - The second tuple contains the validation data (features and labels).
            Specifically, it is structured as ((X_train, y_train), (X_val, y_val)),
            where X_train and X_val are the feature matrices, and y_train and y_val
            are the target vectors.
        return_probs (bool): Return predicted probabilities with score if True.

    Returns:
        Union: The calculated score of the model on the validation data, and
            optionally the true labels and predicted probabilities.
    """
    (X_train, y_train), (X_val, y_val) = fold
    with warnings.catch_warnings():
        warnings.filterwarnings("ignore", category=UserWarning)
        warnings.filterwarnings("ignore", category=ConvergenceWarning)

        score, _, _ = self.train(
            model=model, X_train=X_train, y_train=y_train, X_val=X_val, y_val=y_val
        )

        if return_probs:
            if hasattr(model, "predict_proba"):
                probs = model.predict_proba(X_val)[:, 1]
                return score, y_val, probs
            else:
                raise AttributeError(
                    f"The model {type(model)} does not support predict_proba."
                )

    return score

optimize_threshold(model, outer_splits, n_jobs)

Optimize the decision threshold using cross-validation.

Aggregates probability predictions across cross-validation folds.

Parameters:

Name Type Description Default
model Any

The trained machine learning model.

required
outer_splits List[Tuple]

List of ((X_train, y_train), (X_val, y_val)).

required
n_jobs int

Number of parallel jobs to use for cross-validation.

required

Returns:

Name Type Description
Union Union[float, None]

The optimal threshold for 'f1', or None if the criterion is 'brier_score'.

Source code in periomod/training/_basetrainer.py
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
def optimize_threshold(
    self,
    model: Any,
    outer_splits: Optional[List[Tuple[pd.DataFrame, pd.DataFrame]]],
    n_jobs: int,
) -> Union[float, None]:
    """Optimize the decision threshold using cross-validation.

    Aggregates probability predictions across cross-validation folds.

    Args:
        model (Any): The trained machine learning model.
        outer_splits (List[Tuple]): List of ((X_train, y_train), (X_val, y_val)).
        n_jobs (int): Number of parallel jobs to use for cross-validation.

    Returns:
        Union: The optimal threshold for 'f1', or None if the criterion is
            'brier_score'.
    """
    if outer_splits is None:
        return None

    results = Parallel(n_jobs=n_jobs)(
        delayed(self.evaluate_cv)(model, fold, return_probs=True)
        for fold in outer_splits
    )

    all_true_labels = np.concatenate([y for _, y, _ in results])
    all_probs = np.concatenate([probs for _, _, probs in results])

    return self._find_optimal_threshold(
        true_labels=all_true_labels, probs=all_probs
    )

train(model, X_train, y_train, X_val, y_val) abstractmethod

Trains either an MLP model with custom logic or a standard model.

Parameters:

Name Type Description Default
model Any

The machine learning model to be trained.

required
X_train DataFrame

Training features.

required
y_train Series

Training labels.

required
X_val DataFrame

Validation features.

required
y_val Series

Validation labels.

required
Source code in periomod/training/_basetrainer.py
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
@abstractmethod
def train(
    self,
    model: Any,
    X_train: pd.DataFrame,
    y_train: pd.Series,
    X_val: pd.DataFrame,
    y_val: pd.Series,
):
    """Trains either an MLP model with custom logic or a standard model.

    Args:
        model (Any): The machine learning model to be trained.
        X_train (pd.DataFrame): Training features.
        y_train (pd.Series): Training labels.
        X_val (pd.DataFrame): Validation features.
        y_val (pd.Series): Validation labels.
    """

train_final_model(df, resampler, model, sampling, factor, n_jobs, seed, test_size, verbose) abstractmethod

Trains the final model.

Parameters:

Name Type Description Default
df DataFrame

The dataset used for model evaluation.

required
resampler Resampler

Resampling class.

required
model sklearn estimator

The machine learning model used for evaluation.

required
sampling str

The type of sampling to apply.

required
factor float

The factor by which to upsample or downsample.

required
n_jobs int

The number of parallel jobs to run for evaluation.

required
seed int

Seed for splitting.

required
test_size float

Size of train test split.

required
verbose bool

verbose during model evaluation process if set to True.

required
Source code in periomod/training/_basetrainer.py
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
@abstractmethod
def train_final_model(
    self,
    df: pd.DataFrame,
    resampler: Resampler,
    model: Tuple,
    sampling: Optional[str],
    factor: Optional[float],
    n_jobs: int,
    seed: int,
    test_size: float,
    verbose: bool,
):
    """Trains the final model.

    Args:
        df (pandas.DataFrame): The dataset used for model evaluation.
        resampler: Resampling class.
        model (sklearn estimator): The machine learning model used for evaluation.
        sampling (str): The type of sampling to apply.
        factor (float): The factor by which to upsample or downsample.
        n_jobs (int): The number of parallel jobs to run for evaluation.
        seed (int): Seed for splitting.
        test_size (float): Size of train test split.
        verbose (bool): verbose during model evaluation process if set to True.
    """

train_mlp(mlp_model, X_train, y_train, X_val, y_val, final=False) abstractmethod

Trains MLPClassifier with early stopping and evaluates performance.

Applies evaluation for both binary and multiclass classification.

Parameters:

Name Type Description Default
mlp_model MLPClassifier

The MLPClassifier to be trained.

required
X_train DataFrame

Training features.

required
y_train Series

Training labels.

required
X_val DataFrame

Validation features.

required
y_val Series

Validation labels.

required
final bool

Flag for final model training.

False
Source code in periomod/training/_basetrainer.py
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
@abstractmethod
def train_mlp(
    self,
    mlp_model: MLPClassifier,
    X_train: pd.DataFrame,
    y_train: pd.Series,
    X_val: pd.DataFrame,
    y_val: pd.Series,
    final: bool = False,
):
    """Trains MLPClassifier with early stopping and evaluates performance.

    Applies evaluation for both binary and multiclass classification.

    Args:
        mlp_model (MLPClassifier): The MLPClassifier to be trained.
        X_train (pd.DataFrame): Training features.
        y_train (pd.Series): Training labels.
        X_val (pd.DataFrame): Validation features.
        y_val (pd.Series): Validation labels.
        final (bool): Flag for final model training.
    """