Skip to content

ModelExtractor

Bases: BaseConfig

Extracts best model from learner dictionary.

Inherits

BaseConfig: Loads configuration parameters.

Parameters:

Name Type Description Default
learners_dict Dict

Dictionary containing models and their metadata.

required
criterion str

Criterion for selecting models (e.g., 'f1', 'brier_score').

required
aggregate bool

Whether to aggregate metrics.

required
verbose bool

Controls verbose in the evaluation process.

required
random_state int

Random state for resampling.

required

Attributes:

Name Type Description
learners_dict Dict

Holds learners and metadata.

criterion str

Evaluation criterion to select the optimal model.

aggregate bool

Indicates if metrics should be aggregated.

verbose bool

Flag for controlling logging verbose.

random_state int

Random state for resampling.

classification str

Classification type ('binary' or 'multiclass').

Properties
  • criterion (str): Retrieves or sets current evaluation criterion for model selection. Supports 'f1', 'brier_score', and 'macro_f1'.
  • model (object): Retrieves best-ranked model dynamically based on the current criterion. Recalculates when criterion is updated.
Source code in periomod/wrapper/_basewrapper.py
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
class ModelExtractor(BaseConfig):
    """Extracts best model from learner dictionary.

    Inherits:
        `BaseConfig`: Loads configuration parameters.

    Args:
        learners_dict (Dict): Dictionary containing models and their metadata.
        criterion (str): Criterion for selecting models (e.g., 'f1', 'brier_score').
        aggregate (bool): Whether to aggregate metrics.
        verbose (bool): Controls verbose in the evaluation process.
        random_state (int): Random state for resampling.

    Attributes:
        learners_dict (Dict): Holds learners and metadata.
        criterion (str): Evaluation criterion to select the optimal model.
        aggregate (bool): Indicates if metrics should be aggregated.
        verbose (bool): Flag for controlling logging verbose.
        random_state (int): Random state for resampling.
        classification (str): Classification type ('binary' or 'multiclass').

    Properties:
        - `criterion (str)`: Retrieves or sets current evaluation criterion for model
            selection. Supports 'f1', 'brier_score', and 'macro_f1'.
        - `model (object)`: Retrieves best-ranked model dynamically based on the current
            criterion. Recalculates when criterion is updated.
    """

    def __init__(
        self,
        learners_dict: Dict,
        criterion: str,
        aggregate: bool,
        verbose: bool,
        random_state: int,
    ):
        """Initializes ModelExtractor."""
        super().__init__()
        self.learners_dict = learners_dict
        self.criterion = criterion
        self.aggregate = aggregate
        self.verbose = verbose
        self.random_state = random_state
        self._update_best_model()
        self.classification = (
            "multiclass" if self.task == "pdgrouprevaluation" else "binary"
        )

    @property
    def criterion(self) -> str:
        """The current evaluation criterion used to select the best model.

        Returns:
            str: The current criterion for model selection (e.g., 'f1', 'brier_score').

        Raises:
            ValueError: If an unsupported criterion is assigned.
        """
        return self._criterion

    @criterion.setter
    def criterion(self, value: str) -> None:
        """Sets the evaluation criterion and updates related attributes accordingly.

        Args:
            value (str): The criterion for selecting the model ('f1', 'brier_score',
                or 'macro_f1').

        Raises:
            ValueError: If the provided criterion is unsupported.
        """
        if value not in ["f1", "brier_score", "macro_f1"]:
            raise ValueError(
                "Unsupported criterion. Choose 'f1', 'macro_f1', or 'brier_score'."
            )
        self._criterion = value
        self._update_best_model()

    @property
    def model(self) -> Any:
        """Retrieves the best model based on the current criterion.

        Returns:
            Any: The model object selected according to the current criterion.

        Raises:
            ValueError: If no model matching the criterion and rank is found.
        """
        return self._model

    def _update_best_model(self) -> None:
        """Retrieves and updates the best model based on the current criterion."""
        (
            best_model,
            self.encoding,
            self.learner,
            self.task,
            self.factor,
            self.sampling,
        ) = self._get_best()
        self._model = best_model

    def _get_best(self) -> Tuple[Any, str, str, str, Optional[float], Optional[str]]:
        """Retrieves best model entities.

        Returns:
            Tuple: A tuple containing the best model, encoding ('one_hot' or 'target'),
                learner, task, factor, and sampling type (if applicable).

        Raises:
            ValueError: If model with rank1 is not found, or any component cannot be
                determined.
        """
        best_model_key = next(
            (
                key
                for key in self.learners_dict
                if f"_{self.criterion}_" in key and "rank1" in key
            ),
            None,
        )

        if not best_model_key:
            raise ValueError(
                f"No model with rank1 found for criterion '{self.criterion}' in dict."
            )

        best_model = self.learners_dict[best_model_key]

        if "one_hot" in best_model_key:
            encoding = "one_hot"
        elif "target" in best_model_key:
            encoding = "target"
        else:
            raise ValueError("Unable to determine encoding from the model key.")

        if "upsampling" in best_model_key:
            sampling = "upsampling"
        elif "downsampling" in best_model_key:
            sampling = "downsampling"
        elif "smote" in best_model_key:
            sampling = "smote"
        else:
            sampling = None

        key_parts = best_model_key.split("_")
        task = key_parts[0]
        learner = key_parts[1]

        for part in key_parts:
            if part.startswith("factor"):
                factor_value = part.replace("factor", "")
                if factor_value.isdigit():
                    factor = float(factor_value)
                else:
                    factor = None

        return best_model, encoding, learner, task, factor, sampling

criterion: str property writable

The current evaluation criterion used to select the best model.

Returns:

Name Type Description
str str

The current criterion for model selection (e.g., 'f1', 'brier_score').

Raises:

Type Description
ValueError

If an unsupported criterion is assigned.

model: Any property

Retrieves the best model based on the current criterion.

Returns:

Name Type Description
Any Any

The model object selected according to the current criterion.

Raises:

Type Description
ValueError

If no model matching the criterion and rank is found.

__init__(learners_dict, criterion, aggregate, verbose, random_state)

Initializes ModelExtractor.

Source code in periomod/wrapper/_basewrapper.py
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
def __init__(
    self,
    learners_dict: Dict,
    criterion: str,
    aggregate: bool,
    verbose: bool,
    random_state: int,
):
    """Initializes ModelExtractor."""
    super().__init__()
    self.learners_dict = learners_dict
    self.criterion = criterion
    self.aggregate = aggregate
    self.verbose = verbose
    self.random_state = random_state
    self._update_best_model()
    self.classification = (
        "multiclass" if self.task == "pdgrouprevaluation" else "binary"
    )