ModelExtractor

Bases: BaseConfig

Extracts best model from learner dictionary.

Inherits

BaseConfig: Loads configuration parameters.

Parameters:

Name	Type	Description	Default
`learners_dict`	`Dict`	Dictionary containing models and their metadata.	required
`criterion`	`str`	Criterion for selecting models (e.g., 'f1', 'brier_score').	required
`aggregate`	`bool`	Whether to aggregate metrics.	required
`verbose`	`bool`	Controls verbose in the evaluation process.	required
`random_state`	`int`	Random state for resampling.	required

Attributes:

Name	Type	Description
`learners_dict`	`Dict`	Holds learners and metadata.
`criterion`	`str`	Evaluation criterion to select the optimal model.
`aggregate`	`bool`	Indicates if metrics should be aggregated.
`verbose`	`bool`	Flag for controlling logging verbose.
`random_state`	`int`	Random state for resampling.
`classification`	`str`	Classification type ('binary' or 'multiclass').

Properties

criterion (str): Retrieves or sets current evaluation criterion for model selection. Supports 'f1', 'brier_score', and 'macro_f1'.
model (object): Retrieves best-ranked model dynamically based on the current criterion. Recalculates when criterion is updated.

Source code in periomod/wrapper/_basewrapper.py

class ModelExtractor(BaseConfig):
    """Extracts best model from learner dictionary.

    Inherits:
        `BaseConfig`: Loads configuration parameters.

    Args:
        learners_dict (Dict): Dictionary containing models and their metadata.
        criterion (str): Criterion for selecting models (e.g., 'f1', 'brier_score').
        aggregate (bool): Whether to aggregate metrics.
        verbose (bool): Controls verbose in the evaluation process.
        random_state (int): Random state for resampling.

    Attributes:
        learners_dict (Dict): Holds learners and metadata.
        criterion (str): Evaluation criterion to select the optimal model.
        aggregate (bool): Indicates if metrics should be aggregated.
        verbose (bool): Flag for controlling logging verbose.
        random_state (int): Random state for resampling.
        classification (str): Classification type ('binary' or 'multiclass').

    Properties:
        - `criterion (str)`: Retrieves or sets current evaluation criterion for model
            selection. Supports 'f1', 'brier_score', and 'macro_f1'.
        - `model (object)`: Retrieves best-ranked model dynamically based on the current
            criterion. Recalculates when criterion is updated.
    """

    def __init__(
        self,
        learners_dict: Dict,
        criterion: str,
        aggregate: bool,
        verbose: bool,
        random_state: int,
    ):
        """Initializes ModelExtractor."""
        super().__init__()
        self.learners_dict = learners_dict
        self.criterion = criterion
        self.aggregate = aggregate
        self.verbose = verbose
        self.random_state = random_state
        self._update_best_model()
        self.classification = (
            "multiclass" if self.task == "pdgrouprevaluation" else "binary"
        )

    @property
    def criterion(self) -> str:
        """The current evaluation criterion used to select the best model.

        Returns:
            str: The current criterion for model selection (e.g., 'f1', 'brier_score').

        Raises:
            ValueError: If an unsupported criterion is assigned.
        """
        return self._criterion

    @criterion.setter
    def criterion(self, value: str) -> None:
        """Sets the evaluation criterion and updates related attributes accordingly.

        Args:
            value (str): The criterion for selecting the model ('f1', 'brier_score',
                or 'macro_f1').

        Raises:
            ValueError: If the provided criterion is unsupported.
        """
        if value not in ["f1", "brier_score", "macro_f1"]:
            raise ValueError(
                "Unsupported criterion. Choose 'f1', 'macro_f1', or 'brier_score'."
            )
        self._criterion = value
        self._update_best_model()

    @property
    def model(self) -> Any:
        """Retrieves the best model based on the current criterion.

        Returns:
            Any: The model object selected according to the current criterion.

        Raises:
            ValueError: If no model matching the criterion and rank is found.
        """
        return self._model

    def _update_best_model(self) -> None:
        """Retrieves and updates the best model based on the current criterion."""
        (
            best_model,
            self.encoding,
            self.learner,
            self.task,
            self.factor,
            self.sampling,
        ) = self._get_best()
        self._model = best_model

    def _get_best(self) -> Tuple[Any, str, str, str, Optional[float], Optional[str]]:
        """Retrieves best model entities.

        Returns:
            Tuple: A tuple containing the best model, encoding ('one_hot' or 'target'),
                learner, task, factor, and sampling type (if applicable).

        Raises:
            ValueError: If model with rank1 is not found, or any component cannot be
                determined.
        """
        best_model_key = next(
            (
                key
                for key in self.learners_dict
                if f"_{self.criterion}_" in key and "rank1" in key
            ),
            None,
        )

        if not best_model_key:
            raise ValueError(
                f"No model with rank1 found for criterion '{self.criterion}' in dict."
            )

        best_model = self.learners_dict[best_model_key]

        if "one_hot" in best_model_key:
            encoding = "one_hot"
        elif "target" in best_model_key:
            encoding = "target"
        else:
            raise ValueError("Unable to determine encoding from the model key.")

        if "upsampling" in best_model_key:
            sampling = "upsampling"
        elif "downsampling" in best_model_key:
            sampling = "downsampling"
        elif "smote" in best_model_key:
            sampling = "smote"
        else:
            sampling = None

        key_parts = best_model_key.split("_")
        task = key_parts[0]
        learner = key_parts[1]

        for part in key_parts:
            if part.startswith("factor"):
                factor_value = part.replace("factor", "")
                if factor_value.isdigit():
                    factor = float(factor_value)
                else:
                    factor = None

        return best_model, encoding, learner, task, factor, sampling

`criterion: str` `property` `writable` ¶

The current evaluation criterion used to select the best model.

Returns:

Name	Type	Description
`str`	`str`	The current criterion for model selection (e.g., 'f1', 'brier_score').

Raises:

Type	Description
`ValueError`	If an unsupported criterion is assigned.

`model: Any` `property` ¶

Retrieves the best model based on the current criterion.

Returns:

Name	Type	Description
`Any`	`Any`	The model object selected according to the current criterion.

Raises:

Type	Description
`ValueError`	If no model matching the criterion and rank is found.

`init(learners_dict, criterion, aggregate, verbose, random_state)` ¶

Initializes ModelExtractor.

Source code in periomod/wrapper/_basewrapper.py

def __init__(
    self,
    learners_dict: Dict,
    criterion: str,
    aggregate: bool,
    verbose: bool,
    random_state: int,
):
    """Initializes ModelExtractor."""
    super().__init__()
    self.learners_dict = learners_dict
    self.criterion = criterion
    self.aggregate = aggregate
    self.verbose = verbose
    self.random_state = random_state
    self._update_best_model()
    self.classification = (
        "multiclass" if self.task == "pdgrouprevaluation" else "binary"
    )

ModelExtractor

criterion: str property writable ¶

model: Any property ¶

__init__(learners_dict, criterion, aggregate, verbose, random_state) ¶

`criterion: str` `property` `writable` ¶

`model: Any` `property` ¶

`init(learners_dict, criterion, aggregate, verbose, random_state)` ¶