Skip to content

BaseConfig

Base class to initialize Hydra configuration.

This class loads and sets various configuration parameters used across the package, providing easy access to these parameters by initializing them from a Hydra configuration file.

Parameters:

Name Type Description Default
config_path str

Path to the Hydra config directory.

'../config'
config_name str

Name of the configuration file (without extension).

'config'

Attributes:

Name Type Description
group_col str

Column name used for group-based splitting.

y str

Target column name in the dataset.

target_state int

Random state of target encoding.

learner_state int

Random state of learners.

xgb_obj_binary str

Objective function for binary classification in XGBoost.

xgb_loss_binary str

Loss function for binary classification in XGBoost.

xgb_obj_multi str

Objective function for multiclass classification in XGBoost.

xgb_loss_multi str

Loss function for multiclass classification in XGBoost.

lr_solver_binary str

Solver type for binary classification in logistic regression.

lr_solver_multi str

Solver type for multiclass classification in logistic regression.

lr_multi_loss str

Loss function for multiclass logistic regression.

patient_columns List[str]

List of column names related to patient data.

tooth_columns List[str]

List of column names related to tooth data.

side_columns List[str]

List of column names related to side data.

feature_mapping dict[str, str]

Mapping of feature names for plotting.

cat_vars List[str]

List of categorical variables in the dataset.

bin_vars List[str]

List of binary variables in the dataset.

scale_vars List[str]

List of numeric variables to scale in preprocessing.

behavior_columns Dict[str, List[str]]

Dictionary categorizing behavior-related columns by type.

task_cols List[str]

List of task-specific columns in the dataset.

no_train_cols List[str]

Columns excluded from training.

infect_vars List[str]

Columns indicating infection status.

cat_map Dict[str, int]

Mapping of categorical features and their maximum values for encoding.

target_cols List[str]

Columns related to the prediction target.

all_cat_vars List[str]

Combined list of categorical variables for encoding.

required_columns List[str]

Combined list of columns required in the dataset for analysis.

rs_state int

State for random search parameter selection.

Example
config = BaseConfig()
print(config.tooth_columns)
Note

This class assumes Hydra configuration files are correctly set up and stored at config_path. Make sure the file structure and values are properly defined within the configuration.

Source code in periomod/base.py
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
class BaseConfig:
    """Base class to initialize Hydra configuration.

    This class loads and sets various configuration parameters used across the package,
    providing easy access to these parameters by initializing them from a Hydra
    configuration file.

    Args:
        config_path (str): Path to the Hydra config directory.
        config_name (str): Name of the configuration file (without extension).

    Attributes:
        group_col (str): Column name used for group-based splitting.
        y (str): Target column name in the dataset.
        target_state (int): Random state of target encoding.
        learner_state (int): Random state of learners.
        xgb_obj_binary (str): Objective function for binary classification in XGBoost.
        xgb_loss_binary (str): Loss function for binary classification in XGBoost.
        xgb_obj_multi (str): Objective function for multiclass classification in
            XGBoost.
        xgb_loss_multi (str): Loss function for multiclass classification in XGBoost.
        lr_solver_binary (str): Solver type for binary classification in logistic
            regression.
        lr_solver_multi (str): Solver type for multiclass classification in logistic
            regression.
        lr_multi_loss (str): Loss function for multiclass logistic regression.
        patient_columns (List[str]): List of column names related to patient data.
        tooth_columns (List[str]): List of column names related to tooth data.
        side_columns (List[str]): List of column names related to side data.
        feature_mapping (dict[str, str]): Mapping of feature names for plotting.
        cat_vars (List[str]): List of categorical variables in the dataset.
        bin_vars (List[str]): List of binary variables in the dataset.
        scale_vars (List[str]): List of numeric variables to scale in preprocessing.
        behavior_columns (Dict[str, List[str]]): Dictionary categorizing
            behavior-related columns by type.
        task_cols (List[str]): List of task-specific columns in the dataset.
        no_train_cols (List[str]): Columns excluded from training.
        infect_vars (List[str]): Columns indicating infection status.
        cat_map (Dict[str, int]): Mapping of categorical features and their maximum
            values for encoding.
        target_cols (List[str]): Columns related to the prediction target.
        all_cat_vars (List[str]): Combined list of categorical variables for encoding.
        required_columns (List[str]): Combined list of columns required in the dataset
            for analysis.
        rs_state (int): State for random search parameter selection.

    Example:
        ```
        config = BaseConfig()
        print(config.tooth_columns)
        ```

    Note:
        This class assumes Hydra configuration files are correctly set up and stored at
        `config_path`. Make sure the file structure and values are properly defined
        within the configuration.
    """

    def __init__(
        self, config_path: str = "../config", config_name: str = "config"
    ) -> None:
        """Initializes the Hydra configuration for use in other classes."""
        with hydra.initialize(config_path=config_path, version_base="1.2"):
            cfg = hydra.compose(config_name=config_name)

        self.group_col = cfg.resample.group_col
        self.y = cfg.resample.y
        self.target_state = cfg.resample.target_state
        self.learner_state = cfg.learner.learner_state
        self.xgb_obj_binary = cfg.learner.xgb_obj_binary
        self.xgb_loss_binary = cfg.learner.xgb_loss_binary
        self.xgb_obj_multi = cfg.learner.xgb_obj_multi
        self.xgb_loss_multi = cfg.learner.xgb_loss_multi
        self.lr_solver_binary = cfg.learner.lr_solver_binary
        self.lr_solver_multi = cfg.learner.lr_solver_multi
        self.lr_multi_loss = cfg.learner.lr_multi_loss
        self.patient_columns = cfg.data.patient_columns
        self.tooth_columns = cfg.data.tooth_columns
        self.side_columns = cfg.data.side_columns
        self.feature_mapping = cfg.data.feature_mapping
        self.cat_vars = cfg.data.cat_vars
        self.bin_vars = cfg.data.bin_vars
        self.scale_vars = cfg.data.scale_vars
        self.behavior_columns = cfg.data.behavior_columns
        self.task_cols = cfg.data.task_cols
        self.no_train_cols = cfg.data.no_train_cols
        self.infect_vars = cfg.data.infect_cols
        self.cat_map = cfg.data.cat_map
        self.target_cols = cfg.data.target_cols
        self.all_cat_vars = self.cat_vars + cfg.data.behavior_columns["categorical"]
        self.required_columns = (
            self.patient_columns + self.tooth_columns + self.side_columns
        )
        self.rs_state = cfg.tuning.rs_state

__init__(config_path='../config', config_name='config')

Initializes the Hydra configuration for use in other classes.

Source code in periomod/base.py
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
def __init__(
    self, config_path: str = "../config", config_name: str = "config"
) -> None:
    """Initializes the Hydra configuration for use in other classes."""
    with hydra.initialize(config_path=config_path, version_base="1.2"):
        cfg = hydra.compose(config_name=config_name)

    self.group_col = cfg.resample.group_col
    self.y = cfg.resample.y
    self.target_state = cfg.resample.target_state
    self.learner_state = cfg.learner.learner_state
    self.xgb_obj_binary = cfg.learner.xgb_obj_binary
    self.xgb_loss_binary = cfg.learner.xgb_loss_binary
    self.xgb_obj_multi = cfg.learner.xgb_obj_multi
    self.xgb_loss_multi = cfg.learner.xgb_loss_multi
    self.lr_solver_binary = cfg.learner.lr_solver_binary
    self.lr_solver_multi = cfg.learner.lr_solver_multi
    self.lr_multi_loss = cfg.learner.lr_multi_loss
    self.patient_columns = cfg.data.patient_columns
    self.tooth_columns = cfg.data.tooth_columns
    self.side_columns = cfg.data.side_columns
    self.feature_mapping = cfg.data.feature_mapping
    self.cat_vars = cfg.data.cat_vars
    self.bin_vars = cfg.data.bin_vars
    self.scale_vars = cfg.data.scale_vars
    self.behavior_columns = cfg.data.behavior_columns
    self.task_cols = cfg.data.task_cols
    self.no_train_cols = cfg.data.no_train_cols
    self.infect_vars = cfg.data.infect_cols
    self.cat_map = cfg.data.cat_map
    self.target_cols = cfg.data.target_cols
    self.all_cat_vars = self.cat_vars + cfg.data.behavior_columns["categorical"]
    self.required_columns = (
        self.patient_columns + self.tooth_columns + self.side_columns
    )
    self.rs_state = cfg.tuning.rs_state