TarNetHyperparameterTuner
Description
The TarNetHyperparameterTuner
class is used for hyperparameter tuning of the TarNet model. It leverages a hyperparameter optimization framework (e.g., Optuna) to explore different model configurations and determine the best set of hyperparameters based on validation loss.
Arguments
T: Treatment variables.
Y: Outcome variables.
R: Internal representations.
epoch (list of str, optional): Epoch options as strings (default: [“100”, “200”]).
batch_size (int, optional): Batch size for tuning (default: 64).
valid_perc (float, optional): Fraction of data for validation (default: 0.2).
learning_rate (list of float, optional): Range of learning rates (default: [1e-4, 1e-5]).
dropout (list of float, optional): Range of dropout rates (default: [0.1, 0.2]).
step_size (list of int, optional): List of step sizes (default: [5, 10]).
architecture_y (list of list of str, optional): Outcome model architecture options (default: [“[1]”]).
architecture_z (list of list of str, optional): Deconfounder architecture options (default: [“[1024]”, “[2048]”, “[4096]”]).
bn (list of bool, optional): Options for batch normalization (default: [True, False]).
patience_min (int, optional): Minimum patience value (default: 5).
patience_max (int, optional): Maximum patience value (default: 20).
Example Usage
from gpi_pack.TarNet import TarNetHyperparameterTuner
import optuna
# Load data and set hyperparameters
obj = TarNetHyperparameterTuner(
# Data
T = df['TreatmentVar'].values,
Y = df['OutcomeVar'].values,
R = hidden_states,
# Hyperparameters
epoch = ["100", "200"], #try either 100 epochs or 200 epochs
learning_rate = [1e-4, 1e-5], #draw learning rate in the range (1e-4, 1e-5)
dropout = [0.1, 0.2], #draw dropout rate in the range (1e-4, 1e-5)
# Outcome model architecture:
# [100, 1] means that the deconfounder is passed to the intermediate layer with size 100,
# and then it passes to the output layer with size 1.
architecture_y = ["[200, 1]", "[100,1]"], #either [200, 1] or [100, 1] (size of layers)
#Deconfounder model architecture:
# [1024] means that the input (hidden states) is passed to the intermediate layer with size 1024.
# The size of last layer (last number in the list) corresponds to the dimension of the deconfounder.
architecture_z = ["[1024]", "[2048]"] #either [1024] or [2048]
)
# Hyperparameter tuning with Optuna
study = optuna.create_study(direction='minimize')
study.optimize(obj.objective, n_trials=100) #runs 100 trials to seek the best hyperparameter
#Print the best hyperparameters
print("Best hyperparameters: ", study.best_params)