Skip to content

XGBoost Models

XGBoost is the default model framework in Workbench. It trains gradient boosted trees on RDKit molecular descriptors and supports both regression and classification with built-in uncertainty quantification.

Creating an XGBoost Model

from workbench.api import FeatureSet, ModelType

fs = FeatureSet("aqsol_features")

# Regression with uncertainty quantification
model = fs.to_model(
    name="sol-xgb-reg",
    model_type=ModelType.UQ_REGRESSOR,
    target_column="solubility",
    feature_list=fs.feature_columns,
    description="XGBoost regression for solubility",
    tags=["xgboost", "solubility"],
)

# Deploy and run inference
endpoint = model.to_endpoint()
endpoint.auto_inference()

Classification

model = fs.to_model(
    name="sol-xgb-class",
    model_type=ModelType.CLASSIFIER,
    target_column="solubility_class",
    feature_list=fs.feature_columns,
    description="XGBoost classifier for solubility",
    tags=["xgboost", "classification"],
)
model.set_class_labels(["low", "medium", "high"])

Hyperparameters

Set hyperparameters via the hyperparameters dict:

model = fs.to_model(
    name="sol-xgb-tuned",
    model_type=ModelType.UQ_REGRESSOR,
    target_column="solubility",
    feature_list=fs.feature_columns,
    hyperparameters={
        "n_estimators": 200,
        "max_depth": 6,
        "learning_rate": 0.1,
        "subsample": 0.8,
    },
)
Parameter Default Description
n_estimators 300 Number of boosted trees
max_depth 7 Maximum tree depth
learning_rate 0.05 Boosting learning rate (eta)
subsample 0.8 Fraction of training samples per tree
n_folds 5 Number of cross-validation folds (ensemble size)
split_strategy random Data splitting: random, scaffold, or butina
butina_cutoff 0.4 Tanimoto distance threshold (butina splits only)

Any additional XGBoost parameters (e.g., colsample_bytree, gamma, min_child_weight) are passed directly to the XGBoost estimator.

Examples

Full code listing: examples/models/xgb_model.py


Questions?

The SuperCowPowers team is happy to answer any questions you may have about AWS® and Workbench.