PyTorch Models
Workbench PyTorch models train feedforward neural networks on RDKit molecular descriptors. They support both regression and classification with built-in uncertainty quantification via cross-validation ensembles.
Creating a PyTorch Model
from workbench.api import FeatureSet, ModelType, ModelFramework
fs = FeatureSet("aqsol_features")
# Regression with uncertainty quantification
model = fs.to_model(
name="sol-pytorch-reg",
model_type=ModelType.UQ_REGRESSOR,
model_framework=ModelFramework.PYTORCH,
target_column="solubility",
feature_list=fs.feature_columns,
description="PyTorch regression for solubility",
tags=["pytorch", "solubility"],
)
# Deploy and run inference
endpoint = model.to_endpoint()
endpoint.auto_inference()
Classification
model = fs.to_model(
name="sol-pytorch-class",
model_type=ModelType.CLASSIFIER,
model_framework=ModelFramework.PYTORCH,
target_column="solubility_class",
feature_list=fs.feature_columns,
description="PyTorch classifier for solubility",
tags=["pytorch", "classification"],
)
model.set_class_labels(["low", "medium", "high"])
Hyperparameters
Set hyperparameters via the hyperparameters dict:
model = fs.to_model(
name="sol-pytorch-tuned",
model_type=ModelType.UQ_REGRESSOR,
model_framework=ModelFramework.PYTORCH,
target_column="solubility",
feature_list=fs.feature_columns,
hyperparameters={
"layers": "256-128-64",
"max_epochs": 200,
"learning_rate": 0.001,
"dropout": 0.1,
"batch_size": 32,
},
)
| Parameter | Default | Description |
|---|---|---|
| layers | 512-256-128 | Hidden layer sizes (dash-separated) |
| max_epochs | 200 | Maximum training epochs |
| learning_rate | 0.001 | Optimizer learning rate |
| dropout | 0.05 | Dropout rate |
| batch_size | 64 | Training batch size |
| early_stopping_patience | 30 | Epochs without improvement before stopping |
| loss | L1Loss | Loss function: L1Loss, MSELoss, HuberLoss, SmoothL1Loss |
| n_folds | 5 | Number of cross-validation folds (ensemble size) |
| split_strategy | random | Data splitting: random, scaffold, or butina |
Layer Architecture
The layers parameter defines the feedforward network architecture as a dash-separated string. Each number is a hidden layer dimension:
"512-256-128"— Three hidden layers (default, good for most datasets)"128-64-32"— Smaller network for smaller datasets"1024-512-256-128"— Deeper network for large, complex datasets
Examples
Full code listing: examples/models/pytorch.py
Questions?

The SuperCowPowers team is happy to answer any questions you may have about AWS® and Workbench.
- Support: workbench@supercowpowers.com
- Discord: Join us on Discord
- Website: supercowpowers.com