Skip to content

Release 0.8.358

Need Help?

The SuperCowPowers team is happy to give any assistance needed when setting up AWS and Workbench. So please contact us at workbench@supercowpowers.com or on chat us up on Discord

Sample weights and the train/holdout split used to live as shared mutable state on the FeatureSet. When several models trained off the same FeatureSet concurrently, one job's weights-table rewrite could collide with another's read of the shared training view — an intermittent INVALID_VIEW failure (plus a quieter risk of training on the wrong weights). This release makes the FeatureSet a pure data abstraction and moves weights/splits to per-model artifacts, removing the race by construction.

Full design write-up: Remove the Shared Training View + Sample-Weight State.

API changes

  • Sample weights are now per-model. Pass them to the model transform instead of mutating the FeatureSet:
    sample_weights = {id_1: 0.0, id_2: 0.5, ...}   # sparse; missing ids default to 1.0
    fs.to_model(..., sample_weights=sample_weights)
    
    Rows with weight 0 are excluded from training at the view level, so every model (including custom ones) gets exclusion for free.
  • endpoint.auto_inference()endpoint.test_inference(). It is now an endpoint smoke test (a sample of rows), not a holdout evaluation.
  • fs.temporal_split(date_col, end_date) returns a {id: weight} dict (was a list of holdout ids) and no longer persists anything — compose it with other weight sources and hand it to to_model(sample_weights=...).

Code example for API change

Before

# Weights were shared mutable state on the FeatureSet
fs.set_sample_weights({id_1: 0.0, id_2: 0.5})

# temporal_split mutated the FeatureSet (additively merged holdout weights)
fs.temporal_split("date_col", end_date="2025-10-17")

model = fs.to_model(name=model_name, ...)

end = model.to_endpoint(...)
end.auto_inference()

After

# Weights are a plain local dict
sample_weights = {id_1: 0.0, id_2: 0.5}

# temporal_split now RETURNS {id: 0.0}; merge it to keep the old additive behavior
sample_weights = {**sample_weights, **fs.temporal_split("date_col", end_date="2025-10-17")}

# Hand the composed weights to the model transform
model = fs.to_model(name=model_name, ..., sample_weights=sample_weights)

end = model.to_endpoint(...)
end.test_inference()

Highlights:

  • Merge, don't overwrite. temporal_split used to be additive — composing with {**sample_weights, **fs.temporal_split(...)} keeps existing exclusions in place.
  • Empty weights are a no-op. sample_weights={} (or omitting the arg) trains on all rows, so a bare fs.set_sample_weights({}) can simply be dropped.

Removed

  • FeatureSet methods: set_sample_weights, get_sample_weights, add_filter, set_training_holdouts, get_training_holdouts, get_training_data.
  • The shared fs.view("training") and the TrainingView class.
  • Write-only workbench_inference_metrics metadata and its internal loader.

Other

  • Regression UQ is now robust to FeatureSets without a smiles column: it fits the structure-free V0 model and skips the fingerprint-based V1/V2 (which still fit when smiles is present).

Migration

Pipelines that called fs.set_sample_weights(...) before training should drop that call and pass sample_weights= to fs.to_model(...) / FeaturesToModel.transform(...) instead.

Questions?

The SuperCowPowers team is happy to answer any questions you may have about AWS and Workbench. Please contact us at workbench@supercowpowers.com or on chat us up on Discord