Release 0.8.23
Need Help?
The SuperCowPowers team is happy to give any assistance needed when setting up AWS and Workbench. So please contact us at workbench@supercowpowers.com or on chat us up on Discord
The Workbench framework continues to flex to support different real world use cases when operating a set of production machine learning pipelines.
Note: These release notes cover the changes from 0.8.22 to 0.8.23
General
Mostly bug fixes and minor API changes.
API Changes
-
Removing
auto_one_hotarg fromPandasToFeaturesandDataSource.to_features()When creating a
PandasToFeaturesobject or usingDataSource.to_features()there was an optional argumentauto_one_hot. This would try to automatically convert object/string columns to be one-hot encoded. In general this was only useful for 'toy' datasets but for more complex data we need to specify exactly which columns we want converted. -
Adding optional
one_hot_columnsarg toPandasToFeatures.set_input()andDataSource.to_features()When calling either of these FeatureSet creation methods you can now add an option arg
one_hot_columnsas a list of columns that you would like to be one-hot encoded.
Minor Bug Fixes
Our pandas dependency was outdated and causing an issue with an include_groups arg when outlier groups were computed. We've changed the requirements:
Improvements
The time to ingest new rows into a FeatureSet can take a LONG time. Calling the FeatureGroup AWS API and waiting on the results is what takes all the time.
There will hopefully be a series of optimizations around this process, the first one is simply increasing the number of workers/processes for the ingestion manager class.
feature_group.ingest(.., max_processes=8)
(has been changed to)
feature_group.ingest(..., max_processes=16, num_workers=4)
Questions?

The SuperCowPowers team is happy to answer any questions you may have about AWS and Workbench. Please contact us at workbench@supercowpowers.com or on chat us up on Discord