Release 0.8.23

Need Help?

The SuperCowPowers team is happy to give any assistance needed when setting up AWS and SageWorks. So please contact us at sageworks@supercowpowers.com or on chat us up on Discord

The SageWorks framework continues to flex to support different real world use cases when operating a set of production machine learning pipelines.

Note: These release notes cover the changes from 0.8.22 to 0.8.23

General

Mostly bug fixes and minor API changes.

API Changes

Removing auto_one_hot arg from PandasToFeatures and DataSource.to_features()

When creating a PandasToFeatures object or using DataSource.to_features() there was an optional argument auto_one_hot. This would try to automatically convert object/string columns to be one-hot encoded. In general this was only useful for 'toy' datasets but for more complex data we need to specify exactly which columns we want converted.
Adding optional one_hot_columns arg to PandasToFeatures.set_input() and DataSource.to_features()

When calling either of these FeatureSet creation methods you can now add an option arg one_hot_columns as a list of columns that you would like to be one-hot encoded.

Minor Bug Fixes

Our pandas dependency was outdated and causing an issue with an include_groups arg when outlier groups were computed. We've changed the requirements:

pandas>=2.1.2
to
pandas>=2.2.1

We also have a ticket for the logic change so that we avoid the deprecation warning.

Improvements

The time to ingest new rows into a FeatureSet can take a LONG time. Calling the FeatureGroup AWS API and waiting on the results is what takes all the time.

There will hopefully be a series of optimizations around this process, the first one is simply increasing the number of workers/processes for the ingestion manager class.

feature_group.ingest(.., max_processes=8)
(has been changed to)
feature_group.ingest(..., max_processes=16, num_workers=4)

Questions?

The SuperCowPowers team is happy to answer any questions you may have about AWS and SageWorks. Please contact us at sageworks@supercowpowers.com or on chat us up on Discord