Release 0.8.50
Need Help?
The SuperCowPowers team is happy to give any assistance needed when setting up AWS and SageWorks. So please contact us at sageworks@supercowpowers.com or on chat us up on Discord
The SageWorks framework continues to flex to support different real world use cases when operating a set of production machine learning pipelines.
Note: These release notes cover the changes from 0.8.46
to 0.8.50
General
This release is an incremental release as part of the road map for v.0.9.0
. Please see the full details of the planned changes here: v0.9.0 Roadmap.
FeatureSet: id_column lock in
We're going to lock in id_columns when FeatureSets are created, AWS FeatureGroup requires an id column, so this is the best place to do it, see API Changes below.
FeatureSet: Robust handling of training column
In the past we haven't supported giving a training column as input data. FeatureSets are read-only, so locking in the training rows is 'suboptimal'. In general you might want to use the FeatureSet for several models with different training/hold_out sets. Now if a FeatureSet detects a training column it will give the follow message:
Training column detected: Since FeatureSets are read only, SageWorks
creates training views that can be dynamically changed. We'll use
this training column to create a training view.
Endpoint: auto_inference()
We're changing the internal logic for the auto_inference()
method to include the id_column in it's output.
API Changes
FeatureSet
When creating a FeatureSet the id_column
is now a required argument.
to_features = PandasToFeatures("my_feature_set")
to_features.set_input(df_features, id_column="my_id") <-- Required
to_features.set_output_tags(["blah", "whatever"])
to_features.transform()
to_features = PandasToFeatures("my_feature_set")
to_features.set_input(df_features, id_column="auto") <-- Auto Id (index)
For more details see: FeatureSet Class
The new Meta() API will be used inside of the Artifact classes (see Internal Changes...Artifacts... below)
Improvements
DFStore
Robust handling of slashes, so now it will 'just work' with various upserts and gets:
```
# These all give you /ml/shap_value dataframe
df_store.get("/ml/shap_values")
df_store.get("ml/shap_values")
df_store.get("//ml/shap_values")
```
Internal Changes
There's a whole new directory structure that helps isolate Cloud Platform specific funcitonality.
- The
DFStore
now usesAWSDFStore
as its concrete implementation class. - Both
CachedMeta
andAWSAccountClamp
have had a revamp of their singleton logic.
Internal Caching
So as part of our v0.9.0 Roadmap we're continuing to revamp caching. We're experimenting with CachedMeta Class inside the Artifact classes. Caching continues to be challenging for the framework, it's an absolute must for Web Inferface/UI performance and then it needs to get out of the way for batch jobs and the concurrent building of ML pipelines.
Specific Code Changes
Who doesn't like looking at code! Also +3 points for getting down this far! Here's a cow joke as a reward:
That feeling like you’ve done this before? .... Deja-moo
Questions?
The SuperCowPowers team is happy to answer any questions you may have about AWS and SageWorks. Please contact us at sageworks@supercowpowers.com or on chat us up on Discord