Road Map v0.9.0

Need Help?

The SuperCowPowers team is happy to give any assistance needed when setting up AWS and SageWorks. So please contact us at sageworks@supercowpowers.com or on chat us up on Discord

The SageWorks framework continues to flex to support different real world use cases when operating a set of production machine learning pipelines.

General

Streamlining

We've learned a lot from our beta testers!

One of the important lessons is not to 'over manage' AWS. We want to provide useful, granular Classes and APIs. Getting out of the way is just as important as providing functionality. So streamlining will be a big part of our 0.9.0 roadmap.

Horizontal Scaling

The framwork is currently struggling with 10 parallel ML pipelines being run concurrently. When running simultaneous pipelines we're seeing AWS service/query contention, throttling, and the occasional 'partial state' that we get back from AWS.

Our plan for 0.9.0 is to have formalized horizontal stress testing that tests everything from 4 to 32 concurrent ML pipelines. Even though 32 may not seem like much, AWS has various quotas and limits that we'll be hitting, so 32 is a good goal for 0.9.0. Obviously once we get to 32 we'll look forward to an architecture that will support 100's of concurrent pipelines.

Full Artifact Load or None

For the SageWorks’ DataSource, FeatureSet, Model, and Endpoint classes the new functionality will ensure that objects are only instantiated when all required data is fully available, returning None if the artifact ID is invalid or if the object is only partially constructed in AWS.

By preventing partially constructed objects, this approach reduces runtime errors when accessing incomplete attributes and simplifies error handling for clients, enhancing robustness and reliability. We are looking at Pydantic for formally capturing schema and types (see Pydantic).

Onboarding

We'll have to balance the 'full artifact or None' with the need to onboard() artifacts created outside of SageWorks. We'll probably have a class method for onboarding, something like:

my_model = Model.onboard("some_random_model")
if my_model is None:
    <handle failure to onboard>

Caching

Caching needs a substantial overhaul. Right now SageWorks over uses caching. We baked it into our AWSServiceBroker and that gets used by everything.

Caching only really makes sense when we can't wait for AWS to respond to requests. The Dashboard and Web Interfaces are the only use case where responsiveness is important. Other use cases like nightly batch processing, scripts or notebooks, will work totally fine waiting for AWS responses.

Class/API Reductions

The organic growth of SageWorks was based on user feedback and testing, that organic growth has led to an abundance of Classes and API calls. We'll be identifying classes and methods that are 'cruft' from some development push and will be deprecating those.

Deprecation Warnings

We're starting to put in deprecation warning as we streamline classes and APIs. If you're using a class or method that's going to be deprecated you'll see a log message like this:

broker = AWSServiceBroker()
WARNING AWSServiceBroker is deprecated and will be removed in version 0.9.

If you're using a class that's NOT going to be deprecated but currently uses/relies on one that is you'll still get a warning that you can ignore (developers will take care of it).

# This class is NOT deprecated but an internal class is
meta = Meta() 
WARNING AWSServiceBroker is deprecated and will be removed in version 0.9.

In general these warning messages will be annoying but they will help us smoothly transistion and streamline our Classes and APIs.

Deprecations

AWSServiceBroker: Replaced by the Meta() class
Other Stuff

API Changes

Meta()

The new Meta() class will provide API that aligns with the AWS list and describe API. We'll have functionality for listing objects (models, feature sets, etc) and then functionality around the details for a named artifact.

meta = Meta()
models_list = meta.models()  # List API
end_list = meta.endpoints()  # List API

fs_dict = meta.feature_set("my_fs") # Describe API
model_dict = meta.model("my_model") # Describe API

The new Meta() API will be used inside of the Artifact classes (see Internal Changes...Artifacts... below)

Improvements/Fixes

FeatureSet

When running concurrent ML pipelines we occasion get a partially constructed FeatureSet, FeatureSets will now 'wait and fail' if they detect partially constructed data (like offline storage not being ready).

Internal Changes

Meta()

We're going to make a bunch of changes to Meta() specifically around more granular (LESS) caching. Also there will be an AWSMeta() subclass that manages the AWS specific API calls. We'll also put stubs in for AzureMeta() and GCPMeta(), cause hey we might have a client who really wants that flexibility.

The new Meta class will also include API that's more aligned to the AWS list and describe interfacts. Allowing both broad and deep queries of the Machine Learning Artifacts within AWS.

Artifacts

We're getting rid of caching for individual Artifacts, if you're constructing an artifact object, you probably want detailed information that's 'up to date' and waiting a bit is probably fine. Note: We'll still make these instantiations as fast as we can, removing the caching logic will as least simplify the implementations.

Questions?

The SuperCowPowers team is happy to answer any questions you may have about AWS and SageWorks. Please contact us at sageworks@supercowpowers.com or on chat us up on Discord