Monitor

Monitor Examples

Examples of using the Monitor class are listed at the bottom of this page Examples.

Monitor: Manages AWS Endpoint Monitor creation and deployment. Endpoints Monitors are set up and provisioned for deployment into AWS. Monitors can be viewed in the AWS Sagemaker interfaces or in the SageWorks Dashboard UI, which provides additional monitor details and performance metrics

`Monitor`

Bases: MonitorCore

Monitor: SageWorks Monitor API Class

Common Usage

mon = Endpoint(name).get_monitor()  # Pull from endpoint OR
mon = Monitor(name)                 # Create using Endpoint Name
mon.summary()
mon.details()

# One time setup methods
mon.add_data_capture()
mon.create_baseline()
mon.create_monitoring_schedule()

# Pull information from the monitor
baseline_df = mon.get_baseline()
constraints_df = mon.get_constraints()
stats_df = mon.get_statistics()
input_df, output_df = mon.get_latest_data_capture()

Source code in src/sageworks/api/monitor.py

class Monitor(MonitorCore):
    """Monitor: SageWorks Monitor API Class

    Common Usage:
       ```
       mon = Endpoint(name).get_monitor()  # Pull from endpoint OR
       mon = Monitor(name)                 # Create using Endpoint Name
       mon.summary()
       mon.details()

       # One time setup methods
       mon.add_data_capture()
       mon.create_baseline()
       mon.create_monitoring_schedule()

       # Pull information from the monitor
       baseline_df = mon.get_baseline()
       constraints_df = mon.get_constraints()
       stats_df = mon.get_statistics()
       input_df, output_df = mon.get_latest_data_capture()
       ```
    """

    def summary(self) -> dict:
        """Monitor Summary

        Returns:
            dict: A dictionary of summary information about the Monitor
        """
        return super().summary()

    def details(self) -> dict:
        """Monitor Details

        Returns:
            dict: A dictionary of details about the Monitor
        """
        return super().details()

    def add_data_capture(self, capture_percentage=100):
        """
        Add data capture configuration for this Monitor/endpoint.

        Args:
            capture_percentage (int): Percentage of data to capture. Defaults to 100.
        """
        super().add_data_capture(capture_percentage)

    def create_baseline(self, recreate: bool = False):
        """Code to create a baseline for monitoring

        Args:
            recreate (bool): If True, recreate the baseline even if it already exists

        Notes:
            This will create/write three files to the baseline_dir:
            - baseline.csv
            - constraints.json
            - statistics.json
        """
        super().create_baseline(recreate)

    def create_monitoring_schedule(self, schedule: str = "hourly", recreate: bool = False):
        """
        Sets up the monitoring schedule for the model endpoint.

        Args:
            schedule (str): The schedule for the monitoring job (hourly or daily, defaults to hourly).
            recreate (bool): If True, recreate the monitoring schedule even if it already exists.
        """
        super().create_monitoring_schedule(schedule, recreate)

    def get_latest_data_capture(self) -> (pd.DataFrame, pd.DataFrame):
        """
        Get the latest data capture input and output from S3.

        Returns:
            DataFrame (input), DataFrame(output): Flattened and processed DataFrames for input and output data.
        """
        return super().get_latest_data_capture()

    def get_baseline(self) -> Union[pd.DataFrame, None]:
        """Code to get the baseline CSV from the S3 baseline directory

        Returns:
            pd.DataFrame: The baseline CSV as a DataFrame (None if it doesn't exist)
        """
        return super().get_baseline()

    def get_constraints(self) -> Union[pd.DataFrame, None]:
        """Code to get the constraints from the baseline

        Returns:
           pd.DataFrame: The constraints from the baseline (constraints.json) (None if it doesn't exist)
        """
        return super().get_constraints()

    def get_statistics(self) -> Union[pd.DataFrame, None]:
        """Code to get the statistics from the baseline

        Returns:
            pd.DataFrame: The statistics from the baseline (statistics.json) (None if it doesn't exist)
        """
        return super().get_statistics()

`add_data_capture(capture_percentage=100)`

Add data capture configuration for this Monitor/endpoint.

Parameters:

Name	Type	Description	Default
`capture_percentage`	`int`	Percentage of data to capture. Defaults to 100.	`100`

Source code in src/sageworks/api/monitor.py

def add_data_capture(self, capture_percentage=100):
    """
    Add data capture configuration for this Monitor/endpoint.

    Args:
        capture_percentage (int): Percentage of data to capture. Defaults to 100.
    """
    super().add_data_capture(capture_percentage)

`create_baseline(recreate=False)`

Code to create a baseline for monitoring

Parameters:

Name	Type	Description	Default
`recreate`	`bool`	If True, recreate the baseline even if it already exists	`False`

Notes

This will create/write three files to the baseline_dir: - baseline.csv - constraints.json - statistics.json

Source code in src/sageworks/api/monitor.py

def create_baseline(self, recreate: bool = False):
    """Code to create a baseline for monitoring

    Args:
        recreate (bool): If True, recreate the baseline even if it already exists

    Notes:
        This will create/write three files to the baseline_dir:
        - baseline.csv
        - constraints.json
        - statistics.json
    """
    super().create_baseline(recreate)

`create_monitoring_schedule(schedule='hourly', recreate=False)`

Sets up the monitoring schedule for the model endpoint.

Parameters:

Name	Type	Description	Default
`schedule`	`str`	The schedule for the monitoring job (hourly or daily, defaults to hourly).	`'hourly'`
`recreate`	`bool`	If True, recreate the monitoring schedule even if it already exists.	`False`

Source code in src/sageworks/api/monitor.py

def create_monitoring_schedule(self, schedule: str = "hourly", recreate: bool = False):
    """
    Sets up the monitoring schedule for the model endpoint.

    Args:
        schedule (str): The schedule for the monitoring job (hourly or daily, defaults to hourly).
        recreate (bool): If True, recreate the monitoring schedule even if it already exists.
    """
    super().create_monitoring_schedule(schedule, recreate)

`details()`

Monitor Details

Returns:

Name	Type	Description
`dict`	`dict`	A dictionary of details about the Monitor

Source code in src/sageworks/api/monitor.py

def details(self) -> dict:
    """Monitor Details

    Returns:
        dict: A dictionary of details about the Monitor
    """
    return super().details()

`get_baseline()`

Code to get the baseline CSV from the S3 baseline directory

Returns:

Type	Description
`Union[DataFrame, None]`	pd.DataFrame: The baseline CSV as a DataFrame (None if it doesn't exist)

Source code in src/sageworks/api/monitor.py

def get_baseline(self) -> Union[pd.DataFrame, None]:
    """Code to get the baseline CSV from the S3 baseline directory

    Returns:
        pd.DataFrame: The baseline CSV as a DataFrame (None if it doesn't exist)
    """
    return super().get_baseline()

`get_constraints()`

Code to get the constraints from the baseline

Returns:

Type	Description
`Union[DataFrame, None]`	pd.DataFrame: The constraints from the baseline (constraints.json) (None if it doesn't exist)

Source code in src/sageworks/api/monitor.py

def get_constraints(self) -> Union[pd.DataFrame, None]:
    """Code to get the constraints from the baseline

    Returns:
       pd.DataFrame: The constraints from the baseline (constraints.json) (None if it doesn't exist)
    """
    return super().get_constraints()

`get_latest_data_capture()`

Get the latest data capture input and output from S3.

Returns:

Name	Type	Description
`DataFrame`	`input), DataFrame(output`	Flattened and processed DataFrames for input and output data.

Source code in src/sageworks/api/monitor.py

def get_latest_data_capture(self) -> (pd.DataFrame, pd.DataFrame):
    """
    Get the latest data capture input and output from S3.

    Returns:
        DataFrame (input), DataFrame(output): Flattened and processed DataFrames for input and output data.
    """
    return super().get_latest_data_capture()

`get_statistics()`

Code to get the statistics from the baseline

Returns:

Type	Description
`Union[DataFrame, None]`	pd.DataFrame: The statistics from the baseline (statistics.json) (None if it doesn't exist)

Source code in src/sageworks/api/monitor.py

def get_statistics(self) -> Union[pd.DataFrame, None]:
    """Code to get the statistics from the baseline

    Returns:
        pd.DataFrame: The statistics from the baseline (statistics.json) (None if it doesn't exist)
    """
    return super().get_statistics()

`summary()`

Monitor Summary

Returns:

Name	Type	Description
`dict`	`dict`	A dictionary of summary information about the Monitor

Source code in src/sageworks/api/monitor.py

def summary(self) -> dict:
    """Monitor Summary

    Returns:
        dict: A dictionary of summary information about the Monitor
    """
    return super().summary()

Examples

Initial Setup of the Endpoint Monitor

monitor_setup.py

from sageworks.api.monitor import Monitor

# Create an Endpoint Monitor Class and perform initial Setup
endpoint_name = "abalone-regression-end-rt"
mon = Monitor(endpoint_name)

# Add data capture to the endpoint
mon.add_data_capture(capture_percentage=100)

# Create a baseline for monitoring
mon.create_baseline()

# Set up the monitoring schedule
mon.create_monitoring_schedule(schedule="hourly")

Pulling Information from an Existing Monitor

monitor_usage.py

from sageworks.api.monitor import Monitor
from sageworks.api.endpoint import Endpoint

# Construct a Monitor Class in one of Two Ways
mon = Endpoint("abalone-regression-end-rt").get_monitor()
mon = Monitor("abalone-regression-end-rt")

# Check the summary and details of the monitoring class
mon.summary()
mon.details()

# Check the baseline outputs (baseline, constraints, statistics)
base_df = mon.get_baseline()
base_df.head()

constraints_df = mon.get_constraints()
constraints_df.head()

statistics_df = mon.get_statistics()
statistics_df.head()

# Get the latest data capture (inputs and outputs)
input_df, output_df = mon.get_latest_data_capture()
input_df.head()
output_df.head()

SageWorks UI

Running these few lines of code creates and deploys an AWS Endpoint Monitor. The Monitor status and outputs can be viewed in the Sagemaker Console interfaces or in the SageWorks Dashboard UI. SageWorks will use the monitor to track various metrics including Data Quality, Model Bias, etc...

sageworks_endpoints — SageWorks Dashboard: Endpoints

Not Finding a particular method?

The SageWorks API Classes use the 'Core' Classes Internally, so for an extensive listing of all the methods available please take a deep dive into: SageWorks Core Classes