Skip to content

Async Endpoints

Same Class, Different Mode

There is no separate AsyncEndpoint API class. Async behavior lives in AsyncEndpointCore and is invoked transparently by Endpoint when the underlying SageMaker endpoint was deployed as async.

Async endpoints support long-running inference (up to 60 minutes per invocation) and scale to zero when idle, so you only pay for compute during active batch runs. The API is the same as a sync Endpoint: send a DataFrame, get a DataFrame back — the S3 round-trip is handled internally.

Async endpoint flow: S3 Upload → SageMaker → Uvicorn → FastAPI → Model → S3 Result
Async endpoints add an S3 I/O layer for long-running invocations and scale to zero when idle.

Quick Example

async_inference.py
from workbench.api import Endpoint

# Endpoint auto-detects the async deployment and routes accordingly
endpoint = Endpoint("smiles-to-3d-full-v1")
results_df = endpoint.inference(df)

Deploy a New Async Endpoint

deploy_async.py
from workbench.api import Model

model = Model("smiles-to-3d-full-v1")
model.to_endpoint(async_endpoint=True, tags=["smiles", "3d descriptors", "full"])

Full Reference

For the full method list, deployment options, scaling configuration, and advanced usage, see AsyncEndpointCore.