InferenceCache
InferenceCache is a caching wrapper around a Workbench Endpoint. It's handy when an endpoint is slow to invoke and the same inputs show up across calls — the motivating example is the 3D molecular feature endpoint smiles-to-3d-descriptors-v1, which takes real time to generate conformers and force-field optimize each molecule.
On each inference(df) call, rows whose cache-key value is already in the cache are served from S3, and only the new rows go to the underlying endpoint. Newly-computed rows are written back to the cache. The cache lives in a shared S3-backed DFStore, so once one person has computed a row, everyone gets it for free.
Not the same as workbench.cached.CachedEndpoint
CachedEndpoint caches metadata methods like summary(), details(), and health_check(). InferenceCache caches inference results. Different classes, different concerns.
Example
from workbench.api import Endpoint, FeatureSet, InferenceCache
# Wrap a slow endpoint in an InferenceCache
endpoint = Endpoint("smiles-to-3d-descriptors-v1")
cached_endpoint = InferenceCache(endpoint, cache_key_column="smiles")
# Pull a DataFrame of molecules and run inference
df = FeatureSet("feature_endpoint_fs").pull_dataframe()[:50]
# First call: slow (cache is empty, rows go to the endpoint)
results = cached_endpoint.inference(df)
# Second call with the same SMILES: near-instant (all hits)
results_again = cached_endpoint.inference(df)
# Drop a bad row so it recomputes on the next call
cached_endpoint.delete_entries("c1ccc(cc1)C(=O)O")
# Or drop many at once
cached_endpoint.delete_entries(["CCO", "CCN", "CCOCC"])
# Inspect the cache
print(cached_endpoint.cache_size())
print(cached_endpoint.cache_info())
Output (log lines)
InferenceCache[smiles-to-3d-descriptors-v1]: 0/50 cache hits
InferenceCache[smiles-to-3d-descriptors-v1]: computing 50 new rows via endpoint
InferenceCache[smiles-to-3d-descriptors-v1]: 50/50 cache hits
InferenceCache[smiles-to-3d-descriptors-v1]: removed 1 entries
InferenceCache[smiles-to-3d-descriptors-v1]: removed 3 entries
Endpoint change detection
By default, InferenceCache keeps the existing cache regardless of endpoint changes. If you want it to automatically clear the cache when the endpoint has been modified since the cache was last written, pass auto_invalidate_cache=True:
A tiny sidecar manifest stores the endpoint's modified() timestamp; when auto-invalidation is enabled, the cache is cleared on the next access if the stored and current timestamps differ.
Attribute delegation
InferenceCache forwards anything it doesn't define to the wrapped endpoint, so cached_endpoint.name, cached_endpoint.details(), cached_endpoint.fast_inference(), etc. all Just Work.
API Reference
InferenceCache: Client-side caching wrapper around a Workbench Endpoint.
Wraps an Endpoint and stores inference results in a shared S3-backed
DFStore keyed on a cache-key column (SMILES by default). On each
inference(df) call, rows whose cache-key value is already in the cache
are served from S3, and only the remaining rows are sent to the underlying
endpoint. Newly computed rows are written back to the cache.
Motivating use case: the smiles-to-3d-descriptors-v1 feature endpoint is
slow (conformer generation + FF optimization), and the same SMILES is
frequently re-computed across calls.
Note: this is distinct from workbench.cached.CachedEndpoint, which
caches metadata methods (summary, details, health_check). This
class caches inference results.
InferenceCache
InferenceCache: Client-side caching wrapper for a Workbench Endpoint.
Common Usage
from workbench.api import Endpoint
from workbench.api.inference_cache import InferenceCache
endpoint = Endpoint("smiles-to-3d-descriptors-v1")
cached_endpoint = InferenceCache(endpoint, cache_key_column="smiles")
# Drop-in replacement for endpoint.inference()
result_df = cached_endpoint.inference(eval_df)
# Other endpoint methods still work via attribute delegation
print(cached_endpoint.name)
cached_endpoint.details()
Source code in src/workbench/api/inference_cache.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 | |
__getattr__(name)
Delegate any unrecognized attribute access to the wrapped Endpoint.
Source code in src/workbench/api/inference_cache.py
__init__(endpoint, cache_key_column='smiles', output_key_column=None, auto_invalidate_cache=False)
Initialize the InferenceCache.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
endpoint
|
Endpoint
|
The Workbench Endpoint to wrap. |
required |
cache_key_column
|
str
|
Name of the column whose values are used as the cache key (default: "smiles"). |
'smiles'
|
output_key_column
|
Optional[str]
|
Name of the column in the endpoint's output that contains the original input key values. Some endpoints normalize/canonicalize the key column (e.g. canonical SMILES) and place the original value in a separate column (e.g. "orig_smiles"). When set, the cache uses this column's values as the key so future lookups with the original input values still hit. When None (default), the cache key column in the output is assumed to match the input unchanged. |
None
|
auto_invalidate_cache
|
bool
|
When True, automatically clear the cache if the endpoint has been modified since the cache was last written. When False (default), the existing cache is kept regardless of endpoint changes — the manifest is reseeded on first load so subsequent calls have a consistent baseline. |
False
|
Source code in src/workbench/api/inference_cache.py
cache_info()
Summary of the cache: path, row count, columns, manifest.
Source code in src/workbench/api/inference_cache.py
cache_size()
clear_cache()
Delete the cache (and manifest) from S3 and reset in-memory state.
Source code in src/workbench/api/inference_cache.py
delete_entries(keys)
Remove one or more entries from the cache by cache-key value(s).
Use this to drop bad results that should be recomputed on the next
inference() call.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
keys
|
Union[Any, Iterable[Any]]
|
A single cache-key value, or an iterable of them. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
int |
int
|
Number of rows removed from the cache. |
Source code in src/workbench/api/inference_cache.py
inference(eval_df, **kwargs)
Run cached inference on eval_df.
Rows whose cache_key_column value is already in the cache are
served from S3; the rest are sent to the underlying endpoint and the
new results are written back to the cache. The returned DataFrame
preserves the original row order of eval_df.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
eval_df
|
DataFrame
|
DataFrame to run predictions on. Must
contain |
required |
**kwargs
|
Any
|
Forwarded to the wrapped |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
pd.DataFrame: |
DataFrame
|
left-joined on |