Batch Transform¶
Overview¶
Run inference on large datasets without deploying an endpoint.
When to Use¶
- Large dataset inference
- No real-time requirements
- Periodic batch predictions
- Cost optimization (no always-on endpoint)
Creating a Batch Transform Job¶
from sagemaker.transformer import Transformer
transformer = Transformer(
model_name="my-model",
instance_count=2,
instance_type="ml.m5.xlarge",
output_path="s3://bucket/batch-output/",
strategy="SingleRecord", # or "MultiRecord"
assemble_with="Line",
max_payload=6 # MB
)
transformer.transform(
data="s3://bucket/batch-input/",
content_type="text/csv",
split_type="Line"
)
transformer.wait()
Configuration Options¶
| Parameter | Options | Description |
|---|---|---|
| strategy | SingleRecord, MultiRecord | How to batch records |
| split_type | Line, RecordIO, None | How to split input |
| assemble_with | Line, None | How to assemble output |
| max_payload | 0-100 MB | Max payload per request |
| max_concurrent_transforms | 1-100 | Parallel processing |
Data Formats¶
Input¶
Output¶
Batch Transform vs Endpoint¶
| Aspect | Batch Transform | Real-time Endpoint |
|---|---|---|
| Use Case | Large batch processing | Real-time predictions |
| Latency | Minutes to hours | Milliseconds |
| Cost | Pay per job | Pay per hour |
| Scaling | Automatic | Manual/Auto |
| Infrastructure | Transient | Persistent |
Best Practices¶
!!! tip "Optimization" 1. Use MultiRecord strategy for throughput 2. Increase max_concurrent_transforms for parallelism 3. Use appropriate instance types for your workload 4. Partition input data for faster processing
Exam Tips¶
!!! warning "Key Points" - Batch Transform for offline, large-scale inference - No endpoint maintenance required - Cost-effective for infrequent predictions - Supports data distribution across instances