SageMaker Instance Types
Quick reference for choosing the right instance type.
Instance Categories
General Purpose (ml.m5.*)
| Instance |
vCPU |
Memory |
Use Case |
| ml.m5.large |
2 |
8 GB |
Small jobs, testing |
| ml.m5.xlarge |
4 |
16 GB |
Medium workloads |
| ml.m5.2xlarge |
8 |
32 GB |
Larger datasets |
| ml.m5.4xlarge |
16 |
64 GB |
Production training |
| ml.m5.12xlarge |
48 |
192 GB |
Large-scale |
Best for: Balanced compute and memory requirements
Compute Optimized (ml.c5.*)
| Instance |
vCPU |
Memory |
Use Case |
| ml.c5.large |
2 |
4 GB |
CPU-intensive |
| ml.c5.xlarge |
4 |
8 GB |
Preprocessing |
| ml.c5.2xlarge |
8 |
16 GB |
Inference |
| ml.c5.4xlarge |
16 |
32 GB |
Production |
Best for: CPU-bound algorithms (tree-based, preprocessing)
GPU Instances (ml.p3., ml.p4d., ml.g4dn., ml.g5.)
| Instance |
GPU |
GPU Memory |
Use Case |
| ml.g4dn.xlarge |
1x T4 |
16 GB |
Inference, light training |
| ml.g5.xlarge |
1x A10G |
24 GB |
Training/inference |
| ml.p3.2xlarge |
1x V100 |
16 GB |
Deep learning training |
| ml.p3.8xlarge |
4x V100 |
64 GB |
Distributed training |
| ml.p3.16xlarge |
8x V100 |
128 GB |
Large models |
| ml.p4d.24xlarge |
8x A100 |
320 GB |
Largest models |
Best for: Deep learning, computer vision, NLP
Memory Optimized (ml.r5.*)
| Instance |
vCPU |
Memory |
Use Case |
| ml.r5.large |
2 |
16 GB |
Memory-intensive |
| ml.r5.xlarge |
4 |
32 GB |
Large datasets |
| ml.r5.2xlarge |
8 |
64 GB |
In-memory processing |
| ml.r5.4xlarge |
16 |
128 GB |
Very large datasets |
Best for: Large datasets that need to fit in memory
Instance Selection Guide
Training
| Algorithm Type |
Recommended |
| XGBoost, linear |
ml.m5., ml.c5. |
| Deep learning (small) |
ml.g4dn., ml.g5. |
| Deep learning (large) |
ml.p3., ml.p4d. |
| Large data processing |
ml.r5.* |
Inference
| Workload |
Recommended |
| Low latency, CPU |
ml.c5.* |
| Low latency, GPU |
ml.g4dn., ml.g5. |
| Cost-sensitive |
ml.m5.large, Serverless |
| High throughput |
ml.c5.* with Auto Scaling |
Cost Optimization Tips
| Strategy |
Savings |
When to Use |
| Spot Training |
Up to 90% |
Non-urgent training |
| Savings Plans |
Up to 64% |
Consistent usage |
| Serverless |
Variable |
Unpredictable traffic |
| Right-sizing |
Varies |
Regularly |
Instance Limits
| Limit Type |
Default |
Notes |
| ml.p3 instances |
1-2 |
Request increase |
| ml.p4d instances |
0 |
Request increase |
| Total instances |
Varies |
Per account |
Quick Decision Tree
Need GPU?
├── No → Need more memory than CPU?
│ ├── Yes → ml.r5.*
│ └── No → CPU-intensive?
│ ├── Yes → ml.c5.*
│ └── No → ml.m5.*
└── Yes → Training or Inference?
├── Training → Model size?
│ ├── Small → ml.g4dn.*, ml.g5.*
│ └── Large → ml.p3.*, ml.p4d.*
└── Inference → ml.g4dn.xlarge