Container Deployment¶
Overview¶
Deploying ML models using containers on AWS.
Amazon ECR¶
Container registry for storing Docker images.
# Build and push image
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin account.dkr.ecr.us-east-1.amazonaws.com
docker build -t my-inference-image .
docker tag my-inference-image:latest account.dkr.ecr.us-east-1.amazonaws.com/my-repo:latest
docker push account.dkr.ecr.us-east-1.amazonaws.com/my-repo:latest
SageMaker Container Requirements¶
Directory Structure¶
/opt/ml/
├── model/ # Model artifacts
├── input/
│ ├── config/ # Hyperparameters, resource config
│ └── data/ # Training data channels
└── output/ # Model output, failure info
Inference Container¶
FROM python:3.9-slim
RUN pip install flask gunicorn scikit-learn
COPY inference.py /opt/program/
COPY serve /opt/program/
ENV PATH="/opt/program:${PATH}"
WORKDIR /opt/program
ENTRYPOINT ["python", "serve"]
Required Endpoints¶
| Endpoint | Method | Purpose |
|---|---|---|
| /ping | GET | Health check |
| /invocations | POST | Inference requests |
# serve.py
from flask import Flask, request
import json
app = Flask(__name__)
@app.route("/ping", methods=["GET"])
def ping():
return "", 200
@app.route("/invocations", methods=["POST"])
def invocations():
data = request.get_json()
prediction = model.predict(data)
return json.dumps({"prediction": prediction})
if __name__ == "__main__":
app.run(host="0.0.0.0", port=8080)
Bring Your Own Container (BYOC)¶
from sagemaker.model import Model
model = Model(
image_uri="account.dkr.ecr.region.amazonaws.com/my-repo:latest",
model_data="s3://bucket/model.tar.gz",
role=role
)
model.deploy(
instance_type="ml.m5.large",
initial_instance_count=1
)
ECS/EKS Deployment¶
For non-SageMaker deployments.
ECS¶
# task-definition.json
{
"family": "ml-inference",
"containerDefinitions":
[
{
"name": "inference",
"image": "account.dkr.ecr.region.amazonaws.com/my-repo:latest",
"memory": 2048,
"cpu": 1024,
"portMappings": [{ "containerPort": 8080 }],
},
],
}
EKS¶
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: ml-inference
spec:
replicas: 3
selector:
matchLabels:
app: ml-inference
template:
spec:
containers:
- name: inference
image: account.dkr.ecr.region.amazonaws.com/my-repo:latest
ports:
- containerPort: 8080
Exam Tips¶
!!! warning "Key Points" - ECR for storing container images - SageMaker containers need /ping and /invocations - BYOC for custom frameworks - ECS/EKS for non-SageMaker deployments