Kiến thức cần nắm:
Giải thích chi tiết:
Observability Pillars cho GenAI:
| Pillar | Tools | Metrics |
|---|---|---|
| Metrics | CloudWatch Metrics | Token usage, latency, error rates |
| Logs | CloudWatch Logs | Prompts, responses, errors |
| Traces | X-Ray | End-to-end request flow |
| Dashboards | CloudWatch Dashboards | Business + operational views |
Kiến thức cần nắm:
Giải thích chi tiết:
Bedrock Model Invocation Logging:
# Enable model invocation logging
bedrock.put_model_invocation_logging_configuration(
loggingConfig={
'cloudWatchConfig': {
'logGroupName': '/aws/bedrock/model-invocations',
'roleArn': 'arn:aws:iam::role/bedrock-logging-role',
'largeDataDelivery': {
's3Config': {
'bucketName': 'bedrock-logs-bucket',
'keyPrefix': 'invocation-logs/'
}
}
},
'textDataDeliveryEnabled': True,
'imageDataDeliveryEnabled': True
}
)
Key GenAI Metrics:
| Metric | Mô tả | Alert Threshold |
|---|---|---|
| Input tokens/request | Avg input token count | > expected baseline |
| Output tokens/request | Avg output token count | > max_tokens setting |
| Latency (TTFT) | Time to first token | > SLA threshold |
| Latency (total) | Total response time | > SLA threshold |
| Error rate | % failed invocations | > 1% |
| Throttle rate | % throttled requests | > 0.5% |
| Cost per request | Avg cost per invocation | > budget threshold |
Kiến thức cần nắm:
Giải thích chi tiết:
CloudWatch Dashboard cho GenAI:
Kiến thức cần nắm:
Giải thích chi tiết:
Agent Tool Monitoring:
Kiến thức cần nắm:
Giải thích chi tiết:
Vector Store Health Metrics:
| Metric | Mô tả |
|---|---|
| Query latency | Time to return search results |
| Index size | Number of vectors stored |
| Recall rate | % relevant results returned |
| Index freshness | Time since last update |
| Storage utilization | Disk/memory usage |
Kiến thức cần nắm:
Giải thích chi tiết:
Hallucination Detection Pipeline:
FM Response → Compare against golden dataset
↓
Semantic similarity score < threshold?
↓ Yes
Flag as potential hallucination → Alert → Human review
Response Consistency Testing: