First 10 Minutes: Health Degradation¶
Symptoms¶
- Environment health transitions from Green to Yellow or Red.
- Environment shows Grey during updates and does not return to healthy state.
- Enhanced health reports severe causes such as command timeout, failed checks, or high error rates.
- One or more instances stay unhealthy after deployment or scaling event.
- Users report intermittent failures that align with health status changes.
flowchart TD
A[Health Color Changed] --> B[Run eb health and read cause]
B --> C{Color}
C -->|Grey| D[Check active deploy or command execution]
C -->|Yellow| E[Check instance-level warning patterns]
C -->|Red| F[Check target health and app availability]
D --> G[Collect events and logs]
E --> G
F --> G
G --> H[Test top hypothesis]
H --> I[Stabilize or escalate] Quick Check Commands¶
eb health --environment "$ENV_NAME" --profile "eb-ops" --refresh
eb events --environment "$ENV_NAME" --profile "eb-ops"
eb logs --environment "$ENV_NAME" --profile "eb-ops"
aws elasticbeanstalk describe-environment-health \
--environment-name "$ENV_NAME" \
--attribute-names "All" \
--profile "eb-ops" \
--region "$REGION"
aws elasticbeanstalk describe-instances-health \
--environment-name "$ENV_NAME" \
--attribute-names "All" \
--profile "eb-ops" \
--region "$REGION"
Signals to capture immediately:
- Environment health color and status reason.
- Instance health status and degraded cause per instance.
- Load balancer target health transitions during the same timestamp window.
- Recent deployment, scaling, or configuration update events.
Common Causes¶
Enhanced Health Detects Application Errors¶
- Elevated 5xx rate from application process.
- Health endpoint returns non-200 under load.
- Dependency latency pushes responses past timeout thresholds.
Instance Startup or Replacement Instability¶
- New instances fail to pass health checks after launch.
- Rolling deployment batch too large for current capacity.
- Warm-up time longer than configured health expectations.
Resource Saturation¶
- CPU or memory pressure causes slow processing and failed checks.
- Connection pools saturate and reject requests.
- Garbage collection pauses or process restarts increase error spikes.
Load Balancer Health Check Mismatch¶
- Health check path does not represent application readiness.
- Health check port/path mismatch after configuration change.
- Security group rules block target health checks.
Grey Health Prolonged¶
- Environment command running longer than expected.
- Platform hook script waiting on unavailable dependency.
- Instance reports delayed due to initialization issues.
Escalation Path¶
Escalate when any of the following remain true after first-response fixes:
- Red health persists across all instances beyond one deploy/scale cycle.
- Replaced instances repeatedly fail health checks.
- Error rate remains elevated with no clear app-level signature.
Escalation package:
- Health color timeline with exact UTC timestamps.
describe-environment-healthanddescribe-instances-healthoutputs.- Correlated events and key log excerpts.
- Recent change list (deployment, config update, scaling policy).
See Also¶
Sources¶
- https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/health-enhanced.html
- https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/health-enhanced-status.html
- https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.healthstatus.html
- https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/environments-cfg-alb.html
- https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.logging.html