Troubleshooting Hub¶
1. Summary¶
Central entry point for Elastic Beanstalk troubleshooting with a quick-start flow and links to all playbooks.
flowchart TD
A[Observe symptom in Elastic Beanstalk] --> B[Collect environment events]
B --> C[Collect logs and health evidence]
C --> D[Evaluate competing hypotheses]
D --> E{Evidence supports hypothesis?}
E -->|Yes| F[Apply focused mitigation]
E -->|No| G[Disprove and test next hypothesis]
F --> H[Re-check health and events]
G --> C Section Table¶
| Area | Start Here | Typical Signal |
|---|---|---|
| First 10 Minutes | troubleshooting/first-10-minutes/index.md | Fresh incident and uncertain scope |
| Deployment and Availability | troubleshooting/playbooks/deployment-availability/deployment-failed.md | Deploy errors or immediate rollback |
| Performance | troubleshooting/playbooks/performance/high-latency-under-load.md | Slow responses and high saturation |
| Networking | troubleshooting/playbooks/networking/load-balancer-5xx.md | Load balancer errors and timeout symptoms |
| Hands-on Labs | troubleshooting/lab-guides/index.md | Need reproducible CloudFormation-based practice environments |
| Methodology | troubleshooting/methodology/troubleshooting-method.md | Need repeatable incident workflow |
Quick-Start Flow¶
- Identify whether the incident starts at deploy time, runtime, or network boundary.
- Collect Elastic Beanstalk events and logs first, then verify CloudWatch metrics and health causes.
- Validate one hypothesis at a time and apply the narrowest mitigation possible.
2. Common Misreadings¶
- Green or Yellow health means no user impact; AWS guidance still requires reading health causes and events.
- A successful deploy event means the application is healthy; post-deploy runtime can still fail.
- One log source is sufficient; AWS troubleshooting guidance depends on events plus logs plus health evidence.
3. Competing Hypotheses¶
- The issue is deployment-related and starts at application version processing or command execution.
- The issue is runtime health-related and starts after deployment when instances fail health checks.
- The issue is network-related and traffic cannot reach healthy instances or instances cannot reach dependencies.
4. What to Check First¶
- Review the environment health color and health causes in the Elastic Beanstalk console before changing configuration.
- Read recent environment events and deployment events to identify the first failing operation in time order.
- Request logs from the environment and inspect web server, application, and Elastic Beanstalk engine logs.
- Confirm incident scope by environment name and application version label before changing settings.
5. Evidence to Collect¶
- Environment events in descending and ascending order for the incident window.
- Enhanced health statuses and health causes at environment and instance levels.
- Log bundles or streamed logs including web server, application, and Elastic Beanstalk engine logs.
- Deployment history with application version labels and timestamps.
- Metrics relevant to latency, error rates, CPU, and memory during the same period.
aws elasticbeanstalk describe-events \\
--application-name "<APPLICATION_NAME>" \\
--environment-name "<ENVIRONMENT_NAME>" \\
--max-records 200
aws elasticbeanstalk describe-environment-health \\
--environment-name "<ENVIRONMENT_NAME>" \\
--attribute-names "Status" "Color" "Causes" "ApplicationMetrics" "InstancesHealth"
aws elasticbeanstalk request-environment-info \\
--environment-name "<ENVIRONMENT_NAME>" \\
--info-type "tail"
6. Validation and Disproof by Hypothesis¶
- Validate deployment hypothesis by matching first failure event to engine log error and command phase.
- Validate health hypothesis by matching health causes to request failures, process restarts, or check failures.
- Validate networking hypothesis by matching target health, listener status, and route or security configuration.
- Disprove each hypothesis with at least one contradictory artifact before selecting root cause.
7. Likely Root Cause Patterns¶
- Application startup command or dependency install failure during deployment lifecycle.
- Health check mismatch between configured path and actual application readiness behavior.
- Resource saturation under traffic causing timeouts, error bursts, and degraded health.
- Security group, listener, or routing mismatch blocking expected traffic flow.
8. Immediate Mitigations¶
- Stabilize by rolling back to last known healthy application version when user impact is active.
- Reduce blast radius by scaling capacity or pausing high-risk configuration changes.
- Correct failing health check or listener configuration only after evidence confirms mismatch.
- Re-run validation after mitigation and confirm event stream no longer emits failure pattern.
9. Prevention¶
- Keep deployment and runtime checks in release process with clear rollback criteria.
- Stream logs to CloudWatch Logs and retain enough history for multi-hour incident analysis.
- Use enhanced health monitoring to detect warning trends before severe user impact.
- Document known failure signatures and required evidence for faster future triage.
See Also¶
troubleshooting/index.mdtroubleshooting/decision-tree.mdtroubleshooting/methodology/troubleshooting-method.mdtroubleshooting/lab-guides/index.md
Sources¶
- https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/troubleshooting.html
- https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/using-features.logging.html
- https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/health-enhanced-status.html