Lab: High Latency Under Load¶

Create controlled request pressure that keeps the environment serving traffic but drives p95 latency and queueing high enough to practice saturation diagnosis.

Lab Metadata¶

Attribute	Value
Difficulty	Intermediate
Duration	40 minutes
Tier	Load-balanced web server environment
Failure Mode	Application responds slowly under concurrent load, causing elevated latency and intermittent `Warning` health
Skills Practiced	CloudWatch metric interpretation, ALB latency correlation, access log review, scaling signal analysis

1) Background¶

1.1 Why this lab exists¶

Latency incidents are often gradual rather than catastrophic. This lab builds the habit of proving saturation with metrics and logs before changing scaling settings.

1.2 Platform behavior model¶

As request concurrency rises, worker time, CPU, and connection pressure can push response time above the health model threshold. EB may remain available while reporting Warning or Degraded because the service is slow rather than down.

1.3 Diagram (Mermaid)¶

flowchart LR
    A[Load generator] --> B[ALB request queue]
    B --> C[EC2 instances]
    C --> D[Long app processing time]
    D --> E[High target response time]
    E --> F[EB health warning]

2) Hypothesis¶

2.1 Original hypothesis¶

High latency is caused by application saturation under concurrent load rather than networking failure.

2.2 Causal chain¶

Increased load -> longer application processing time -> ALB target response time rises -> EB health moves to Warning or Degraded -> users experience slow responses.

2.3 Proof criteria¶

CloudWatch metrics show increased response time and request count during the trigger window.
Access logs confirm slow but successful responses.
Instance CPU or request backlog rises with the same timing.

2.4 Disproof criteria¶

Latency rises without load or resource pressure, suggesting downstream dependency or network path failure instead.

3) Runbook¶

Deploy the healthy baseline and record normal latency.

eb deploy "$ENV_NAME" --staged
aws cloudwatch get-metric-statistics \
    --namespace AWS/ApplicationELB \
    --metric-name TargetResponseTime \
    --dimensions Name=LoadBalancer,Value="$LOAD_BALANCER_DIMENSION" \
    --statistics Average \
    --extended-statistics p95 \
    --start-time 2026-04-07T03:00:00Z \
    --end-time 2026-04-07T03:15:00Z \
    --period 60

Trigger controlled load.

bash "trigger.sh"

Watch health and ALB metrics during the test.

aws elasticbeanstalk describe-environment-health \
    --environment-name "$ENV_NAME" \
    --attribute-names Status Color Causes ApplicationMetrics

aws cloudwatch get-metric-statistics \
    --namespace AWS/ApplicationELB \
    --metric-name RequestCount \
    --dimensions Name=LoadBalancer,Value="$LOAD_BALANCER_DIMENSION" \
    --statistics Sum \
    --start-time 2026-04-07T03:15:00Z \
    --end-time 2026-04-07T03:30:00Z \
    --period 60

Collect access and application evidence.

eb logs --environment-name "$ENV_NAME" --all
sudo less /var/log/nginx/access.log
sudo less /var/log/web.stdout.log

Query access logs with Logs Insights if streaming is enabled.

aws logs start-query \
    --log-group-name "/aws/elasticbeanstalk/$ENV_NAME/var/log/nginx/access.log" \
    --start-time 1712477700 \
    --end-time 1712481300 \
    --query-string 'fields @timestamp, @message | filter @message like /GET/ | sort @timestamp desc | limit 50'

4) Experiment Log¶

Time (UTC)	Observation	Evidence
13:00	Baseline latency remains low	CloudWatch `TargetResponseTime`
13:05	Load generator starts sustained concurrency	`trigger.sh` output
13:09	p95 latency and request count rise together	CloudWatch metrics
13:12	EB health shifts to `Warning`	`describe-environment-health`
13:15	Access logs show slower completion times but mostly successful requests	access logs

Expected Evidence¶

Before Trigger (Baseline)¶

Low average and p95 response time.
Health is Ok.

During Incident¶

Request count and target response time rise together.
Health causes mention elevated latency.
Access logs show long request durations rather than immediate connection failures.

After Recovery¶

Removing load returns latency to baseline.
Health returns to Ok without infrastructure changes.

Evidence Timeline (Mermaid sequence diagram)¶

sequenceDiagram
    participant Load as Load Generator
    participant ALB
    participant App
    participant CW as CloudWatch
    Load->>ALB: Burst requests
    ALB->>App: Forward concurrent traffic
    App-->>ALB: Slow responses
    ALB->>CW: Publish response time metrics
    CW-->>Load: Elevated latency visible

Evidence Chain: Why This Proves the Hypothesis¶

When request volume rises, latency rises in lockstep and then falls after the load stops. That temporal relationship, combined with slow-successful access log entries, supports a saturation explanation rather than a binary failure such as a routing or certificate issue.

Clean Up¶

eb terminate "$ENV_NAME"
aws cloudformation delete-stack --stack-name "$STACK_NAME" --region "$AWS_REGION"

High Latency Under Load

Lab: High Latency Under Load¶

Lab Metadata¶

1) Background¶

1.1 Why this lab exists¶

1.2 Platform behavior model¶

1.3 Diagram (Mermaid)¶

2) Hypothesis¶

2.1 Original hypothesis¶

2.2 Causal chain¶

2.3 Proof criteria¶

2.4 Disproof criteria¶

3) Runbook¶

4) Experiment Log¶

Expected Evidence¶

Before Trigger (Baseline)¶

During Incident¶

After Recovery¶

Evidence Timeline (Mermaid sequence diagram)¶

Evidence Chain: Why This Proves the Hypothesis¶

Clean Up¶

See Also¶

Sources¶

Lab: High Latency Under Load¶

Lab Metadata¶

1) Background¶

1.1 Why this lab exists¶

1.2 Platform behavior model¶

1.3 Diagram (Mermaid)¶

2) Hypothesis¶

2.1 Original hypothesis¶

2.2 Causal chain¶

2.3 Proof criteria¶

2.4 Disproof criteria¶

3) Runbook¶

4) Experiment Log¶

Expected Evidence¶

Before Trigger (Baseline)¶

During Incident¶

After Recovery¶

Evidence Timeline (Mermaid sequence diagram)¶

Evidence Chain: Why This Proves the Hypothesis¶

Clean Up¶

Related Playbook¶

See Also¶

Sources¶