Troubleshoot Your API Faster: Using ELB Logs and Athena

Nic Lasdoce
24 Sep 20236 minutes read

This step-by-step guide is a startup's shortcut to robust logging and analytics with AWS ELB Access Logs and Athena. With minimal setup, you'll gain actionable insights to mitigate security and performance issues, freeing you to focus on business growth. This guide is your quick path to a more secure and optimized cloud infrastructure.

Introduction

Navigating logs and metrics has always been really hard for developers, but when issues arise then we should be able to investigate deeper, and if possible do a query using SQL (a language almost every developer knows). Problems like security vulnerabilities, performance bottlenecks, and unexpected user behavior often manifest in subtle ways that may go unnoticed until it's too late. This is where the crucial alliance of AWS Elastic Load Balancer (ELB) Access Logs and AWS Athena comes into play. By employing ELB Access Logs, you gain a meticulous record of all HTTP requests sent to your ELB, capturing essential data points that can be invaluable for diagnostics and analytics. However, raw logs are just the beginning—the real power comes when you pair these with AWS Athena to do analysis using SQL.

Setting Up AWS ELB Access Logs

Create an S3 Bucket to store the Logs

  1. Navigate to the Amazon S3 dashboard by visiting https://console.aws.amazon.com/s3/.
  2. Click on "Create bucket."
  3. While on the "Create bucket" interface, carry out the following steps:
    1. For the "Bucket name" field, input a distinctive name that doesn't duplicate any existing S3 bucket names. Keep in mind that some regions might impose extra constraints on naming conventions. For additional details, consult the Amazon Simple Storage Service User Guide's section on bucket limitations.
    2. In "AWS Region," opt for the geographical location where your load balancer is deployed.
    3. Under "Default encryption," go for the Amazon S3-managed keys option (SSE-S3).
    4. Click "Create bucket" to finalize the setup.
  4. Attach the following policy to your bucket
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::elb-account-id:root"
},
"Action": "s3:PutObject",
"Resource": "my-s3-arn"
}
]
}

By following these instructions, you'll have a secure and region-specific S3 bucket ready to capture and store ELB logs, a critical step in establishing a robust analytics pipeline.

Enabling Logging

  1. Navigate to the Amazon EC2 dashboard by visiting https://console.aws.amazon.com/ec2/.
  2. On the sidebar, click on "Load Balancers."
  3. Find and click on your load balancer's name to view its information page.
  4. Head over to the "Attributes" tab and select "Edit."
  5. In the "Monitoring" section, enable "Access logs."
  6. For the "S3 URI" field, input the appropriate URI where you want your logs stored. The format of the URI will depend on whether you're using a prefix or not.
    1. If you're using a prefix: s3://bucket-name/prefix
    2. Without a prefix: s3://bucket-name
  7. Click on "Save changes" to update your settings.\

Configuring AWS Athena for Log Analysis

  1. Open the Athena console and create a new database.
  2. Paste the following into your query editor.
  3. Replace the values in LOCATION and "storage.location.template"
    s3://your-alb-logs-directory/AWSLogs/<ACCOUNT-ID>/elasticloadbalancing/<REGION>/
    with the s3 bucket you selected in ELB Access Logs
CREATE EXTERNAL TABLE IF NOT EXISTS alb_logs (
type string,
time string,
elb string,
client_ip string,
client_port int,
target_ip string,
target_port int,
request_processing_time double,
target_processing_time double,
response_processing_time double,
elb_status_code int,
target_status_code string,
received_bytes bigint,
sent_bytes bigint,
request_verb string,
request_url string,
request_proto string,
user_agent string,
ssl_cipher string,
ssl_protocol string,
target_group_arn string,
trace_id string,
domain_name string,
chosen_cert_arn string,
matched_rule_priority string,
request_creation_time string,
actions_executed string,
redirect_url string,
lambda_error_reason string,
target_port_list string,
target_status_code_list string,
classification string,
classification_reason string
)
PARTITIONED BY
(
day STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1',
'input.regex' =
'([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*):([0-9]*) ([^ ]*)[:-]([0-9]*) ([-.0-9]*) ([-.0-9]*) ([-.0-9]*) (|[-0-9]*) (-|[-0-9]*) ([-0-9]*) ([-0-9]*) \"([^ ]*) (.*) (- |[^ ]*)\" \"([^\"]*)\" ([A-Z0-9-_]+) ([A-Za-z0-9.-]*) ([^ ]*) \"([^\"]*)\" \"([^\"]*)\" \"([^\"]*)\" ([-.0-9]*) ([^ ]*) \"([^\"]*)\" \"([^\"]*)\" \"([^ ]*)\" \"([^\s]+?)\" \"([^\s]+)\" \"([^ ]*)\" \"([^ ]*)\"')
LOCATION 's3://your-alb-logs-directory/AWSLogs/<ACCOUNT-ID>/elasticloadbalancing/<REGION>/'
TBLPROPERTIES
(
"projection.enabled" = "true",
"projection.day.type" = "date",
"projection.day.range" = "2022/01/01,NOW",
"projection.day.format" = "yyyy/MM/dd",
"projection.day.interval" = "1",
"projection.day.interval.unit" = "DAYS",
"storage.location.template" = "s3://your-alb-logs-directory/AWSLogs/<ACCOUNT-ID>/elasticloadbalancing/<REGION>/${day}"
)

Analyzing Logs With Athena

Identify Specific Error Codes

Find out how many 4xx or 5xx error codes were encountered.

SELECT elb_status_code, count(*)
FROM alb_logs
WHERE elb_status_code >= 400
GROUP BY elb_status_code;

Monitoring Endpoints

Find out which endpoints are accessed most often.

SELECT request_url, count(*)
FROM alb_logs
GROUP BY request_url
ORDER BY count(*) DESC;

Filtering By Day

Identify IPs that have made the most requests.

SELECT *
FROM alb_logs
WHERE day = '2022/02/12'

Filtering By Request Type

Segment logs by the type of HTTP request.

SELECT request_verb, count(*)
FROM alb_logs
GROUP BY request_verb;

Use Cases to Consider

  1. Security Audits: Unusual access patterns can be indicative of a security breach.
  2. Performance Tuning: Identifying endpoints with the most errors or longest response times helps in targeted optimizations.
  3. Client Behavior Analysis: Knowing which endpoints are accessed the most can aid in UX/UI decisions.

Conclusion

Harnessing the power of AWS ELB logs and Athena can transform your approach to analytics and monitoring, allowing you to be more proactive rather than reactive. This setup is not just an architectural decision; it's a strategic move to better understand your system's inner workings.

So, next time you think about skipping on logging and analytics — don't. The benefits of insights you'll gain is well worth the trouble initial setup investment.

Bonus

If you are a founder needing help in your Software Architecture or Cloud Infrastructure, we do free assessment and we will tell you if we can do it or not! Feel free to contact us at any of the following:
Social
Contact

Email: nic@triglon.tech

Drop a Message

Tags:
Software Development
TechStack
AWS
NodeJS

Nic Lasdoce

Software Architect

Unmasking Challenges, Architecting Solutions, Deploying Results

Member since Mar 15, 2021

Tech Hub

Unleash Your Tech Potential: Explore Our Cutting-Edge Guides!

Stay ahead of the curve with our cutting-edge tech guides, providing expert insights and knowledge to empower your tech journey.

View All
The Quest for MicroAgents: Loosely Coupled, Highly Cohesive (Part 2.3)
19 Nov 20242 minutes read
View All

Get The Right Job For You

Subscribe to get updated on latest and relevant career opportunities