AWS Interview Questions (2026)

100 real interview questions with in-depth answers — 30 basic, 40 intermediate, 30 advanced. Updated April 2026.

Preparing for a AWS / Cloud Engineer role?

Browse AWS / Cloud Engineer Jobs Salary Guide How to Become Guide

AWS (Amazon Web Services) is a cloud computing platform offering 200+ services from data centers worldwide. The infrastructure is organized into Regions (geographically isolated groups of data centers, e.g., us-east-1), Availability Zones (AZs — one or more discrete data centers within a Region with redundant power/networking), and Edge Locations (Points of Presence used by CloudFront and Route 53 for low-latency content delivery). As of 2025 AWS has 34 launched regions, 108 AZs, and 600+ edge locations. Choosing the right Region matters for latency, compliance, and data residency.

EC2 instance families are grouped by workload: General Purpose (t4g, m7i) for balanced CPU/memory; Compute Optimized (c7g, c7i) for CPU-intensive apps like batch processing; Memory Optimized (r7g, x2idn) for in-memory databases and large datasets; Storage Optimized (i4i, d3) for high sequential read/write and NVMe SSD; Accelerated Computing (p5, g5, inf2) for ML training/inference and GPU workloads. The t-family uses burstable CPU credits — good for dev environments but risky for sustained load. Always benchmark with actual workloads before committing to a family.

You need the private key (.pem file) from the key pair specified at launch, the public IP or DNS of the instance, and port 22 open in the security group. The default username depends on the AMI: `ec2-user` for Amazon Linux/RHEL, `ubuntu` for Ubuntu, `admin` for Debian.

bash
chmod 400 my-key.pem
ssh -i my-key.pem ec2-user@<public-ip-or-dns>

For instances in private subnets use a bastion host or AWS Systems Manager Session Manager (no open port 22 required, preferred for production).

S3 stores data as objects (up to 5 TB each) inside buckets. An object consists of the data, metadata, and a key — the full path-like name (e.g., `photos/2025/cat.jpg`). There is no real folder hierarchy; the slash is just part of the key. Versioning, when enabled on a bucket, keeps every version of every object instead of overwriting. Deleting an object inserts a delete marker rather than permanently removing it; you can restore by removing the marker. Versioning protects against accidental deletion and overwrites but increases storage costs.

S3 Standard is for frequently accessed data with low latency and high throughput. S3 Intelligent-Tiering automatically moves objects between access tiers based on usage patterns — ideal when access is unpredictable. S3 Standard-IA (Infrequent Access) costs less per GB stored but charges a retrieval fee — good for backups accessed monthly. S3 One Zone-IA is cheaper still but stores data in only one AZ (no AZ redundancy). S3 Glacier Instant Retrieval suits archival data needing millisecond access. Glacier Flexible Retrieval (formerly Glacier) offers minutes-to-hours retrieval at very low cost. Glacier Deep Archive is the cheapest tier with 12–48 hour retrieval for long-term compliance archives.

User data is a script (Bash on Linux, PowerShell on Windows) you provide at instance launch. It runs once automatically on first boot as the root user via cloud-init (Linux) or EC2Launch (Windows). Common uses: installing packages, pulling application code, configuring services. It does NOT run on subsequent reboots unless you explicitly configure cloud-init to run it on every boot. You can pass user data via the console, CLI, or launch template. Logs go to `/var/log/cloud-init-output.log`.

bash
#!/bin/bash
yum update -y
yum install -y nginx
systemctl enable --now nginx

EBS snapshots are incremental backups stored in S3 (managed by AWS, not visible in your S3 console). The first snapshot copies all data; subsequent snapshots only copy blocks changed since the last snapshot, reducing cost and time. You can create volumes from snapshots in any AZ within the same Region, or copy snapshots cross-region for DR. Amazon Data Lifecycle Manager (DLM) automates snapshot creation, retention, and deletion via policies. Example policy: take a snapshot daily, retain 7, cross-copy to us-west-2 for DR.

bash
aws ec2 create-snapshot --volume-id vol-0abc123 --description "nightly backup"

S3 lifecycle policies automate transitioning objects between storage classes and expiring (deleting) objects after a specified time. You define rules scoped by prefix or tag. Example: transition objects to Standard-IA after 30 days, to Glacier Instant Retrieval after 90 days, delete after 365 days. Lifecycle rules also manage incomplete multipart uploads and noncurrent versions when versioning is enabled. This is critical for cost control — without lifecycle rules, objects in Standard storage accumulate indefinitely.

json
{"Rules":[{"Status":"Enabled","Filter":{"Prefix":"logs/"},"Transitions":[{"Days":30,"StorageClass":"STANDARD_IA"}],"Expiration":{"Days":365}}]}

A pre-signed URL is a temporary URL that grants time-limited access to a specific S3 object without requiring the requester to have AWS credentials. The URL is generated server-side using the credentials of an IAM principal that has access to the object. It embeds an expiry time (up to 7 days for IAM user credentials, 15 minutes for STS tokens by default). Use cases: allowing users to download private files directly from S3 bypassing your server (reduces egress cost), or allowing users to upload directly to S3 without exposing credentials.

bash
aws s3 presign s3://my-bucket/report.pdf --expires-in 3600

IAM policy conditions restrict when a policy statement applies. The Condition block contains condition operators mapped to keys and values. `StringEquals` requires an exact match; `StringLike` allows wildcards (`*` and `?`). `ArnLike` matches ARN patterns with wildcards. `IpAddress` restricts to a CIDR range. `Bool` checks true/false (e.g., `aws:SecureTransport`). `DateLessThan` and `DateGreaterThan` create time windows. Conditions are AND-ed within a single condition block key, and OR-ed across multiple keys.

json
{"Condition":{"StringLike":{"s3:prefix":["home/${aws:username}/*"]},"Bool":{"aws:SecureTransport":"true"}}}

An ALB listener forwards traffic to target groups based on rules evaluated in priority order. Each rule has conditions (path, host header, HTTP method, query string, source IP) and an action (forward, redirect, fixed-response, authenticate). Path-based routing sends `/api/*` to an ECS service target group and `/static/*` to an S3 origin target group from the same listener port. Target groups contain EC2 instances, IP addresses, Lambda functions, or other ALBs, with health checks to route around unhealthy targets. This enables a single ALB to front an entire microservices application.

bash
aws elbv2 create-rule --listener-arn <arn> --priority 10 \
  --conditions Field=path-pattern,Values="/api/*" \
  --actions Type=forward,TargetGroupArn=<tg-arn>

Lifecycle hooks pause an EC2 instance during launch (Pending:Wait) or termination (Terminating:Wait) so you can run custom actions before the instance becomes active or is deleted. Use cases during launch: register the instance with a service mesh, configure monitoring agents, warm up caches. Use cases during termination: drain connections, deregister from service discovery, copy logs to S3. The instance remains paused until you send `complete-lifecycle-action` or the heartbeat timeout expires (default 1 hour, max 48 hours). Hooks publish events to SNS or SQS, or trigger EventBridge rules.

bash
aws autoscaling complete-lifecycle-action --lifecycle-hook-name my-hook \
  --auto-scaling-group-name my-asg --lifecycle-action-result CONTINUE \
  --instance-id i-0abc123

A CloudWatch Alarm watches a single metric over a time period and changes state (OK, ALARM, INSUFFICIENT_DATA) when the metric crosses a threshold for a specified number of evaluation periods. Actions on ALARM state: notify SNS, trigger Auto Scaling, stop/terminate/recover an EC2 instance, invoke a Systems Manager OpsItem. Composite alarms combine multiple alarms using Boolean logic (AND/OR/NOT) into a single alarm — used to reduce alert noise. Example: page on-call only if BOTH CPU > 90% AND error_rate > 5% simultaneously. This prevents false pages when one metric spikes transiently.

bash
aws cloudwatch put-composite-alarm --alarm-name high-cpu-and-errors \
  --alarm-rule "ALARM(cpu-alarm) AND ALARM(error-alarm)"

Service Control Policies should be written as deny-list policies (start with FullAWSAccess at root, add targeted deny SCPs at lower OUs/accounts) rather than allow-list policies, which break access. Common guardrails: deny leaving the organization (`organizations:LeaveOrganization`), deny disabling CloudTrail or Config, deny creating IAM users with access keys in production, require encryption for all S3 puts (`s3:x-amz-server-side-encryption` condition), restrict regions to approved ones only, prevent creation of public S3 buckets. SCPs should be applied to OUs, not individual accounts where possible — account-level SCPs are hard to audit at scale.

json
{"Effect":"Deny","Action":["cloudtrail:StopLogging","cloudtrail:DeleteTrail"],"Resource":"*"}

VPC Flow Logs capture metadata about accepted and rejected IP traffic at ENI, subnet, or VPC level. Fields include: version, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action (ACCEPT/REJECT), log-status. To troubleshoot: filter for `REJECT` action on the destination IP and port to confirm a security group or NACL is blocking. Query in CloudWatch Logs Insights or Athena (partition logs in S3 by account/region/date for cost-efficient querying). Common patterns: look for `REJECT` with the expected source port to find asymmetric SG rules; look for short-duration high-volume flows from unexpected IPs as lateral movement indicators. VPC Reachability Analyzer is the proactive complement — it validates network path configuration without sending actual traffic.

sql
SELECT srcaddr, dstaddr, dstport, action, COUNT(*) as c
FROM vpc_flow_logs WHERE action = 'REJECT' GROUP BY 1,2,3,4 ORDER BY c DESC

ECR (Elastic Container Registry) lifecycle policies automate the expiry and deletion of container image versions from a repository. Without them, repositories accumulate thousands of image tags consuming storage and increasing the attack surface (old images may have unpatched vulnerabilities). Lifecycle rules match images by tag prefix pattern (`untagged`, `semver-*`, `dev-*`), count-based rules (keep the last N images by push date), or age-based rules (delete images older than X days). Rules are evaluated in priority order. Best practice: keep the last 5 production images (rollback ability), delete all untagged images after 1 day, and delete feature-branch images after 30 days.

json
{"rules":[{"rulePriority":1,"selection":{"tagStatus":"untagged","countType":"sinceImagePushed","countUnit":"days","countNumber":1},"action":{"type":"expire"}}]}

A production-grade serverless data pipeline: (1) Producers send events to Kinesis Data Streams (sized by shard = 1 MB/s in / 2 MB/s out). (2) An enhanced fan-out consumer Lambda processes records in real time — validates schema, enriches (reverse geocode, user lookup from DynamoDB), and batches records. (3) Lambda writes Parquet files to S3 using a date-partitioned prefix (`year=/month=/day=/hour=`) for Athena query efficiency. (4) Alternatively, use Kinesis Data Firehose with a Lambda transformation function for the S3 delivery — Firehose buffers, compresses (GZIP/Snappy), and partitions automatically. (5) AWS Glue Crawler (or manual DDL) creates/updates the Athena table schema. (6) Athena queries the Parquet data with SQL, scanning only relevant partitions. (7) CloudWatch alarms monitor IteratorAge (Kinesis lag) and Lambda error rates.

bash
# Partition projection reduces Glue Crawler dependency
CREATE TABLE events PARTITIONED BY (dt string)
LOCATION 's3://bucket/events/' TBLPROPERTIES ('projection.enabled'='true')

Frequently Asked Questions

Do I need an AWS certification for interviews?

Helpful for breaking in but not required. Hands-on project experience trumps certs at senior levels.

Which services come up most?

EC2, S3, IAM, VPC, RDS, Lambda, DynamoDB, CloudWatch, CloudFormation/Terraform. Know these cold.

How important is IAM?

Very — most cloud security questions trace back to IAM policies, roles, and trust relationships.

Terraform or CloudFormation?

Terraform is more widely used; CloudFormation/CDK is AWS-native. Know one well and the concepts translate.

What about cost optimization?

Senior AWS roles always ask. Understand reserved instances, spot, S3 storage classes, and common anti-patterns (NAT gateway egress).

Ready to apply?

TryApplyNow scores matches, tailors resumes, and tracks applications so you can focus on prep, not paperwork.

Try for free →