Friday, 15 August 2025

AWS scenario-based interview Q&A

## 1. Security & Identity Management


Q1: A developer accidentally deletes an S3 bucket in the dev account. How would you prevent such incidents across all accounts without impacting productivity?

Testing: Knowledge of SCPs, IAM best practices, and least privilege.

Answer:


 Use AWS Organizations → Service Control Policies (SCP) to restrict destructive actions (e.g., `s3:DeleteBucket`) in non-prod accounts.

 Enable MFA Delete for critical buckets.

 Configure CloudTrail + EventBridge alerts for bucket deletion events.


Q2: How do you implement least privilege for a team of developers?

Testing: IAM role & policy design.

Answer:


 Assign roles with service-specific permissions rather than using root/admin.

 Use managed policies but customize to remove unused actions.

 Monitor with IAM Access Advisor to tighten permissions over time.


---


## 2. Networking


Q1: Your application in VPC A needs to connect to a database in VPC B in the same region without using public internet. How would you do it?

Testing: VPC connectivity, security best practices.

Answer:


 Use VPC Peering if it’s a simple one-to-one link.

 Use PrivateLink if only specific services need access.

 Avoid routing through IGW or NAT for security.


Q2: You need to connect 10 VPCs across 3 regions. How would you design the network to minimize complexity?

Testing: Multi-VPC & multi-region architecture.

Answer:


 Use AWS Transit Gateway to centralize connectivity.

 Enable inter-region peering between TGWs.

 Configure route tables carefully per environment to control traffic flow.


---


## 3. Compute Layer


Q1: During a flash sale, EC2 instances behind an ALB reach 90% CPU. What’s your immediate plan?

Testing: Auto Scaling & load management.

Answer:


 Check Auto Scaling policies and trigger scale-out.

 Temporarily add spot or on-demand instances.

 Implement caching (CloudFront/ElastiCache) for future spikes.


Q2: How do you ensure zero downtime during EC2 app updates?

Testing: Deployment strategies.

Answer:


 Use Auto Scaling group rolling updates.

 Implement blue/green deployment via CodeDeploy.

 Gradually switch traffic with weighted ALB target groups.


---


## 4. Database Layer


Q1: RDS MySQL is experiencing slow queries at peak traffic. How do you troubleshoot and optimize?

Testing: Performance tuning.

Answer:


 Check CloudWatch metrics for CPU, memory, IOPS.

 Enable Performance Insights to identify slow queries.

 Add indexes, use Read Replicas for read-heavy workloads.

 Consider Provisioned IOPS if storage is bottleneck.


Q2: How do you migrate a large production database from on-prem to AWS with minimal downtime?

Testing: Database migration strategy.

Answer:


 Use AWS DMS with Change Data Capture (CDC).

 Pre-create schema on target RDS/Aurora.

 Perform full load first, then replicate ongoing changes until cutover.


---


## 5. Automation & DevOps


Q1: Manual deployments lead to wrong versions being deployed. How would you automate?

Testing: CI/CD implementation.

Answer:


 Use CodePipeline + CodeDeploy or integrate Jenkins/GitHub Actions.

 Bake versioned AMIs using Packer.

 Automate rollback strategies in deployment configs.


Q2: How would you automate IAM key rotation?

Testing: Security automation.

Answer:


 Store keys in AWS Secrets Manager.

 Rotate keys automatically using Lambda triggered via EventBridge.

 Send notifications on rotation completion.


---


## 6. Serverless


Q1: Your Lambda function has cold start issues affecting response time. How do you mitigate?

Testing: Serverless performance optimization.

Answer:


 Use Provisioned Concurrency.

 Keep function code and dependencies small.

 Use VPC endpoints if accessing VPC resources to reduce latency.


Q2: How would you secure a serverless REST API using API Gateway + Lambda?

Testing: Serverless security best practices.

Answer:


 Enable IAM authorization or Cognito user pools.

 Enable WAF to filter malicious requests.

 Use API keys and throttling to prevent abuse.


---


## 7. Monitoring & Logging


Q1: How do you monitor Lambda functions for performance and errors?

Testing: CloudWatch and observability.

Answer:


 Enable CloudWatch Logs for function output.

 Use CloudWatch Metrics & Alarms for errors and duration.

 Use X-Ray for distributed tracing.


Q2: How do you set up centralized logging for multiple AWS accounts?

Testing: Cross-account logging & monitoring.

Answer:


 Aggregate logs to central S3 bucket via CloudWatch Logs subscription or Kinesis.

 Enable CloudTrail multi-account logging.

 Integrate with SIEM tools like Splunk.


## 8. Storage


Q1: Your S3 bucket stores critical backups. How do you ensure data durability, compliance, and disaster recovery?

Testing: Storage architecture & DR planning.

Answer:


 Enable versioning and MFA delete.

 Use S3 Cross-Region Replication (CRR) for DR.

 Apply server-side encryption (SSE-KMS).

 Set lifecycle policies to move data to Glacier/Deep Archive for compliance.


Q2: EBS volume performance is low for a high-I/O database. What’s your solution?

Testing: Block storage optimization.

Answer:


 Use Provisioned IOPS SSD (io2/io1).

 Striping multiple volumes with RAID 0 if necessary.

 Enable EBS optimization on EC2.


Q3: When would you use EFS over S3?

Answer:


 For shared file system access by multiple EC2 instances.

 When NFS file semantics are required.

 For low-latency file storage rather than object storage.


---


## 9. Databases


Q1: Your read-heavy RDS MySQL instance is under load. How do you scale reads efficiently?

Answer:


 Implement Read Replicas.

 Use Aurora Global Database for multi-region reads.

 Offload caching with ElastiCache.


Q2: How would you optimize DynamoDB for high traffic spikes?

Answer:


 Enable on-demand mode or auto-scaling read/write capacity.

 Use partition keys with high cardinality.

 Enable DAX (DynamoDB Accelerator) for caching.


Q3: What is the difference between Multi-AZ and Read Replica in RDS?

Answer:


 Multi-AZ: High availability and failover, synchronous replication.

 Read Replica: Scale reads, asynchronous replication.


---


## 10. DevOps Integration


Q1: How do you implement zero-downtime deployments with AWS CodeDeploy?

Answer:


 Use Blue/Green deployment to switch traffic to the new version gradually.

 Validate using health checks and rollback on failure.

 Integrate ALB target group weighting for incremental traffic shift.


Q2: How would you automate AMI creation for multiple environments?

Answer:


 Use Packer templates to build AMIs with required packages/config.

 Automate builds with CodePipeline or Lambda triggers.

 Version AMIs for dev/test/prod environments.


Q3: How do you integrate CI/CD with containerized apps in ECS/EKS?

Answer:


 Build Docker images using CodeBuild or Jenkins.

 Push to ECR.

 Deploy to ECS/Fargate or EKS using CodePipeline or ArgoCD.


---


## 11. Migration & Hybrid


Q1: You need to migrate petabytes of on-premises data to AWS quickly. What options do you use?

Answer:


 AWS Snowball/Snowmobile for large-scale offline transfer.

 Direct Connect for high-bandwidth online transfer.

 Use AWS DataSync for incremental sync.


Q2: How do you design a hybrid cloud setup for on-prem apps using AWS?

Answer:


 Establish VPN or Direct Connect for secure connectivity.

 Use VPC Peering/Transit Gateway for centralized routing.

 Extend Active Directory via AWS Managed AD for authentication.


---


## 12. Cost Optimization


Q1: Your AWS monthly bill is high. How would you identify and optimize costs?

Answer:


 Use AWS Cost Explorer & Trusted Advisor for resource recommendations.

 Analyze underutilized EC2 instances and EBS volumes.

 Switch to Reserved Instances or Savings Plans for predictable workloads.

 Use Spot Instances for non-critical workloads.


Q2: How would you reduce costs in a multi-region deployment?

Answer:


 Review cross-region replication usage; replicate only necessary data.

 Optimize data transfer costs by using CloudFront for caching.

 Right-size instances and remove idle resources.


Q3: How do you manage cost in serverless applications?

Answer:


 Monitor Lambda invocation metrics.

 Optimize memory allocation and execution time.

 Use DynamoDB on-demand mode for unpredictable workloads.


---


## 13. Advanced Architecture / High Availability


Q1: Design a multi-region highly available web application. What’s your approach?

Answer:


 Deploy ALB + EC2/ECS in multiple AZs per region.

 Use Route 53 latency-based routing across regions.

 Replicate databases using Aurora Global DB or DynamoDB global tables.

 Use CloudFront for global content caching.


Q2: How do you design for Disaster Recovery (DR) in AWS?

Answer:


 Backup & restore: Snapshots, S3/Glacier backups.

 Pilot light: Minimal resources in DR region, scale when needed.

 Warm standby: Scaled-down environment running continuously.

 Multi-site: Fully active-active in multiple regions.


Q3: Define RTO and RPO.

Answer:


 RTO (Recovery Time Objective): Max tolerable downtime.

 RPO (Recovery Point Objective): Max tolerable data loss.

 Choose DR strategy based on business requirements.



No comments:

Post a Comment