Gupta: AWS scenario-based interview Q&A

## 1. Security & Identity Management

Q1: A developer accidentally deletes an S3 bucket in the dev account. How would you prevent such incidents across all accounts without impacting productivity?

Testing: Knowledge of SCPs, IAM best practices, and least privilege.

Answer:

Use AWS Organizations → Service Control Policies (SCP) to restrict destructive actions (e.g., `s3:DeleteBucket`) in non-prod accounts.

Enable MFA Delete for critical buckets.

Configure CloudTrail + EventBridge alerts for bucket deletion events.

Q2: How do you implement least privilege for a team of developers?

Testing: IAM role & policy design.

Answer:

Assign roles with service-specific permissions rather than using root/admin.

Use managed policies but customize to remove unused actions.

Monitor with IAM Access Advisor to tighten permissions over time.

---

## 2. Networking

Q1: Your application in VPC A needs to connect to a database in VPC B in the same region without using public internet. How would you do it?

Testing: VPC connectivity, security best practices.

Answer:

Use VPC Peering if it’s a simple one-to-one link.

Use PrivateLink if only specific services need access.

Avoid routing through IGW or NAT for security.

Q2: You need to connect 10 VPCs across 3 regions. How would you design the network to minimize complexity?

Testing: Multi-VPC & multi-region architecture.

Answer:

Use AWS Transit Gateway to centralize connectivity.

Enable inter-region peering between TGWs.

Configure route tables carefully per environment to control traffic flow.

---

## 3. Compute Layer

Q1: During a flash sale, EC2 instances behind an ALB reach 90% CPU. What’s your immediate plan?

Testing: Auto Scaling & load management.

Answer:

Check Auto Scaling policies and trigger scale-out.

Temporarily add spot or on-demand instances.

Implement caching (CloudFront/ElastiCache) for future spikes.

Q2: How do you ensure zero downtime during EC2 app updates?

Testing: Deployment strategies.

Answer:

Use Auto Scaling group rolling updates.

Implement blue/green deployment via CodeDeploy.

Gradually switch traffic with weighted ALB target groups.

---

## 4. Database Layer

Q1: RDS MySQL is experiencing slow queries at peak traffic. How do you troubleshoot and optimize?

Testing: Performance tuning.

Answer:

Check CloudWatch metrics for CPU, memory, IOPS.

Enable Performance Insights to identify slow queries.

Add indexes, use Read Replicas for read-heavy workloads.

Consider Provisioned IOPS if storage is bottleneck.

Q2: How do you migrate a large production database from on-prem to AWS with minimal downtime?

Testing: Database migration strategy.

Answer:

Use AWS DMS with Change Data Capture (CDC).

Pre-create schema on target RDS/Aurora.

Perform full load first, then replicate ongoing changes until cutover.

---

## 5. Automation & DevOps

Q1: Manual deployments lead to wrong versions being deployed. How would you automate?

Testing: CI/CD implementation.

Answer:

Use CodePipeline + CodeDeploy or integrate Jenkins/GitHub Actions.

Bake versioned AMIs using Packer.

Automate rollback strategies in deployment configs.

Q2: How would you automate IAM key rotation?

Testing: Security automation.

Answer:

Store keys in AWS Secrets Manager.

Rotate keys automatically using Lambda triggered via EventBridge.

Send notifications on rotation completion.

---

## 6. Serverless

Q1: Your Lambda function has cold start issues affecting response time. How do you mitigate?

Testing: Serverless performance optimization.

Answer:

Use Provisioned Concurrency.

Keep function code and dependencies small.

Use VPC endpoints if accessing VPC resources to reduce latency.

Q2: How would you secure a serverless REST API using API Gateway + Lambda?

Testing: Serverless security best practices.

Answer:

Enable IAM authorization or Cognito user pools.

Enable WAF to filter malicious requests.

Use API keys and throttling to prevent abuse.

---

## 7. Monitoring & Logging

Q1: How do you monitor Lambda functions for performance and errors?

Testing: CloudWatch and observability.

Answer:

Enable CloudWatch Logs for function output.

Use CloudWatch Metrics & Alarms for errors and duration.

Use X-Ray for distributed tracing.

Q2: How do you set up centralized logging for multiple AWS accounts?

Testing: Cross-account logging & monitoring.

Answer:

Aggregate logs to central S3 bucket via CloudWatch Logs subscription or Kinesis.

Enable CloudTrail multi-account logging.

Integrate with SIEM tools like Splunk.

## 8. Storage

Q1: Your S3 bucket stores critical backups. How do you ensure data durability, compliance, and disaster recovery?

Testing: Storage architecture & DR planning.

Answer:

Enable versioning and MFA delete.

Use S3 Cross-Region Replication (CRR) for DR.

Apply server-side encryption (SSE-KMS).

Set lifecycle policies to move data to Glacier/Deep Archive for compliance.

Q2: EBS volume performance is low for a high-I/O database. What’s your solution?

Testing: Block storage optimization.

Answer:

Use Provisioned IOPS SSD (io2/io1).

Striping multiple volumes with RAID 0 if necessary.

Enable EBS optimization on EC2.

Q3: When would you use EFS over S3?

Answer:

For shared file system access by multiple EC2 instances.

When NFS file semantics are required.

For low-latency file storage rather than object storage.

---

## 9. Databases

Q1: Your read-heavy RDS MySQL instance is under load. How do you scale reads efficiently?

Answer:

Implement Read Replicas.

Use Aurora Global Database for multi-region reads.

Offload caching with ElastiCache.

Q2: How would you optimize DynamoDB for high traffic spikes?

Answer:

Enable on-demand mode or auto-scaling read/write capacity.

Use partition keys with high cardinality.

Enable DAX (DynamoDB Accelerator) for caching.

Q3: What is the difference between Multi-AZ and Read Replica in RDS?

Answer:

Multi-AZ: High availability and failover, synchronous replication.

Read Replica: Scale reads, asynchronous replication.

---

## 10. DevOps Integration

Q1: How do you implement zero-downtime deployments with AWS CodeDeploy?

Answer:

Use Blue/Green deployment to switch traffic to the new version gradually.

Validate using health checks and rollback on failure.

Integrate ALB target group weighting for incremental traffic shift.

Q2: How would you automate AMI creation for multiple environments?

Answer:

Use Packer templates to build AMIs with required packages/config.

Automate builds with CodePipeline or Lambda triggers.

Version AMIs for dev/test/prod environments.

Q3: How do you integrate CI/CD with containerized apps in ECS/EKS?

Answer:

Build Docker images using CodeBuild or Jenkins.

Push to ECR.

Deploy to ECS/Fargate or EKS using CodePipeline or ArgoCD.

---

## 11. Migration & Hybrid

Q1: You need to migrate petabytes of on-premises data to AWS quickly. What options do you use?

Answer:

AWS Snowball/Snowmobile for large-scale offline transfer.

Direct Connect for high-bandwidth online transfer.

Use AWS DataSync for incremental sync.

Q2: How do you design a hybrid cloud setup for on-prem apps using AWS?

Answer:

Establish VPN or Direct Connect for secure connectivity.

Use VPC Peering/Transit Gateway for centralized routing.

Extend Active Directory via AWS Managed AD for authentication.

---

## 12. Cost Optimization

Q1: Your AWS monthly bill is high. How would you identify and optimize costs?

Answer:

Use AWS Cost Explorer & Trusted Advisor for resource recommendations.

Analyze underutilized EC2 instances and EBS volumes.

Switch to Reserved Instances or Savings Plans for predictable workloads.

Use Spot Instances for non-critical workloads.

Q2: How would you reduce costs in a multi-region deployment?

Answer:

Review cross-region replication usage; replicate only necessary data.

Optimize data transfer costs by using CloudFront for caching.

Right-size instances and remove idle resources.

Q3: How do you manage cost in serverless applications?

Answer:

Monitor Lambda invocation metrics.

Optimize memory allocation and execution time.

Use DynamoDB on-demand mode for unpredictable workloads.

---

## 13. Advanced Architecture / High Availability

Q1: Design a multi-region highly available web application. What’s your approach?

Answer:

Deploy ALB + EC2/ECS in multiple AZs per region.

Use Route 53 latency-based routing across regions.

Replicate databases using Aurora Global DB or DynamoDB global tables.

Use CloudFront for global content caching.

Q2: How do you design for Disaster Recovery (DR) in AWS?

Answer:

Backup & restore: Snapshots, S3/Glacier backups.

Pilot light: Minimal resources in DR region, scale when needed.

Warm standby: Scaled-down environment running continuously.

Multi-site: Fully active-active in multiple regions.

Q3: Define RTO and RPO.

Answer:

RTO (Recovery Time Objective): Max tolerable downtime.

RPO (Recovery Point Objective): Max tolerable data loss.

Choose DR strategy based on business requirements.

Gupta

Friday, 15 August 2025

AWS scenario-based interview Q&A

No comments:

Post a Comment

About Me