Most small and mid-size businesses overspend on AWS by 25-40%. The reasons are consistent: resources provisioned during a sprint and never cleaned up, default instance sizes that far exceed actual demand, and a lack of commitment-based discounts for stable workloads. The good news is that most of these savings do not require architectural changes or downtime. This checklist organizes 25+ specific actions by category and estimated impact so your team can work through them systematically.
We built this checklist from hundreds of AWS cost reviews across SaaS companies, healthcare organizations, e-commerce platforms, and professional services firms. Every item is something we have seen deliver real savings in production environments.
1. Quick Wins You Can Do Today
These are the low-hanging fruit. Each item takes minutes to identify and execute, yet we routinely find hundreds of dollars per month hiding in these categories. Start here before tackling anything else.
1.1 Delete Unused EBS Volumes
When you terminate an EC2 instance, its attached EBS volumes are not always deleted automatically. These orphaned volumes sit in your account accruing charges at the provisioned rate, even though nothing is reading from or writing to them. A single forgotten 500GB gp3 volume costs about $40/month. Check the EC2 console under Volumes, filter for "available" status, and delete anything not attached to a running instance.
Estimated impact: $50-500/month depending on environment size.
1.2 Release Unused Elastic IPs
AWS charges $3.60/month for every Elastic IP that is allocated but not associated with a running instance. This is easy to overlook after decommissioning an environment. Navigate to VPC > Elastic IPs and release any addresses not currently attached. Beyond cost, unused EIPs also represent a small security surface area you can eliminate.
Estimated impact: $5-50/month per unattached IP.
1.3 Clean Up Old EBS Snapshots
EBS snapshots are incremental, but they still accumulate cost over time, especially if you are snapshotting large databases daily. Audit your snapshot inventory and delete anything older than your recovery point objective (RPO) requires. Many teams discover thousands of dollars in snapshot storage once they look. Use AWS Backup lifecycle policies or a simple script to automate retention going forward.
Estimated impact: $100-1,000/month for teams with extensive snapshot histories.
1.4 Terminate Long-Stopped EC2 Instances
Stopped instances do not incur compute charges, but their attached EBS volumes, Elastic IPs, and any associated resources continue billing. If an instance has been stopped for more than two weeks, it is almost certainly not needed. Create an AMI if you want to preserve the configuration, then terminate the instance and clean up its storage.
Estimated impact: $50-300/month per stopped instance with attached storage.
1.5 Remove Unused Load Balancers
Application Load Balancers cost roughly $16/month even with zero traffic, and Network Load Balancers start around $6/month per AZ. After decommissioning services or consolidating environments, old load balancers frequently remain. Check each ALB and NLB for registered targets. If a load balancer has no healthy targets and no recent traffic in CloudWatch, delete it.
Estimated impact: $16-100/month per unused load balancer.
1.6 Delete Unused NAT Gateways
NAT Gateways cost $32/month each before any data processing charges. If you have multiple VPCs or AZs with NAT Gateways that no longer serve active subnets, removing them is an immediate win. Verify by checking route tables and flow logs before deletion.
Estimated impact: $32-100/month per unused NAT Gateway.
2. Rightsizing: Match Resources to Actual Demand
Rightsizing is consistently the single highest-impact optimization for SMBs. Most teams provision for peak load and then never revisit, which means instances run at 10-20% average utilization. AWS Compute Optimizer and CloudWatch metrics make this analysis straightforward.
2.1 Analyze CPU and Memory Utilization
Enable AWS Compute Optimizer for your account (it is free) and review its recommendations. Any instance consistently running below 40% average CPU utilization is a candidate for downsizing. For memory analysis, you will need the CloudWatch agent installed. Look at 14-day trends rather than point-in-time snapshots to account for periodic spikes.
Estimated impact: Identifies 30-60% of instances as over-provisioned.
2.2 Downsize Over-Provisioned Instances
Once you have utilization data, downsize incrementally. Moving from an m5.xlarge to an m5.large cuts your cost in half. Perform changes during maintenance windows and monitor for 48 hours afterward. Many teams find they can drop two size categories without any performance impact. For databases, start with read replicas and non-production instances first.
Estimated impact: 30-50% reduction per rightsized instance.
2.3 Switch to Current-Generation Instance Families
Older instance families like m4 and c4 cost more per unit of compute than their current-generation equivalents (m7i, c7g). AWS pricing rewards you for staying current. A move from m5.large to m7i.large typically delivers 15-20% better price-performance with no application changes required.
Estimated impact: 10-20% savings per migrated instance.
2.4 Evaluate ARM-Based Graviton Instances
AWS Graviton (m7g, c7g, r7g) instances deliver up to 40% better price-performance than x86 equivalents for most workloads. If your application runs on Linux and does not depend on x86-specific binaries, Graviton is typically a straightforward migration. Containerized workloads and interpreted languages (Python, Node.js, Java) are particularly good candidates.
Estimated impact: 20-40% cost reduction per instance.
2.5 Rightsize RDS Instances
Database instances are among the most frequently over-provisioned resources. Many teams launch with db.r5.2xlarge during initial setup and never revisit. Check Performance Insights and CloudWatch for CPU, memory, and connection count. If your database runs under 30% CPU and has ample free memory, downsize it. For non-production RDS instances, consider db.t4g burstable classes.
Estimated impact: $200-2,000/month per over-provisioned database.
3. Commitment Strategies: Lock In Discounts
Once you have rightsized your resources, you have a clear picture of your baseline demand. This is when commitment-based discounts become powerful. The key principle: only commit to what you know you will use.
3.1 Start with Compute Savings Plans
Compute Savings Plans offer the most flexibility. They apply automatically across EC2, Fargate, and Lambda in any region or instance family. Commit to your sustained baseline (the minimum compute you always run) with a 1-year no-upfront plan. This alone delivers 20-30% savings with minimal risk. Use the AWS Cost Explorer Savings Plans recommendations page to see your optimal commitment level.
Estimated impact: 20-30% on committed compute spend.
3.2 Layer in EC2 Instance Savings Plans for Stable Workloads
For workloads that will stay on a specific instance family in a specific region (such as a production database cluster), EC2 Instance Savings Plans offer deeper discounts than Compute Savings Plans, typically 30-40% on 1-year terms. Use these to cover your most predictable resources after your Compute Savings Plan covers the flexible baseline.
Estimated impact: Additional 5-10% beyond Compute Savings Plans for stable workloads.
3.3 Use Reserved Instances for RDS and ElastiCache
Savings Plans do not cover RDS, ElastiCache, Redshift, or OpenSearch. For these services, Reserved Instances remain the discount mechanism. If your production database has been running the same instance type for six months, a 1-year no-upfront RI will save you 30-40%. Always rightsize before committing.
Estimated impact: 30-40% on covered database and cache spend.
3.4 Know When NOT to Commit
Do not purchase commitments for workloads that may change significantly in the next 12 months. This includes instances you plan to migrate to containers, workloads under active refactoring, development environments that scale to zero on weekends, and any service you are evaluating alternatives for. Over-committing wastes money just as surely as under-committing does.
Estimated impact: Avoids locking in $1,000s on resources you may not need.
4. Architecture Optimization
These changes require more planning but deliver outsized returns. Each one addresses a structural pattern that causes ongoing waste.
4.1 Use Spot Instances for Fault-Tolerant Workloads
Spot instances offer 60-90% discounts compared to on-demand pricing. They are well-suited for batch processing, CI/CD pipelines, data analysis jobs, and any workload that can tolerate interruption. Use Spot Fleet or EC2 Auto Scaling with capacity-optimized allocation to minimize interruptions. Many teams start with their CI/CD runners, which is a low-risk entry point.
Estimated impact: 60-90% reduction for eligible workloads.
4.2 Implement S3 Lifecycle Policies
Most S3 data is accessed frequently for the first 30 days, then rarely. Configure lifecycle rules to transition objects to S3 Infrequent Access after 30 days, Glacier Flexible Retrieval after 90 days, and Glacier Deep Archive after 180 days. For buckets with millions of objects, use S3 Intelligent-Tiering which automates transitions based on access patterns for a small monitoring fee.
Estimated impact: 40-70% reduction on S3 storage costs.
4.3 Choose the Right Storage Tier
Beyond S3 lifecycle policies, review your EBS volume types. Many workloads run on gp2 volumes that should be gp3 (20% cheaper with better baseline performance). Databases that need high IOPS might be on io2 when gp3 with provisioned IOPS would suffice. And if you are using EFS, check whether your access patterns justify the per-GB premium over S3 or FSx.
Estimated impact: 20-50% reduction on storage costs.
4.4 Go Serverless for Intermittent Workloads
If you have workloads that run for a few minutes per hour or handle bursty traffic, Lambda, Fargate, or Step Functions eliminate the cost of idle compute. A cron job running on a dedicated t3.medium costs about $30/month. The same job on Lambda might cost $0.50/month. API endpoints with variable traffic are another strong candidate for serverless migration.
Estimated impact: 70-95% reduction for suitable workloads.
4.5 Schedule Non-Production Environments
Development, staging, and QA environments rarely need to run 24/7. Use AWS Instance Scheduler or a simple Lambda function to shut down non-production resources outside business hours and on weekends. Running environments only during business hours (10 hours/day, 5 days/week) cuts their cost by roughly 70%.
Estimated impact: 65-70% reduction on non-production compute.
4.6 Consolidate Idle Databases
Many teams maintain separate RDS instances for each microservice or environment. If several databases run at less than 5% utilization, consider consolidating them onto shared instances or migrating non-production databases to Aurora Serverless v2, which scales to zero during inactivity.
Estimated impact: $200-1,500/month per consolidated database.
5. Monitoring and Governance
Optimization is not a one-time project. Without ongoing monitoring and governance, costs drift back up within months. These items establish the visibility and accountability framework that keeps savings permanent.
5.1 Set Up AWS Billing Alerts
At minimum, create CloudWatch billing alarms at 80%, 100%, and 120% of your expected monthly spend. This catches runaway costs from misconfigured auto-scaling, forgotten test resources, or data transfer spikes before they become painful. Configure alerts to notify your team via SNS (email, Slack, or PagerDuty).
Estimated impact: Prevents surprise bills; typically saves $500-5,000 per incident caught early.
5.2 Configure AWS Budgets
AWS Budgets goes beyond simple billing alerts. Create budgets per team, per project, and per environment. Set both actual and forecasted spend thresholds. Configure automated actions to restrict IAM permissions or stop specific resources when budgets are exceeded. This gives you proactive cost governance without requiring manual monitoring.
Estimated impact: Provides early warning; reduces budget overruns by 30-50%.
5.3 Implement a Tagging Strategy
You cannot optimize what you cannot measure. Require tags for Environment (production, staging, dev), Team or Owner, Project, and Cost Center on every resource. Use AWS Organizations Tag Policies to enforce compliance. Accurate tagging is the foundation of cost allocation, chargeback models, and anomaly detection. Without it, every other governance initiative is guesswork.
Estimated impact: Enables all other optimization efforts; foundational.
5.4 Conduct Monthly Cost Reviews
Schedule a 30-minute monthly review with your engineering and finance leads. Walk through Cost Explorer trends, review the top 10 cost drivers, flag anomalies, and track progress against optimization goals. Teams that conduct regular reviews maintain their savings. Teams that do not see costs creep back up within two to three months.
Estimated impact: Maintains all other savings; prevents 10-20% cost drift.
5.5 Enable AWS Cost Anomaly Detection
AWS Cost Anomaly Detection uses machine learning to identify unusual spending patterns. Configure monitors for each service and linked account. It catches issues like a misconfigured Lambda generating millions of invocations or an auto-scaling group that scaled up and never came back down. There is no additional cost for this service.
Estimated impact: Early detection prevents $500-10,000+ in runaway spending.
6. Data Transfer Optimization
Data transfer costs are the most opaque charges on most AWS bills. They show up across dozens of line items and are easy to ignore until they become a significant percentage of your total spend. These strategies address the most common sources of transfer waste.
6.1 Use VPC Endpoints for AWS Service Traffic
Every API call from your VPC to an AWS service (S3, DynamoDB, SQS, etc.) that traverses a NAT Gateway incurs data processing charges. Gateway VPC Endpoints for S3 and DynamoDB are free and eliminate these charges entirely. Interface VPC Endpoints for other services cost $7.20/month each but save significantly more in NAT Gateway processing fees for high-traffic services.
Estimated impact: $100-2,000/month depending on traffic volume.
6.2 Deploy CloudFront for Content Delivery
CloudFront data transfer out pricing is 30-50% cheaper than direct EC2 or S3 data transfer. Even if your users are all in one region, serving static content and API responses through CloudFront reduces your data transfer costs. The caching layer also reduces load on your origin servers. Enable compression to further reduce bytes transferred.
Estimated impact: 30-50% reduction on outbound data transfer costs.
6.3 Keep Communicating Resources in the Same Availability Zone
Cross-AZ data transfer costs $0.01/GB in each direction. For services that communicate heavily (such as an application server and its database), placing them in the same AZ eliminates this cost. This does not mean sacrificing availability. Use multi-AZ for your database failover replica, but route primary application traffic to same-AZ endpoints where possible.
Estimated impact: $50-500/month for chatty architectures.
6.4 Audit Cross-Region Data Transfer
Cross-region data transfer is significantly more expensive than within-region transfer. Review whether you have S3 replication, database replication, or service calls going across regions unnecessarily. If your disaster recovery region does not need real-time replication, consider less frequent batch synchronization to reduce ongoing transfer costs.
Estimated impact: $100-1,000/month per unnecessary cross-region flow.
Putting It All Together
The most effective approach is to work through this checklist in order. Quick wins deliver immediate savings and build momentum. Rightsizing establishes your true baseline. Commitment strategies lock in discounts on that baseline. Architecture changes address structural waste. And governance ensures everything stays optimized over time.
For a typical SMB spending $10,000-50,000/month on AWS, this checklist identifies $2,000-20,000/month in savings. The quick wins and rightsizing alone usually account for 60-70% of the total opportunity, and they can be implemented within a few weeks.
The hardest part is not the technical implementation. It is finding the time and focus to work through these items systematically when your team is already stretched thin building product features and handling production issues. That is exactly where outside expertise pays for itself many times over.