AWS Cost Mistakes · Startup Guide

10 AWS Cost Mistakes Startups Make Every Month

These are not edge cases. They are the same patterns found in account after account - over-provisioned compute, no commitments, silent NAT Gateway charges, and no one looking at the bill. Together they typically account for 30–40% of total AWS spend.

30–40% of spend is recoverable waste

Audit identifies all 10 in one week

8+ years auditing AWS accounts

Why the Same Mistakes Keep Showing Up

AWS makes provisioning fast and cheap to start. It charges nothing upfront and everything ongoing. The default settings are generous at small scale and expensive at Series A–C scale. Every team that moves fast in their first 18 months accumulates the same set of inefficiencies.

Right-sizing gets deferred because "prod is stable, don't touch it." Savings Plans get deferred because "we're not sure what we'll need next year." Tagging never gets retrofitted because it feels low-priority. None of these are engineering failures - they are the predictable result of a team focused on shipping product.

Published benchmark: a Series A SaaS startup ($47,200/month AWS) cut to $12,100/month - a 74% reduction - by addressing the same 10 mistake categories below. Source: ZeonEdge case study ↗.

Over-provisioned EC2 instances

$2,000–8,000/month

Why it happens

Teams provision for peak load that arrives 2% of the time. Lift-and-shift migrations land on the same instance sizes as on-prem - and nobody revisits. AWS Compute Optimizer reports go unread.

How to fix

Enable Compute Optimizer and check CPU + memory for 14 days. Most startups run at 8–15% average CPU - downsize to the next instance family. Right-sizing alone cuts compute costs 30–60% with zero user impact.

EC2 rightsizing: how to cut compute costs 30–40% →

Paying on-demand for workloads that never turn off

$3,000–10,000/month

Why it happens

"We'll sort commitments later" becomes the permanent state. Confusion between Reserved Instances and Savings Plans leads to inaction. Meanwhile, EC2, RDS, and Fargate run 24/7 on full on-demand pricing.

How to fix

After rightsizing, buy 1-year no-upfront Compute Savings Plans for your steady-state baseline. A 36% discount on on-demand pricing applies automatically across EC2, Lambda, and Fargate. One startup saved $8,600/month from this alone.

Savings Plans vs Reserved Instances: startup-friendly guide →

Non-production environments running 24/7

$800–3,000/month

Why it happens

Dev and staging environments are provisioned at production-equivalent size for convenience, then left running nights, weekends, and bank holidays. Nobody owns the cost and nobody notices until a budget review.

How to fix

Use AWS Instance Scheduler or a Lambda cron to stop non-production EC2 and RDS outside business hours (8am–8pm weekdays). This cuts non-production compute by ~65%. One startup went from $1,200 to $250/month on staging alone.

AWS zombie resources: find and kill orphaned waste →

NAT Gateway used as a free pipe

$1,500–5,000/month

Why it happens

Default VPC architectures route all outbound traffic through NAT Gateway. S3 and DynamoDB calls from private subnets add $0.045/GB in processing charges on top of NAT's $0.065/hour baseline. At any scale, this compounds fast.

How to fix

Add free S3 and DynamoDB Gateway Endpoints - these route traffic directly from your VPC without going through NAT. Add Interface Endpoints for ECR and CloudWatch Logs. One team saved $3,200/month from this single change.

How to reduce NAT Gateway costs by 70–80% →

Orphaned and zombie resources quietly billing

$500–4,000/month

Why it happens

Cloud provisioning takes seconds; cleanup is nobody's job. Test EC2 instances from sprints, unattached EBS volumes from terminated instances, load balancers pointing at nothing, and Elastic IPs reserved for instances that no longer exist - all keep billing indefinitely.

How to fix

Run: `aws ec2 describe-volumes --filters Name=status,Values=available` for unattached EBS. Check ALB RequestCount in CloudWatch for 30 days - zero-traffic ALBs at $16–30/month each should be deleted. Enforce Owner and Environment tags to prevent future accumulation.

AWS zombie resources: systematic cleanup guide →

S3 stored entirely in Standard class

$300–2,000/month

Why it happens

"S3 is cheap" is true at small scale. At scale, 80% of objects are typically not accessed in 90+ days. Without lifecycle policies, years of backups, logs, and media accumulate at $0.023/GB/month with no automated tiering.

How to fix

Enable S3 Intelligent-Tiering for files accessed unpredictably. Add a lifecycle rule to move files older than 90 days to Glacier Instant Retrieval (67% cheaper). Objects older than 180 days move to Glacier Deep Archive (96% cheaper).

AWS S3 cost optimization: reduce storage 40–95% →

Oversized RDS instances never revisited

$1,000–4,000/month

Why it happens

Databases are provisioned conservatively - a db.r5.2xlarge "just in case" - and never touched again for fear of downtime. Performance Insights goes unchecked. Read replicas sit at near-zero query load.

How to fix

Check RDS Performance Insights for CPU and memory utilisation over 14 days. A db.r5.2xlarge running at 6% CPU and 18% memory can be safely downgraded to a db.r6g.xlarge (Graviton) - which is both cheaper and faster. Migrate storage from gp2 to gp3 for a free 20% saving.

How to reduce AWS RDS costs: rightsizing and reservations →

No cost visibility - no tags, no budgets, no alerts

Costs 20–40% more than necessary

Why it happens

Teams skip tagging in the early days and never retrofit it. Without Owner, Environment, and Service tags, there's no cost attribution by team or product. Nobody sets AWS Budgets alerts, so the first signal is a shocking invoice.

How to fix

Implement 3–5 mandatory tags: Owner, Environment, Service, CostCentre. Activate them as cost allocation tags in the Billing console. Set budget alerts at 80% actual and 100% forecasted. Enable AWS Cost Anomaly Detection - free, catches spikes within 24 hours.

AWS tagging strategy for cost allocation →

CloudWatch logs growing forever

$300–2,000/month

Why it happens

CloudWatch log groups default to "never expire". Verbose DEBUG-level logging in production generates gigabytes daily at $0.50/GB ingestion. ALB access logs routed to CloudWatch cost 20× more than routing them to S3.

How to fix

Set log retention on every log group - 30 days for production, 7 days for non-production. Switch logging level from DEBUG to INFO. Route ALB and VPC Flow logs to S3, not CloudWatch. Move custom metrics to self-hosted Prometheus on a t3.medium ($30/month).

How to reduce AWS CloudWatch costs by 60–80% →

Treating cost optimisation as a one-time project

Savings erode within 3–6 months

Why it happens

A one-time cost sprint saves 30–40%, then costs drift back as new resources are provisioned without cost awareness. Reserved Instances expire silently and revert to on-demand. No FinOps process means no one catches the creep until the bill is already up 40%.

How to fix

Set up monthly cost reviews using Cost Explorer by team or service. Schedule automated alerts for any spend increase greater than 20% month-over-month. Track Savings Plans coverage - below 70% means you're leaving money on the table.

AWS cost drift: why savings erode and how to stop it →

What to Fix First

Not all mistakes are equal. This is the order that maximises savings per hour of engineering time.

Week 1

Right-size EC2 and RDS using Compute Optimizer data

16% of total bill on average

Week 2

Buy Compute Savings Plans for your rightsized baseline

Additional 18% on steady-state compute

Week 2

Add S3 and DynamoDB Gateway Endpoints

Eliminates most NAT Gateway data processing charges

Week 3

Schedule non-production environment shutdowns

65% reduction on dev and staging costs

Ongoing

Implement tagging, budgets, and anomaly detection

Prevents 20–40% drift within 6 months

10 AWS Cost Mistakes Startups Make Every Month

Why the Same Mistakes Keep Showing Up

Over-provisioned EC2 instances

Paying on-demand for workloads that never turn off

Non-production environments running 24/7

NAT Gateway used as a free pipe

Orphaned and zombie resources quietly billing

S3 stored entirely in Standard class

Oversized RDS instances never revisited

No cost visibility - no tags, no budgets, no alerts

CloudWatch logs growing forever

Treating cost optimisation as a one-time project

What to Fix First

Get every mistake identified in 7 days