Senior 3 min · March 09, 2026

AWS Egress $28,000 — GCP Global VPC Cuts Cost 40%

AWS inter-region egress $0.

N
Naren · Founder
Plain-English first. Then code. Then the interview question.
About
 ● Production Incident 🔎 Debug Guide
Quick Answer
  • AWS (200+ services): broadest ecosystem, mature tools, complex pricing. EC2 spot 90% discount, S3 standard $0.023/GB.
  • Azure (Entra ID integration): best for Windows/.NET workloads, Hybrid Benefit saves Windows licensing costs. VNet peering, Blob Hot $0.018/GB.
  • GCP (GKE, global VPC): container-native, best data/AI tools, automatic sustained-use discounts (20-30% without commitment). Cloud Storage $0.020/GB.
  • Performance: GCP global VPC eliminates inter-region egress ($0.08/GB on AWS). For 2PB/month, that's $160k difference. 30-50% lower latency.
  • Production trap: choosing a provider without modelling egress costs. Inter-region transfer dominates bills. Always use CDN as first layer.
  • Biggest mistake: treating cloud providers as interchangeable. S3 bucket policies (AWS), Blob container ACLs (Azure), IAM roles (GCP) differ significantly — blind porting fails.
Plain-English First

Think of GCP, AWS, and Azure as the 'Big Three' utility companies for the digital age. AWS is like the established power giant with a tool for every niche; Azure is the massive corporate provider that integrates perfectly with the office equipment you already own; and GCP is the high-tech, specialised firm that offers the fastest, most advanced smart-grid technology. Understanding the differences helps you decide which 'grid' will power your application most efficiently.

Choosing a cloud provider is no longer just about virtual machines; it's about choosing an ecosystem. AWS, Azure, and GCP each offer a unique philosophy toward infrastructure, data, and developer experience. While they all provide the fundamental building blocks of modern computing—compute, storage, and networking—the way they implement identity, global networking, and managed services varies significantly.

In this guide, we'll break down the architectural nuances of the 'Big Three,' why they were designed with different priorities, and how to navigate their CLI tools to manage resources. By the end, you'll have the technical perspective needed to make an informed multi-cloud or single-cloud decision for your production workloads.

The most important insight that separates senior engineers from the rest? Egress pricing. AWS charges $0.09/GB inter-region. GCP's global VPC eliminates that cost entirely for traffic on its backbone. For a 2PB/month workload, that's $160,000 difference. Not a rounding error — a hiring decision.

Core Philosophy and Market Position

Each cloud provider started from a different origin, and that history drives their current strengths and weaknesses.

AWS (Amazon, 2006): Launched as an internal infrastructure platform for Amazon's retail operations. The philosophy is 'primitive-first' — offer building blocks that can be composed any way. This leads to breadth over simplicity. AWS has over 200 services, from machine learning (SageMaker) to satellite ground stations (Ground Station). The downside: steep learning curve and complex pricing. Over 80% of enterprises use AWS as their primary cloud.

Azure (Microsoft, 2010): Built to leverage Microsoft's enterprise footprint. The philosophy is 'hybrid-first' — seamless integration with on-premises Active Directory (now Entra ID), Windows Server, SQL Server, and Office 365. Ideal for organizations with existing Microsoft Enterprise Agreements (EAs). The Azure Hybrid Benefit can reduce Windows Server and SQL Server licensing costs by up to 80% compared to other clouds. Second-largest cloud provider, dominant in Fortune 500.

GCP (Google, 2011): Born from Google's internal infrastructure (Borg, Colossus, Spanner). The philosophy is 'data-first' — leverage Google's expertise in AI/ML, big data, and container orchestration. GCP effectively invented Kubernetes (K8s) before open-sourcing it in 2014. The networking layer (global VPC) is unmatched, keeping traffic on Google's private fiber backbone. Third-largest cloud provider but fastest-growing segment in data analytics and AI.

io/thecodeforge/cloud/MultiCloudCLI.shBASH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# io.thecodeforge: Standardizing Resource Creation across CLIs

# AWS: Create an EC2 Instance (t3.micro is the modern burstable standard)
aws ec2 run-instances \
    --image-id ami-0abcdef1234567890 \
    --count 1 \
    --instance-type t3.micro \
    --key-name ForgeKeyPair \
    --security-group-ids sg-0858102434db6c694

# Azure: Create a VM with a focused Resource Group
az vm create \
    --resource-group ForgeProdRG \
    --name ForgeWorkerVM \
    --image Ubuntu2204 \
    --size Standard_B1s \
    --admin-username forgeadmin \
    --generate-ssh-keys

# GCP: Create a GCE Instance with high-performance networking
gcloud compute instances create forge-app-node \
    --project=thecodeforge-prod \
    --zone=us-central1-a \
    --machine-type=e2-micro \
    --network-interface=network-tier=PREMIUM,stack-type=IPV4_ONLY \
    --image-family=debian-11 \
    --image-project=debian-cloud
Output
Instances starting on AWS, Azure, and GCP...
Cloud Provider Origins Shape Their DNA
  • AWS: Primitive-first, build anything, at the cost of complexity.
  • Azure: Enterprise-first, hybrid-cloud, best for Windows/.NET shops.
  • GCP: Data-first, AI/ML leadership, best global network.
  • AWS has the most services (200+), GCP has the most advanced services (Spanner, BigQuery, GKE).
  • Azure's secret weapon: existing Microsoft enterprise agreements (discounts up to 80% for Windows/SQL).
Production Insight
AWS leads in market share (32% of cloud spend), but GCP is catching up in AI/ML (45% of ML workloads on GCP).
Azure dominates Fortune 500 (95% of Fortune 500 use Azure for some workloads, mostly identity and Windows apps).
Rule: If you're a startup building AI or containers, start with GCP. If you're Windows/.NET, start with Azure. If you need breadth and talent pool, start with AWS.
Key Takeaway
AWS = breadth (200+ services), Azure = enterprise (Entra ID integration), GCP = data/AI (GKE, BigQuery).
Choose based on team expertise, existing contracts, and workload type — not just price per hour.

Compute Comparison: EC2 vs Azure VM vs GCE — Spot Instances and Burstable Pricing

Each provider's compute service reflects its design goals. AWS EC2 offers the broadest selection of instance families, including FPGAs (F1), GPU (P4), and Graviton ARM instances. Azure VMs deeply integrate with Windows licenses and offer Reserved Instances with Azure Hybrid Benefit to reduce Windows Server costs. GCE stands out with custom machine types (pick exact vCPU/memory), sustained-use discounts (automatically scale down), and preemptible VMs at up to 90% discount.

Pricing models** differ significantly
  • AWS: On-demand, Reserved (1/3 years, up to 72% off), Spot (up to 90% off, 2-min eviction notice), Savings Plans (flexible across families).
  • Azure: On-demand, Reserved (same), Spot VMs (up to 90% off, 30-sec eviction), Hybrid Benefit (use on-prem Windows/SQL licenses in cloud).
  • GCP: On-demand, Committed Use Discounts (1/3 years, up to 70% off), Preemptible VMs (80% off, 30-sec notice), Sustained Use (automatic 20-30% discount for running >25% of month).

Burstable performance: AWS T-family (t3, t4g) uses CPU credits; Azure B-series uses credits; GCP E2-micro/nano have no burst credits — they're always throttled. T3 unlimited mode allows bursting beyond credit balance at extra cost.

For containerized workloads, GKE runs most efficiently due to Google's Borg lineage; AWS EKS and Azure AKS are close competitors but require more manual tuning for pod density. GKE Autopilot (serverless Kubernetes) eliminates node management entirely — unique among providers.

io/thecodeforge/cloud/ComputeComparison.tfHCL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# io.thecodeforge: Compute resource definitions for comparison

resource "aws_instance" "forge_app" {
  ami           = "ami-0abcdef1234567890"
  instance_type = "t3.large"
  # no preemptible option natively in AWS; use Spot
}

resource "azurerm_linux_virtual_machine" "forge_app" {
  name                = "forge-app-vm"
  resource_group_name = azurerm_resource_group.example.name
  location            = "East US"
  size                = "Standard_DS2_v2"
  admin_username      = "forgeadmin"
  network_interface_ids = []
}

resource "google_compute_instance" "forge_app" {
  name         = "forge-app-instance"
  machine_type = "e2-standard-2"
  zone         = "us-central1-a"
  allow_stopping_for_update = true
  boot_disk {
    initialize_params {
      image = "debian-cloud/debian-11"
    }
  }
}
Output
Three VMs of similar spec, each with provider-specific scaling options.
Compute Philosophy — The Rental Market Analogy
  • AWS: hundreds of instance types → pick the perfect one, or pay for generic.
  • Azure: Reserved Instances + Hybrid Benefit = Windows cost leader (up to 80% savings).
  • GCP: custom machine types + sustained use discounts = most flexible pricing for custom workloads.
  • Preemptible/Spot VMs: GCP's 90% discount best for fault-tolerant batch, but 30-sec eviction notice.
  • Kubernetes: GKE Autopilot eliminates node management; EKS and AKS require more operational overhead.
Production Insight
AWS spot instances are 90% cheaper but reclaimable within 30 seconds — design for interruption.
Azure spot VMs have up to 5-minute eviction notice — safer for stateful workloads.
GCP preemptibles are 80% discount with 30-second notice and only run 24 hours max.
Rule: use spot for stateless batch (CI/CD, data processing), reserved for persistent services, on-demand for spiky traffic.
Key Takeaway
EC2: broadest family, Azure: Windows-friendly, GCE: custom & automatic discounts.
For Kubernetes, GKE leads in managed experience; EKS and AKS require more infra.
Choose compute based on workload pattern, not just price per hour — spot for batch, reserved for persistent, on-demand for spiky.
Choose Compute Type Based on Workload
IfFault-tolerant batch processing (CI/CD, data pipelines, transcoding)
UseUse spot/preemptible VMs. AWS spot (most capacity, 2-min eviction) > GCP preemptible (90% discount, 24hr max) > Azure spot.
IfWindows/.NET workloads with existing licenses
UseAzure with Hybrid Benefit — up to 80% savings on Windows Server and SQL Server licensing.
IfCustom ML training (TensorFlow, PyTorch) with variable time
UseGCP preemptible VMs + TPU/GPU — best price/performance for AI, but checkpoint every 30 seconds.
IfKubernetes workloads with variable scale
UseGKE Autopilot (no node management, pay per pod) for simplicity, EKS with Fargate for AWS-integrated, AKS for Windows containers.
IfPredictable 24/7 workloads (databases, web servers)
UseReserved Instances or Committed Use Discounts. AWS Savings Plans most flexible, GCP CUDs per project, Azure RIs per region.
● Production incidentPOST-MORTEMseverity: high

The $28,000 Egress Shock That Sent the CFO to the ER

Symptom
Monthly AWS bill jumped from $2,500 to $28,000 — all from Data Transfer out ($0.09/GB inter-region). The engineering dashboard showed high traffic to Europe, but the cost anomaly detection hadn't been configured. The CFO received an AWS invoice email on a Friday evening — the team missed the budget alert because they never set it up.
Assumption
The team assumed ingress/egress pricing was similar across providers and across regions. They didn't know that inter-region egress costs can dominate compute costs. They also assumed CloudFront would handle caching — but their dynamic API responses couldn't be cached, so every request still incurred inter-region transfer.
Root cause
Architectural decision to run the database in us-east-1 and the application servers in eu-west-1. Each user request to the European region required the app server to fetch data from the US database. At 1000 requests/second average, data transfer of 200TB/month. Egress cost: 200,000 GB × $0.09 = $18,000 just for DB-to-app transfer. Plus API responses to users: 800,000 GB × $0.08 = $6,400 extra. Total: $24,400 in egress, $3,600 in compute and storage. The team had optimised compute cost (using spot instances) but completely ignored data transfer. They also didn't realise that GCP's global VPC would have eliminated inter-region egress fees entirely — traffic stays on Google's backbone at no extra cost.
Fix
1. Deployed CloudFront with regional edge caches in Europe — cached static assets reduced egress to 40% of original. 2. For dynamic API responses, moved the database to a multi-region Aurora Global Database with read replicas in eu-west-1. Local reads eliminated cross-region transfer. 3. Added budget alerts at $5,000, $10,000, $20,000 thresholds. 4. Switched inter-region transfer to use AWS Direct Connect with GCP Partner Interconnect, routing traffic through private peering to reduce egress costs (still not free, but negotiated <$0.05/GB). 5. For future projects, evaluated GCP for global deployments — its global VPC eliminates cross-region egress at the network layer, not just for cached assets.
Key lesson
  • Egress pricing varies 3-5x between providers — GCP is cheapest for inter-region (global VPC), AWS most expensive.
  • Always model data transfer costs before selecting a primary region. Egress can exceed compute bill by 3x.
  • Use CDN (CloudFront/Azure CDN/Cloud CDN) as the first layer of egress control.
  • For dynamic traffic, use multi-region databases (Aurora Global, Spanner) to localise reads, not cross-region replication.
  • Set up budget alerts on day one. A $28,000 bill without warning is a career-limiting event.
Production debug guideSymptom → Action guide for common cloud provider issues5 entries
Symptom · 01
Application inside AWS needs to read data from GCP Cloud Storage — latency > 1 second
Fix
Check if using AWS Direct Connect + GCP Partner Interconnect. Without direct peering, traffic goes over public internet → 100-300ms add. Use GCP's storage transfer service or replicate to S3.
Symptom · 02
Azure VM can't resolve hostname of AWS EC2 instance in same VPC? (no VPC peering exists)
Fix
Azure and AWS VPCs cannot be directly peered. Use Azure VPN Gateway + AWS VPN connection or a third-party transit VPC. Verify public DNS resolution.
Symptom · 03
Billing alert triggered — spend 3x normal on a single day
Fix
Check for DDoS, egress spikes, misconfigured load balancers (AWS NLB with cross-zone disabled → unnatural traffic patterns). Use AWS Cost Anomaly Detection, Azure Cost Management, GCP Billing Budgets.
Symptom · 04
IAM role assumed in AWS fails to access GCP resource
Fix
Cloud providers have incompatible identity systems. Use federation: AWS IAM IdP federation with GCP Workforce Identity Federation. Or use a service account from GCP with delegated access.
Symptom · 05
Kubernetes cluster in GKE costs 30% more than EKS for same workload
Fix
GKE's default node pool uses n1-standard machines (older generation). Switch to e2-standard (20% cheaper) or c3 (compute-optimised). Also check if you're using regional cluster (replicates control plane across zones → 3x cost) vs zonal cluster.
★ Cloud Provider Cost & Performance DebugFast diagnostics for cost spikes and performance issues across AWS, Azure, and GCP.
AWS bill skyrocketed — check egress first
Immediate action
Check Data Transfer out costs in Cost Explorer
Commands
aws ce get-cost-and-usage --time-period Start=2026-04-01,End=2026-04-30 --granularity MONTHLY --metrics "UnblendedCost" --filter "{\"Dimensions\":{\"Key\":\"SERVICE\",\"Values\":[\"AWS Data Transfer\"]}}"
aws ce get-cost-and-usage --time-period Start=2026-04-01,End=2026-04-30 --granularity DAILY --metrics "UnblendedCost" --group-by Type=DIMENSION,Key=REGION
Fix now
Deploy CloudFront, use multi-region database replicas, set up budget alerts for egress > $1000.
Azure cost spike — check inter-region VNet peering+
Immediate action
Check VNet peering data transfer costs
Commands
az consumption usage list --query "[?contains(instanceName, 'VNet')]" | jq '.[] | {usageName, pretaxCost}'
az network vnet peering list --resource-group ForgeProdRG --vnet-name ForgeVNet
Fix now
For cross-region traffic, use Azure Front Door or CDN. Reduce unnecessary peering. Move resources to same region.
GCP bill higher than expected — check sustained use discounts+
Immediate action
Verify if sustained use discounts applied automatically
Commands
gcloud compute instances list --format='table(name,zone,machineType,status)'
gcloud beta billing accounts get-iam-policy ACCOUNT_ID
Fix now
GCP automatically applies sustained use discounts (20-30%) for long-running instances. If not seeing them, check if your instance is preemptible or if you're using committed use discounts incorrectly.
Multi-region latency > 100ms — check global VPC vs peering+
Immediate action
Identify if traffic is going over public internet or private backbone
Commands
traceroute -n ec2.us-east-1.amazonaws.com
mtr --report --report-cycles 10 gcp-eu-west1.googleapis.com
Fix now
GCP global VPC keeps traffic on private backbone (lower latency). AWS cross-region peering still incurs public internet routing unless using Direct Connect + peering.
AWS vs Azure vs GCP — Feature Comparison
FeatureAWS (Amazon)Azure (Microsoft)GCP (Google)
Market PositionPioneer & Market Leader (Largest Ecosystem, 32% market share)Enterprise Staple (Hybrid Cloud, 22% market share)Data & Innovation Leader (Cloud Native, 11% market share)
Primary ComputeEC2 (Elastic Compute Cloud) — 500+ instance typesAzure Virtual Machines (300+ instance types)Compute Engine (GCE) — custom machine types supported
Burstable ComputeT3/T4g (CPU credits, unlimited mode available)B-series (credits, no unlimited mode)E2-micro/nano (no credits, always throttled)
Kubernetes ServiceEKS — $0.10/cluster/hour + worker nodesAKS — free control plane, pay only for nodesGKE — $0.10/cluster/hour (zonal) or $0.30 (regional) + nodes
Object StorageS3 — $0.023/GB standard, 11x9s durabilityBlob Storage — $0.018/GB hot tierCloud Storage — $0.020/GB standard
Object Storage Egress$0.09/GB (inter-region), $0.09/GB to internet (first 10GB free)$0.07/GB (inter-region), $0.087/GB to internet$0.08/GB (inter-region), $0.12/GB to internet — GLOBAL VPC eliminates inter-region fees
Global NetworkingRegion/AZ-based — VPC per region, peering + Transit GatewayVNet per region — VNet peering, Global VNet Peering ($)Global VPC — single VPC spans all regions, traffic on Google backbone (no egress)
Serverless ComputeLambda — 1M free requests/month, $0.20/1M thereafterFunctions — 1M free requests/month, $0.20/1M thereafterCloud Functions — 2M free requests/month, $0.40/1M thereafter
Managed DatabaseRDS (Aurora, PostgreSQL, MySQL) — Aurora Serverless v2SQL Database (MSSQL, PostgreSQL, MySQL) — Hyperscale tierCloud SQL (PostgreSQL, MySQL, SQL Server) + Spanner (global consistency)
Best forVariety, talent pool, specialized servicesWindows/.NET, enterprise compliance, hybrid-cloudData analytics, AI/ML, Kubernetes, global apps

Key takeaways

1
AWS is the most mature platform, ideal for teams needing the widest variety of specialised tools and a massive talent pool.
2
Azure is the strategic choice for organisations with existing Microsoft Enterprise Agreements and deep Entra ID (Active Directory) integration.
3
GCP offers the most advanced Kubernetes experience (GKE) and a superior global network, often delivering better price-to-performance for data analytics and AI workloads.
4
Multi-cloud isn't just a buzzword
it requires Infrastructure as Code (IaC) to manage the operational complexity of diverse providers reliably.
5
Always optimise for 'Managed Services' (PaaS) over 'Virtual Machines' (IaaS) to reduce the operational burden of patching and scaling.
6
The secret to cloud cost control
model egress costs before compute costs. GCP's global VPC can save $160k/year on 2PB/month inter-region traffic.

Common mistakes to avoid

5 patterns
×

Not modelling egress costs before choosing a region

Symptom
AWS monthly bill jumps from $2,500 to $28,000 — all from Data Transfer out ($0.09/GB inter-region). CFO gets an invoice on Friday evening. Team missed the budget alert because they never set it up.
Fix
Always calculate data transfer costs before deployment. Use provider calculators. GCP global VPC eliminates inter-region egress. For AWS, place compute and storage in same region, use CloudFront for caching, and set budget alerts.
×

Not utilising the 'Free Tier' correctly — leaving resources running

Symptom
AWS free tier expires after 12 months, but a t2.micro left running continues accruing charges at standard rate ($8-15/month), causing unexpected monthly bills. Azure and GCP have similar traps (e2-micro free, but only one per region).
Fix
Set up a budget alert on each provider on day one. For AWS, monitor free tier usage in Cost Explorer. For GCP, use 'Always Free' products like Cloud Functions and Cloud Storage but limit request count. Terminate test instances, don't just stop them.
×

Treating S3-compatible APIs as identical across providers

Symptom
An app written with S3 pre-signed URLs works fine on AWS, but when migrated to Azure or GCP, the logic breaks because Blob uses container-based namespaces (different URL patterns) and GCS uses uniform bucket name but different signing mechanism.
Fix
Use abstraction libraries like MinIO SDK (S3-compatible) or provider-agnostic SDKs (Apache jclouds). For multi-cloud object storage, use MinIO gateway as a translation layer. Never hardcode provider-specific URL structures.
×

Assuming IAM roles, policies, and service accounts are interchangeable

Symptom
An AWS IAM role assumed in one account fails to access a GCP resource. AWS uses resource-based policies (S3 bucket policies) and identity-based policies (user/group/role). GCP uses hierarchical permissions (Organization → Folder → Project → Resource) with service accounts. Azure uses RBAC (role-based access control) with Entra ID.
Fix
Use federation for cross-provider access: AWS IAM IdP federation with GCP Workforce Identity Federation or Azure AD. For application-level access, use workload identity federation. Never create separate identities per provider when possible.
×

Manual resource management via 'ClickOps' — no Infrastructure as Code

Symptom
After a disaster recovery drill, engineers cannot reproduce the exact infrastructure because resources were created through the console without version control, leading to configuration drift. A team spent 3 weeks rebuilding a production environment after an outage because their 'critical' configuration was only in the console, not in code.
Fix
Use Infrastructure as Code (Terraform, Pulumi, AWS CDK, Azure Bicep, Google Deployment Manager) to manage all resources. Store state files securely in a shared backend (S3, Azure Storage, GCS) with version control. Never click in the console for production resources.
INTERVIEW PREP · PRACTICE MODE

Interview Questions on This Topic

Q01SENIOR
Google Spanner vs AWS Aurora Global Database: When would you choose one ...
Q02SENIOR
Explain the difference in IAM philosophy: AWS Resource-Based Policies vs...
Q03SENIOR
Scenario: A client is heavily invested in Active Directory (on-premises)...
Q04SENIOR
Compare the networking models: What is the technical advantage of GCP's ...
Q05SENIOR
How do 'Preemptible VMs' (GCP) or 'Spot Instances' (AWS/Azure) work, and...
Q06SENIOR
What is 'Egress' and how do you architect a system to minimise data tran...
Q01 of 06SENIOR

Google Spanner vs AWS Aurora Global Database: When would you choose one over the other for a global financial application needing strong consistency?

ANSWER
Spanner provides strong global consistency with ACID transactions across continents, using TrueTime and synchronised clocks. It's the only cloud database offering external consistency (linearizability) globally. Aurora Global Database offers read replicas across regions with ~1 second replication lag, but writes are only at the primary region. For a financial application where a user in Asia and a user in Europe might update the same account concurrently, Spanner is necessary to avoid conflicts. For workloads where reads can be slightly stale and writes are localised to a single region, Aurora Global with read replicas is more cost-effective. Spanner example: credit card transaction processing globally. Aurora Global example: social media feed (eventual consistency acceptable).
FAQ · 6 QUESTIONS

Frequently Asked Questions

01
Which cloud provider is cheapest for general compute workloads?
02
Can I use multiple cloud providers together?
03
Which cloud provider has the best developer experience?
04
How do I choose between AWS, Azure, and GCP for my startup?
05
How do I handle identity and access across multiple cloud providers?
06
What is the exit cost — how hard is it to leave a cloud provider once you're locked in?
🔥

That's Google Cloud. Mark it forged?

3 min read · try the examples if you haven't

Previous
Introduction to Google Cloud Platform
2 / 4 · Google Cloud
Next
Google Cloud Compute Engine Basics