AWS (Amazon Web Services) charges its users for three primary components across its services, these are Compute, Bandwidth and Storage. In this article we will first understand what comprises of Data Transfer and then look at few tried and tested ways of reducing your Data Transfer costs on AWS.
AWS transfer costs can add up very quickly and therefore an understanding of how these costs work and what are some basic ways in which this can be optimized, is important.
How AWS looks at Data Transfer and associated Costs
Even before we get started with reducing our usage bill , we need to know how AWS looks at Data Transfer. There are two pieces to this puzzle. There is data transfer that happens within AWS, i.e. within the same data centre (or lets say the region of AWS), this could be for accessing your database on another server or accessing a certain service like Kinesis and then there is bandwidth consumption that happens between AWS and the Public Internet.
When it comes to networking, we represent data using terms like ingress, midgress and egress.
- ingress traffic refers to the data that makes its way into AWS from the public internet. This is usually the data that is not a result of a request made from within the VPC.
- midgress traffic here will refer to traffic between AWS regions. For example a request made from your ec2 instance to AWS S3. Within AWS, there is a concept of Region and then there is a concept of Availability Zones within a region. So while, theoritically you can say that traffic within AWS is midgress, pricing rules take into account traffic movement across Regions and Availability Zones into account.
- egress traffic refers to the traffic making its way out of the AWS region to the public internet. This can be because of a script that is running on an ec2 instance which pings some public server or it could be traffic going out of VPC (in which case it will be egress for the VPC).
For certain AWS services, data movement cost is part of the service cost itself, instead of being a separate line item.
Data Transfer within AWS
This is the midgress use-case we discussed earlier, where data travels between AWS services across regions, or within the same region but different availability zones or same region and same availability zone.
- At the time of writing this article, if you have AWS services like EC2, RDS, Redshift, Elasticache, Elastic Network Interfaces or VPC Peering connections across Availability Zones in the same AWS Region is charged at $0.01/GB in each direction.
- If you make use of AWS services like EC2, S3, Glacier, DynamoDB, SES, SQS, Kinesis, ECR, SNS or SimpleDB do note that data transferred between these services within the same region is Free.
- Also note that within the same availability zone (this is like being in the same server rack), most of the AWS services have zero fee on data transfer.
- Data Transfer from one region to another is classified as InterRegion Inbound (data transfer in) and InterRegion Outbound on your AWS monthly bill. Inter region data transfer is charged by AWS as per the data transfer pricing of the source region rates.
Data Transfer between AWS and Public Internet
If you use any service which transfers data out to the public internet from an AWS region, you are generally billed basis the pricing for that specific region. Data transfer rates are tiered and therefore depending on the usage bucket you will be charged.
Image below shows the pricing you will incur for sending data out of EC2 to the Internet. All ingress is free. Refer to the latest aws data transfer pricing here.
AWS Free Tier gives you 15GB of free Data Transfer Out each month for one year. After the first year, you get 1GB Data Transfer Out to internet free per month per region.
Tips for Saving AWS Data Transfer Costs
- Plan your architecture such that you factor in the routes the data will travel. Now that you know that Data transfer costs within AWS are highest for inter-region connectivity, next is same region inter Availability zones and last is same Availability Zone.
- Back in 2015, we had a high traffic application hosted on one instance of ec2 that called another ec2 server (within the same availability zone) over public internet address. This added both to latency as well as data transfer cost for us. We were able to get rid of these by using private IP Address. Lesson here is, to use Private IP Address whenever your application architecture allows. This is cheaper than using both public as well as Elastic IP Address.
- We have been able to come up with interesting solutions by complimenting our data transfer modeling with the AWS Calculator as well. One interesting solution we found was about using Cloudfront with Amazon ec2. If you transfer a lot of data to your end users (images, videos, large files etc), do consider using Cloudfront. Costs from ec2 to Cloudfront (which are called “origin fetches”) i.e. from origin to edge locations are free of charge. You will however pay for the object delivery charges basis the regions selected for Cloudfront.
- Look at your WAN configurations and Multi-AZ setups, there are often leakages that can be protected here.
- If you are using VPN, you might want to consider using AWS Direct Connect for connectivity to AWS and your own corporate data centre. This also has a significant cost benefits as you are only charged once in case of Direct Connect. In case of AWS Direct Connect you are charged for data that egresses from AWS region to your Direct Connect Location. You will be charged region to direct connect location charges in this case.