One of the more fascinating areas I encounter with clients is the approach to managing costs when using Amazon Web Services (AWS). I’ve worked with clients across the entire spectrum, ranging from what appears to be unlimited spending down to controlling expenses with an iron fist. While many clients understand the concept of “paying only for what you use” with AWS, an area that is often overlooked is how to properly optimize costs within AWS.
Over the past year, I’ve noticed a lot of inquiries from clients on how they can best optimize and effectively manage their cloud expenditures. Clients have noticed that their monthly bill has only continued to grow due to the ease of launching cloud resources for Proofs-of-Concept (POCs), providing end users the freedom to try new services, and increasing general production workload using AWS within the organization. Suddenly, the whole reason for moving to the cloud in order to save money is no longer working.
The solution I immediately see clients implement is a knee-jerk reaction to simply shut things down, and implement more stringent controls on who can launch resources so that their bill drops immediately the next month. While this does save money in the short term, it is not how cost optimization should be handled within AWS. Without proper cost-optimization practices, clients end up back right in the same spot with their expenditures a few months down the road.
AWS Cost optimization is all about how you effectively utilize your cloud spending to offer scalable, resilient, and modernized applications that set up your organization for long-term success with cloud expenses. It is not only about reducing costs, but ensuring that cloud solutions are being used efficiently. This is an area that typically is not given the attention it deserves by the organizations that continually see their AWS bill increasing monthly.
Cost optimization is so important at AWS that an entire pillar is devoted to it within their AWS Well-Architected Framework principles for best practices. AWS provides a plethora of tools to help organizations optimize their costs within the platform that are free to use, and also purchasing options to help lower costs based on a commitment period. The challenge, however, is to know which tools to use to help in optimizing your costs as well as identify areas that can be optimized across your AWS account(s).
AWS breaks down cost optimization into the following five principles:
- Practice Cloud Financial Management
- Expenditure and usage awareness
- Cost-effective resources
- Manage demand and supply resources
- Optimize over time
That's the holy grail of AWS cost optimization and is the key to long-term success. You can read more about these principles in AWS Cost Optimization Design Principles.
However, most clients need to handle short-term and long-term success when they come to Unicon for help. I break it down into similar concepts, but using what I like to call the FUMI methodology.
- Find what is running in the account and what are the resource’s current costs
- Understand what cost optimization options are available for the findings and apply them
- Monitor the cloud spend and implement a continual financial management process
- Identify modernization options and plan for implementation
Step 1: Find what is running in the account and what are the resource’s current costs
The first step is to find out what is running in the account and what is actually incurring costs. As with any financial analysis, you need to know where your money is going before you can do anything concerning cost savings. This step is typically a combination of using AWS financial management tools, as well as interacting with stakeholders of the cloud workloads.
For example, by using the AWS Cost Explorer, you can gather information about the services incurring the larger costs across the account.
This report can be adjusted to include a variety of time periods to view trends as well. A review of the AWS bill for any of the previous months also provides a quick way to see which services are running and their overall costs. Another example is using S3 Storage Lens to identify the S3 buckets that are consuming the most storage and consequently perhaps the most costs.
Once you have this information, you can begin to work with the stakeholders of an application to ask for more information that will be key to determining what you can actually do to optimize costs. For example, questions that are asked in this step are as follows:
- Do these cloud workloads need to be available 24x7x365? What are the non-production environment availability needs?
- Do you need to keep all of the data in an S3 bucket in the standard storage tier? How are these objects used?
- How does this application use AWS services?
The third question above is one that I always ask, as it opens the door for a discussion on whether or not the application is truly using the most efficient cloud architecture. This becomes a key knowledge area for later discussions on long-term cost optimization.
At the end of this step is typically a spreadsheet or report detailing by service what is running and what it is costing per month. Taking this data, you can then find what cost savings options are available for the resources identified. In practice, the client ends up not only discovering workloads that do not need to be running but also learning more about how cloud resources are actually being used in their workloads.
Step 2: Understand what AWS cost optimization options are available for the findings and apply them
Once you know what is running, and have discovered your most expensive resources, you can begin to look into what cost-saving measures can be implemented. This includes taking advantage of pricing discounts AWS will offer users in exchange for a commitment on usage (e.g. Savings Plans), running resources only when needed, and perhaps even migrating to a lower-cost resource option.
In the AWS Cost Optimization tool sets, AWS provides AWS Savings Plans recommendations to save you money based on actual resource utilization. If you are not familiar with Savings Plans, it is a way to pay a lower price in exchange for a commitment to running a resource over a period of time with EC2, ECS, Lambda, and SageMaker. AWS will charge you a lower rate, up to effectively $0.00/hr to run a resource if you will commit to running that resource. You can save up to 72% using these Savings Plans. An example of a real Savings Plan recommendation is shown below:
AWS offers similar savings and recommendations through AWS Reservations for RDS, ElastiCache, Redshift, and OpenSearch. These savings plans and reservations are helpful for those workloads that always need to be available.
Another cost optimization option is to use S3 Lifecycle Policies to move objects to lower-cost storage in S3 automatically, and on a schedule that you define. Clients often can have terabytes of data in S3 that is simply sitting there to be accessed a few times a month. This is often helpful for gaining ‘quick wins’ with S3 storage costs. An area to pay attention to is S3 buckets that are used with multi-part uploads as I have seen workloads have TB of multi-part uploads that do not compete and are sitting there consuming storage. An S3 lifecycle policy can be put into practice to automatically remove these failed uploads to avoid unnecessary costs.
Shutting down resources that are not needed 24x7x365, or even moving from an Aurora Cluster to an Aurora Serverless option can contribute to cost optimization. This step is also the time in which stakeholders typically make decisions about terminating resources if they no longer are needed.
“Right Sizing” also takes place in this step. In the same AWS Cost Optimization toolset, AWS provides recommendations on which resources perhaps could be modified to use lower-cost instance types.
It is important to realize that some of the cost optimization options discovered in this step may be longer-term. While shutting down resources has an immediate effect, moving an EC2 workload to a containerized solution must be done over time. For these longer approaches, it is necessary to try and gather the costs of moving to the lower-cost solution versus keeping things as they are.
Step 3: Monitor the cloud spend and implement a continual financial management process
Once you implement your cost optimization plans, you need a way to know that your cloud spend is continuing to be efficient and providing value without extraneous expenditures. This is where you work to prevent your cloud spend from getting out of control again.
The first thing that I tell clients to do in this step is to make use of AWS Budgets and set up alerts for when the monthly spending is approaching a limit so that immediate actions can be taken to understand why the costs are increasing. It’s best to know early if the increase is expected, and also if it's unexpected so that the issue can be corrected.
** Account ID removed for privacy reasons
Another excellent tool is the AWS Cost Anomaly Detection. This tool uses Machine Learning to identify when your cloud expenditure begins to veer off the normal pattern. Best of all, it is a free service and can even help provide root cause analysis for the cost anomaly, helping you pinpoint where the cost issue is occurring.
This step is also where I work with the client to develop their cloud financial management plan. By the time you get here, you’ve uncovered a lot of things that led to this extraneous spending and that allows one to develop a solid cost management plan that helps to control costs over the long term. Plans are made for who is responsible for implementing and enforcing Cost Allocation Tags, which help users assign costs to resources within reports. Automated reporting is set up in this step to help notify the proper stakeholders of cloud spend, and processes are developed to regularly use the AWS Cost Management tools to monitor costs.
Step 4: Identify modernization options and potential plans for their implementation
This step also involves putting application modernization plans in place, or at the very least discussing what application modernization options are available for workloads that could benefit from adjustments in order to lower costs. Examples of this would be taking processes running on EC2 instances (think a scheduled cronjob running a batch process) and modernizing it to use a CloudWatch EventBridge schedule rule that invokes a Lambda function to do the task. This also is the time to take the information gathered from Step 1 regarding a cloud workload’s current architecture to see if it is the right one for long-term cost optimization.
This step will take someone who is experienced in AWS solutions to help determine what is actually viable for modernization and what the return on investment can be over time. Typically, long-term cost optimization options will require some investment in modernizing the workloads and you will need an experienced AWS cloud architect to help fully take advantage of this step.
Conclusion
This article simply only scratches the surface of how to implement cost optimization, but the FUMI method has helped many of our clients to gain short-term cost optimization and place them into a better long-term setup. I recently completed a cost-optimization project for one of our clients using the FUMI methodology and after two months their AWS bill is now 55% lower. Cost optimization works and in some cases, can be achieved very quickly. If you are looking to get a better handle on your AWS cloud spend, become familiar with the AWS Cost Management Tools within the links below, implement the FUMI methodology, and if you need expert help, feel free to reach out.