Reserved Instances, which provide a guaranteed compute capacity and are available at various discounts for long-term commitments and Spot Instances, which provide a spare compute capacity at a very low price, but are only available when your bid price is high enough. Even though they work differently, both provide opportunities to make significant savings if used properly. In this, the final part of this series, we will discuss AutoScaling groups, a free AWS service that controls groups of instances using various features, and can also be used to reduce overall spending by shutting down unused resources. Let’s look at how it works.
AWS Auto Scaling Groups OverviewAWS Auto Scaling Groups contain a collection of EC2 instances (On-Demand, Reserved, or Spot) defined by the Launch Configurations (templates with defined AMIs, instance types, security groups, etc), that Auto Scaling Groups control in different ways depending on your needs. The basic configuration allows you to choose the desired minimum and maximum number of instances, the load balancer (if any) to be placed in front of them, and which subnet and Availability Zone to run them in. Auto Scaling Groups can be used to control backend resources behind an ELB, provide self-replication (when the instance crashes, Auto Scaling Group will immediately provision a new one to maintain the desired capacity), simplify deployments (regular releases, blue/green deployments, etc.), and for many other use cases. However, what we are really interested in here is how we can use them to reduce our EC2 spending. The unnecessary spending on EC2 instances is usually caused by unused, or underused, compute resources, that increase your monthly bill. This is an age-old problem where you provision more than you need, to make sure you have enough to handle the expected, but also unexpected traffic. An Auto Scaling Group solves this issue by handling the scalability requirements for you. Let’s say you need compute resources that are serviced by one m4.4xlarge instance. However, looking at the actual CPU usage (CloudWatch metric), you notice that it varies between 20% and 80% during the day. Not very efficient, is it? So how can Auto Scaling Group help? One of the features it provides is Scaling Policy, which allows you to scale resources based on demand. So instead of using one m4.4xlarge instance, you simply create an Auto Scaling Group that has four m4.xlarge instances, and create a policy that will reduce the number of running instance by two when CPU usage is below 50%. When CPU usage goes below 25%, it will reduce the number of running instances to only one. By using this simple policy (which can be perfected with more fine-tuning), you will ensure that when the demand for compute capacity is low, you will have fewer resources provisioned, thereby reducing the cost. Moreover, adding a policy to scale back up when demand is higher will create a fully automated environment that can scale dynamically without your intervention. If you have predictable workloads, e.g., peak usage between 1pm and 6pm during the weekdays, you can also use Scheduled Actions to set recurring scaling during the desired time period. This feature is good for scaling down when you know you won’t need compute resources, and also as a pre-emptive scaling up during events such as Black Friday.
- When you first start creating scaling policies, start small and monitor your scaling activities using Activity History to make sure your policies don’t cause instability in your environment.
- Scale up quickly to meet the demand, and scale down slowly to make sure you don’t remove resources too soon.
- Always scale up and down symmetrically; this will prevent inconsistent scaling, and allow you to have the exact amount of resources that you have planned for.