Scale in at your own pace with Compute Engine autoscaler controls
In a Compute Engine environment, managed instance groups (MIGs) offer autoscaling that lets you automatically change an instance group’s capacity based on current load, so you can rightsize your environment—and your costs. Autoscaling adds more virtual machines (VMs) to a MIG when there is more load (scaling out), and deletes VMs when the need for VMs is lesser (scaling in).
When load declines, the autoscaler removes all unused capacity. This allows you to save costs but might cause the autoscaler to scale in abruptly. For example, if the load goes down by 50%, the autoscaler removes ~50% of your VMs immediately after a 10-minute stabilization period. Deleting VMs so abruptly might not work well for some workloads. For example, if your VMs take many minutes to initialize you might want to slow down the rate at which you scale in to maintain capacity for imminent load spikes.
Introducing scale-in controls
New scale-in controls in Compute Engine let you limit the VM deletion rate by preventing the autoscaler from reducing a MIG’s size by more VM instances than your workload can tolerate to lose.
When you configure autoscaler scale-in controls, you control the speed at which you scale in. The autoscaler never scales in faster than your configured rate, as shown in this diagram:
When load declines, the autoscaler maintains the size for the group at a level required to serve the peak load observed in the last 10 minutes (the stabilization period). This works the same with and without scale-in controls.
An autoscaler without scale-in controls keeps only enough instances required to handle the recently observed load. After the stabilization period, the autoscaler removes all unneeded instances in one step. A sudden drop in load can lead to a dramatic reduction of instance group size.
An autoscaler with scale-in controls limits how many VM instances can be removed in a given period of time (here 10 VMs in 20 minutes). This slows down the instance reduction rate.
With the load spikes again, the autoscaler adds new instances. However, because VMs take a long time to initialize, the new VMs aren’t ready to serve the load. With scale-in controls, the previous capacity wasn’t deleted yet, allowing existing VMs to serve the spike.
You can set up scale-in controls in the Google Cloud Console. Select an autoscaled MIG from the instance groups page and click Edit group. Under the Autoscaling section, set your scale-in controls:
You can also configure scale-in controls programmatically. Here’s the same command written using CLI (gcloud):
For more details including configuration using API refer to the documentation.
Try scale-in controls today
Scale-in controls are generally available across all regions. For more information on how you can control the scale-in rate, check out the docs.
Related Google News:
- Scaling deep retrieval with TensorFlow Recommenders and Vertex AI Matching Engine May 1, 2023
- Unleash your Google Cloud data with ThoughtSpot, Looker, and BigQuery May 1, 2023
- Track, Trace and Triumph: How Utah Division of Wildlife Resources is harnessing Google Cloud to… May 1, 2023
- Seeing the World: Vertex AI Vision Developer Toolkit May 1, 2023
- BBC: Keeping up with a busy news day with an end-to-end serverless architecture May 1, 2023
- Scalable electronic trading on Google Cloud: A business case with BidFX May 1, 2023
- Google Cloud and Equinix: Building Excellence in ML Operations (MLOps) May 1, 2023
- Effingo: the internal Google copy service moving data at scale May 1, 2023