When you create a Compute Engine instance, you must define the method, called provisioning model, that you want to use to obtain your requested resources. Each provisioning model determines the availability, lifespan, and pricing of your compute instances.
This document explains the different provisioning models that you can specify when you create compute instances. By understanding these models, you can choose the best option for your workload.
Available provisioning models
When you create a compute instance, you can specify one of the following provisioning models. If you don't specify a provisioning model, then Compute Engine uses the standard provisioning model by default.
Standard
Spot
Flex-start
Reservation-bound
The following table helps you compare the use cases and pricing for each provisioning model:
| Standard | Spot | Flex-start | Reservation-bound | |
|---|---|---|---|---|
| Summary |
|
|
|
|
| Use cases |
Ideal for workloads that require stability and continuous operation, such as the following workloads:
|
Ideal for workloads that can tolerate interruptions, such as the following workloads:
|
Workloads that require stability and need to run for no more than seven days, such as the following workloads:
|
Ideal for workloads that require stability and a specific run time, such as the following:
|
| Resource allocation | Best-effort. Compute Engine physically places resources close to each other on a best-effort basis. To control placement, you can optionally use placement policies. | Best-effort. Compute Engine physically places resources close to each other on a best-effort basis. To control placement, you can optionally use placement policies. | Resource allocation varies based on how you create compute instances:
|
Dense. Compute Engine physically places resources on tightly coupled hosts connected by a high-speed network fabric to minimize network latency. |
| Pricing |
|
|
|
|
| Quota | When you create a compute instance, standard quota is consumed. | When you create a compute instance, preemptible quota is consumed. If your project lacks preemptible quota, then standard quota is consumed. Google Cloud Free Tier credits don't apply to Spot VMs. | When the MIG adds compute instances to the group, preemptible quota is consumed. If your project lacks preemptible quota, then standard quota is consumed. | Quota doesn't apply to the reservation-bound provisioning model. However, you still need quota for the resources that aren't part of your reserved capacity, such as disks and IP addresses. |
Compute instance availability and lifespan
The following table shows compute instance availability and lifespan for each provisioning model:
| Standard | Spot | Flex-start | Reservation-bound | |
|---|---|---|---|---|
| Creation prerequisites | No creation prerequisites. | No creation prerequisites. | No creation prerequisites. |
To create compute instances, you must first reserve capacity using one of the following methods:
At your chosen delivery date and time, Compute Engine provisions your requested capacity. Then, you can consume the capacity by creating compute instances. |
| Supported machine series | You can use any machine series, except A4X Max, A4X, A4, and A3 Ultra. | You can use any machine series, except A4X instances and any bare metal instances (A4X Max, C4D, C4, C3, X4, and Z3). |
You can only use the following machine series: |
Based on how you reserve capacity to create VMs, you can only use the following machine series:
|
| Compute instance availability | You can create compute instances at any time, as long as your requested resources are available. | You can create compute instances at any time, as long as your requested resources are available. | You can create compute instances as follows: Compute Engine uses DWS to schedule the provisioning of your requested capacity based on resource availability. DWS helps you obtain high-demand resources like GPUs. |
You can only create compute instances after reserving capacity for a future date. On your requested date, Compute Engine delivers your requested capacity, which you can then use to create compute instances. If you reserve resources using future reservations in calendar mode, then Compute Engine uses DWS to provision your requested capacity. DWS helps you obtain high-demand resources like GPUs. |
| Capacity assurance | Based on the creation method. Capacity assurance
varies based on the method that you use to create compute instances as
follows:
|
Best-effort. When you create Spot VMs, Compute Engine makes best-effort attempts to provision your requested capacity. | Best-effort. When you create a MIG resize request, Compute Engine makes best-effort attempts to schedule the provisioning of your requested capacity. | Very high. If Google Cloud approves your reservation request, then you have very high assurance that Compute Engine provisions your reserved capacity at your chosen delivery date and time. You have exclusive access to your reserved capacity for the reservation period. |
| Compute instance lifespan | You can control when to stop or delete a compute instance. However, if the machine type that the compute instance uses doesn't support live migration, then Compute Engine stops the compute instance during host maintenance events. |
You can control when to stop or delete a compute instance, except in the following cases:
|
Before a compute instance reaches the end of its run duration, you can do the following:
When a compute instance reaches the end of its run duration, Compute Engine deletes it. |
You can control when to stop or delete a compute instance, except in the following cases:
|
What's next
Learn more about Spot VMs.
Learn more about Flex-start VMs.
Learn more about compute instances that use the reservation-bound provisioning model.