Hubbry Logo
Google Compute EngineGoogle Compute EngineMain
Open search
Google Compute Engine
Community hub
Google Compute Engine
logo
7 pages, 0 posts
0 subscribers
Be the first to start a discussion here.
Be the first to start a discussion here.
Contribute something
Google Compute Engine
Google Compute Engine
from Wikipedia
Google Compute Engine
Original authorsGoogle, Inc.
DeveloperGoogle
Initial releaseJune 28, 2012; 13 years ago (2012-06-28)[1]
Operating system
Available inEnglish
TypeVirtual private server
LicenseProprietary software
Websitecloud.google.com/compute/

Google Compute Engine (GCE) is the infrastructure as a service (IaaS) component of Google Cloud Platform which is built on the global infrastructure that runs Google's search engine, Gmail, YouTube and other services. Google Compute Engine enables users (utilising authentication based on OAuth 2.0) to launch virtual machines (VMs) on demand. VMs can be launched from the standard images or custom images created by users. Google Compute Engine can be accessed via the Developer Console, RESTful API or command-line interface (CLI).

History

[edit]

Google announced Compute Engine on June 28, 2012 at Google I/O 2012 in a limited preview mode. In April 2013, GCE was made available to customers with Gold Support Package. On February 25, 2013, Google announced that RightScale was their first reseller.[2] During Google I/O 2013, many features including sub-hour billing, shared-core instance types, larger persistent disks, enhanced SDN based networking capabilities and ISO/IEC 27001 certification got announced. GCE became available to everyone on May 15, 2013. Layer 3 load balancing came to GCE on August 7, 2013. Finally, on December 2, 2013, Google announced that GCE is generally available. It also expanded the OS support, enabled live migration of VMs, 16-core instances, faster persistent disks and lowered the price of standard instances.

At the Google Cloud Platform Live event on March 25, 2014, Urs Hölzle, Senior VP of technical infrastructure announced sustained usage discounts, support for Microsoft Windows Server 2008 R2, Cloud DNS and Cloud Deployment Manager. On May 28, 2014, Google announced optimizations for LXC containers along with dynamic scheduling of Docker containers across a fleet of VM instances.[3]

Google Compute Engine Unit

[edit]

Google Compute Engine Unit (GCEU), which is pronounced as GQ, is an abstraction of computing resources. According to Google, 2.75 GCEUs represent the minimum power of one logical core (a hardware hyper-thread) based on the Sandy Bridge platform. The GCEU was created by Anthony F. Voellm out of a need to compare the performance of virtual machines offered by Google. It is approximated by the Coremark(TM) benchmark run as part of the PerfKitBenchmarker Open Source benchmark created by Google in partnership with many Cloud Providers.

Persistent disks

[edit]

Every Google Compute Engine instance starts with a disk resource called persistent disk. Persistent disk provides the disk space for instances and contains the root filesystem from which the instance boots. Persistent disks can be used as raw block devices. By default, Google Compute Engine uses SCSI for attaching persistent disks. Persistent Disks provide straightforward, consistent and reliable storage at a consistent and reliable price, removing the need for a separate local ephemeral disk. Persistent disks need to be created before launching an instance. Once attached to an instance, they can be formatted with the native filesystem. A single persistent disk can be attached to multiple instances in read-only mode. Each persistent disk can be up to 10 TB in size. Google Compute Engine encrypts the persistent disks with AES-128-CB, and this encryption is applied before the data leaves the virtual machine monitor and hits the disk. Encryption is always enabled and is transparent to Google Compute Engine users. The integrity of persistent disks is maintained via a HMAC scheme.

On June 18, 2014, Google announced support for SSD persistent disks. These disks deliver up to 30 IOPS per GB which is 20x more write IOPS and 100x more read IOPS than the standard persistent disks.

Images

[edit]

An image is a persistent disk that contains the operating system and root file system that is necessary for starting an instance. An image must be selected while creating an instance or during the creation of a root persistent disk. By default, Google Compute Engine installs the root filesystem defined by the image on a root persistent disk. Google Compute Engine provides CentOS and Debian images as standard Linux images. Red Hat Enterprise Linux (RHEL) and Microsoft Windows Server 2008 R2 images are a part of the premier operating system images which are available for an additional fee. Container Linux (formerly CoreOS), the lightweight Linux OS based on ChromiumOS is also supported on Google Compute Engine.

Machine types

[edit]

Google Compute Engine uses KVM as the hypervisor,[4] and supports guest images running Linux and Microsoft Windows which are used to launch virtual machines based on the 64 bit x86 architecture. VMs boot from a persistent disk that has a root filesystem. The number of virtual CPUs, amount of memory supported by the VM is dependent on the machine type selected.

Billing and discounts

[edit]

Google Compute Engine offers sustained use discounts. Once an instance is run for over 25% of a billing cycle, the price starts to drop:

  • If an instance is used for 50% of the month, one will get a 10% discount over the on-demand prices
  • If an instance is used for 75% of the month, one will get a 20% discount over the on-demand prices
  • If an instance is used for 100% of the month, one will get a 30% discount over the on-demand prices

Machine type comparison

[edit]

Google provides certain types of machine:

  • Standard machine: 3.75 GB of RAM per virtual CPU
  • High-memory machine: 6.5 GB of RAM per virtual CPU
  • High-CPU machine: 0.9 GB of RAM per virtual CPU
  • Shared machine: CPU and RAM are shared between customers
  • Memory-optimized machine: greater than 14 GB RAM per vCPU.

The prices mentioned below[5] are based on running standard Debian or CentOS Linux virtual machines (VMs). VMs running proprietary operating systems will be charged more.

Machine type Machine name Virtual cores Memory Cost per hour (US hosted) Cost per hour (Europe hosted)
Standard n1-standard-1 1 3.75 GB $0.070 $0.077
Standard n1-standard-2 2 7.5 GB $0.140 $0.154
Standard n1-standard-4 4 15 GB $0.280 $0.308
Standard n1-standard-8 8 30 GB $0.560 $0.616
Standard n1-standard-16 16 60 GB $1.120 $1.232
High Memory n1-highmem-2 2 13GB $0.164 $0.180
High Memory n1-highmem-4 4 26 GB $0.328 $0.360
High Memory n1-highmem-8 8 52 GB $0.656 $0.720
High Memory n1-highmem-16 16 104 GB $1.312 $1.440
High CPU n1-highcpu-2 2 1.80 GB $0.088 $0.096
High CPU n1-highcpu-4 4 3.60 GB $0.176 $0.192
High CPU n1-highcpu-8 8 7.20 GB $0.352 $0.384
High CPU n1-highcpu-16 16 14.40 GB $0.704 $0.768
Shared Core f1-micro 0.2 0.60 GB $0.013 $0.014
Shared Core g1-small 0.5 1.70 GB $0.035 $0.0385
Memory-optimized n1-ultramem-40 40 938 GB $6.3039 $6.9389
Memory-optimized n1-ultramem-80 80 1922 GB $12.6078 $13.8779
Memory-optimized n1-megamem-96 96 1433.6 GB $10.6740 $11.7430
Memory-optimized n1-ultramem-160 160 3844 GB $25.2156 $27.7557

Resources

[edit]

Compute Engine connects various entities called resources that will be a part of the deployment. Each resource performs a different function. When a virtual machine instance is launched, an instance resource is created that uses other resources, such as disk resources, network resources and image resources. For example, a disk resource functions as data storage for the virtual machine, similar to a physical hard drive, and a network resource helps regulate traffic to and from the instances.

Image

[edit]

An image resource contains an operating system and root file system necessary for starting the instance. Google maintains and provides images that are ready-to-use or users can customize an image and use that as an image of choice for creating instances. Depending on the needs, users can also apply an image to a persistent disk and use the persistent disk as the root file system.

Machine type

[edit]

An instance's machine type determines the number of cores, the memory, and the I/O operations supported by the instance.

Disk

[edit]

Persistent disks are independent of the virtual machines and outlive an instance's lifespan. All information stored on the persistent disks is encrypted before being written to physical media, and the keys are tightly controlled by Google.

Type Price (per GB/month)
Standard provisioned space $0.04
SSD provisioned space $0.17
Snapshot storage $0.026
IO operations No additional charge

Each instance can attach only a limited amount of total persistent disk space (one can have up to 64 TB on most instances) and a limited number of individual persistent disks (one can attach up to 16 independent persistent disks to most instances).

Regional persistent disks can be replicated between two zones in a region for higher availability.[6]

Snapshot

[edit]

Persistent disk snapshots lets the users copy data from existing persistent disk and apply them to new persistent disks. This is especially useful for creating backups of the persistent disk data in cases of unexpected failures and zone maintenance events.

Instance

[edit]

A Google Compute Engine instance is a virtual machine running on a Linux or Microsoft Windows configuration. Users can choose to modify the instances including customizing the hardware, OS, disk, and other configuration options.

Network

[edit]

A network defines the address range and gateway address of all instances connected to it. It defines how instances communicate with each other, with other networks, and with the outside world. Each instance belongs to a single network and any communication between instances in different networks must be through a public IP address.

A Cloud Platform Console project can contain multiple networks, and each network can have multiple instances attached to it. A network allows the user to define a gateway IP and the network range for the instances attached to that network. By default, every project is provided with a default network with preset configurations and firewall rules. Users can choose to customize the default network by adding or removing rules, or they can create new networks in that project. Generally, most users only need one network, although there can be up to five networks per project by default.

A network belongs to only one project, and each instance can only belong to one network. All Compute Engine networks use the IPv4 protocol. Compute Engine currently does not support IPv6. However, Google is a major advocate of IPv6 and it is an important future direction.

Address

[edit]

When an instance is created, an ephemeral external IP address is automatically assigned to the instance by default. This address is attached to the instance for the life of the instance and is released once the instance has been terminated. GCE also provides mechanism to reserve and attach static IPs to the VMs. An ephemeral IP address can be promoted to a static IP address.

Firewall

[edit]

A firewall resource contains one or more rules that permit connections into instances. Every firewall resource is associated with one and only one network. It is not possible to associate one firewall with multiple networks. No communication is allowed into an instance unless a firewall resource permits the network traffic, even between instances on the same network.

Route

[edit]

Google Compute Engine offers a routing table to manage how traffic destined for a certain IP range should be routed. Similar to a physical router in the local area network, all outbound traffic is compared to the routes table and forwarded appropriately if the outbound packet matches any rules in the routes table.

Regions and zones

[edit]

A region refers to a geographic location of Google's infrastructure facility. Users can choose to deploy their resources in one of the available regions based on their requirement. As of June 1, 2014, Google Compute Engine is available in central US region, Western Europe and Asia East region.

A zone is an isolated location within a region. Zones have high-bandwidth, low-latency network connections to other zones in the same region. In order to deploy fault-tolerant applications that have high availability, Google recommends deploying applications across multiple zones in a region. This helps protect against unexpected failures of components, up to and including a single zone. As of August 5, 2014, there are eight zones - three each in central US region and Asia East region and two zones in Western Europe region.

Scope of resources

[edit]

All resources within GCE belong to the global, regional, or zonal plane. Global resources are accessible from all the regions and zones. For example, images are a global resource so users can launch a VM in any region based on a global image. But an address is a regional resource that is available only to the instances launched in one of the zones within the same region. Instances are launched in a specific zone that requires the zone specification as a part of all requests made to that instance.

The table below summarises the scope of GCE resources:

Scope Resource
Global Image
Global Snapshot
Global Network
Global Firewall
Global Route
Region Address
Zone Instance
Zone Machine Type
Zone Disk

Features

[edit]

Billing and pricing model

[edit]

Google charges the VMs for a minimum of 10 minutes. At the end of 10th minute, instances are charged in 1-minute increments, rounded up to the nearest minute.[7] Sustained usage based pricing will credit the discounts to the customers based on the monthly utilisation.[8][9] Users need not pay a commitment fee upfront to get discounts on the regular, on-demand pricing.

VM performance

[edit]

Compute Engine VMs boot within 30 seconds[10] which is considered to be 4-10x faster than the competition.

Disk performance

[edit]

The persistent disks of Compute Engine deliver higher IOPS consistently.[11] With the cost of provisioned IOPS included within the cost of storage, users need not pay separately for the IOPS.[12]

Global scope for images and snapshots

[edit]

Images and disk snapshots belong to the global scope which means they are implicitly available across all the regions and zones of Google Cloud Platform.[13] This avoids the need for exporting and importing images and snapshots between regions.

Transparent maintenance

[edit]

During the scheduled maintenance of Google data center, Compute Engine can automatically migrate the VMs from one host to the other without involving any action from the users. This delivers better uptime to applications.[14][15]

References

[edit]
[edit]
Revisions and contributorsEdit on WikipediaRead on Wikipedia
from Grokipedia
Google Compute Engine is an infrastructure-as-a-service (IaaS) offering from that enables users to create and manage (VM) instances and bare metal servers on Google's global data center infrastructure. It provides scalable compute resources, allowing customers to run diverse workloads such as web applications, databases, , and without managing underlying hardware. Launched in preview on June 28, 2012, and achieving general availability on December 2, 2013, Compute Engine utilizes a KVM-based to deliver reliable, self-managed instances with options for and Windows operating systems, while bare metal servers provide direct hardware access without . Compute Engine supports a variety of machine types tailored to specific needs, including general-purpose (e.g., , E2), compute-optimized (e.g., C2), memory-optimized (e.g., ), and accelerator-optimized instances equipped with GPUs, Google's custom Tensor Processing Units (TPUs), or Arm-based processors like for AI and tasks. Users can customize configurations for vCPUs, memory, and storage, with options like Persistent Disk for block storage, Local SSD for high-performance temporary storage, and Hyperdisk for advanced throughput. The service guarantees 99.9% uptime for most instances and 99.95% for memory-optimized VMs, featuring to minimize downtime during maintenance. Integrated seamlessly with other Google Cloud services, Compute Engine facilitates container orchestration via Google Kubernetes Engine (GKE), data analytics with , and storage with , enabling hybrid and multi-cloud architectures. Pricing models include pay-as-you-go, Spot VMs for up to 91% discounts on interruptible workloads, and committed use discounts for predictable savings, with a free tier offering one e2-micro instance monthly. Available across 42 regions and 127 zones worldwide as of 2025, it emphasizes security through features like shielded VMs, customer-managed encryption keys, and compliance with standards such as SOC, PCI DSS, and HIPAA.

History

Launch and Early Development

Google Compute Engine was announced on June 28, 2012, during the developer conference as a limited preview service within the (GCP). This launch marked Google's entry into the (IaaS) market, offering users the ability to provision and manage virtual machines (VMs) on its global infrastructure without the need to handle underlying hardware. The service was positioned to compete with offerings like Amazon EC2, emphasizing Google's strengths in , , and cost-efficiency, with claims of providing 50% more compute power per dollar compared to competitors. At launch, Google Compute Engine focused on delivering KVM-based virtual machines primarily for operating systems, enabling developers and businesses to run large-scale workloads such as web applications, , and . Initial VM configurations supported up to 8 virtual CPUs and 3.75 GB of RAM per core, with persistent block storage for data durability. Key early integrations with other GCP services, such as , allowed users to store and access unstructured data directly from VMs, facilitating seamless workflows for applications requiring object storage alongside compute resources. Access to the limited preview required sign-up and was initially restricted to selected developers, with no public pricing or general availability timeline disclosed. In early 2013, the service transitioned from limited preview to a broader beta phase, ending the and requiring users to provide details for continued access. By May 2013, opened the beta to all users via the Google Cloud Console, expanding availability and introducing initial machine type offerings like the n1-standard series, which balanced CPU and memory for general-purpose workloads (e.g., n1-standard-1 with 1 vCPU and 3.75 GB RAM). During this beta period, support for additional distributions grew, and foundational features like for maintenance were tested to ensure . Windows support was introduced in limited preview later in development, broadening OS compatibility. The service reached general availability on December 2, 2013, with a 99.95% monthly uptime SLA, 24/7 support, and reduced pricing to encourage broader adoption. This milestone solidified Google Compute Engine's role in GCP, transitioning it from experimental preview to a production-ready IaaS platform capable of supporting enterprise-scale deployments.

Major Milestones and Updates

Following its general availability in , Google Compute Engine saw key enhancements in operating system support and pricing models. In April 2014, the service introduced sustained use discounts, which automatically apply up to a 30% reduction for instances running more than 25% of a billing month, optimizing costs for long-running workloads without requiring commitments. support launched in limited preview that same year, enabling users to run workloads on the platform, with expanded capabilities—including license mobility for existing on-premises licenses—announced on December 8, 2014. Compute options diversified further in 2015 and 2016 to address interruptible and accelerated workloads. Preemptible virtual machines (now known as Spot VMs) debuted in beta on May 18, 2015, offering up to 70% discounts compared to on-demand pricing for batch jobs tolerant of interruptions (with current Spot VMs offering up to 91% discounts), and achieved general availability in September 2015. Initial GPU-accelerated instances were announced on November 16, 2016, powered by NVIDIA Tesla K80 cards, and became available worldwide in early 2017 to support , data analytics, and tasks. Infrastructure growth accelerated through the late 2010s, with regions and zones expanding to enhance global availability and reduce latency. By mid-2020, Google Cloud had grown to 24 regions across 73 zones in 17 countries, up from just a handful at launch, facilitating broader adoption for distributed applications. Integration with AI and advanced notably in 2020, when Confidential Computing launched with Confidential VMs on Compute Engine; these use hardware-based trusted execution environments to encrypt data in use, protecting sensitive AI/ML models and processing without performance overhead. Recent updates from 2024 to 2025 emphasize performance for AI-driven and specialized workloads. In July 2024, Hyperdisk ML entered general availability as a high-throughput block storage option tailored for , delivering up to 1,200,000 MBps read throughput per volume to accelerate data loading for training pipelines across up to 2,500 attached VMs. September 2025 brought general availability of Flex-start VMs, which support short-duration tasks up to seven days using a flexible provisioning model that consumes Spot quota for cost savings on bursty or experimental workloads. The G4 accelerator-optimized machine series followed in October 2025, featuring NVIDIA RTX PRO 6000 Blackwell GPUs for graphics-intensive applications like virtual desktops and Omniverse simulations, available in multiple regions with low-latency networking. November 2025 marked further hardware innovations, with the N4D VM series achieving general availability on November 7, powered by fifth-generation processors and offering up to 96 vCPUs, 768 GB of DDR5 memory, and Titanium I/O for general-purpose tasks in regions like us-central1. On November 6, the N4A series entered preview, utilizing Google's custom processors based on N3 architecture, with configurations up to 64 vCPUs and 512 GB DDR5 for efficient, scalable AI inference and web serving in limited regions such as us-central1 and europe-west3. These developments underscore ongoing efforts to balance cost, performance, and security in .

Overview and Core Concepts

Virtual Machine Instances

A (VM) instance in Google Compute Engine is a self-managed virtual server that runs on Google's using a KVM-based , allowing users to deploy and operate workloads on customizable compute resources. These instances support both and Windows operating systems and can be configured for a wide range of applications, from web servers to tasks. The lifecycle of a Compute Engine VM instance progresses through distinct states, including provisioning (where resources are allocated), running (when the instance is active and operational), stopping (where the instance is shut down but resources are preserved), and terminating (where the instance is deleted and resources are released). Users can monitor and manage these states to ensure efficient resource utilization and application availability throughout the instance's duration. Instances are created through the Google Cloud Console for a graphical interface, the CLI for command-line automation, or the Compute Engine for programmatic integration, with key steps involving selection of a machine type, bootable , and deployment zone. This process enables rapid deployment tailored to specific workload requirements, such as compute capacity and geographic placement. For scalable deployments, Compute Engine supports instance groups, which manage collections of identical VMs; managed instance groups (MIGs) provide advanced features like automatic healing, rolling updates, and autoscaling based on metrics such as CPU utilization or custom load balancing. MIGs ensure by distributing instances across multiple zones and dynamically adjusting group size to match demand. In September 2025, introduced Flex-start VMs in general availability, a feature for single-instance deployments with runtime limits up to seven days, optimized for bursty workloads like AI training or through a queuing system that improves resource access efficiency. Compute Engine also offers bare metal instances, which provide direct hardware access without overhead, catering to low-latency applications such as financial trading or real-time that require maximal performance and minimal interference. VM instances can attach to persistent storage options for durable data management, with details on these attachments covered in dedicated storage sections.

Basic Resource Units

In Google Compute Engine, the fundamental resources are measured primarily in terms of virtual CPUs (vCPUs) and gigabytes (GB) of memory, which form the core building blocks for virtual machine instances. Historically, Google introduced the Google Compute Engine Unit (GCEU) as an abstraction for CPU capacity, where 2.75 GCEUs represented the compute power equivalent to one logical CPU core on an n1-standard-1 instance; however, this metric has been largely superseded in modern usage by direct vCPU and memory allocations for simplicity and alignment with hardware capabilities. A vCPU in Compute Engine represents a single hardware hyper-thread (or thread) on the underlying physical processors, which include Intel Xeon Scalable, AMD EPYC, and Arm-based (Tau) CPUs. By default, simultaneous multithreading (SMT, also known as hyper-threading) is enabled, allowing two vCPUs to share one physical core, thereby providing efficient resource utilization without dedicating full cores unless specified otherwise via configuration options. vCPUs can be allocated from 1 up to 384 per instance, depending on the machine type and series, with the exact mapping to physical hardware determined by the selected CPU platform. Memory is allocated in increments of GB and is closely tied to vCPU counts, with predefined ratios varying by machine family to balance performance needs. For general-purpose standard machine types, the typical ratio is 4 GB of memory per vCPU, though ranges can extend from 3 to 7 GB per vCPU; specialized families like high-memory types offer up to 24 GB per vCPU, while high-CPU types provide as low as 0.9 GB per vCPU to prioritize processing power. Custom allocations allow flexibility within these bounds, ensuring memory scales proportionally to computational demands. Disk resources are provisioned as block storage in GB, with Persistent Disks serving as the primary unit for durable, scalable storage attached to instances; quotas limit total disk size per , with default limits varying by and often starting in the terabyte range for standard Persistent Disk, though these can encompass both SSD and HDD variants. Network bandwidth is another key allocatable unit, measured in Gbps for ingress and egress; while ingress is unlimited, egress bandwidth is capped per instance based on machine type—ranging from 1 Gbps for small instances to 200 Gbps for high-performance series—with premium Tier_1 networking options enabling higher sustained throughput for data-intensive workloads. Compute Engine enforces quotas to manage resource availability, with default limits applied per and to prevent overuse; for example, the standard CPU quota (total vCPUs) often starts at 8-24 per region for new projects as of early 2025, alongside corresponding memory quotas, and boot disks have a minimum size of 10 GB. These quotas are visible and adjustable via the Google Cloud console, where users can request increases through a form-based process, typically approved based on usage history and justification to accommodate scaling needs.

Infrastructure and Locations

Regions and Zones

Google Compute Engine organizes its infrastructure into regions and zones to provide geographical distribution, , and compliance options for deployments. A is an independent geographic area, such as us-central1 in , , that spans one or more physical locations and contains multiple zones. Each region operates independently, allowing users to select locations based on specific needs while ensuring resources within a region can communicate with low latency. As of November 2025, Google Cloud operates 42 regions worldwide, with expansions including new facilities in Europe, such as Stockholm, Sweden (europe-north2), and in North America, such as Querétaro, Mexico (northamerica-south1). Zones represent isolated locations within a , designed to enhance by isolating failures such as power outages or network issues to a single zone without affecting others in the same . For example, the us-central1 includes zones like us-central1-a, us-central1-b, and us-central1-c, each hosting a of the 's capacity. With 127 zones available as of 2025, users can deploy instances across multiple zones within a to achieve , as resources in different zones are engineered to be failure-independent. Selecting regions and zones involves evaluating factors like latency, , and service to optimize and meet legal requirements. For instance, to minimize latency for users in , one might choose the europe-west1 region in , while data residency rules such as the EU's GDPR may necessitate deploying in European regions to keep within the continent. considerations include checking zone-specific quotas and schedules to ensure uninterrupted operations. Multi-regional resources, such as replicated storage buckets, enable global replication across multiple for enhanced durability and accessibility, though their use ties into broader resource scoping policies.

Resource Scopes and Placement Policies

In Google Compute Engine, resources are organized into scopes that determine their availability and accessibility across the infrastructure. Zonal resources, such as instances, are confined to a single zone within a and can only interact with other resources in that same zone. Regional resources, including managed instance groups (MIGs), span multiple zones within a single , enabling broader distribution for improved . Global resources, like custom images and snapshots, are accessible across all regions and zones, facilitating reuse without location-specific constraints. Placement policies in Compute Engine allow users to control the physical distribution of virtual machines to optimize for reliability, performance, or latency. The compact placement policy groups instances closely together on the same underlying hardware or within the same , reducing inter-instance communication latency, which is particularly useful for tightly coupled workloads like applications. In contrast, the spread placement policy distributes instances across distinct hardware to minimize the risk of correlated failures from hardware or zonal outages, enhancing overall for mission-critical services. The default "any" policy imposes no specific constraints, allowing the system to place instances based on . These placement policies effectively implement affinity and anti-affinity principles for instance placement. Compact policies enforce affinity by co-locating instances to promote low-latency interactions, while spread policies apply anti-affinity by separating them to avoid single points of failure, thereby supporting strategies for high availability without requiring custom scripting. At a higher level, Compute Engine resources are managed within a hierarchical structure that aligns with Google Cloud's overall organization. Projects serve as the primary containers for resources, where all Compute Engine instances, disks, and networks are created and billed. Folders provide optional intermediate grouping for projects, enabling structured organization by department or environment, while the organization node at the top represents the root for an entire enterprise, enforcing policies and access controls across the hierarchy. This structure ensures isolated, scalable management of resources while inheriting permissions downward. A recent enhancement to regional MIGs, introduced in public preview as of November 2025, allows automatic repair of failed virtual machines in an alternate zone within the same when the primary zone is unavailable. This feature requires enabling update-on-repair and helps maintain instance group health during zonal disruptions, further bolstering availability without manual intervention.

Compute Resources

Machine Types

Google Compute Engine offers a variety of predefined machine type families tailored to different workload requirements, balancing vCPU, memory, and other resources for optimal performance and cost-efficiency. These families include general-purpose, compute-optimized, memory-optimized, accelerator-optimized, and storage-optimized types, each with specific series designed for common use cases such as web serving, , in-memory databases, machine learning inference, and high-I/O . Machine types determine the vCPU-to-memory ratios, networking bandwidth, and other capabilities, allowing users to select configurations that align with their application's needs without custom modifications. The general-purpose machine family, suitable for versatile workloads like web servers, containerized applications, and development environments, encompasses the , N2, and N4 series. The series, an earlier generation, supports up to 96 vCPUs with a of 6.5 GB per vCPU and networking bandwidth up to 32 Gbps, providing balanced performance for standard tasks. The N2 series, powered by processors (with Ice Lake for instances over 80 vCPUs), scales to 128 vCPUs at 8 GB per vCPU and up to 32 Gbps networking, offering improved price-performance for medium-scale applications. The N4 series extends this with up to 80 vCPUs at 8 GB per vCPU and 50 Gbps networking, while the N4D variant, based on processors, reaches 96 vCPUs with the same and became generally available in November 2025 for enhanced flexibility in general workloads. Compute-optimized machine types, such as the C2 and C3 series, prioritize high-frequency CPUs for demanding tasks including (HPC), , and game servers. The C2 series delivers up to 60 vCPUs with 4 GB memory per vCPU and sustained all-core turbo frequencies up to 3.8 GHz, paired with up to 32 Gbps networking for compute-intensive operations. The C3 series advances this capability to 176 vCPUs at 8 GB per vCPU, supporting even larger-scale HPC and AI training workloads with networking bandwidth up to 100 Gbps. Memory-optimized types like the M1 and series are engineered for applications requiring substantial RAM, such as in-memory databases, caching layers, and deployments. The M1 series accommodates up to 160 vCPUs with up to 24 GB memory per vCPU (totaling over 3.8 TB), and networking up to 32 Gbps to handle data-heavy queries efficiently. The series focuses on ultra-high configurations, supporting 208–416 vCPUs with as much as 12 TB total (approximately 28 GB per vCPU in larger instances), ideal for analytics and real-time processing with the same networking bandwidth. Accelerator-optimized machine types, including the A2, A3, and G2 series, integrate GPUs for graphics rendering, inference, and generative AI tasks. The A2 series pairs up to 96 vCPUs with 16 NVIDIA A100 GPUs and up to 100 Gbps networking, optimized for large-scale ML training. The A3 series scales to 224 vCPUs with 8 H100 GPUs and exceptional 3,200 Gbps networking, targeting advanced AI workloads. The G2 series, featuring L4 GPUs, supports up to 96 vCPUs with 8 GPUs per instance and 100 Gbps networking, particularly suited for graphics-intensive applications like remote visualization and . Storage-optimized machine types, represented by the Z3 series, cater to high-I/O workloads such as SQL/NoSQL , data analytics, and vector databases requiring rapid local storage access. These instances provide up to 176 vCPUs with 36 TiB of local SSD storage and networking bandwidth up to 100 Gbps, enabling low-latency data throughput for scale-out storage systems.
Machine FamilyKey SeriesvCPU RangeMemory Ratio (GB/vCPU)Max Networking BandwidthPrimary Use Cases
General-purposeN1, N2, N4/N4DUp to 1286.5–832–50 GbpsWeb servers,
Compute-optimizedC2, C3Up to 1764–8Up to 100 GbpsHPC, AI/ML batch jobs
Memory-optimizedM1, M2Up to 41614–28Up to 32 GbpsIn-memory
Accelerator-optimizedA2, , G2Up to 224Varies (8–16 base)100–3,200 GbpsML training, graphics
Storage-optimizedZ3Up to 176VariesUp to 100 GbpsHigh-I/O , analytics

Custom Configurations

Google Compute Engine allows users to create custom machine types that enable precise specification of virtual CPUs (vCPUs) and to match specific workload requirements, offering greater flexibility than predefined machine types. For example, a user can configure an instance with exactly 10 vCPUs and 60 GB of memory using the format custom-10-61440 (where is specified in MB), which is particularly useful for applications needing non-standard resource ratios, such as memory-intensive databases or compute-light services. Memory allocations must be in multiples of 256 MB, and the total configuration must align with the supported machine series, such as N2 or E2. Constraints on custom machine types ensure compatibility with underlying hardware. In standard configurations, memory per vCPU ranges from 0.9 GB to 6.5 GB, though this varies by series—for instance, N1 series supports 0.922 GB to 6.656 GB per vCPU. options, available for series like N4, N4A, N2, and N1, remove the per-vCPU upper limit, allowing up to 8 GB or more per vCPU (e.g., up to 624 GB total for N1), billed at a premium rate to support workloads like large-scale analytics. vCPUs can generally be specified in multiples of 1 starting from 1, except for certain series like E2, which require multiples of 2 up to 32 vCPUs. Sole-tenant nodes provide dedicated physical hardware isolation for custom machine types, ensuring that VMs run exclusively on servers reserved for a single project to meet compliance or security needs. These nodes support custom configurations in compatible series like N2, where VMs must match the node's machine series but can vary in size within the node's total capacity (e.g., up to 80 vCPUs and 640 GB memory for an n2-standard-80 node). Custom machine types integrate with accelerators for enhanced performance in AI and . Users can attach GPUs (e.g., A100 or T4) or Google TPUs to custom VMs in supported series like or A2, enabling tailored setups such as a custom-32-225280 instance with four T4 GPUs for training. As of November 2025, the Arm-based N4A series, powered by Google's processor on the N3 platform, supports custom machine types in preview, offering up to 64 vCPUs and 512 GB of DDR5 memory with extended memory options for cost-effective general-purpose workloads.

Storage Options

Persistent Disks

Persistent Disks are block storage devices in Google Compute Engine that provide durable, high-availability storage independent of (VM) instances, allowing data to persist even if the instance is stopped or terminated. They function like physical hard drives but are managed by Google Cloud, offering features such as live attachment and detachment to running VMs without downtime. Google Compute Engine offers several types of Persistent Disks to suit different workloads, balancing cost, , and latency requirements. Standard Persistent Disks (pd-standard) use hard disk drives (HDDs) and are optimized for large-scale sequential read/write operations, such as media serving or data analytics, with scaling at 0.75 read and 1.5 write per GiB of provisioned space, up to a maximum of 7,500 read and 15,000 write per instance on larger machines. Balanced Persistent Disks (pd-balanced) employ solid-state drives (SSDs) for a cost-effective mix of and price, delivering up to 6 (read and write) per GiB, with a baseline of 3,000 and maximums reaching 80,000 per instance, suitable for general-purpose applications like web servers. SSD Persistent Disks (pd-ssd) provide high- storage for demanding workloads such as databases, offering up to 30 (read and write) per GiB and peaking at 100,000 per instance, with throughput limits of 1,200 MiBps for reads and writes. For workloads requiring predictable latency, Extreme Persistent Disks (pd-extreme) allow provisioning of up to 120,000 and 4,000 MiBps throughput for reads, ensuring consistent without scaling solely on disk . Across all types, scales with disk and the number of vCPUs in the attached VM instance, but is capped by per-instance limits to prevent overload. Persistent Disks can be sized from a minimum of 10 GB to a maximum of 64 TB per volume, with sizes adjustable in 1 GB increments; for greater capacity, multiple disks can be combined using software configurations within the VM. Up to 128 Persistent Disk volumes (including the boot disk) can be attached to a single VM instance, supporting a total attached capacity of up to 257 TiB, which enables scalable storage setups for complex applications. All Persistent Disks are encrypted at rest by default using Google-managed encryption keys, ensuring data security without additional configuration; alternatively, users can opt for customer-supplied encryption keys (CSEK) to manage their own 256-bit AES keys, providing greater control over for compliance needs, though Google does not store these keys and data becomes inaccessible if they are lost. Data in transit between the disk and VM is also encrypted. For backup, Persistent Disks support incremental snapshots that can be created and managed separately.

Local SSD and Hyperdisk

Google Compute Engine offers Local SSD as an ephemeral storage option that provides high-performance, low-latency block storage physically attached to the host machine running the instance. This storage uses NVMe or interfaces and is designed for temporary workloads where data persistence is not required, as all data on Local SSD disks is lost when the instance stops, is preempted, or the host encounters an error. Unlike persistent disks, which maintain data independently of the instance lifecycle, Local SSD emphasizes speed over and cannot be detached from the instance or used for snapshots. Local SSD disks come in standard and variants, with each disk offering 375 GiB of capacity, though SSD supports up to 6 TiB per disk on certain bare metal configurations. Instances can attach multiple disks, enabling up to 72 TiB of total Local SSD capacity, depending on the machine type and series (e.g., Z3 series allows 12 disks of 6 TiB each). Performance scales with the number of disks and interface; for example, SSD on NVMe can deliver up to 9,000,000 read , 6,000,000 write , 36,000 MiB/s read throughput, and 30,000 MiB/s write throughput. Common use cases include caching, scratch space for high-I/O applications like (e.g., temporary tables in SQL Server), and transient data processing in environments. Limitations include incompatibility with shared-core machine types, the inability to add disks after instance creation, and no support for custom keys or data preservation beyond preview features for live migrations. Hyperdisk provides a family of durable, high-performance block storage volumes that can be customized for and throughput independently of capacity, making it suitable for demanding workloads while maintaining data persistence across instance restarts. Available in several types—Balanced, Balanced , Extreme, Throughput, and ML—Hyperdisk volumes attach directly to instances like physical disks and support features such as regional replication for and, for the ML variant, sharing across multiple read-only instances with limits varying by volume size (up to 2,500 for volumes ≤256 GiB and lower for larger volumes). The Balanced type offers a general-purpose balance with up to 160,000 and 2,400 MiB/s throughput; Extreme prioritizes at up to 350,000 with 5,000 MiB/s throughput; and Throughput focuses on with up to 2,400 MiB/s at lower . Hyperdisk ML, optimized for AI and machine learning workloads, delivers the highest performance in the family, with up to 1,200,000 MiB/s throughput and 19,200,000 IOPS, enabling faster model loading and reduced idle time for accelerators in inference and training scenarios. This type supports volumes from 4 GiB to 64 TiB and is particularly useful for immutable datasets like model weights, where multiple instances can access the same volume in read-only mode for large-scale HPC or analytics tasks such as those in Hadoop or Spark. It became generally available in 2024, enhancing support for AI-driven applications. Limitations for Hyperdisk include restrictions on using Extreme, ML, or Throughput types as boot disks, zonal-only availability for ML volumes, and the need to adjust performance settings in increments (e.g., throughput every 6 hours), with attachment limits varying by volume size (up to 2,500 for ≤256 GiB, lower for larger).
Hyperdisk TypeMax IOPSMax Throughput (MiB/s)Key Focus
Balanced160,0002,400General-purpose workloads
Extreme350,0005,000High random I/O
Throughput9,6002,400Sequential access
ML19,200,0001,200,000AI/ML data loading

Networking and Connectivity

Virtual Private Cloud

Google Compute Engine (GCE) utilizes Virtual Private Cloud (VPC) networks as the foundational networking layer, providing isolated, scalable virtual environments for resources like virtual machine (VM) instances. A VPC network acts as a global, virtual version of a physical network, spanning multiple regions without the need for physical cabling, and enables users to define subnets within specific regions for logical segmentation of resources. These networks support auto-mode or custom-mode configurations, where auto-mode automatically creates subnets in every region, while custom-mode allows manual definition of IP ranges and subnet placements to suit workload requirements. IP addressing in GCE VPCs includes internal IPv4 and addresses assigned to instances, with support for both dual-stack and IPv6-only configurations to accommodate modern networking needs. External IPv4 addresses can be optionally attached to instances for public , while alias IP ranges enable secondary IP assignments to VMs or load balancers without additional network interfaces. Subnets are associated with primary IP ranges (CIDR blocks) and can include secondary ranges for alias IPs, ensuring efficient address management across regional deployments. support, introduced in 2022 and expanded thereafter, allows global addresses for improved scalability in IPv6-enabled workloads. Firewall rules in VPC networks control ingress and egress using distributed, stateful firewalls that apply to all instances within the network. Rules are defined by priority (lower numbers take precedence), direction, and action (allow or deny), with matching based on IP protocols, ports, source/destination IP ranges, and instance tags or service accounts for granular . For example, a common rule might allow HTTP (port 80) from any source to instances tagged "web-server," while denying all other ingress to minimize exposure. These rules are enforced at the instance level but defined at the VPC level, with a default quota of 1000 rules per project, and hierarchical firewall policies available for enterprise-scale management. Routes in VPC networks direct , including for internet-bound and custom static routes for on-premises or peered network connectivity. The system-generated (0.0.0.0/0) handles outbound to the via Google's edge routers, while custom routes can specify next-hop types such as VM instances, VPN tunnels, or interconnects with metrics to prioritize paths. Route propagation is automatic for connected networks, ensuring dynamic updates without manual intervention in most cases. VPC Network Peering enables secure, low-latency connectivity between multiple VPC networks, either within the same , across projects, or between different organizations, without requiring gateways or VPNs. Peering connections exchange routes automatically (unless disabled), allowing instances in peered networks to communicate using internal IP addresses as if they were in the same network, which is particularly useful for multi- architectures or hybrid cloud setups. Limitations include no transitive peering (direct connections only) and non-overlapping IP ranges to prevent conflicts. Shared VPCs extend this capability by allowing centralized network administration across projects, where a host project owns the VPC and subnets are shared with attached projects for deployment.

Load Balancing and IP Management

Google Cloud Load Balancing provides scalable traffic distribution for Compute Engine instances, supporting various types tailored to different protocols and scopes. The platform offers external Application Load Balancer for HTTP(S) traffic, which operates globally using proxy-based distribution to handle content-based routing and SSL offloading. External passthrough Network Load Balancer supports TCP/SSL/UDP protocols with non-proxied forwarding for low-latency applications, while Internal Application and Passthrough Load Balancers manage intra-VPC traffic for private services. These load balancers integrate with managed instance groups (MIGs) as backend services, enabling automatic distribution of traffic across multiple VM instances for and . IP address management in Compute Engine distinguishes between ephemeral and static external IPs to support reliable external connectivity. Ephemeral external IPs are automatically assigned from Google's pool upon VM creation and released when the instance stops or terminates, making them suitable for temporary workloads but unsuitable for services requiring persistent addressing. Static external IPs can be reserved in advance or promoted from an existing ephemeral IP, ensuring consistent public access for DNS records or external integrations, with options for regional or global scopes. Reservations allow pre-allocation without attachment to a specific instance, facilitating flexible assignment across projects or regions. Global load balancing leverages IP addressing to route traffic to the nearest healthy backend across worldwide regions, minimizing latency for multi-region deployments. A single anycast IP serves as the frontend, with Google's edge network directing packets based on proximity and backend health, supporting both premium and standard network tiers for optimized performance. This approach enables seamless and content delivery integration, such as with Cloud CDN, for applications spanning multiple zones or regions. Autoscaling integrates with load balancing through backend services and health checks to dynamically adjust instance counts based on traffic demands. Health checks probe instance groups at configurable intervals to verify responsiveness, removing unhealthy backends from load distribution and triggering autoscaling policies. Autoscalers can base decisions on load balancing capacity metrics, such as serving capacity or HTTP request rates, ensuring resources scale in tandem with incoming traffic while integrating with MIGs for rolling updates. Recent enhancements to global load balancing include traffic isolation policies, introduced in May 2025, which route requests preferentially to the closest region for multi-region applications, reducing latency in preview mode. Additionally, failover capabilities for global external Application Load Balancers, reaching general availability in November 2024, provide regional backup backends for improved resilience in distributed setups. These updates build on prior optimizations like service load balancing policies from July 2024, enhancing multi-region traffic management.

Images and Snapshots

Operating System Images

Google Compute Engine provides a variety of preconfigured operating system (OS) images that users can select to boot (VM) instances, ensuring compatibility with Google's infrastructure. These images include popular distributions and editions, all optimized for cloud workloads with built-in support for features like automatic security updates and networking. images are maintained by Google or partners and are available at no additional licensing cost for most variants, while images incur on-demand licensing fees. Among the supported Linux public images are (versions 13, 12, and 11), (such as 24.04 and 22.04), and (10 and 9), each with default disk sizes ranging from 10 GB to 20 GB and configurations that disable root password login for enhanced . For Windows, public images encompass Server 2022, 2019, 2016, and the 2025 edition, which achieved general availability in late and supports extended updates until November 2034; these images enable automatic updates and integration with Cloud tools like the guest environment for metadata access. Users can list and select these images via the Cloud console or CLI commands, with regular patches applied for critical vulnerabilities. In addition to public images, users can create custom images to tailor environments with specific software or configurations. Custom images are generated from existing boot disks, snapshots, or imported virtual disks stored in Cloud Storage, allowing for the pre-installation of applications before launching instances. This approach supports scenarios like migrating on-premises workloads or standardizing VM setups across deployments. To manage version updates efficiently, Google Compute Engine uses image families, which are logical groupings of related images within projects like debian-cloud or ubuntu-os-cloud. For instance, the debian-11 family always references the latest non-deprecated Debian 11 image, enabling rolling updates without manual intervention; if issues arise, administrators can deprecate the current image to revert to a prior stable version. This mechanism ensures access to the most recent stable releases while avoiding end-of-life versions. Deprecated images, such as those for and 20.04, enter an end-of-support phase where ceases updates and eventually deletes them from public availability, prompting users to migrate to supported alternatives. During , image families automatically exclude these versions, and existing VMs can continue running but without patches or compatibility guarantees; extended paid support may be available for select OSes like Windows via Microsoft's programs. Images in Google Compute Engine operate on a global scope, allowing them to be shared seamlessly across projects and regions without duplication. Public images are inherently accessible project-wide, while custom images can be exported to or granted permissions via IAM policies for use in other projects, facilitating consistent deployments in multi-project environments.

Snapshots and Data Backup

Google Compute Engine provides disk snapshots as a mechanism for backing up data from persistent disks and Hyperdisks. These snapshots capture the contents of a disk at a specific point in time and serve as incremental backups, storing only the data that has changed since the previous snapshot to optimize storage efficiency. All disk snapshots are encrypted at rest using Google-managed keys, and they are stored in , with options for multi-regional or regional locations to ensure durability and availability. Disk snapshots can be created manually through the Google Cloud console, CLI, or APIs, allowing users to initiate backups on demand. Automated creation is supported via snapshot schedules, which enable periodic backups at user-defined intervals, such as daily or weekly, configurable through the APIs. Retention policies can be applied to manage snapshot lifecycle, with standard snapshots suitable for short- to medium-term retention and archive snapshots designed for long-term storage at lower costs; snapshots persist independently of the source disk and can be retained indefinitely until manually deleted. To restore data from a disk snapshot, users create a new persistent disk or Hyperdisk from the snapshot, which must be at least as large as the original source disk. This new disk can then be attached to a running or new instance using the console, commands, or APIs, after which the is mounted to access the restored data. The restoration process supports both zonal and regional disks, enabling quick recovery without for the original instance. In addition to disk-level backups, Google Compute Engine offers machine image snapshots, which provide full backups of an entire instance. A machine image captures the instance's configuration, metadata, permissions, operating system, and data from all attached disks in a crash-consistent manner, using differential snapshots for subsequent images to store only changes from prior versions. These are particularly useful for instances, , or replicating environments across projects. Disk and machine image snapshots support global scope, allowing them to be created and restored in any or zone within the . Standard and archive snapshots are automatically replicated across multiple s for high (up to 99.999999999% over a year), facilitating disaster recovery by enabling quick restoration in a secondary during outages.

Features and Capabilities

Performance Optimizations

Google Compute Engine offers various CPU platforms to optimize (VM) performance based on workload requirements, supporting processors such as Granite Rapids and for general-purpose and compute-intensive tasks, EPYC processors including and for cost-effective scale-out applications, and Arm-based processors like Google Axion and Grace for energy-efficient AI and cloud-native workloads. These platforms enable users to select machine series tailored to specific architectures, with providing advanced vector extensions like for , offering strong multi-threaded performance for , and Arm delivering up to 50% better price-performance in certain inference scenarios. Transparent maintenance in Compute Engine is achieved through automatic , which seamlessly transfers running VMs to healthy hosts during infrastructure events like hardware repairs or software updates without , , or changes to instance configurations such as IP addresses or attached storage. This process involves a brief blackout period of under one second, during which the VM's memory state is copied to the target host, ensuring for most workloads while excluding specialized setups like those with attached GPUs or large local SSDs. Disk performance can be enhanced by tuning IOPS and throughput provisions, particularly with Hyperdisk volumes that allow dynamic adjustments every four to six hours without detaching the disk, enabling workloads to scale from baseline levels up to 350,000 for Hyperdisk Extreme or 1,200,000 MiB/s throughput for Hyperdisk ML. For read-heavy applications, Hyperdisk supports asynchronous replication to create read replicas in a secondary region, providing low-latency access to duplicated data while maintaining primary write performance. Optimization techniques include aligning application I/O patterns with provisioned limits and using tools like fio for to avoid contention across multiple disks. Acceleration for machine learning workloads is facilitated by attaching GPUs or TPUs to VMs, with GPU-enabled machine types like the A4 series integrating up to eight NVIDIA B200 GPUs for training large models and the G2 series supporting up to eight L4 GPUs for efficient inference. TPUs, optimized for tensor operations, can be deployed as TPU VMs directly connected to Compute Engine instances for hybrid setups, accelerating frameworks like with up to 10x peak performance gains over previous generations in training and serving tasks. These attachments require compatible machine types and zones, ensuring seamless integration for data processing and graphics-intensive applications. As of November 2025, the N4D machine series, powered by processors and featuring up to 768 GB of DDR5 , delivers up to 3.5x better price-performance for web-serving workloads, 50% higher performance for general , and 70% higher for workloads compared to the prior N2D series, enhancing VM efficiency for web serving and Java-based workloads. This update supports custom machine types with up to 96 vCPUs and a 4.1 GHz max-boost , providing substantial gains in memory-bound scenarios without altering existing migration policies.

Management and Automation Tools

Google Compute Engine provides several tools for orchestrating, monitoring, and automating the lifecycle of instances, enabling efficient at scale. Cloud Deployment Manager is an infrastructure-as-code service that automates the creation, updating, and deletion of Compute Engine resources through declarative configuration files written in or Python templates. It supports deploying instance groups, networks, and disks by leveraging underlying Compute Engine APIs, allowing users to version and reuse infrastructure definitions for consistent environments. Note that Cloud Deployment Manager will reach end of support on March 31, 2026, with recommendations to migrate to alternatives like Infrastructure Manager. For broader orchestration, Google Compute Engine integrates seamlessly with Terraform, an open-source infrastructure-as-code tool from . Users can provision and manage Compute Engine instances, such as virtual machines and autoscalers, using Terraform's declarative language and the official Google provider, which translates configurations into API calls for creating resources like google_compute_instance. This integration supports complex setups, including state management and dependency handling, to ensure reproducible deployments across projects. Monitoring and logging capabilities are essential for maintaining Compute Engine operations, with Cloud Monitoring collecting metrics such as CPU utilization, disk I/O, and network traffic from virtual machine instances via the Ops Agent. Users can create dashboards, set alerting policies based on thresholds, and visualize performance trends to proactively address issues. Complementing this, Cloud Logging aggregates and analyzes logs from Compute Engine VMs, including system events and application outputs, enabling real-time search, filtering, and export for compliance and troubleshooting. These tools integrate with other Google Cloud services to provide unified observability across hybrid environments. Autoscaling in Compute Engine is handled through managed instance groups (MIGs), which automatically adjust the number of virtual machine instances based on predefined policies tied to metrics like CPU usage, memory consumption, or custom signals from Cloud Monitoring. For example, a policy can scale out by adding instances when average CPU exceeds 60% and scale in when it drops below 40%, ensuring resource elasticity without manual intervention. This feature supports both zonal and regional MIGs, with options for predictive scaling based on historical load patterns to minimize latency during traffic spikes. Operating system management is streamlined via OS Config, a service within VM Manager that automates patching, compliance reporting, and configuration enforcement on Compute Engine instances. It applies OS updates using native mechanisms for supported images like , RHEL, and Windows, with scheduling options to patch during maintenance windows and assess compliance against baselines such as CIS benchmarks. The OS Config agent, installed on VMs, reports patch status and vulnerabilities, enabling fleet-wide remediation without downtime risks. Automation is further enhanced through the gcloud CLI, part of the Google Cloud SDK, which provides scripting-friendly commands for managing Compute Engine resources programmatically. Commands like gcloud compute instances create and gcloud compute instance-groups managed list-instances support batch operations, filtering, and output formatting in or for integration into pipelines or shell scripts. As of November 10, 2025, observability fields for reservations became generally available, allowing users to query via or CLI which reservations a VM consumes and list VMs attached to specific reservations, improving visibility into committed resource utilization.

Billing and Pricing

Pricing Models

Google Compute Engine employs a pay-as-you-go model for (VM) instances, where costs are calculated based on the resources consumed, including vCPUs, , and storage. Billing occurs per second after a 10-minute minimum charge, allowing for flexible usage without long-term commitments. For on-demand instances, vCPUs and are priced separately; for example, in the us-central1 region, an N2 vCPU may cost approximately $0.03465 per hour, while is around $0.003938 per GiB-hour, with rates varying by machine family and region. Spot VMs offer a cost-effective alternative for interruptible workloads, providing discounts typically up to 91% (previously 60-91%) compared to on-demand prices, with that can adjust up to once per day based on . Starting October 28, 2025, discounts may be less than 60% off on-demand prices. These instances can be preempted by Google with a 30-second notice, making them suitable for fault-tolerant applications like . For instance, a Spot VM equivalent to an on-demand N2 instance might cost as low as $0.003465 per vCPU-hour under optimal conditions. Certain operating systems incur premium charges in addition to base VM costs, particularly for licensed images such as or specialized Linux distributions. Windows licensing, for example, adds fees that scale with vCPU count, such as $0.006 per core per hour for instances with 9-127 vCPUs, billed on-demand through Google Cloud. Similarly, (RHEL) adds approximately $0.06 per hour for instances with 1-4 vCPUs and $0.13 per hour for those with 5+ vCPUs (rates as of 2023; verify current pricing), while (SLES) charges $0.02-$0.11 per hour depending on the machine type. These fees ensure compliance with vendor licensing while integrating seamlessly with Compute Engine billing. Network egress traffic, which includes data leaving the Google Cloud network, follows tiered pricing to encourage efficient data management. Inbound traffic is free, but outbound to the internet is charged per GiB; for example, in , the first 1 GiB is free monthly, followed by $0.12 per GiB for the next 1 TiB, decreasing to $0.08 per GiB beyond 10 TiB. Inter-region transfers within the same cost $0.01 per GiB, while cross-continent egress starts at $0.02-$0.14 per GiB depending on the destinations. Persistent disk storage contributes to overall costs, with pricing based on provisioned capacity rather than usage. Standard persistent disks (HDD) $0.04 per GiB-month, while SSD-backed disks are $0.17 per GiB-month, prorated per second for the full provisioned amount. To illustrate total cost calculation for a simple on-demand VM: consider an instance with 2 vCPUs at $0.03465 per hour each, 8 GiB at $0.003938 per GiB-hour, a 100 GiB SSD disk at $0.17 per GiB-month (approximately $0.000235 per GiB-hour), running for 730 hours (one month). The monthly would be (2 × 0.03465 × 730) + (8 × 0.003938 × 730) + (100 × 0.17) ≈ $50.59 (vCPUs) + $22.99 () + $17.00 (disk) = $90.58, excluding any premium OS or egress fees. This formula—total = (vCPU-hours × vCPU-rate) + (GiB-hours × -rate) + (provisioned GiB-months × disk-rate)—highlights the granular, resource-based billing structure.

Discounts and Reservations

Google Compute Engine offers several mechanisms to reduce costs for long-term or predictable workloads, including sustained use discounts, committed use discounts, reservations, and promotional credits. These options allow users to optimize expenses without altering their , by applying automatic reductions or guaranteeing resource availability. Sustained use discounts (SUDs) are applied automatically to eligible resources that run for more than 25% of a billing month, providing tiered savings that increase with higher utilization levels, up to 30% off on-demand prices for full-month usage. SUDs apply to instances, persistent disk, and certain other resources, but only if no other discounts like committed use are active on the same usage. Committed use discounts (CUDs) enable deeper savings through contractual commitments to specific usage over 1- or 3-year periods, without upfront payments. -based CUDs target predictable workloads on particular types, offering discounts up to 57% compared to on-demand , depending on the commitment term and . In contrast, flexible CUDs are spend-based commitments that apply broadly across Compute Engine, Engine, and Run, providing a flat 28% discount for 1-year terms and 46% for 3-year terms on eligible vCPU and memory usage. These can be combined with reservations to ensure capacity while maximizing savings, and they automatically cover the highest-discount eligible usage first. Reservations allow users to pre-provision capacity in specific zones or regions, guaranteeing availability for critical workloads even during high demand. Users are charged at standard on-demand rates for reserved resources, but reservations can be paired with CUDs or SUDs for additional discounts, and they support features like future reservations for planning up to one year ahead. To aid management, Compute Engine provides reservation recommendations based on historical usage, helping identify opportunities to optimize idle capacity. New customers receive $300 in free credits upon signup, applicable to Compute Engine and other Google Cloud services for the first 90 days, alongside an always-free tier that includes one e2-micro VM instance per month in select regions. In November 2025, Google made generally available enhanced reservation observability features, including API fields to verify which reservation a VM consumes and list VMs using a specific reservation, improving transparency for capacity management.

References

Add your contribution
Related Hubs
Contribute something
User Avatar
No comments yet.