Let’s evaluate the compute offerings of AWS, Azure, Google, and IBM SoftLayer. For a high-level view of the differences (in compute, network, storage, database, analytics, and other services) among these cloud provider.
RightScale comparing tool
1- Amazon Web Services
AWS was first to market with a cloud compute offering, and it gained a sizable head start. Today AWS Elastic Compute Cloud (EC2) has approximately 40 different instance sizes in its current generation (“instance” is the term AWS and Google use for what others call a “virtual machine,” “VM,” “virtual server,” or “VS”). The previous generations of instance types (including the aforementioned M1) are still available, although they are not “above the fold” on any AWS price sheets or product description literature. There are about 15 instance sizes in the previous generations. While they are currently fully supported, it would not be surprising if AWS looks to sunset these instance types at some point in the future.
Focusing on the current generation, some of these instance types come with attached “ephemeral” storage (storage that is deprovisioned when the instance is terminated), while many others come with no attached volumes and instead specify “EBS only” with regard to storage. This means you must separately provision, attach, and pay for the storage. (EBS is AWS’s Elastic Block Storage offering, which will be discussed in a future article in this series.)
The current generation of instances is organized into instance families that are optimized for certain use cases. Some of the current instances address general-purpose workloads, while others are tailored for computationally intensive applications. Still others are optimized for workloads with high memory requirements or for applications that require high amounts of storage (up to 48TB). Some instances provide GPUs that can be used for video rendering, graphics computations, and streaming.
Additionally, some instance families support “burstable” performance. These provide a baseline CPU performance, but can burst to higher CPU rates for finite periods of time provided by “CPU credits” that the instance accumulates during periods of low CPU utilization. Evaluate your use case and workload carefully before deciding upon burstable instance types.
It is important to benchmark your application to ensure that on average it stays at or below the baseline. Not only that, you want to ensure that the CPU bursts are not so long they exhaust your credits, and the CPU valleys are sufficiently long to allow for credit replenishment. If you exhaust your CPU credits, your application may run in a “CPU starved” state that will obviously hinder performance. Burstable instances are a great tool for the right application, but they can prove very problematic when used incorrectly.
AWS instance types can be optionally configured to meet specific use cases, performance targets, or compliance regulations. For example, certain instance types can be configured in an enhanced networking model that allows for increased packet rates, lowers latencies between instances, and decreases network jitter. Additionally, instances can be launched into high-performance computing (HPC) clusters or deployed on dedicated hardware that allows for single-tenant configurations, which may be required for certain security or compliance regulations.
There are also different pricing structures and deployment models that can be used within AWS EC2. The standard deployment model is “on-demand,” which means, as the name implies, you launch when you need them. On-demand instances run for a fixed hourly cost (fractional hours are rounded up to the next hour) until you explicitly terminate them. There are also “spot instances,” which allow you to bid for any excess compute capacity AWS may have at any given time. Spot instances can often be obtained for a fraction of the on-demand cost (savings in excess of 80 percent are not uncommon).
However, they come with the caveat that they may be terminated at any time if the current spot price exceeds your bid price. It is a real-time marketplace in which the highest bid (the price you are willing to pay per hour for the instance) “wins.” You can achieve tremendous cost savings with spot instances, but they are only suited for workloads that can be interrupted (processing items from an external queue, for example).
AWS offers “spot blocks,” which are similar to spot instances in that you specify the price you are willing to pay, but you also specify the number of instances you want at that price, and a duration in hours up to a maximum of six. If your bid is accepted, your desired number of instances will run for the time specified without interruption, but they will be terminated when the time period expires. This deployment model is useful for predictable, finite workloads such as batch processing tasks.
AWS offers discounts through reserved instances (RIs), which require you to commit to a specific instance type running a specific operating system in a specific availability zone (AZ) of your desired AWS region. You must commit to a one- or three-year term, and in return your hourly cost for the instance will be greatly reduced (up to 75 percent for a three-year commitment).
However, you are generally constrained to the instance type, operating system, and AZ that you selected for the duration of the contract, so careful planning is essential. You can request modifications within certain limitations, but those requests are subject to approval by AWS based on available capacity. Clearly, committing to one or three years of reserved instances isn’t for everyone. Other providers have similar discounting policies that are far simpler to implement and don’t require having years of visibility into your workload (Google’s Sustained Use Discounts, for example, which will be described shortly).
AWS has the most complete set of offerings in the compute arena, but it doesn’t have a lock on unique and interesting features. Other vendors are continually adding new compute options that make them attractive alternatives for many use cases.
2- Microsoft Azure
Microsoft takes a similar approach to compute instance types with Azure, but uses slightly different nomenclature. Instances are called virtual machines (VMs), although you will see the word “instance” sprinkled throughout the online documentation. VMs are grouped into seven different series with between five and a dozen different sizes in each group. Each series is optimized for a particular type of workload, including general-purpose use cases, computationally intensive applications, and workloads with high memory requirements. An eighth group (the “N” series), which is composed of GPU-enabled instances, were recently released for general availability this month.
All told, Azure has approximately 70 different VM sizes, covering a wide array of use cases and workload requirements. All VM types in Azure come with attached ephemeral storage, varying from about 7GB to about 7TB. (Azure measures attached storage in gibibytes, not gigabytes, so the numbers don’t come out as neat and clean as we are typically used to.)
As the maximum capacity of attached storage for an Azure VM is considerably less than for an AWS EC2 instance (the aforementioned 48TB, for example), you may want to provision additional storage. This can be allocated from an Azure Storage account associated with your Azure subscription (“subscription” is the Azure term for what is generally known as an “account”). Azure provides both a standard storage option (HDD) and a “premium” storage option (SSD). I’ll discuss these in more detail in a later post in this series.
Azure also provides a VM size (the A0) that is intentionally oversubscribed with regard to the underlying physical hardware. This means the CPU performance on this VM type can be affected by “noisy neighbors” running VMs on the same physical node. Azure specifies an expected baseline performance, but acknowledges that performance may vary as much as 15 percent from that baseline. The A0 is a very inexpensive VM, and if a particular workload can tolerate the variability, it may be an attractive option.
Azure charges for VMs on a per-minute basis instead of on an hourly rate like AWS. Thus, a VM that runs for 61 minutes on Azure is charged for 61 minutes, whereas AWS would charge you for a full two hours. Azure has an offering similar to AWS’s reserved instances. Called the “Azure Compute prepurchase plan,” this allows you to reap significant discounts (as much as 63 percent) by making an upfront prepurchase, with a one-year commitment, on a particular VM family, size, region, and operating system.
However, the prepurchase plan is available only to customers holding an active Enterprise Agreement (EA) with Microsoft. Because an EA can greatly influence your pricing model, VM pricing on Azure is kind of like the pricing of airline seats on any particular flight: No two people pay the same price, though they are all sitting in the same type of seat and going to the same place. If you have an EA with Microsoft, be sure to speak to your sales representative about your Azure usage.
Microsoft has made great strides in IaaS over the last few years. Azure has started to close the overall gap with AWS, particularly the gaps in its compute offering. As many enterprises are already engaged with Microsoft on some level (or multiple levels), it would not be surprising to see this trend continue.
3- Google Cloud Platform
Google Compute Engine (GCE), the service within Google Cloud Platform that manages IaaS compute resources, also provides numerous options for launching virtual machines. Like AWS, GCE calls the VMs “instances” and the different options “machine types.” These are grouped into several categories (standard, high CPU, and high memory), with multiple sizes within each category.
Currently you’ll find approximately 20 different predefined machine types in GCE, with available memory ranging from 600MB to 208GB. None of these predefined machine types provides ephemeral storage, which is a change from the early days of GCE when ephemeral storage was an option. Ephemeral storage was a casualty of GCE’s live migration (or “transparent maintenance”) service, which enables a VM to be migrated from one physical node to another without any interaction (or even knowledge of the process) by the customer. This feature is unique to GCE and a powerful differentiator to AWS.
Another unique feature of GCE is the ability to create custom machine types. That is, you can specify the configuration of virtual CPUs and available memory if none of the predefined machine types fits your needs. There are limitations to what can be configured, and prices for custom machine types are higher than for predefined instances, but for certain use cases and workloads, custom machines may be an attractive option.
GCE also provides a few “shared core” machine types, which are similar in concept to the oversubscribed VM sizes in Azure. These machine types provide “opportunistic” bursting, which allows the instance to consume additional CPU cycles when they are not being consumed by other workloads on the same physical CPU. GCE does not use a “CPU credit” system such as AWS uses to balance peaks and valleys of utilization. Instead the bursts occur whenever the stars of application need and CPU cycle availability align.
Similar to Azure (but unlike AWS), GCE employs a per-minute pricing model, rounded up to the next minute, with a floor of 10 minutes. Thus, an instance that runs for three minutes is charged for 10 minutes of execution, while an instance that was operational for 12.5 minutes would be charged for 13.
GCE also provides a mechanism to access unused capacity at a reduced rate, somewhat similar to AWS spot instances. Google calls these preemptible VM instances and provides them at an 80 percent discount as compared to the fixed, on-demand hourly rate. Like AWS spot instances, a GCE preemptible instance may be terminated (“preempted”) at any time. However, whereas spot instances could (in theory) operate indefinitely, preemptible instances will always be terminated after 24 hours. Preemptible instances are not covered under GCE’s SLA, and they cannot be live migrated, so the use cases may be limited. But if you have an appropriate use for them, they can deliver great cost savings.
A compelling feature of GCE’s pricing model is the aforementioned Sustained Use Discount (SUD). In the SUD model, any instance in use for more than 25 percent of the monthly billing cycle is automatically discounted for every minute beyond that initial 25 percent, with no interaction required by the customer. The discount is 20 percent for usage between 25 and 50 percent of the full month, 40 percent for usage between 50 and 75 percent of the month, and 60 percent for usage above 75 percent of the month. In addition, this discount is not limited to a specific instance but can apply to multiple instances of the same machine type.
In other words, if you have two instances that run for a quarter of the month, you get a discount of 40 percent, not 20 percent. Further, it’s a simple and elegant pricing model that doesn’t require the user to make any upfront commitments or predictions about future utilization. You simply launch your instances, and if they are operational for more than 25 percent of the billing cycle, a discount is automatically applied.
Although Google was the second major player to enter the IaaS game behind AWS, it has seen slower adoption among enterprise users than Azure, most likely due to Microsoft’s long-established stronghold on enterprises. However, as Google continues to expand and innovate the GCE product suite and introduce unique offerings, its footprint in these organizations continues to expand.
4- IBM SoftLayer
SoftLayer takes a slightly different approach to compute instances (“virtual servers” or “VSes” in SoftLayer parlance) in that there are no predefined sizes. Similar to GCE’s custom machine types, you can configure a virtual server to your own specs, drawing on core/RAM configurations from one core with 1G of RAM to 56 cores with 242GB. Not every combination is available (you can’t select one core with 242GB of RAM, for example), but there is a vast array of options. As such, SoftLayer does not have “families” or “series” of VSes, but with the ability to customize the CPU count and RAM capacity, you can effectively build your own high-CPU or high-memory virtual servers.
SoftLayer does not offer burstable virtual servers or a spot market for unused capacity, but it does have something akin to AWS’s reserved instances. Make a monthly commitment to a specific VS, and you get a lower effective monthly rate. In this model you are paying for an entire month of VS usage, so it only makes sense if you have an application that will require an always-on configuration for the month. The savings you get in return are typically in the eight to 10 percent range; unless you are sure of your usage needs, the incentive to forgo the flexibility that hourly billing affords is not substantial.
A unique service that SoftLayer provides is the ability to provision bare-metal servers. This is done via the SoftLayer portal (or the SoftLayer API), so the ordering experience is the same as for VSes. The difference is that bare-metal servers can take up to four hours to be provisioned, whereas VSes typically boot within 10 minutes. Nevertheless, considering the additional complexity involved behind the scenes, this seems a reasonable turnaround time.
As with VSes, bare-metal servers are available in a variety of configurations, and you can choose between an hourly usage model or a monthly commitment. Monthly commitments open the door to a far wider array of server configurations than is available under the hourly usage model. Bare-metal servers offer the advantage of single-tenant configurations (similar to AWS’s dedicated instances and hosts), which may be required for security or compliance reasons, or to accommodate more restrictive software licensing (IP-locked, MAC-locked, and so on).
SoftLayer does not have as strong a foothold in the IaaS market as the other IaaS vendors discussed here, but it has some differentiating offerings (many in the network and appliance arenas, which are outside the scope of this article) that make them attractive for specific use cases and workloads. Because SoftLayer is an IBM company, it enjoys established relationships with many large enterprise customers.