With traditional colocation, a data center will rent out space to many different customers. Each customer brings in their own server hardware. One of the trends among data centers is to have the data center purchase and manage all of the hardware. This creates various economies of scale and benefits such as:
- Standardization of hardware within a data center allows for elasticity and rapid provisioning of a server. This opens up a new market for customers who want to rent servers. Customers can have an operating serving within minutes or hours rather than weeks.
- Software can automate the process of setting up servers.
- Centralizing the purchasing and management of hardware reduces the inventory of spare parts that must be kept on site. It also allows for repairs to have a much faster turnaround time because clients do not need to physically go to the data center to repair their servers. These are very minor efficiencies.
- If clients do not need to actually visit their servers, certain efficiencies are possible. Vertical integration between data center design and server design allows for various capex and opex savings from power efficiency and cheaper servers.
- Some infrastructure providers offer managed IT services. They help customers to secure their servers, to maintain the server’s software, etc. etc.
From the customers’ perspective, the main benefits of centralized management are:
- The rental model. For startups, this is highly attractive. They no longer have to accurately forecast their infrastructure needs. Renting infrastructure is also a form of cheap financing for cash-strapped startups. As well, startups can quickly test their software on different hardware configurations without having to buy servers.
- Elasticity. There are a workloads where money can be saved by renting out servers for only part of the day, part of the year, or in response to sudden spikes in demand.
- Value-added software can save time setting up and administering servers.
- Sometimes capex and opex efficiencies are passed onto the customer. In practice this isn’t the case in many situations.
- Managed IT services can be attractive because customers can get the expertise of a full-time specialist on a part-time basis.
The major downsides of centralized management is that there is no hardware flexibility. Some customers need customized hardware because the “one-size-fits-all” solution doesn’t cut it for them. They may have unique needs when it comes to:
- A very specific type of performance, e.g. storage throughput, IOPS, network latency, etc. etc. The rising popularity of SSD drives is reducing the need for specialized storage solutions that deliver high throughput and/or IOPS.
- Highly-specialized types of hardware such as high-performance networking (for supercomputers), FPGAs, power-efficient microservers, and GPUs. Over time, it could be the case that more infrastructure providers will offer GPU options. Amazon currently offers servers with GPUs. However, I do not predict that infrastructure providers will offer almost every type of hardware.
- Expensive enterprise-grade hardware with very high reliability. Many legacy systems were designed with the assumption that the hardware is extremely reliable. The current trend is add software on top of the hardware to solve reliability problems through redundant hardware. However, the software itself is a point of failure and can cause reliability issues. In practice, Amazon Web Services and its competitors have had numerous major outages (e.g. Amazon had issues with its Elastic Block Store).
- Very high availability.
While centralized management has been around for a long time, it is becoming more competitive versus customers buying their own hardware and putting it in a colocation facility. Over time I expect the market share to slowly trend in favour of centralized management.
Data center integration and economies of scale
If clients do not need to visit the data center, certain efficiencies are possible.
- Google figured out that servers don’t need cases. They also realized that cases impede airflow, making cooling less efficient.
- Because clients do not bring their own hardware, the infrastructure provider can purchase custom-built hardware. The infrastructure provider can get rid of graphics chips, IPMI (to remotely control the computer), peripherals, CD/DVD-ROM drives, etc. In other words, they can find (minor) efficiencies with highly-specialized servers unsuitable for broader markets. According to The Register, Rackspace saves between 18 and 22 percent by buying custom servers rather than buying servers from HP and Dell.
- Google data centers do not have an Uninterruptible Power Supply (UPS) for their servers. Instead, its servers come with batteries built into them for temporary backup power. These batteries save power because they avoid the direct current –> alternating current –> direct current conversion of traditional UPS.
- Real estate. Data centers are often located in cities (sometimes in the downtown core) so that they are a reasonable driving distance for the clients. If the clients don’t need to visit the servers, there is a lot more flexibility in deciding where to put the data center. Google places many of its data centers in locations with cheap electricity. Google also considers factors such as access to cold water (for cooling) and the climate at the location (which affects the efficiency of their custom-designed cooling systems).
- The data center and the servers inside them can be designed so that the server rooms can run at higher temperatures. The data center can be designed so that the “hot” aisle can be very hot (e.g. 120°F or 49°C) and uncomfortable for human beings. This means that the server should be designed in a way that human beings do not have to stand behind the server to maintain them. Conventional servers do not have such a design.
- Economies of scale allow repair staff to become highly experienced at their jobs. Inventory needed for spare parts is much lower.
The biggest data center companies will see various economies of scale from all of these little efficiencies. I believe that Google is currently the leader in this area.
- PUE (power usage effectiveness) is a metric that measures how many units of power is needed for a server to spend 1 unit of energy doing something useful. Lower is better. A PUE of 1 is the limit. Google’s PUE is incredibly low at around 1.12. Amazon may have been at around 1.45 a few years ago while Microsoft is aiming below 1.25 (source). Australian government guidelines set a goal of less than 1.9. Many conventional data centers are well above 2.0.
- Google is designing its own SSD drives and networking switches. Presumably Google is trying to optimize for power consumption, cost, and performance. Its proprietary switches are presumably related to its work on Software Defined Networking (SDN), a fundamentally different way of doing networking that improves performance.
- Many other data center companies do not seem to be buying custom-made servers with on-board batteries. Facebook and Rackspace are following Google down the path of unconventional server designs with their sponsorship of the Open Compute project.
The bottom line
The bottom line is that centralized management allows data centers to find efficiencies in their operations. Future posts in this series will look at ways in which centralized management is opening up new markets and fundamentally different ways of operating a data center.
*Disclosure: No position.