Looking to HPC to solve (some of) the data center’s problems
Today’s data centers are having to evolve very quickly because of the different loads being placed on them by speed & power-hungry applications. And new technologies will soon be either powering the high-load apps on which we rely, or providing the blueprint for a new generation of data center technologies.
The world of HPC (high-performance computing) seems a long way from the everyday tech that many of us use in our working lives. However, the technologies deployed at this cutting edge of computer science are beginning to solve some of the issues which are causing bottlenecks in the data centers of the world’s major cloud providers – Google, Amazon’s AWS platform, and a host of smaller, specialized cloud platforms.
Perhaps the best way to consider the problem suffered by cloud providers in commercial terms is to imagine a company which has developed a new platform that it wants to deploy in the cloud. Among the imaginary platform’s features are:
- Simulation of complex industrial processes across the full range of a business, from tracking physical inputs, through many thousands of devices, machines and processes, and outputting multiple products.
- Analytics of real-world data from monitor and control systems across the business, to find trends, bottlenecks, opportunities, and efficiencies.
- Deployment of artificial intelligence-type routines to dig deeply into very large amounts of data in legacy storage and new information as it appears.
To run this complex set of requirements, an organization may feel it will achieve the most efficient results (fastest compute, lowest cost, most secure environment, most reliable service) by splitting up the different elements of the platform across different processor and storage types.
— Intel HPC (@intelhpc) November 3, 2018
For instance, AI (artificial intelligence) functions may be best deployed on GPU arrays (graphics processing unit), while real-time signal processing from the factory floor may be pushed to FGPA (field gate programmable array). Similarly, data analytics may use HDDs, while highly variable computing loads may be supported by burst buffers and high-end flash, like NVMe arrays.
Supercomputer manufacturers are developing new product ranges which, it is hoped, will run and scale all these different workloads, using multiple processor types, on a unified file system.
The new designs are being conceived with a unified platform as the goal which combines the infrastructure required to run different workloads now and in the future, on multiple processor architectures, using disparate storage media.
The same manufacturers are also working on interconnection systems to remediate the data center’s major bottleneck, latency caused by “traditional” ethernet. While new, fast interconnectivity is the goal, companies know that any new platform will have to fit into existing data centers, before centers’ upgrade paths see, eventually, the faster interconnections more commonly seen at present in HPC environments.
Unified platforms – capable of taking over multiple workload requirements – mean that companies wishing to deploy complex applications will be able to pare down their supplier list: cloud platforms will be able to hit performance benchmarks and adapt to wildly changing requirements, using the same physical data centers.
But before we get fixated on the increasing need for speed and efficiency in our distributed applications’ hosting, there are other pressing concerns for large cloud providers in the race for better performance, foremost being the need to reduce power requirements, and more efficient cooling systems.
As cloud providers reach exascale, driven by demand for their services, they seek faster, more efficient computing power, but ones that draw less power and create less extraneous heat.
With data centers now having a carbon footprint comparable with that of the airline industry, our hunger for ever-improving results needs to be satiated with care.
2 July 2022