From the December, 2013 Issue of Cabling Installation & Maintenance Magazine
In high-performance computing environments, interconnect technologies of choice battle for market share in terms of system deployment and performance.
by Patrick McLaughlin
The world of high-performance computing (HPC) or supercomputing (SC) generally is considered an industry of its own. It can be viewed as "intersecting" with data centers because the facilities in which supercomputing takes place really are data centers. But the primary difference between what happens in an enterprise data center or colocation facility and what happens in a supercomputing environment is the computing itself. The website insidehpc.com provides a technical definition, involving the aggregation of computing power, and then explains in more layman terms: "It turns out that defining ‘HPC' is kind of like defining the word ‘car'--you probably know what a car is, but I bet you'd be hard pressed to write a concise, simple definition of one that means anything. Also note that HPC is actually used in two ways: it can either mean ‘high performance computing' or ‘high performance computer.'"
One characteristic of HPC or SC environments of particular interest to network professionals is the communication technology through which the elements of a supercomputer connect and communicate. Different speeds of Ethernet and Infiniband are favorites in this realm. That fact was on display, literally, when the 25th annual Supercomputing Conference (SC) took place in November. SC13 was held in Denver, CO the week of November 17. Several of the announcements made during the event indicate the positions of different connectivity technologies within the SC space.
In the Ether
The Ethernet Alliance (www.ethernetalliance.org) had a presence at SC13, where it explained its demonstration "integrated a range of cutting-edge technologies from across the Ethernet ecosystem, including HPC, networking and storage." Mario Chow is senior marketing engineer with Dell and is the Ethernet Alliance's SC13 technical lead. He noted, "With the ever-increasing interest in emerging technologies, convergence, and enhancements to existing infrastructure in both HPC and data center environments, the Ethernet Alliance is excited to bring a live demonstration of a real-world data center fabric to SC13.
"Our SC13 demo integrates the best Ethernet technologies and advancements in a mixed-vendor environment. We have an operational Ethernet fabric that includes 10-GbE and 40-GbE interfabric links, LAN/SAN convergence with data center bridging [DCB], and RDMA over Converged Ethernet [RoCE], 10/40-GbE line-rate access performance, and full Layer 1 interoperability. This demo illustrates Ethernet's growing command of HPC and other relevant high-performance fabric requirements, and sets the pace for its continued expansion within the supercomputing and advanced research community."
The demonstration included 10-GbE attached servers in the access layer linked to multiple 10/40-GbE access-layer switches, aggregated by two 10/40-GbE core switches.
Scott Kipp, the Ethernet Alliance's president and a senior technologist with Brocade, added, "Ethernet is steadily making significant inroads throughout the global HPC community. As a market-driven technology, Ethernet is continually evolving and adapting to meet changing demands not only in HPC, but across a diverse range of industries and applications. The prevalence of Gigabit and 10-GbE systems on the most recent Top500 list demonstrates Ethernet is proving itself a reliable and cost-effective means for meeting today's demanding supercomputing performance needs. And with 40-GbE, 100-GbE and the era of 400G drawing closer, we expect to see this shift toward Ethernet-based supercomputing accelerate."
Supercomputing and Infiniband
Kipp's mention of the Top500 refers to a list of the world's most powerful supercomputers, which has been compiled semiannually for more than 20 years. The list is available at the website top500.org. When the most updated list was made available, the announcement of it explained, "The first version of what became today's Top500 list started as an exercise for a small conference in Germany in June 1993. Out of curiosity, the authors decided to revisit the list in November 1993 to see how things had changed. About that time they realized they might be on to something and decided to continue compiling the list, which is now a much-anticipated, much-watched and much-debated twice-yearly event. The top500 list is compiled by Hans Meuer of the University of Mannheim, Germany; Erich Strohmaier and Horst Simon of Lawrence Berkeley National Laboratory; and Jack Dongarra of the University of Tennessee, Knoxville."
Topping the November 2013 list, as it had previously, is Tianhe-2, a supercomputer developed by China's National University of Defense Technology. Supercomputers are ranked according to the Linpack benchmark, which was developed by Dongarra. The supercomputers' performance is expressed in petaflop--quadrillions of calculations per second. Tianhe-2's performance is 33.86 petaflop/second. When announcing and detailing the list, its creators explained, "The total combined performance of all 500 systems on the list is 250 Pflop/sec. Half of the total performance is achieved by the top 17 systems on the list, with the other half of total performance spread among the remaining 483 systems."
Other noteworthy statistics from the most recent Top500 list follow.
- A total of 31 systems have performance greater than a Pflop/sec; the previous list, compiled in June 2013, included 26 Pflop/sec systems.
- The number of systems on the list installed in China is 63, compared to 65 in June 2013. China is the number-two user of HPC, behind the United States (which has 265 of the 500 systems) and ahead of Japan (28 systems), United Kingdom (23 systems), France (22 systems) and Germany (20 systems).
- The system at number 500 on the most-recent list occupied spot 363 on the June 2013 list.
The top500.org site also includes statistics on the interconnect technologies used in these 500 HPCs. For the November 2013 list, Gigabit Ethernet holds a 27-percent share, as it is deployed in 135 of the 500 systems. Next is Infiniband FDR (14 Gbits/sec/lane) with 16-percent share (80 systems). Close behind Infiniband FDR is 10G Ethernet, at 15.4 percent (77 systems)--itself followed closely by Infiniband QDR (10-Gbit/sec/lane) with 15.2 percent (76 systems). Custom interconnects make up 7.4 percent (37 systems), followed by Infiniband (2.5 Gbits/sec/lane) at 7.2 percent (36 systems). Other interconnect systems, including Cray Gemini, Aries and other forms of Infiniband, combine for the remaining 11.8-percent share.
When measured by performance rather than number of systems deployed, the rankings change fairly dramatically. Custom interconnects, which make up only 7.4 percent of the number of systems deployed, account for 24.8 percent of the computing performance with nearly 62 Pflop. Next is Infiniband FDR, at more than 38.8 Pflop. Occupying the third spot is Tianhe Express-2, the interconnect designed for and specific to the number-one supercomputer on the list. Its use in that one system accounts for more than 33.8 Pflop of computing.
Gigabit Ethernet--first in system deployments with 135--ranks sixth in performance share with in excess of 23.6 Pflop. The seventh position in the performance ranking is occupied by 10G Ethernet, with slightly more than 14 Pflop. By comparison, Infiniband QDR ranks fifth in performance with more than 24.7 Pflop.
Cool technology
The SC13 exhibition also was the platform upon which an innovative cooling technology was introduced. 3M and Allied Control jointly unveiled a passive two-phase liquid immersion cooling system. The Allied Control system employs 3M's Novec Engineered Fluids. A release from the two organizations said the technology "enables much tighter component packaging for greater computing power in less space."
The system already has been deployed at a Hong Kong data center, of which 3M and Allied Control released some details. They said, "Using 3M Novec Engineered Fluids as the primary coolant, Allied Control completed the new data center project on behalf of their client from design to deployment in just six months. The first-of-its-kind data center is now Asia and Hong Kong's most energy-efficient. The success of this 500-kW facility has much broader implications for widespread use in the data center market. The advances hold significant potential for organizations building supercomputers or high-performance computing platforms, central processing unit or graphics processing unit manufacturers, computing clouds, original equipment manufacturers and green facilities."
They further explained that using this immersion cooling system, the data center "is capable of achieving a Power Usage Effectiveness [PUE] of 1.02 in the hot and humid climate of Hong Kong, without taking any free cooling into account. … The data center, despite being housed in Hong Kong's sticky climate, saved more than 95 percent of its cooling electricity energy. This represents $64,000 savings per month in the 500-kW facility by eliminating chillers and air-conditioning units. Additionally, the IT equipment in the data center uses 10 times less space than traditional data centers, requiring less than 160 square feet."
Allied Control's vice president of operations Kar-Wing Lau said the project "demonstrates the elegance of immersion cooling and showcases that it has what it takes to be the new gold standard in the industry. Many of the companies that we partner with see the immediate benefits--technically and economically--and that it's already commercially available today. There is no denying that immersion cooling will play an important role in the future, and that it has great potential for growth."
3M's electronics markets materials division business director Joe Koch said, "The advancements that Allied Control has achieved are a major thrust for the data center industry. Their accomplishments are inspiring other industry leaders to find better ways to address energy efficiency, space constraints and increased computing power in data centers."
3M developed passive, two-phase (evaporative) immersion cooling in response to the growing need for more-energy-efficient means of cooling electronic components in large installations like data centers, the company explains. It employs semi-open baths of non-electrically conductive fluids--3M Novec Engineered Fluids.
"In this technique," 3M says, "component racks are completely submerged in a bath of Novec fluid." 3M advanced application development specialist Phil Tuma further explains, "Novec fluids remove heat through direct contact with the chip or other heat source. This raises the fluid to its boiling point. The vapor thus generated condenses back to a liquid by exposure to a condenser coil, then falls back into the bath. No energy is required to move the vapor and no chiller is needed for the condenser, which is cooled by normal facility water."
He estimates that this immersion cooling technique can decrease power use by 90 percent when compared to conventional air-cooling methods. "It also eliminates the need for the kinds of connectors, plumbing, pumps and cold plates associated with conventional cooling," he added.
Whether the consideration is computing power, interconnect technology, equipment cooling or other concerns, high-performance computing environments require just that--high performance--of all components and systems.
Patrick McLaughlin is our chief editor.
Archived CIM Issues