Physically Relocating Compute Nodes over 3km Fiber Link Results in Increased Deployment Flexibility with no Performance Degradation

Obsidian Research Corporation, the leader in InfiniBand range extension, today announced that its Longbow Campus products have enabled NASA to relocate 15% (1,536 processors) of its high-ranking SGI® Altix®-based Columbia Supercomputer to another facility and seamlessly connect both locations without any performance degradation. This move allows NASA to free up additional building space in its compute center, to reduce the power and cooling requirements on any single location, and to create multi-site supercomputer clusters that now can scale, perform, and be managed as if they resided in one physical location since the Obsidian Longbow solutions add only 840ns of latency beyond the optical flight time between the two locations. By deploying eight, Longbow Campus products at each end, NASA transparently extends its InfiniBand fabric 3km, provides 8 GB/sec of bandwidth between the two locations, and only adds an additional 2RU worth of equipment that consumes a negligible 150W. “The Columbia Supercomputer tackles some of the most important science and engineering projects in the world,” said Dr. David Southwell, CEO of Obsidian Research Corporation. “Obsidian is pleased that its Longbow Campus products have created further flexibility for NASA by allowing it to quickly and seamlessly extend the supercomputer across multiple locations without any loss in performance, scalability, or manageability.”

InfiniBand Range Extension Allows NASA to Maximize Columbia Resources

NASA’s Columbia supercomputer, located in Mountain View, California, is a 10,240-core system composed of twenty 512-core nodes, based on SGI® Altix® 4700 and 3700 systems. Each Altix node is connected via NumaLink, while node-to-node communication is done over 10Gb/s InfiniBand. Ranked the 13th most powerful supercomputer in the world, Columbia is particularly well suited for large-scale applications that involve substantial inter-processor communication and I/O, and is used to run some of NASA’s – and America’s – toughest science and engineering problems.

NASA needed to free up some floor space in the compute center that houses Columbia to make room for new supercomputing resources, including a 2,048-core, 4TB SGI® Altix® 4700 system recently selected as part of the NASA Technology Refresh program, and expected to be installed later this year. However, moving parts of the supercomputer to a nearby location posed several problems as NASA did not want to sacrifice performance, manageability, scalability, or security. While InfiniBand offers the highest performance, lowest latency solution, standard InfiniBand copper cabling can only travel 20 meters and more expensive optical solutions only span 100 meters. Interworking from InfiniBand to Ethernet proved not only to be an expensive proposition but one that substantially increased the latency of the system, thereby reducing performance and scalability.

“We needed a solution that allowed us to re-locate nodes from our Columbia supercomputer to a facility 3km away, yet perform, scale, and be managed as if all the nodes continued to reside in a single location,” said Alan Powers of Computer Sciences Corporation and High-End Computing Lead at NASA Ames. “With its ability to transparently extend IB fabrics over 10km of dark fiber while retaining the InfiniBand performance and semantics, the Obsidian Longbow Campus products offered a plug-and-play solution that allowed us to achieve our goals.”

Future Growth Potential without Power, Cooling, or Space Constraints

Currently 15% of the Columbia system (1536 processors) now resides in a different building on NASA’s campus with eight Longbow Campus systems on each side of the link. This production environment can now be managed, perform, and scale seamlessly as each Longbow system presents as a two-port InfiniBand switch and adds only 840ns of latency (an order of magnitude less than Ethernet interworking) beyond the optical flight time between the two buildings.

By distributing Columbia’s compute power across two locations while maintaining its desired performance, NASA achieves additional flexibility that eases future expansion. As compute requirements continue to grow, so do the physical power, cooling and space burdens that many facilities have difficulty accommodating. Obsidian Longbow Campus products remove these barriers by facilitating distributed supercomputer systems that perform as a single co-located system but can scale beyond any one location’s power, cooling, and space budget. As NASA continues to push the limits of its understanding in the areas of aeronautics, exploration, science, and space operations, Columbia will have the ability to tackle these problems and grow unimpeded by these physical constraints.

About Obsidian:

Obsidian Research Corporation and the Obsidian Longbow LP are the developers of Longbow, a series of InfiniBand range extension products. Longbow technology allows an InfiniBand fabric, normally a short-range network used in high-performance computing, to be extended via optical fiber over varying distances. Longbow connects across Campus, Metro or Global networks to offer unparalleled high-bandwidth, low-latency access to InfiniBand compute and storage resources. Obsidian is available online at www.obsidianresearch.com.

Contact

Obsidian Research Corporation

David Southwell, CEO, 780-964-3283

dsouthwell@obsidianresearch.com