Summit (supercomputer)
View on Wikipedia| Sponsors | United States Department of Energy |
|---|---|
| Operators | IBM |
| Architecture | 9,216 POWER9 22-core CPUs 27,648 Nvidia Tesla V100 GPUs[1] |
| Power | 13 MW[2] |
| Operating system | Red Hat Enterprise Linux (RHEL)[3][4] |
| Storage | 250 PB |
| Speed | 200 petaFLOPS (peak) |
| Ranking | TOP500: 7 (1H2024) |
| Purpose | Scientific research |
| Website | www |


Summit or OLCF-4 was a supercomputer developed by IBM for use at Oak Ridge Leadership Computing Facility (OLCF), a facility at the Oak Ridge National Laboratory, United States of America. It held the number 1 position on the TOP500 list from June 2018 to June 2020.[5][6] As of June 2024, its LINPACK benchmark was clocked at 148.6 petaFLOPS.[7] Summit was decommissioned on November 15, 2024.[8]
As of November 2019, the supercomputer had ranked as the 5th most energy efficient in the world with a measured power efficiency of 14.668 gigaFLOPS/watt.[9] Summit was the first supercomputer to reach exaflop (a quintillion operations per second) speed, on a non-standard metric, achieving 1.88 exaflops during a genomic analysis and is expected to reach 3.3 exaflops using mixed-precision calculations.[10]
History
[edit]The United States Department of Energy awarded a $325 million contract in November 2014 to IBM, Nvidia and Mellanox. The effort resulted in construction of Summit and Sierra. Summit is tasked with civilian scientific research and is located at the Oak Ridge National Laboratory in Tennessee. Sierra is designed for nuclear weapons simulations and is located at the Lawrence Livermore National Laboratory in California.[11]
Summit was estimated to cover 5,600 square feet (520 m2)[12] and require 219 kilometres (136 mi) of cabling,[13] and was designed to be used for research in diverse fields such as cosmology, medicine, and climatology.[14]
In 2015, the project called Collaboration of Oak Ridge, Argonne and Lawrence Livermore (CORAL) included a third supercomputer named Aurora and was planned for installation at Argonne National Laboratory.[15] By 2018, Aurora was re-engineered with completion anticipated in 2021 as an exascale computing project along with Frontier and El Capitan to be completed shortly thereafter.[16] Aurora was completed in late 2022.[17]
Uses
[edit]The Summit supercomputer was built for research in energy, artificial intelligence, human health, and other areas.[18] It has been used in earthquake simulation, extreme weather simulation, materials science, genomics, and predicting the lifetime of neutrinos.[19]
Design
[edit]This section may be too technical for most readers to understand. (May 2020) |
Each of its 4,608 nodes consist of 2 IBM POWER9 CPUs, 6 Nvidia Tesla GPUs,[20] with over 600 GB of coherent memory (96 GB HBM2 plus 512 GB DDR4) which is addressable by all CPUs and GPUs, plus 800 GB of non-volatile RAM that can be used as a burst buffer or as extended memory.[21] The POWER9 CPUs and Nvidia Volta GPUs are connected using Nvidia's high speed NVLink. This allows for a heterogeneous computing model.[22]
To provide a high rate of data throughput, the nodes are connected in a non-blocking fat-tree topology using a dual-rail Mellanox EDR InfiniBand interconnect for both storage and inter-process communications traffic, which delivers both 200 Gbit/s bandwidth between nodes and in-network computing acceleration for communications frameworks such as MPI and SHMEM/PGAS.
The storage for Summit [23] has a fast in-system layer and a center-wide parallel filesystem layer. The in-system layer is optimized for fast storage with SSDs on each node, while the center-wide parallel file system provides easy to access data stored on hard drives. The two layers work together seamlessly so users do not have to differentiate their storage needs. The center-wide parallel file system is GPFS (IBM Storage Scale). It provides 250PB of storage. The cluster delivers 2.5 TB/s of single stream read peak throughput and 1 TB/s of 1M file throughput. It was one of the first supercomputers that also required extremely fast metadata performance to support AI/ML workloads exemplified by the 2.6M 32k file creates per second it delivers.
See also
[edit]- Titan (supercomputer) – OLCF-3
- Frontier (supercomputer) – OLCF-5
- TOP500
- OpenBMC
- Red Hat Enterprise Linux
References
[edit]- ^ "ORNL Launches Summit Supercomputer". www.ornl.gov. June 8, 2018. Archived from the original on August 8, 2019. Retrieved June 12, 2018.
- ^ Liu, Zhiye (26 June 2018). "US Dethrones China With IBM Summit Supercomputer". Tom's Hardware. Archived from the original on 5 July 2018. Retrieved 19 July 2018.
- ^ Kerner, Sean Michael (8 June 2018). "IBM Unveils Summit, the World's Fastest Supercomputer (For Now)". Server Watch. Archived from the original on 10 June 2018. Retrieved 24 February 2020.
- ^ Nestor, Marius (11 June 2018). "Meet IBM Summit, World's Fastest and Smartest Supercomputer Powered by Linux". Softpedia News. Archived from the original on 24 February 2020. Retrieved 24 February 2020.
- ^ Lohr, Steve (8 June 2018). "Move Over, China: U.S. Is Again Home to World's Speediest Supercomputer". The New York Times. Archived from the original on 8 June 2018. Retrieved 19 July 2018.
- ^ "Top 500 List - November 2022". TOP500. November 2022. Archived from the original on 16 November 2022. Retrieved 13 April 2022.
- ^ "November 2022 | TOP500 Supercomputer Sites". TOP500. Retrieved 13 April 2022.
- ^ "2024 Notable System Changes — OLCF User Documentation". 31 October 2024. Archived from the original on 1 January 2025. Retrieved 1 January 2025.
- ^ "Green500 List - November 2019". TOP500. Archived from the original on 18 November 2019. Retrieved 7 April 2020.
- ^ Holt, Kris (8 June 2018). "The US again has the world's most powerful supercomputer". Engadget. Archived from the original on 8 June 2018. Retrieved 20 July 2018.
- ^ Shankland, Steven (14 September 2015). "IBM, NVIDIA land $325M supercomputer deal". C|Net. Archived from the original on 3 March 2016. Retrieved 29 December 2015.
- ^ "America's most powerful supercomputer is a machine for scientific discovery" (PDF). www.olcf.ornl.gov. Archived (PDF) from the original on 2023-10-17. Retrieved 2023-12-11.
- ^ Alcorn, Paul (20 November 2017). "Regaining America's Supercomputing Supremacy With The Summit Supercomputer". Tom's Hardware. Archived from the original on 23 November 2017. Retrieved 20 November 2017.
- ^ Noyes, Katherine (16 March 2015). "IBM, NVIDIA rev HPC engines in next-gen supercomputer push". PC World. Archived from the original on 21 December 2015. Retrieved 29 December 2015.
- ^ R. Johnson, Colin (15 April 2015). "IBM vs. Intel in Supercomputer Bout". EE Times. Archived from the original on 16 December 2015. Retrieved 29 December 2015.
- ^ Morgan, Timothy Prickett (9 April 2018). "Bidders Off And Running After $1.8 Billion DOE Exascale Super Deals". The Next Platform. Archived from the original on 16 June 2019. Retrieved 20 July 2018.
- ^ Hemsoth, Nicole (2021-09-23). "A Status Check on Global Exascale Ambitions". The Next Platform. Archived from the original on 2021-10-16. Retrieved 2021-10-15.
- ^ "Introducing Summit". Archived from the original on 29 November 2014. Retrieved 24 December 2019.
- ^ "Summit Supercomputer is Already Making its Mark on Science". 20 September 2018. Archived from the original on 26 September 2019. Retrieved 5 August 2020.
- ^ "The most powerful computers on the planet - Summit and Sierra". IBM. 6 June 2018. Archived from the original on 16 July 2019. Retrieved 4 April 2019.
- ^ Lilly, Paul (January 25, 2017). "NVIDIA 12nm FinFET Volta GPU Architecture Reportedly Replacing Pascal In 2017". HotHardware. Archived from the original on April 17, 2017. Retrieved April 17, 2017.
- ^ "Summit and Sierra Supercomputers: An Inside Look at the U.S. Department of Energy's New Pre-Exascale Systems" (PDF). November 1, 2014. Archived (PDF) from the original on April 21, 2017. Retrieved April 17, 2017.
- ^ Oral, Sarp; Vazhkudai, Sudharshan; Wang, Feiyi; Zimmer, Christopher; Brumgard, Christopher; Hanley, Jesse; Markomanolis, George; Miller, Ross; Leverman, Dustin B. (2019-11-01). End-to-end I/O portfolio for the summit supercomputing ecosystem (Report). Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). OSTI 1619016. Archived from the original on 2024-01-08. Retrieved 2024-01-08.
External links
[edit]Summit (supercomputer)
View on GrokipediaDevelopment and Procurement
Origins and Funding
The U.S. Department of Energy (DOE) initiated planning for next-generation supercomputers in the early 2010s to sustain American leadership in high-performance computing amid intensifying international rivalry, particularly after China's Tianhe-2 claimed the top position on the TOP500 list in November 2013. At Oak Ridge National Laboratory (ORNL), the existing Titan system, operational since 2012, required replacement to address escalating computational demands in scientific simulation and data analysis. In early 2014, DOE formalized the Collaboration of Oak Ridge, Argonne, and Livermore (CORAL) program to coordinate procurement across its national laboratories, optimize resource allocation, and accelerate development of pre-exascale systems capable of supporting advanced research in energy, materials, and national security.[10] On November 14, 2014, DOE awarded IBM a $325 million contract—shared between Summit for ORNL and a companion system, Sierra, for Lawrence Livermore National Laboratory—to design and build machines incorporating hybrid architectures as stepping stones toward exascale computing.[11][12] This selection followed a competitive request for proposals under CORAL, prioritizing vendors able to deliver scalable performance exceeding 100 petaflops while integrating emerging technologies for broader applicability in DOE missions.[13] Funding for Summit derived primarily from DOE's Office of Science budget, appropriated through congressional allocations of federal taxpayer dollars to maintain U.S. computational infrastructure for open science and classified applications.[14] The investment reflected strategic priorities to counter foreign advances in supercomputing, which had implications for technological edge in fields like nuclear simulation and climate modeling, without relying on classified export-restricted hardware.[15]IBM Partnership and Construction
IBM led the design and construction of Summit in collaboration with NVIDIA for GPU acceleration and Mellanox Technologies for networking, as announced by Oak Ridge National Laboratory on November 14, 2014. The system utilized IBM's Power System AC922 architecture, with each of the 4,608 compute nodes equipped with two 22-core POWER9 processors and six NVIDIA Tesla V100 GPUs, interconnected intra-node via NVIDIA's NVLink 2.0 for 50 GB/s bandwidth between CPUs and GPUs.[16] [1] [2] Assembly of the nodes occurred primarily at IBM facilities, with initial deliveries enabling installation phases at the Oak Ridge Leadership Computing Facility to commence in August 2017. The inter-node fabric employed a non-blocking Mellanox EDR InfiniBand fat-tree topology, providing 100 Gb/s connectivity across dual rails per node to support massive parallelism without bottlenecks.[17] [2] [18] Key engineering milestones included validating the hybrid CPU-GPU scaling during phased rollouts, addressing integration demands of over 27,000 GPUs and 9,000 CPUs through rigorous testing of NVLink coherence and InfiniBand latency under full load. The process achieved initial operational readiness by June 8, 2018, marking the transition from construction to acceptance verification by ORNL and IBM teams.[1] [19]Technical Design
Hardware Architecture
Summit employs a hybrid CPU-GPU architecture comprising 4,608 compute nodes interconnected via a dual-rail Mellanox EDR InfiniBand network operating at 200 Gb/s per rail.[2][20] Each node integrates two IBM POWER9 processors and six NVIDIA Tesla V100 GPUs, enabling high-performance computing workloads through tight integration of CPU and accelerator resources.[2][21] The POWER9 CPUs in each node feature 22 cores clocked at up to 3.07 GHz, providing 44 CPU cores total per node with support for simultaneous multithreading (SMT) up to 8 threads per core.[19] These processors deliver scalable vector extensions and are optimized for data-intensive tasks, with each node allocating 512 GB of DDR4 memory accessible coherently across CPUs and GPUs.[2] The NVIDIA V100 GPUs, each equipped with 5,120 CUDA cores, 640 Tensor cores, and 16 GB of HBM2 memory, connect to the POWER9 CPUs via NVLink 2.0 interconnects offering up to 900 GB/s bidirectional bandwidth per GPU for low-latency data transfer.[2][19] This configuration yields a theoretical peak performance of 200 petaFLOPS in double-precision (FP64) arithmetic across the system.[22]| Component | Specification per Node | System Total |
|---|---|---|
| CPUs | 2 × IBM POWER9 (22 cores each, up to 3.07 GHz) | 9,216 CPUs, ~2.4 million cores |
| GPUs | 6 × NVIDIA Tesla V100 (5,120 CUDA cores each) | 27,648 GPUs |
| Memory | 512 GB DDR4 (coherent) + 96 GB HBM2 (GPU) | ~2.5 PB aggregate |
| Interconnect (intra-node) | NVLink 2.0 (900 GB/s per GPU) | N/A |