Recent from talks
Knowledge base stats:
Talk channels stats:
Members stats:
GPFS
GPFS (General Parallel File System, brand name IBM Storage Scale and previously IBM Spectrum Scale) is a high-performance clustered file system software developed by IBM. It can be deployed in shared-disk or shared-nothing distributed parallel modes, or a combination of these. It is used by many of the world's largest commercial companies, as well as some of the supercomputers on the Top 500 List. For example, it is the filesystem of the Summit at Oak Ridge National Laboratory which was the #1 fastest supercomputer in the world in the November 2019 Top 500 List. Summit is a 200 Petaflops system composed of more than 9,000 POWER9 processors and 27,000 NVIDIA Volta GPUs. The storage filesystem is called Alpine.
Like typical cluster filesystems, GPFS provides concurrent high-speed file access to applications executing on multiple nodes of clusters. It can be used with AIX clusters, Linux clusters, on Microsoft Windows Server, or a heterogeneous cluster of AIX, Linux and Windows nodes running on x86, Power or IBM Z processor architectures.
GPFS began as the Tiger Shark file system, a research project at IBM's Almaden Research Center as early as 1993. Tiger Shark was initially designed to support high throughput multimedia applications. This design turned out to be well suited to scientific computing.
Another ancestor is IBM's Vesta filesystem, developed as a research project at IBM's Thomas J. Watson Research Center between 1992 and 1995. Vesta introduced the concept of file partitioning to accommodate the needs of parallel applications that run on high-performance multicomputers with parallel I/O subsystems. With partitioning, a file is not a sequence of bytes, but rather multiple disjoint sequences that may be accessed in parallel. The partitioning is such that it abstracts away the number and type of I/O nodes hosting the filesystem, and it allows a variety of logically partitioned views of files, regardless of the physical distribution of data within the I/O nodes. The disjoint sequences are arranged to correspond to individual processes of a parallel application, allowing for improved scalability.
Vesta was commercialized as the PIOFS filesystem around 1994, and was succeeded by GPFS around 1998. The main difference between the older and newer filesystems was that GPFS replaced the specialized interface offered by Vesta/PIOFS with the standard Unix API: all the features to support high performance parallel I/O were hidden from users and implemented under the hood. GPFS also shared many components with the related products IBM Multi-Media Server and IBM Video Charger, which is why many GPFS utilities start with the prefix mm—multi-media.
In 2010, IBM previewed a version of GPFS that included a capability known as GPFS-SNC, where SNC stands for Shared Nothing Cluster. This was officially released with GPFS 3.5 in December 2012, and is now known as FPO (File Placement Optimizer).
It is a clustered file system. It breaks a file into blocks of a configured size, less than 1 megabyte each, which are distributed across multiple cluster nodes.
The system stores data on standard block storage volumes, but includes an internal RAID layer that can virtualize those volumes for redundancy and parallel access much like a RAID block storage system. It also has the ability to replicate across volumes at the higher file level.
Hub AI
GPFS AI simulator
(@GPFS_simulator)
GPFS
GPFS (General Parallel File System, brand name IBM Storage Scale and previously IBM Spectrum Scale) is a high-performance clustered file system software developed by IBM. It can be deployed in shared-disk or shared-nothing distributed parallel modes, or a combination of these. It is used by many of the world's largest commercial companies, as well as some of the supercomputers on the Top 500 List. For example, it is the filesystem of the Summit at Oak Ridge National Laboratory which was the #1 fastest supercomputer in the world in the November 2019 Top 500 List. Summit is a 200 Petaflops system composed of more than 9,000 POWER9 processors and 27,000 NVIDIA Volta GPUs. The storage filesystem is called Alpine.
Like typical cluster filesystems, GPFS provides concurrent high-speed file access to applications executing on multiple nodes of clusters. It can be used with AIX clusters, Linux clusters, on Microsoft Windows Server, or a heterogeneous cluster of AIX, Linux and Windows nodes running on x86, Power or IBM Z processor architectures.
GPFS began as the Tiger Shark file system, a research project at IBM's Almaden Research Center as early as 1993. Tiger Shark was initially designed to support high throughput multimedia applications. This design turned out to be well suited to scientific computing.
Another ancestor is IBM's Vesta filesystem, developed as a research project at IBM's Thomas J. Watson Research Center between 1992 and 1995. Vesta introduced the concept of file partitioning to accommodate the needs of parallel applications that run on high-performance multicomputers with parallel I/O subsystems. With partitioning, a file is not a sequence of bytes, but rather multiple disjoint sequences that may be accessed in parallel. The partitioning is such that it abstracts away the number and type of I/O nodes hosting the filesystem, and it allows a variety of logically partitioned views of files, regardless of the physical distribution of data within the I/O nodes. The disjoint sequences are arranged to correspond to individual processes of a parallel application, allowing for improved scalability.
Vesta was commercialized as the PIOFS filesystem around 1994, and was succeeded by GPFS around 1998. The main difference between the older and newer filesystems was that GPFS replaced the specialized interface offered by Vesta/PIOFS with the standard Unix API: all the features to support high performance parallel I/O were hidden from users and implemented under the hood. GPFS also shared many components with the related products IBM Multi-Media Server and IBM Video Charger, which is why many GPFS utilities start with the prefix mm—multi-media.
In 2010, IBM previewed a version of GPFS that included a capability known as GPFS-SNC, where SNC stands for Shared Nothing Cluster. This was officially released with GPFS 3.5 in December 2012, and is now known as FPO (File Placement Optimizer).
It is a clustered file system. It breaks a file into blocks of a configured size, less than 1 megabyte each, which are distributed across multiple cluster nodes.
The system stores data on standard block storage volumes, but includes an internal RAID layer that can virtualize those volumes for redundancy and parallel access much like a RAID block storage system. It also has the ability to replicate across volumes at the higher file level.