Industry Standard Operating System
The operating system is Linux, which is used as the standard in the HPC field. The portability of applications and tools is maintained, while the functions have been extended. Extended functions include support for high-speed execution of applications through the use of HPC-ACE, synchronization scheduling and large page, as well as improvements in system operability achieved for example, by isolating failed processors and memory.
Technical Computing Suite, HPC Middleware for Petascale Computing
System Management and Job operations Management Functions to Enable Efficient Operation in Large Systems
- System Management Functions
The system is managed and operated using an efficient system monitoring and control facility capable of handling a large number of nodes. Management features include hierarchical management of compute nodes, synchronized start and stop across the whole system, simultaneous OS deployment of nodes, and execution of operational commands across specific nodes. The system can also be partitioned, depending on
- Job Operations Management Functions
High system efficiency is maintained using a variety of job operations management functions. These include: resource assignment for jobs optimized to the 6D Mesh/Torus Tofu interconnect architecture; detailed assignment of resources according to resource selection priorities; a back fill function for effective use of unused resources; a deadline scheduling function that does not assign resources during pre-specified periods, such as in a maintenance period.
Large-capacity, High-performance, and Highly Reliable Distributed File System, FEFS
The FEFS distributed file system can be shared by over 100,000 nodes and provides high-capacity, high-performance, and highly reliable file systems. FEFS can support file systems of up to 8 exabytes with approximately 9,000 quadrillion files. It also has the ability to increase capacity and I/O throughput by scaling-out I/O servers.
FEFS achieves high performance through a caching and file striping function. Fault tolerance is supported through I/O server fail-over, file system journaling and other functions. In addition, fine control of the file system can be performed using functions such as the QoS control feature that guarantees I/O bandwidth for each user, and by the quota feature, available at the directory level.
FEFS includes the ability to use a global and local file system. The efficient file staging function between both file systems is provided in cooperation with the job operations management function. Files are arranged in the local file system for optimal access from job processes resulting in high-speed I/O and a reduction in variations in job execution times caused by inconsistent I/O processing times.
A Variety of Language Processing Systems Maximizing Hardware Performance
- Fortran/C/C++ Compilers
International standards compliant Fortran 95, Fortran 2003, C99, and C++ are all part of the integrated development environment. These compilers maximize the performance of the SPARC64TM IXfx using the extended register sets, sector cache, and SIMD instructions of HPC-ACE.
Moreover, automatic-parallelization and parallelization using OpenMP are supported, and highly efficient multi threaded processing can be achieved using VISIMPACT cooperating with the hardware functions.
- Message Passing Library (MPI)
Industry standard MPI 2.1 is supported. The MPI library is highly optimized for the Tofu interconnect with increased performance and a small memory footprint. In particular, special algorithms are used for functions that are frequently used in collective communication considering the Tofu interconnect topology.
Implementation of a high-speed barrier and reduction function in the Tofu interconnect further reduces parallel application processing times.
- Mathematical Libraries
PRIMEHPC FX10 supports Fujitsu's optimized highly tuned mathematical libraries, SSL II, C-SSL II, and SSL II/MPI as well as industry standard libraries, such as BLAS, LAPACK, and ScaLAPACK. Since the majour routines are highly tuned to exploit performance of the SPARC64TM IXfx, applications using these libraries can attain high levels of performance.
- Data Parallel Processing Compiler
XPFortran facilitates the development of parallel applications based on Fortran. Since the XPFortran extensions are designed as directives, an XPFortran program can also be run as an ordinary Fortran program.
Advanced Application Development Environment
A GUI-based development environment provides a unified view through all phases of application development. The development process is accelerated with the use of highly functional development tools such as an interactive debugger usable with sequential and parallel applications written in Fortran, C, and C++ and profiler/tracer which helps with efficient application tuning.
GUI for Application Development
User and Operation Management Support Solutions
Portal based solutions are provided to support the user and operational management of the system.
HPC Portal is a Web portal that allows end users simple use of the PRIMEHPC FX10. Specifically, users can edit files, compile or submit/monitor/kill jobs, with the convenience of a standard Web browser.
Operation Management Portal
This Web portal allows administrators to monitor and operate the PRIMEHPC FX10. The portal enables a variety of views that present system operational status, display log information, review resource utilization, as well as manage jobs, and control the power supply via a Web browser. The result is a reduction in operation management work and lower costs for administrator training.