Skip to main content



  1. Home >
  2. News >
  3. Publications >
  4. Periodicals >
  5. Scientific & Technical Journal >
  6. Archives >
  7. FSTJ: The K computer

The K computer

FSTJ 2012-7 Cover Image

2012-7 (Vol.48, No.3)

This special issue provides a broad overview of the K computer. It outlines the project, describes RIKEN and Fujitsu's approach to developing this supercomputer, and covers hardware, software, and applications, including key papers written by RIKEN personnel.

2012-7 (Vol.48, No.3) Contents

  • 1. Preface (PDF)

    The K computer is scheduled to begin shared-use operations in the fall of 2012. I am convinced that many beneficial results will be forthcoming through the development of many applications and the achievement of high execution performance as originally planned. Looking to the future, Fujitsu plans to achieve a supercomputer with even higher performance and to develop even more beneficial applications. We will accomplish this by refining the technologies introduced in this special issue and leveraging our strengths in developing entire computer systems from CPU to software. ---[Yuji Oinaga, Head of Next Generation Technical Computing Unit]

  • 2. Special Contribution: Japan's K computer Project (PDF)

    ---[Kimihiko Hirao]

  • 3. Overview of the K computer System (PDF)

    RIKEN and Fujitsu have been working together to develop the K computer, with the aim of beginning shared use by the fall of 2012, as a part of the High-Performance Computing Infrastructure (HPCI) initiative led by Japan's Ministry of Education, Culture, Sports, Science and Technology (MEXT). Since the K computer involves over 80 000 compute nodes, building it with lower power consumption and high reliability was important from the availability point of view. This paper describes the K computer system and the measures taken for reducing power consumption and achieving high reliability and high availability. It also presents the results of implementing those measures. ---[Hiroyuki Miyazaki, Yoshihiro Kusano, Naoki Shinjou, Fumiyoshi Shoji, Mitsuo Yokokawa, Tadashi Watanabe]

  • 4. Construction and Facilities Technologies for the K computer (PDF)

    The facilities for housing the K computer and for cooling and supplying power have many features not found in other supercomputer sites. These include an expansive, pillar-free computer room, a power supply system that combines the functions of a cogeneration system (CGS) and a high-speed current-limiting circuit breaker without using an uninterruptible power supply (UPS), distribution boards installed under a raised floor instead of on computer-room walls, extremely quiet, high-efficiency air conditioning equipment, and a cooling-water system for CPUs featuring precise temperature control. These features are part of a policy that was adopted to ensure quick and easy installation and stable and safe operation of the K computer. The application of these unique features did not require the development or adoption of new technologies. It was accomplished by cleverly combining existing, proven, and mature technologies having a stable reputation, since a development project limited in time and budget should not adopt novel and unproven technologies. This paper describes the construction and facilities technologies supporting the operation of the K computer. ---[Yoshihiro Sekiguchi]

  • 5. SPARC64 VIIIfx: CPU for the K computer (PDF)

    SPARC64 VIIIfx, which was developed as a processor for the K computer, uses Fujitsu Semiconductor Ltd.'s 45-nm CMOS process for semiconductors and is composed of eight cores, a 6 MB shared level 2 cache, and memory controllers. Peak performance of 128 GFLOPS at an operating frequency of 2 GHz is achieved with power consumption as low as 58 W. The performance per unit of power is more than six times that of the SPARC processor, our previous model. To achieve this performance per unit of power, we extended the SPARC-V9 architecture to develop high performance computing-arithmetic computational extensions (HPC-ACE), the optimum instruction set for scientific computations. In addition, we successfully reduced the leakage power by water cooling and dynamic power by clock gating to achieve a lower power consumption. Furthermore, high-reliability technology for mainframes and UNIX servers is used to ensure stable operation of a system connecting more than 80 000 processors. This paper outlines the technologies used to achieve the high performance, low power consumption and high reliability of SPARC64 VIIIfx. ---[Toshio Yoshida, Mikio Hondo, Ryuji Kan, Go Sugizaki]

  • 6. Tofu: Interconnect for the K computer (PDF)

    Torus fusion (Tofu) is an interconnect for massive parallel computers, and it has been developed to build the K computer that interconnects more than 80 000 nodes. The Tofu interconnect achieves high scalability beyond 100 000 nodes, high performance, high reliability, and high availability. The network topology is a highly scalable six-dimensional mesh/torus. The link throughput is 5 GB/s in each direction. Each node can communicate in four directions simultaneously. The three-dimensional torus rank-mapping scheme improves the system availability and the Tofu barrier interface (TBI) processes collective communications with low latency. Network interfaces and a router of the Tofu interconnect are integrated into a newly developed chip called InterConnect Controller (ICC). This paper describes overviews and characteristics of the ICC chip, the six-dimensional mesh/torus network, high-performance and highly reliable communication functions and the TBI. ---[Yuuichirou Ajima, Tomohiro Inoue, Shinya Hiramoto, Toshiyuki Shimizu]

  • 7. System Packaging Technologies for the K computer (PDF)

    The K computer ranked first on the TOP500 List of June 2011 and maintained its position atop the TOP500 List of November 2011. In addition, it took sixth place in the June 2011 edition of the Green500, which provides a ranking of supercomputers in terms of computational performance per unit of power. The achievement of such high performance and energy efficiency is due not only to the high-performance, low-power-consumption CPU but also largely to the system packaging technologies: rack technology to allow high-density mounting of CPUs, connection technology to achieve high-speed data transmission between CPUs, cooling technology for improved reliability and power supply technology to reduce power loss. This paper describes the system packaging technologies applied to the K computer. ---[Hideki Maeda, Hideo Kubo, Hiroshi Shimamori, Akira Tamura, Jie Wei]

  • 8. Operating System for the K computer (PDF)

    For the K computer to achieve the world's highest performance, Fujitsu has worked on the following three performance improvements in the development of the operating system (OS). First, to bring out the maximum hardware performance of our original CPU and interconnect, we provided a mechanism of controlling hardware extensions directly from applications. As the second improvement, we have introduced the synchronization scheduling function that minimizes the synchronization wait time of parallel programs resulting from system interruptions by coordinating job runtime and system runtime between multiple nodes. Third, multiple page size support that allows use of more than one page size has been achieved for improved memory access performance and memory utilization efficiency. This paper also describes the performance improvement functions, usability and robustness of the OS developed. ---[Jun Moroo, Masahiko Yamada, Takeharu Kato]

  • 9. High-Performance and Highly Reliable File System for the K computer (PDF)

    RIKEN and Fujitsu have been developing the world's fastest supercomputer, the K computer. In addition to over 80 000 compute nodes, the K computer has several tens of petabytes storage capacity and over one terabyte per second of I/O bandwidth. It is the largest and fastest storage system in the world. In order to take advantage of this huge storage system and achieve high scalability and high stability, we developed the Fujitsu Exabyte File System (FEFS), a clustered distributed file system. This paper outlines the K computer's file system and introduces measures taken in FEFS to address key issues in a large-scale system. ---[Kenichiro Sakai, Shinji Sumimoto, Motoyoshi Kurokawa]

  • 10. Operations Management Software for the K computer (PDF)

    Supercomputer systems have been increasing steadily in scale (number of CPU cores and number of nodes) in response to an ever increasing demand for computing power. Operations management software is the key to operating such ultra-large-scale systems in a stable manner and providing users with a high-performance computing environment having a high utilization rate. Fujitsu previously developed software called "Parallelnavi" to provide uniform operations management for its 3000-node supercomputer systems, but to uniformly manage an ultra-large-scale system like the K computer on a scale of more than 80 000 nodes, it expanded its development of operations management technologies. This paper introduces operations management software developed for the K computer, focusing on functions for achieving stable operation of an ultra-large-scale system and a job scheduler for providing a high-performance computing environment. ---[Kouichi Hirai, Yuji Iguchi, Atsuya Uno, Motoyoshi Kurokawa]

  • 11. Compiler Technology That Demonstrates Ability of the K computer (PDF)

    We developed SPARC64 VIIIfx, a new CPU for constructing a huge computing system on a scale of 10 PFLOPS. To make the best use of the features of this CPU, we developed a language package called "Parallelnavi Technical Computing Language." This paper presents compilers for Fortran/C/C++ included in the language package. In these compilers, we enhanced the optimization function for sequential processing (sequential optimization) and the function of the compilers to automatically generate thread parallel processing codes (automatic parallelization) to bring out the best of SPARC64 VIIIfx. Moreover, we have provided a hybrid parallel execution model that combines thread parallel execution and process parallel execution to realize high execution performance in a large-scale system. This model supports the latest industry-standard language specifications, and so it has allowed us to compile a wider range of programs. ---[Koutarou Taki, Manabu Matsuyama, Hitoshi Murai, Kazuo Minami]

  • 12. MPI Library and Low-Level Communication on the K computer (PDF)

    The key to raising application performance in a massively parallel system like the K computer is to increase the speed of communication between compute nodes. In the K computer, this inter-node communication is governed by the Message Passing Interface (MPI) communication library and low-level communication. This paper describes the implementation and performance of the MPI communication library, which exploits the new Tofu-interconnect architecture introduced in the K computer to enhance the performance of petascale applications, and low-level communication mechanism, which performs fine-grained control of the Tofu interconnect. ---[Naoyuki Shida, Shinji Sumimoto, Atsuya Uno]

  • 13. Performance Profiling and Debugging on the K computer (PDF)

    We have developed application-development support tools for the K computer. This paper describes profiling functions for raising the performance of applications and debugging functions for testing applications as main functions of these tools. In developing these tools, we first defined the work procedure that a user would follow for improving the performance of an application and testing it. We then investigated the form that these application-development support tools should take for each task in that procedure. We here introduce these tools in conjunction with those tasks. Additionally, while the large-scale configuration of the K computer is one of its major features, existing profilers and debuggers for large-scale applications still have problems that have yet to be solved, and we here describe new measures for addressing those problems. We also touch upon profiling functions specifically developed for the advanced hardware of the K computer such as the high-performance SPARC64 VIIIfx processor and Tofu interconnect. ---[Keiichi Ida, Yasuyuki Ohno, Shunsuke Inoue, Kazuo Minami]

  • 14. Web Portals of the K computer (PDF)

    Modern computer and communication technologies have transformed computers into information appliances, where people can have at their fingertips any information spread throughout the World Wide Web. The current Web technologies provide means for accessing a computer's sophisticated functionalities through easy-to-use and intuitive user interfaces. Such features are also important in the technical computing field, where researchers require not only high-speed computing power, but also capabilities to easily access the complex functionalities of supercomputers. As part of the effort to provide easy access to the K computer, We developed Web-based User's Portal, which provides means for the end users to access the K computer through the Web. We also developed the Systems Administrator's Portal to make it easier for systems administrators to manage the large-scale computer system. In this article, we introduce key design issues in developing the User's Portal and the Systems Administrator's Portal. ---[Hiroaki Yuasa, Naoki Onishi, Kouichirou Suzuki, Atsuya Uno, Motoyoshi Kurokawa]

  • 15. Visualization Technology for the K computer (PDF)

    Visualization technology makes images or videos from the results of numerical computations performed by supercomputers. To visualize the results of parallel computation on a super-large scale of a few to a few tens of thousands of parallel processes, processing in which bulky computation result data are rendered at high speeds is required. The conventional method of transferring computation result data to a visualization server may give rise to many challenges that are difficult to meet such as reassembly and transfer of extensive amounts of computation result files. This paper presents technologies for solving these challenges to visualize super-large-scale computations performed by the K computer and the results of applying those technologies. ---[Atsuji Ogasa, Hiroyuki Maesaka, Kiyotaka Sakamoto, Sadanori Otagiri]

  • 16. Application Software and Usage Environment for the K computer (PDF)

    The K computer is a super massively parallel computer consisting of about one million processing cores, so the development of an environment in which it is easy to use is important. To facilitate the use of the K computer, two application software development projects, called "grand challenge projects," are being carried out, one for nanoscience and one for life science. A number of program codes that are well-optimized for the K computer will be developed in these projects, and, after project completion, users of the K computer will be able to use these program codes on demand. Five science and technology fields have been specified for promoting use of the K computer and high-performance computing. Research and development related to accomplishing the strategic goals of the two projects are being pursued, and the establishment of a research system for computational science is expected. In this article, application software development and the usage environment for the K computer are presented. ---[Satoshi Itoh]

  • 17. New Technologies of Applications Using Supercomputer (PDF)

    In recent years, with the incredible increase of computing power, we can simulate accurately macro-scale phenomena based on the laws which govern micro-scale phenomena. For example, if we fully utilize the computing capability of the K computer, it is possible to simulate the behavior of a human heart and how it responds to medicine at the cellular level, and to calculate the efficiency of an electric motor by simulating the characteristics of its magnetic material. However, a problem arises because we must control more than tens of thousands of parallel processes properly in order to get high performance out of the recent supercomputers. For this reason, new computation techniques based on both the latest scientific findings and the understanding of supercomputer architecture are required. In this report, we introduce four applications as our approach to the problem described above. ---[Masahiro Watanabe, Tamon Suwa, Atsushi Furuya, Kentaro Takai]

  • 18. Fujitsu's Activities in Improving Performance of LS-DYNA Nonlinear Finite Element Analysis Software (PDF)

    The LS-DYNA nonlinear finite element analysis software package developed for structural analysis by the Livermore Software Technology Corporation (LSTC) is widely used by the automobile, aerospace, construction, military, manufacturing, and bioengineering industries. Fujitsu has been a partner with LSTC since 1996, supporting customers in Japan. A common application of LS-DYNA is car crash simulation. One way to improve the accuracy of the simulation results is to increase the number of elements in the analytical model. However, this increases the amount of computation, resulting in longer computation times, which goes against user expectations of quicker job turnaround when using high-performance computing systems. We report Fujitsu's activities in supporting higher speeds in a hybrid version of LS-DYNA applicable to large-scale parallel processing on the K computer and in improving the performance of the LS-DYNA package for car crash simulation. ---[Kenshiro Kondo]