|
|
Plenary Lectures
Algorithmic and Software Challenges when Moving Towards Exascale.
|
|
Jack Dongarra : University of Tennessee, Oak Ridge National Laboratory, University of Manchester
In this talk we examine how high performance computing has changed over the last 10-year and look toward the future in terms of trends. These changes have had and will continue to have a major impact on our software. Some of the software and algorithm challenges have already been encountered, such as management of communication and memory hierarchies through a combination of compile--time and run--time techniques, but the increased scale of computation, depth of memory hierarchies, range of latencies, and increased run--time environment variability will make these problems much harder.
We will look at five areas of research that will have an importance impact in the development of software and algorithms.
We will focus on following themes:
• Redesign of software to fit multicore and hybrid architectures
• Automatically tuned application software
• Exploiting mixed precision for performance
• The importance of fault tolerance
• Communication avoiding algorithms
|
|
Stabilized and Regularized Semi-Lagrangian Meshfree Method for Fragment-Contact-Impact Problems
|
|
Jiun Shyan Chen : Civil & Environmental Engineering Department, University of California, Los Angeles, USA
The large degree of material distortions and separations involved in the fragment-contact-impact problems poses considerable difficulties in the numerical simulation using Lagrangian mesh based methods due to the high degree of roughness in the solution. In this work, we introduce a Semi-Lagrangian Reproducing Kernel Particle Method to address the aforementioned difficulties. The evolving strong and weak discontinuities in the fragmentation processes call for stabilized nodal integration for Semi-Lagrangian RKPM. We show how to construct stabilization of nodal integration to achieve consistency, stability, and convergence under Galerkin meshfree framework. Further, spurious low energy modes exist in some stabilization methods and they could be excited under high strain rates and contact-impact conditions. In this work, we also introduce stabilization of spurious low energy modes for general large deformation problems. Approximations used in meshfree methods, such as moving least square and reproducing kernel approximations, possess intrinsic nonlocal properties. These nonlocal properties of reproducing kernel approximation are exploited to incorporate an intrinsic length scale which regularizes problems with material instabilities in strain localizations. We also introduce approximation and domain integration approaches to yield a gradient type regularization to the localization problems. For modeling fragment-impact problems, we further propose a kernel contact algorithm for multi-body contact where the contact surfaces are not known a priori. Several fragment-contact-impact and penetration problems are given to demonstrate the effectiveness of the proposed methods.
|
|
Modal Analysis of Fluid-Structure Interaction Problems. Computational and Reduced Order Models
|
|
Roger Ohayon : Structural Mechanics and Coupled Systems Laboratory, Conservatoire National des Arts et Métiers
In parallel of direct symmetric variational formulations/numerical finite elements for modal analysis of fluid-structure interior vibrations, the construction of a family of appropriate reduced order models, especially for the fluid part, as done in structural dynamics by substructuring is of prime importance for sensitivity analysis, multidisciplinary optimization, updating with experiments as well as hybrid active/passive vibration reduction treatments of those systems for their control (as an example let us cite the modelling of acting as physical interfaces such as visco/piezo layers).
Reduced order fluid-structure models have been already analysed. A distinction should be clearly made between gas and/or liquids taking into account incompressibility/compressibility as well as light fluids/ heavy fluids considerations with gravity effects.
The purpose of this presentation will be to give a review synthesis of those aspects and perspectives.
|
|
Technologies implemented in the K computer and beyond.
|
|
Yuji Oinaga : Fujitsu Limited, Japan
The presentation will highlight technologies introduced in Fujitsu’s massively parallel supercomputer series and the K computer by courtesy of RIKEN.
Fujitsu has been developing SPARC based massively parallel supercomputers, FX1 in 2008, the K computer in 2010, and PRIMEHPC FX10 in 2012. The architecture development conforms to the following design approach;
(1) Single CPU per node design for high memory bandwidth
(2) Step by step enhancement for massively parallel calculations
(3) Preserving application compatibility as much as possible
Through the development, Fujitsu invented fundamental technologies for scalability, efficiency, and usability. The technologies include a SIMD extension for HPC applications (HPC-ACE), automatic parallelization support for multi-threaded execution (VISIMPACT), a direct network based on a 6D mesh/torus interconnect (Tofu), and hardware support for collective communication (via the Tofu & its dedicated algorithms).
We evaluated the technologies using real applications and real machines. The evaluations show that not a few real applications perform good efficiency with over 30% of peak performance. The level of vector machine efficiency is accomplished by a thorough utilization of software pipelining and sophisticated tuning techniques using the Performance Analysis counters of the CPU. In addition, some real applications show that the performance of hybrid parallel execution model outperforms that of flat MPI execution model. The compiler support of VISIMPACT technology transforms the applications to the hybrid execution model with flexible combination of processes and threads.
Fujitsu’s recent activities and technology development toward Exascale computing will be also introduced briefly.
|
|
Some considerations on high performance computational mechanics
|
|
Genki Yagawa : Toyo University
It is well known in the computational mechanics community that the solutions of the numerical methods such as the finite element method are expected to become closer to the exact solution if one employs a finer mesh. However, if a model is divided into a uniform and very fine mesh, the degrees of freedom of the final equation would be quite huge, which is not a practical way in usual. For this reason, an efficient method is required, which divides only important regions into finer mesh. In the field of fluid dynamics, it is difficult to estimate the appropriate mesh for the analysis a priori, especially in the case of a high Reynolds number. In this context, several adaptive methods have been studied for the efficient meshing algorithms, which consist of the following two procedures: posterior error estimation and meshing based on the result of the above error estimation. The above steps are repeated many times within adaptive analyses, in which the cost of meshing is very serious.
On the other hand, the development of supercomputers has made remarkable progress in several countries in the world. For example, many petascale computers are now in operation, contributing effectively to the faster development in science and technology in various areas. It is important to note that the exascale computer is considered to appear around 2020 or even before. Then, another important issue in the computational mechanics community nowadays is how to develop the efficient computational mechanics software in the era of peta/exascale computers. It is becoming more and more essential that our computational methods and software need to be tailored toward the hardware architecture of peta/exascale computers. The conventional ways of developing software will have to be changed, if we are willing to make the fullest use of these supercomputers. The questions to us in the computational mechanics community would be whether we are ready to run our codes efficiently and making the fullest possible use of peta or future exascale computers. It is most likely that future exascale computers will be built using “low performance”, very low-energy-consuming, low-cost, and highly-reliable processors. The FLOPS may be made practically free in exascale computers. However, the memory will still be very expensive. It is possible that the number of total processors can be even much more than the number of elements in an FEM model. Since local operations can be made practically free, it makes good sense to make the simplest model to deliver the best possible accurate solution. We may need to advance the theory for creating future computational mechanics.
|
|
|
|