Myvideo

Guest

Login

Gal Oren :: Accelerated C++ with OpenMP

Uploaded By: Myvideo
4 views
0
0 votes
0

Presented at Core C 2023 conference. In recent years, co-processors and accelerators such as GPUs have become increasingly prevalent in high-performance computing. These devices are designed to handle a massive amount of parallelism and have much higher computing power than traditional CPUs. OpenMP, an application programming interface (API) for shared-memory parallel programming, has adapted to this changing hardware landscape by adding directives that can map codes and data onto these devices. These directives are known as target directives, and they allow for efficient programming of GPUs and other specialized accelerators. Target directives can significantly improve the performance of high-performance code (such as Fortran, C, and C ) running on heterogeneous systems. They allow for the distribution of work across multiple devices, enabling the programmer to take full advantage of the processing power available on each device. This can result in significant speedups, especially for code that requires a high level of parallelism. In this talk, we will delve into the specifics of using target directives for programming GPUs. We will examine the advantages of target directives over classic shared-memory parallelism and explore how they can be used to optimize code for maximum performance. We will also discuss some of the challenges and trade-offs involved in using target directives, including issues related to memory management and data transfers between devices. ----- Dr. Gal Oren is a scientist in the Computer Science department at Technion - Israel Institute of Technology and a senior researcher in the Scientific Computing Center at Nuclear Research Center - Negev. Oren completed his distributed computing CS Ph.D. at Ben-Gurion University in 2020, advised by Prof. Leonid Barenboim and Prof. Michael Elkin. His research interests include all aspects of scientific computing, specifically High-Performance, Super, Distributed, Parallel & Cluster Computing, and AI.

Share with your friends

Link:

Embed:

Video Size:

Custom size:

x

Add to Playlist:

Favorites
My Playlist
Watch Later