OpenMP

Author: Blaise Barney, Lawrence Livermore National Laboratory UCRL-MI-133316

Table of Contents

  1. Abstract
  2. Introduction
  3. OpenMP Programming Model
  4. OpenMP API Overview
  5. Compiling OpenMP Programs
  6. OpenMP Directives
    1. Directive Format
    2. C/C++ Directive Format
    3. Directive Scoping
    4. PARALLEL Construct
    5. Exercise 1
    6. Work-Sharing Constructs
      1. DO / for Directive
      2. SECTIONS Directive
      3. WORKSHARE Directive
      4. SINGLE Directive
    7. Combined Parallel Work-Sharing Constructs
    8. TASK Construct
    9. Exercise 2
    10. Synchronization Constructs
      1. MASTER Directive
      2. CRITICAL Directive
      3. BARRIER Directive
      4. TASKWAIT Directive
      5. ATOMIC Directive
      6. FLUSH Directive
      7. ORDERED Directive
    11. THREADPRIVATE Directive
    12. Data Scope Attribute Clauses
      1. PRIVATE Clause
      2. SHARED Clause
      3. DEFAULT Clause
      4. FIRSTPRIVATE Clause
      5. LASTPRIVATE Clause
      6. COPYIN Clause
      7. COPYPRIVATE Clause
      8. REDUCTION Clause
    13. Clauses / Directives Summary
    14. Directive Binding and Nesting Rules
  7. Run-Time Library Routines
  8. Environment Variables
  9. Thread Stack Size and Thread Binding
  10. Monitoring, Debugging and Performance Analysis Tools for OpenMP
  11. Exercise 3
  12. References and More Information
  13. Appendix A: Run-Time Library Routines


Abstract


OpenMP is an Application Program Interface (API), jointly defined by a group of major computer hardware and software vendors. OpenMP provides a portable, scalable model for developers of shared memory parallel applications. The API supports C/C++ and Fortran on a wide variety of architectures. This tutorial covers most of the major features of OpenMP 3.1, including its various constructs and directives for specifying parallel regions, work sharing, synchronization and data environment. Runtime library functions and environment variables are also covered. This tutorial includes both C and Fortran example codes and a lab exercise.

Level/Prerequisites: This tutorial is ideal for those who are new to parallel programming with OpenMP. A basic understanding of parallel programming in C or Fortran is required. For those who are unfamiliar with Parallel Programming in general, the material covered in EC3500: Introduction to Parallel Computing would be helpful.



Introduction

What is OpenMP?

OpenMP Is: OpenMP Logo

OpenMP Is Not:

Goals of OpenMP:

History:

Release History

This tutorial refers to OpenMP version 3.1. Syntax and features of newer releases are not currently covered.

References:



OpenMP Programming Model


Shared Memory Model:

Uniform Memory Access Non-Uniform Memory Access

Thread Based Parallelism:

Explicit Parallelism:

Fork - Join Model:

Compiler Directive Based:

Nested Parallelism:

Dynamic Threads:

I/O:

Memory Model: FLUSH Often?



OpenMP API Overview


Three Components:

Compiler Directives:

Run-time Library Routines:

Environment Variables:

Example OpenMP Code Structure:



Compiling OpenMP Programs


LC OpenMP Implementations:

Compiling:



OpenMP Directives

Fortran Directives Format

Format: (case insensitive)

Example:

Fixed Form Source:

Free Form Source:

General Rules:

OpenMP Directives

C / C++ Directives Format

Format:

Example:

General Rules:



OpenMP Directives

Directive Scoping

Do we do this now...or do it later? Oh well, let's get it over with early...

Static (Lexical) Extent:

Orphaned Directive:

Dynamic Extent:

Example:

Why Is This Important?



OpenMP Directives

PARALLEL Region Construct

Purpose:

Format:

Notes:

How Many Threads?

Dynamic Threads:

Nested Parallel Regions:

Clauses:

Restrictions:


Example: Parallel Region



OpenMP Exercise 1

Getting Started

Overview:
  • Login to the workshop cluster using your workshop username and OTP token
  • Copy the exercise files to your home directory
  • Familiarize yourself with LC's OpenMP environment
  • Write a simple "Hello World" OpenMP program
  • Successfully compile your program
  • Successfully run your program
  • Modify the number of threads used to run your program

GO TO THE EXERCISE HERE

    Approx. 20 minutes



OpenMP Directives

Work-Sharing Constructs

Types of Work-Sharing Constructs:

Restrictions:



OpenMP Directives

Work-Sharing Constructs
DO / for Directive

Purpose:

Format:

Clauses:

Restrictions:


Example: DO / for Directive



OpenMP Directives

Work-Sharing Constructs
SECTIONS Directive

Purpose:

Format:

Clauses:

Questions:

Restrictions:


Example: SECTIONS Directive



OpenMP Directives

Work-Sharing Constructs
WORKSHARE Directive

Purpose:

Format:

Restrictions:


Example: WORKSHARE Directive



OpenMP Directives

Work-Sharing Constructs
SINGLE Directive

Purpose:

Format:

Clauses:

Restrictions:



OpenMP Directives

Combined Parallel Work-Sharing Constructs



OpenMP Directives

TASK Construct

Purpose:

Format:

Clauses and Restrictions:



OpenMP Exercise 2

Work-Sharing Constructs

Overview:
  • Login to the LC workshop cluster, if you are not already logged in
  • Work-Sharing DO/for construct examples: review, compile and run
  • Work-Sharing SECTIONS construct example: review, compile and run

GO TO THE EXERCISE HERE

    Approx. 20 minutes



OpenMP Directives

Synchronization Constructs



OpenMP Directives

Synchronization Constructs
MASTER Directive

Purpose:

Format:

Restrictions:



OpenMP Directives

Synchronization Constructs
CRITICAL Directive

Purpose:

Format:

Notes:

Restrictions:


Example: CRITICAL Construct



OpenMP Directives

Synchronization Constructs
BARRIER Directive

Purpose:

Format:

Restrictions:



OpenMP Directives

Synchronization Constructs
TASKWAIT Directive

Purpose:

Format:

Restrictions:



OpenMP Directives

Synchronization Constructs
ATOMIC Directive

Purpose:

Format:

Restrictions:



OpenMP Directives

Synchronization Constructs
FLUSH Directive

Purpose:

Format:

Notes:



OpenMP Directives

Synchronization Constructs
ORDERED Directive

Purpose:

Format:

Restrictions:



OpenMP Directives

THREADPRIVATE Directive

Purpose:

Format:

Notes:

Restrictions:



OpenMP Directives

Data Scope Attribute Clauses


PRIVATE Clause

Purpose:

Format:

Notes:


SHARED Clause

Purpose:

Format:

Notes:


DEFAULT Clause

Purpose:

Format:

Notes:

Restrictions:


FIRSTPRIVATE Clause

Purpose:

Format:

Notes:


LASTPRIVATE Clause

Purpose:

Format:

Notes:


COPYIN Clause

Purpose:

Format:

Notes:


COPYPRIVATE Clause

Purpose:

Format:


REDUCTION Clause

Purpose:

Format:

Example: REDUCTION - Vector Dot Product:

Restrictions:



OpenMP Directives

Clauses / Directives Summary



OpenMP Directives

Directive Binding and Nesting Rules

This section is provided mainly as a quick reference on rules which govern OpenMP directives and binding. Users should consult their implementation documentation and the OpenMP standard for other rules and restrictions.

Directive Binding:

Directive Nesting:



Run-Time Library Routines


Overview:



Environment Variables

OMP_SCHEDULE

Applies only to DO, PARALLEL DO (Fortran) and for, parallel for (C/C++) directives which have their schedule clause set to RUNTIME. The value of this variable determines how iterations of the loop are scheduled on processors. For example:

setenv OMP_SCHEDULE "guided, 4"
setenv OMP_SCHEDULE "dynamic"

OMP_NUM_THREADS

Sets the maximum number of threads to use during execution. For example:

setenv OMP_NUM_THREADS 8

OMP_DYNAMIC

Enables or disables dynamic adjustment of the number of threads available for execution of parallel regions. Valid values are TRUE or FALSE. For example:

setenv OMP_DYNAMIC TRUE

Implementation notes:

  • Your implementation may or may not support this feature.

OMP_PROC_BIND

Enables or disables threads binding to processors. Valid values are TRUE or FALSE. For example:

setenv OMP_PROC_BIND TRUE

Implementation notes:

  • Your implementation may or may not support this feature.

OMP_NESTED

Enables or disables nested parallelism. Valid values are TRUE or FALSE. For example:

setenv OMP_NESTED TRUE

Implementation notes:

  • Your implementation may or may not support this feature. If nested parallelism is supported, it is often only nominal, in that a nested parallel region may only have one thread.

OMP_STACKSIZE

Controls the size of the stack for created (non-Master) threads. Examples:

setenv OMP_STACKSIZE 2000500B
setenv OMP_STACKSIZE "3000 k "
setenv OMP_STACKSIZE 10M
setenv OMP_STACKSIZE " 10 M "
setenv OMP_STACKSIZE "20 m "
setenv OMP_STACKSIZE " 1G"
setenv OMP_STACKSIZE 20000

Implementation notes:

  • Your implementation may or may not support this feature.

OMP_WAIT_POLICY

Provides a hint to an OpenMP implementation about the desired behavior of waiting threads. A compliant OpenMP implementation may or may not abide by the setting of the environment variable. Valid values are ACTIVE and PASSIVE. ACTIVE specifies that waiting threads should mostly be active, i.e., consume processor cycles, while waiting. PASSIVE specifies that waiting threads should mostly be passive, i.e., not consume processor cycles, while waiting. The details of the ACTIVE and PASSIVE behaviors are implementation defined. Examples:

setenv OMP_WAIT_POLICY ACTIVE
setenv OMP_WAIT_POLICY active
setenv OMP_WAIT_POLICY PASSIVE
setenv OMP_WAIT_POLICY passive

Implementation notes:

  • Your implementation may or may not support this feature.

OMP_MAX_ACTIVE_LEVELS

Controls the maximum number of nested active parallel regions. The value of this environment variable must be a non-negative integer. The behavior of the program is implementation defined if the requested value of OMP_MAX_ACTIVE_LEVELS is greater than the maximum number of nested active parallel levels an implementation can support, or if the value is not a non-negative integer. Example:

setenv OMP_MAX_ACTIVE_LEVELS 2

Implementation notes:

  • Your implementation may or may not support this feature.

OMP_THREAD_LIMIT

Sets the number of OpenMP threads to use for the whole OpenMP program. The value of this environment variable must be a positive integer. The behavior of the program is implementation defined if the requested value of OMP_THREAD_LIMIT is greater than the number of threads an implementation can support, or if the value is not a positive integer. Example:

setenv OMP_THREAD_LIMIT 8

Implementation notes:

  • Your implementation may or may not support this feature.


Thread Stack Size and Thread Binding


Thread Stack Size:

Thread Binding:



Monitoring, Debugging and Performance Analysis Tools for OpenMP


Monitoring and Debugging Threads:


Performance Analysis Tools:



OpenMP Exercise 3

Assorted

Overview:
  • Login to the workshop cluster, if you are not already logged in
  • Orphaned directive example: review, compile, run
  • Get OpenMP implementation environment information
  • Check out the "bug" programs

GO TO THE EXERCISE HERE






This completes the tutorial.

Evaluation Form       Please complete the online evaluation form - unless you are doing the exercise, in which case please complete it at the end of the exercise.

Where would you like to go now?



 
References and More Information



Appendix A: Run-Time Library Routines


OMP_SET_NUM_THREADS

Purpose:

Format:

Notes & Restrictions:


OMP_GET_NUM_THREADS

Purpose:

Format:

Notes & Restrictions:


OMP_GET_MAX_THREADS

Purpose:

Notes & Restrictions:


OMP_GET_THREAD_NUM

Purpose:

Format:

Notes & Restrictions:

Examples:


OMP_GET_THREAD_LIMIT

Purpose:

Format:

Notes:


OMP_GET_NUM_PROCS

Purpose:

Format:


OMP_IN_PARALLEL

Purpose:

Format:

Notes & Restrictions:


OMP_SET_DYNAMIC

Purpose:

Format:

Notes & Restrictions:


OMP_GET_DYNAMIC

Purpose:

Format:

Notes & Restrictions:


OMP_SET_NESTED

Purpose:

Format:

Notes & Restrictions:


OMP_GET_NESTED

Purpose:

Format:

Notes & Restrictions:


OMP_SET_SCHEDULE

Purpose:

Format:


OMP_GET_SCHEDULE

Purpose:

Format:


OMP_SET_MAX_ACTIVE_LEVELS

Purpose:

Format:

Notes & Restrictions:


OMP_GET_MAX_ACTIVE_LEVELS

Purpose:

Format:


OMP_GET_LEVEL

Purpose:

Format:

Notes & Restrictions:


OMP_GET_ANCESTOR_THREAD_NUM

Purpose:

Format:

Notes & Restrictions:


OMP_GET_TEAM_SIZE

Purpose:

Format:

Notes & Restrictions:


OMP_GET_ACTIVE_LEVEL

Purpose:

Format:

Notes & Restrictions:


OMP_IN_FINAL

Purpose:

Format:


OMP_INIT_LOCK
OMP_INIT_NEST_LOCK

Purpose:

Format:

Notes & Restrictions:


OMP_DESTROY_LOCK
OMP_DESTROY_NEST_LOCK

Purpose:

Format:

Notes & Restrictions:


OMP_SET_LOCK
OMP_SET_NEST_LOCK

Purpose:

Format:

Notes & Restrictions:


OMP_UNSET_LOCK
OMP_UNSET_NEST_LOCK

Purpose:

Format:

Notes & Restrictions:


OMP_TEST_LOCK
OMP_TEST_NEST_LOCK

Purpose:

Format:

Notes & Restrictions:


OMP_GET_WTIME

Purpose:

Format:


OMP_GET_WTICK

Purpose:

Format: