v5.2.2 release

OpenMP · Apr 16, 2024 · 11f2efc · 11f2efc
1 parent 075683d
commit 11f2efc
Show file tree

Hide file tree

Showing 183 changed files with 5,545 additions and 3,897 deletions.
diff --git a/Chap_SIMD.tex b/Chap_SIMD.tex
@@ -8,34 +8,34 @@
 Many processors have SIMD (vector) units that can perform simultaneously 
 2, 4, 8 or more executions of the same operation (by a single SIMD unit). 
 
-Loops without loop-carried backward dependency (or with dependency preserved using 
-ordered simd) are candidates for vectorization by the compiler for 
+Loops without loop-carried backward dependences (or with dependences preserved using
+\kcode{ordered simd}) are candidates for vectorization by the compiler for 
 execution with SIMD units. In addition, with state-of-the-art vectorization 
-technology and \code{declare simd} directive extensions for function vectorization
+technology and \kcode{declare simd} directive extensions for function vectorization
 in the OpenMP 4.5 specification, loops with function calls can be vectorized as well. 
 The basic idea is that a scalar function call in a loop can be replaced by a vector version 
 of the function, and the loop can be vectorized simultaneously by combining a loop 
-vectorization (\code{simd} directive on the loop) and a function 
-vectorization (\code{declare simd} directive on the function).
+vectorization (\kcode{simd} directive on the loop) and a function 
+vectorization (\kcode{declare simd} directive on the function).
 
-A \code{simd} construct states that SIMD operations be performed on the
+A \kcode{simd} construct states that SIMD operations be performed on the
 data within the loop.  A number of clauses are available to provide
-data-sharing attributes (\code{private}, \code{linear}, \code{reduction} and 
-\code{lastprivate}).  Other clauses provide vector length preference/restrictions 
-(\code{simdlen} / \code{safelen}), loop fusion (\code{collapse}), and data 
-alignment (\code{aligned}).
+data-sharing attributes (\kcode{private}, \kcode{linear}, \kcode{reduction} and 
+\kcode{lastprivate}).  Other clauses provide vector length preference/restrictions 
+(\kcode{simdlen} / \kcode{safelen}), loop fusion (\kcode{collapse}), and data 
+alignment (\kcode{aligned}).
 
-The \code{declare simd} directive designates
+The \kcode{declare simd} directive designates
 that a vector version of the function should also be constructed for 
-execution within loops that contain the function and have a \code{simd} 
-directive.  Clauses provide argument specifications (\code{linear},
-\code{uniform}, and \code{aligned}), a requested vector length 
-(\code{simdlen}), and designate whether the function is always/never 
-called conditionally in a loop (\code{notinbranch}/\code{inbranch}). 
+execution within loops that contain the function and have a \kcode{simd} 
+directive.  Clauses provide argument specifications (\kcode{linear},
+\kcode{uniform}, and \kcode{aligned}), a requested vector length 
+(\kcode{simdlen}), and designate whether the function is always/never 
+called conditionally in a loop (\kcode{notinbranch}/\kcode{inbranch}). 
 The latter is for optimizing performance.
 
-Also, the \code{simd} construct has been combined with the worksharing loop 
-constructs (\code{for simd} and \code{do simd}) to enable simultaneous thread 
+Also, the \kcode{simd} construct has been combined with the worksharing loop 
+constructs (\kcode{for simd} and \kcode{do simd}) to enable simultaneous thread 
 execution in different SIMD units.  
 %Hence, the \code{simd} construct can be 
 %used alone on a loop to direct vectorization (SIMD execution), or in 

diff --git a/Chap_affinity.tex b/Chap_affinity.tex
@@ -1,7 +1,7 @@
 \cchapter{OpenMP Affinity}{affinity}
 \label{chap:openmp_affinity}
 
-OpenMP Affinity consists of a \code{proc\_bind} policy (thread affinity policy) and a specification of
+OpenMP Affinity consists of a \kcode{proc_bind} policy (thread affinity policy) and a specification of
 places (``location units'' or \plc{processors} that may be cores, hardware
 threads, sockets, etc.).  
 OpenMP Affinity enables users to bind computations on specific places.
@@ -11,13 +11,13 @@
 if two or more cores (hardware threads, sockets, etc.) have been assigned to a given place.
 
 Often the binding can be managed without resorting to explicitly setting places.
-Without the specification of places in the \code{OMP\_PLACES} variable, 
+Without the specification of places in the \kcode{OMP_PLACES} variable, 
 the OpenMP runtime will distribute and bind threads using the entire range of processors for 
-the OpenMP program, according to the \code{OMP\_PROC\_BIND} environment variable
-or the \code{proc\_bind} clause.  When places are specified, the OMP runtime
+the OpenMP program, according to the \kcode{OMP_PROC_BIND} environment variable
+or the \kcode{proc_bind} clause.  When places are specified, the OMP runtime
 binds threads to the places according to a default distribution policy, or
-those specified in the \code{OMP\_PROC\_BIND} environment variable or the
-\code{proc\_bind} clause.
+those specified in the \kcode{OMP_PROC_BIND} environment variable or the
+\kcode{proc_bind} clause.
 
 In the OpenMP Specifications document a processor refers to an execution unit that
 is enabled for an OpenMP thread to use.  A processor is a core when there is
@@ -31,7 +31,7 @@
 
 The processors available to a process may be a subset of the system's
 processors.  This restriction may be the result of a 
-wrapper process controlling the execution (such as \code{numactl} on Linux systems), 
+wrapper process controlling the execution (such as \plc{numactl} on Linux systems), 
 compiler options, library-specific environment variables, or default
 kernel settings.  For instance, the execution of multiple MPI processes,
 launched on a single compute node, will each have a subset of processors as
@@ -53,20 +53,20 @@
 
 Threads of a team are positioned onto places in a compact manner, a 
 scattered distribution, or onto the primary thread's place, by setting the 
-\code{OMP\_PROC\_BIND} environment variable or the \code{proc\_bind} clause  to 
-\code{close}, \code{spread}, or \code{primary} (\code{master} has been deprecated), respectively.  When 
-\code{OMP\_PROC\_BIND} is set to FALSE no binding is enforced; and 
+\kcode{OMP_PROC_BIND} environment variable or the \kcode{proc_bind} clause  to 
+\kcode{close}, \kcode{spread}, or \kcode{primary} (\kcode{master} has been deprecated), respectively.  When 
+\kcode{OMP_PROC_BIND} is set to FALSE no binding is enforced; and 
 when the value is TRUE, the binding is implementation defined to 
-a set of places in the \code{OMP\_PLACES} variable or to places 
-defined by the implementation if the \code{OMP\_PLACES} variable 
+a set of places in the \kcode{OMP_PLACES} variable or to places 
+defined by the implementation if the \kcode{OMP_PLACES} variable 
 is not set. 
 
-The \code{OMP\_PLACES} variable can also be set to an abstract name 
-(\code{threads}, \code{cores}, \code{sockets}) to specify that a place is
+The \kcode{OMP_PLACES} variable can also be set to an abstract name 
+(\kcode{threads}, \kcode{cores}, \kcode{sockets}) to specify that a place is
 either a single hardware thread, a core, or a socket, respectively. 
-This description of the \code{OMP\_PLACES} is most useful when the 
+This description of the \kcode{OMP_PLACES} is most useful when the 
 number of threads is equal to the number of hardware thread, cores
-or sockets.  It can also be used with a \code{close} or \code{spread} 
+or sockets.  It can also be used with a \kcode{close} or \kcode{spread} 
 distribution policy when the equality doesn't hold.
 
 

diff --git a/Chap_data_environment.tex b/Chap_data_environment.tex
@@ -1,12 +1,12 @@
 \cchapter{Data Environment}{data_environment}
 \label{chap:data_environment}
 The OpenMP \plc{data environment} contains data attributes of variables and
-objects.  Many constructs (such as \code{parallel}, \code{simd}, \code{task}) 
+objects.  Many constructs (such as \kcode{parallel}, \kcode{simd}, \kcode{task}) 
 accept clauses to control \plc{data-sharing} attributes
 of referenced variables in the construct, where \plc{data-sharing} applies to
 whether the attribute of the variable is \plc{shared}, 
 is \plc{private} storage, or has special operational characteristics 
-(as found in the \code{firstprivate}, \code{lastprivate}, \code{linear}, or \code{reduction} clause).
+(as found in the \kcode{firstprivate}, \kcode{lastprivate}, \kcode{linear}, or \kcode{reduction} clause).
 
 The data environment for a device (distinguished as a \plc{device data environment})
 is controlled on the host by \plc{data-mapping} attributes, which determine the
@@ -21,57 +21,57 @@
 
 Certain variables and objects have predetermined attributes.  
 A commonly found case is the loop iteration variable in associated loops 
-of a \code{for} or \code{do} construct. It has a private data-sharing attribute.
+of a \kcode{for} or \kcode{do} construct. It has a private data-sharing attribute.
 Variables with predetermined data-sharing attributes cannot be listed in a data-sharing clause; but there are some
 exceptions (mainly concerning loop iteration variables).
 
 Variables with explicitly determined data-sharing attributes are those that are
 referenced in a given construct and are listed in a data-sharing attribute
 clause on the construct. Some of the common data-sharing clauses are:
-\code{shared}, \code{private}, \code{firstprivate}, \code{lastprivate}, 
-\code{linear}, and \code{reduction}. % Are these all of them?
+\kcode{shared}, \kcode{private}, \kcode{firstprivate}, \kcode{lastprivate}, 
+\kcode{linear}, and \kcode{reduction}. % Are these all of them?
 
 Variables with implicitly determined data-sharing attributes are those
 that are referenced in a given construct, do not have predetermined
 data-sharing attributes, and are not listed in a data-sharing
 attribute clause of an enclosing construct.
 For a complete list of variables and objects with predetermined and
 implicitly determined attributes, please refer to the
-\plc{Data-sharing Attribute Rules for Variables Referenced in a Construct}
+\docref{Data-sharing Attribute Rules for Variables Referenced in a Construct}
 subsection of the OpenMP Specifications document.  
 
 \bigskip
 DATA-MAPPING ATTRIBUTES
 
-The \code{map} clause on a device construct explicitly specifies how the list items in
+The \kcode{map} clause on a device construct explicitly specifies how the list items in
 the clause are mapped from the encountering task's data environment (on the host)
 to the corresponding item in the device data environment (on the device).
 The common \plc{list items} are arrays, array sections, scalars, pointers, and
 structure elements (members). 
 
 Procedures and global variables have predetermined data mapping if they appear
-within the list or block of a \code{declare}~\code{target} directive. Also, a C/C++ pointer
+within the list or block of a \kcode{declare target} directive. Also, a C/C++ pointer
 is mapped as a zero-length array section, as is a C++ variable that is a reference to a pointer.
 % Waiting for response from Eric on this.
 
-Without explicit mapping, non-scalar and non-pointer variables within the scope of the \code{target}
-construct are implicitly mapped with a \plc{map-type} of \code{tofrom}.
-Without explicit mapping, scalar variables within the scope of the \code{target}
+Without explicit mapping, non-scalar and non-pointer variables within the scope of the \kcode{target}
+construct are implicitly mapped with a \plc{map-type} of \kcode{tofrom}.
+Without explicit mapping, scalar variables within the scope of the \kcode{target}
 construct are not mapped, but have an implicit firstprivate data-sharing
 attribute. (That is, the value of the original variable is given to a private
 variable of the same name on the device.) This behavior can be changed with
-the \code{defaultmap} clause.
+the \kcode{defaultmap} clause.
 
-The \code{map} clause can appear on \code{target}, \code{target data} and 
-\code{target enter/exit data} constructs.  The operations of creation and
+The \kcode{map} clause can appear on \kcode{target}, \kcode{target data} and 
+\kcode{target enter/exit data} constructs.  The operations of creation and
 removal of device storage as well as assignment of the original list item 
 values to the corresponding list items may be complicated when the list 
 item appears on multiple constructs or when the host and device storage 
 is shared. In these cases the item's reference count, the number of times
-it has been referenced (+1 on entry and -1 on exited) in nested (structured)
+it has been referenced (increment by 1 on entry and decrement by 1 on exit) in nested (structured)
 map regions and/or accumulative (unstructured) mappings, determines the operation.
-Details of the \code{map} clause and reference count operation are specified 
-in the \plc{map Clause} subsection of the OpenMP Specifications document.
+Details of the \kcode{map} clause and reference count operation are specified 
+in the \docref{\kcode{map} Clause} subsection of the OpenMP Specifications document.
 
 
 %===== Examples Sections =====
@@ -81,6 +81,7 @@
 \input{data_environment/fort_loopvar}
 \input{data_environment/fort_sp_common}
 \input{data_environment/fort_sa_private}
+\input{data_environment/fort_shared_var}
 \input{data_environment/carrays_fpriv}
 \input{data_environment/lastprivate}
 \input{data_environment/reduction}

diff --git a/Chap_devices.tex b/Chap_devices.tex
@@ -1,9 +1,9 @@
 \cchapter{Devices}{devices}
 \label{chap:devices}
 
-The \code{target} construct consists of a \code{target} directive 
-and an execution region. The \code{target} region is executed on
-the default device or the device specified in the \code{device} 
+The \kcode{target} construct consists of a \kcode{target} directive 
+and an execution region. The \kcode{target} region is executed on
+the default device or the device specified in the \kcode{device} 
 clause. 
 
 In OpenMP version 4.0, by default, all variables within the lexical
@@ -16,39 +16,39 @@
 The constructs that explicitly
 create storage, transfer data, and free storage on the device
 are categorized as structured and unstructured. The
-\code{target} \code{data} construct is structured. It creates
-a data region around \code{target} constructs, and is
+\kcode{target data} construct is structured. It creates
+a data region around \kcode{target} constructs, and is
 convenient for providing persistent data throughout multiple
-\code{target} regions. The \code{target} \code{enter} \code{data} and 
-\code{target} \code{exit} \code{data} constructs are unstructured, because 
+\kcode{target} regions. The \kcode{target enter data} and 
+\kcode{target exit data} constructs are unstructured, because 
 they can occur anywhere and do not support a ``structure''
-(a region) for enclosing \code{target} constructs, as does the
-\code{target} \code{data} construct. 
+(a region) for enclosing \kcode{target} constructs, as does the
+\kcode{target data} construct. 
 
-The \code{map} clause is used on \code{target} 
+The \kcode{map} clause is used on \kcode{target} 
 constructs and the data-type constructs to map host data. It 
-specifies the device storage and data movement \code{to} and \code{from}
+specifies the device storage and data movement \plc{to} and \plc{from}
 the device, and controls on the storage duration.
 
 There is an important change in the OpenMP 4.5 specification
 that alters the data model for scalar variables and C/C++ pointer variables.
 The default behavior for scalar variables and C/C++ pointer variables
-in a 4.5 compliant code is \code{firstprivate}. Example
+in a 4.5 compliant code is \kcode{firstprivate}. Example
 codes that have been updated to reflect this new behavior are
 annotated with a description that describes changes required
 for correct execution. Often it is a simple matter of mapping
-the variable as \code{tofrom} to obtain the intended 4.0 behavior.
+the variable as \kcode{tofrom} to obtain the intended 4.0 behavior.
 
 In OpenMP version 4.5 the mechanism for target
 execution is specified as occurring through a \plc{target task}. 
-When the \code{target} construct is encountered a new 
-\plc{target task} is generated. The \plc{target task} 
-completes after the \code{target} region has executed and all data 
+When the \kcode{target} construct is encountered a new 
+target task is generated. The target task 
+completes after the \kcode{target} region has executed and all data 
 transfers have finished.
 
 This new specification does not affect the execution of 
 pre-4.5 code; it is a necessary element for asynchronous 
-execution of the \code{target} region when using the new \code{nowait} 
+execution of the \kcode{target} region when using the new \kcode{nowait} 
 clause introduced in OpenMP 4.5.
 
 
@@ -59,6 +59,7 @@
 \input{devices/target_structure_mapping}
 \input{devices/target_fort_allocatable_array_mapping}
 \input{devices/array_sections}
+\input{devices/usm}
 \input{devices/C++_virtual_functions}
 \input{devices/array_shaping}
 \input{devices/target_mapper}

diff --git a/Chap_directives.tex b/Chap_directives.tex
@@ -2,7 +2,7 @@
 \label{chap:directive_syntax}
 \index{directive syntax}
 
-OpenMP \emph{directives} use base-language mechanisms to specify OpenMP program behavior.
+OpenMP \plc{directives} use base-language mechanisms to specify OpenMP program behavior.
 In C code, the directives are formed exclusively with pragmas, whereas in C++
 code, directives are formed from either pragmas or attributes.
 Fortran directives are formed with comments in free form and fixed form sources (codes).
@@ -20,36 +20,36 @@
 
 C/C++ pragmas
 \begin{indentedcodelist}
-\code{\#pragma omp} \plc{directive-specification}
+\kcode{\#pragma omp} \plc{directive-specification}
 \end{indentedcodelist}
 
 C++ attributes
 \begin{indentedcodelist}
-\code{[[omp :: directive(} \plc{directive-specification} \code{)]]}
-\code{[[using omp : directive(} \plc{directive-specification} \code{)]]}
+\kcode{[[omp :: directive( \plc{directive-specification} )]]}
+\kcode{[[using omp : directive( \plc{directive-specification} )]]}
 \end{indentedcodelist}
 
 Fortran comments
 \begin{indentedcodelist}
-\code{!\$omp} \plc{directive-specification}
+\scode{!$omp} \plc{directive-specification}
 \end{indentedcodelist}
 
-where \code{c\$omp} and \code{*\$omp} may be used in Fortran fixed form sources.
+where \scode{c$omp} and \scode{*$omp} may be used in Fortran fixed form sources.
 
 Most OpenMP directives accept clauses that alter the semantics of the directive in some way, 
 and some directives also accept parenthesized arguments that follow the directive name. 
-A clause may just be a keyword (e.g., \scode{untied}) or it may also accept argument lists 
-(e.g., \scode{shared(x,y,z)}) and/or optional modifiers (e.g., \scode{tofrom} in 
-\scode{map(tofrom:}~\scode{x,y,z)}).
+A clause may just be a keyword (e.g., \kcode{untied}) or it may also accept argument lists 
+(e.g., \kcode{shared(\ucode{x,y,z})}) and/or optional modifiers (e.g., \kcode{tofrom} in 
+\kcode{map(tofrom: \ucode{x,y,z})}).
 Clause modifiers may be ``simple'' or ``complex'' -- a complex modifier consists of a
 keyword followed by one or more parameters, bracketed by parentheses, while a simple 
-modifier does not. An example of a complex modifier is the \scode{iterator} modifier, 
-as in \scode{map(iterator(i=0:n),}~\scode{tofrom:}~\scode{p[i])}, or the \scode{step} modifier, as in 
-\scode{linear(x:}~\scode{ref,}~\scode{step(4))}. 
-In the preceding examples, \scode{tofrom} and \scode{ref} are simple modifiers.
+modifier does not. An example of a complex modifier is the \kcode{iterator} modifier, 
+as in \kcode{map(iterator(\ucode{i=0:n}), tofrom: \ucode{p[i]})}, or the \kcode{step} modifier, as in 
+\kcode{linear(\ucode{x}: ref, step(\ucode{4}))}. 
+In the preceding examples, \kcode{tofrom} and \kcode{ref} are simple modifiers.
 
-For Fortran, a declarative directive (such as \code{declare}~\code{reduction})
-must appear after any \code{USE}, \code{IMPORT}, and \code{IMPLICIT} statements
+For Fortran, a declarative directive (such as \kcode{declare reduction})
+must appear after any \bcode{USE}, \bcode{IMPORT}, and \bcode{IMPLICIT} statements
 in the specification part.