-
Notifications
You must be signed in to change notification settings - Fork 1
/
README-3.4
148 lines (88 loc) · 4.19 KB
/
README-3.4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
NAS Parallel Benchmarks Version 3.4 (NPB3.4)
------------------------------------------------
NAS Parallel Benchmarks Team
NASA Ames Research Center
Moffett Field, CA 94035-1000
E-mail: npb@nas.nasa.gov
Fax: (650) 604-3957
http://www.nas.nasa.gov/Software/NPB/
================================================
INSTALLATION
For documentation on installing and running the NAS Parallel
Benchmarks, refer to subdirectory README files.
================================================
BACKGROUND
Information on NPB 3.4, including the technical reports, the
original specifications, source code, results and information
on how to submit new results, is available at:
http://www.nas.nasa.gov/Software/NPB/
================================================
Summary of New Features and Improvements
(Details are given in Changes.log.)
in NPB3.4 from NPB3.3.1:
- General changes applied to both MPI and OMP versions
* Added the class E problem size for IS, and the class F problem
size for BT, LU, SP, CG, EP, FT, and MG.
* Use Fortran modules and allocatable arrays to define and
manage global data (to replace common blocks) and Fortran 2003
IEEE arithmetic function to catch the NaN condition during
verification.
* The environment variable NPB_TIMER_FLAG is now used to enable
additional timers.
* Make flag change: from MPIF77 or F77 to MPIFC or FC.
- MPI version improvement
* MPI codes use Fortran 90 dynamic memory allocation for space
allocation to simplify compilation process. The number of
processes is solely determined and checked at runtime.
* Performance improvement of the LU benchmark.
- OMP version improvement
* Improved loop-level parallelism with the use of the COLLAPSE
clause
* Included the "blocking" version for the BT and SP benchmarks
* Included the "doacross" version for the LU benchmark
- Removed the serial version - use the OpenMP version instead
in NPB3.3.1 from NPB3.3:
- Bug fixes for:
MPI/FT - non-portable way of broadcasting input parameters
{OMP,SER}/DC - access to out-of-bound array elements
{OMP,SER}/UA - use of uninitialized array
- Code clean up in MPI/LU: avoid using MPI_ANY_SOURCE and delete
unused codes
- Additional timers are included in the MPI version
- Executables produced for OMP and SER now use ".x" as an extension
in NPB3.3 from NPB3.2.1:
- Introduction of the Class E problem in seven of the benchmarks
(BT, SP, LU, CG, MG, FT, and EP) to stress larger size parallel
computers.
- Class D added to the IS benchmark in all three implementations.
- Enable the Bucket sort option for OMP/IS.
- Introduction of the "twiddle" array in the OpenMP FT benchmark
to improve performance
- Array padding in MPI/SP was adjusted to improve performance
- Merge the vector codes for the BT and LU benchmarks into this
release.
- The hyperplane version of LU (LU-HP) is no longer included
in the distribution. Download NPB3.2.1 if needed.
in NPB3.2.1 from NPB3.2:
- A number of bug fixes for the MPI versions of {FT, LU, MG, BT} and
the OpenMP version of LU
- Improvements on the OpenMP versions of {EP, IS, UA}
(see *OMP/UA/README for a special note on UA)
in NPB3.2 from NPB3.1:
- Serial DC was converted to C from C++ (only classes S, W, A and B
are available)
- OpenMP version of DC was added (only classes S, W, A and B
are available)
- Inclusion of the new DT benchmark (MPI)
in NPB3.1 from NPB3.0 & NPB2.4:
- MPI, OpenMP, and Serial versions are now merged into one package
- Inclusion of the Class D problem in both serial and OpenMP versions
- Inclusion of the new UA benchmark (Serial & OpenMP)
- Inclusion of "LU-HP" in the OpenMP version
- Inclusion of the new DC benchmark (Serial)
- Use of relative errors for verification in both CG and MG
- Change in problem parameters for MG Class W
The NPB IO benchmark is part of NPB3.3-MPI. Check the README file
in that subdirectory for additional information.
The Java and HPF implementations are not included in this distribution.
Please use the NPB3.0 distribution.