Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #249. OCLTrans() generates OpenCL kernels (also fixes fparser2 stmt_fns) #387

Merged
merged 30 commits into from
Jun 12, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
b33b987
#249 fparser2 statement functions fix and rename_and_write outputs op…
sergisiso May 21, 2019
db105b4
#249 Fixed pytest using tmpdir fixture
sergisiso May 21, 2019
ed5d035
Merge remote-tracking branch 'origin/master' into 249_OCLTransKernels
sergisiso May 21, 2019
b35fdfe
#249 Statement function correction unit-testing
sergisiso Jun 3, 2019
0695060
#249 gen_c_code always use 'e' for scientific notation
sergisiso Jun 3, 2019
876970f
#249 Fixes in gocean1p0 unit test
sergisiso Jun 3, 2019
77ae128
Merge remote-tracking branch 'origin/master' into 249_OCLTransKernels
sergisiso Jun 3, 2019
505540c
#249 Fixed unit-test issues
sergisiso Jun 3, 2019
9ee0535
#249 Remove unnecessary renaming
sergisiso Jun 3, 2019
5d41644
#249 Added some comments
sergisiso Jun 4, 2019
2abd6c2
Merge branch '249_OCLTransKernels' of github.com:stfc/PSyclone into 2…
sergisiso Jun 4, 2019
a09dff2
#249 Added missing unit-test and fixed style issue
sergisiso Jun 4, 2019
bf48327
#249 Updated references of examples/gocean/eg3 that don't have GO_ pr…
sergisiso Jun 6, 2019
9017cb4
#249 Updated OpenCL documentation
sergisiso Jun 6, 2019
44c1e69
#249 Removed Issue reference
sergisiso Jun 6, 2019
b5dd0bd
Merge remote-tracking branch 'origin/master' into 249_OCLTransKernels
sergisiso Jun 6, 2019
1cebdce
#249 Fixed some issues with gocean/eg3
sergisiso Jun 7, 2019
63d3cfc
#249 Addressed reviewer's comments
sergisiso Jun 7, 2019
5e991ba
#249 Fixed unittest compileopencl, OCLTrans only accepts single kerne…
sergisiso Jun 10, 2019
c5a68c6
#249 Addressed reviewer's comments
sergisiso Jun 10, 2019
d4651fa
#249 Removed OpenCL single kernel renaming limitation
sergisiso Jun 10, 2019
e327a46
#249 Removed commented code on gocean/eg3
sergisiso Jun 10, 2019
3ddd9f4
#249 Removed in-progress gocean/eg3 Makefile
sergisiso Jun 10, 2019
724ede9
#249 Added Error docstring for Part_Ref handler
sergisiso Jun 11, 2019
903b643
#249 Update documentation and OCLTrans docstring
sergisiso Jun 11, 2019
47f37be
#249 Minor changes in the test suite
sergisiso Jun 11, 2019
dfe5244
#249 Fixed pylint issues
sergisiso Jun 11, 2019
8db3970
#249 Addressed reviewers comments
sergisiso Jun 12, 2019
e20fb6a
#249 Added a xfail for a module variable access, xpasses now make the…
sergisiso Jun 12, 2019
752e2f0
#387 update changelog and UG
arporter Jun 12, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions changelog
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,11 @@
48) PR #394 for #392. Fixes a bug in the way the test suite checks
for whether the graphviz package is available.

49) PR #387 for #249. Extends OCLTrans() so that all kernels within a
transformed Invoke are converted to OpenCL. Also includes a
work-around for array accesses incorrectly identified as Statement
Functions by fparser2.

release 1.7.0 20th December 2018

1) #172 and PR #173 Add support for logical declaration, the save
Expand Down
12 changes: 2 additions & 10 deletions doc/developer_guide/developers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1893,7 +1893,8 @@ OpenCL
======

PSyclone is able to generate an OpenCL :cite:`opencl` version of
PSy-layer code for the GOcean 1.0 API. Such code may then be executed
PSy-layer code for the GOcean 1.0 API and its associated kernels.
Such code may then be executed
on devices such as GPUs and FPGAs (Field-Programmable Gate
Arrays). Since OpenCL code is very different to that which PSyclone
normally generates, its creation is handled by ``gen_ocl`` methods
Expand Down Expand Up @@ -1995,15 +1996,6 @@ of this setup is done, the kernel itself is launched by calling
Limitations
-----------

Currently PSyclone can only generate the OpenCL version of the PSy
layer. Execution of the resulting code requires that the kernels
themselves be converted from Fortran to OpenCL (a dialect of C) and at
present this must be done manually. Since all data accessed by an
OpenCL kernel must be passed as an argument, this conversion must also
convert any accesses to module data into routine arguments.
Work is in progress to support kernel transformation and this will be
made available in a future PSyclone release.

In OpenCL, all tasks to be performed (whether copying data or kernel
execution) are associated with a command queue. Tasks submitted to
different command queues may then be executed concurrently,
Expand Down
5 changes: 2 additions & 3 deletions doc/user_guide/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -71,9 +71,8 @@ installed.
Example 3: OpenCL
^^^^^^^^^^^^^^^^^

Example of the use of PSyclone to generate an OpenCL version of the
PSy layer. The kernels are not yet transformed automatically (Issue
#249).
Example of the use of PSyclone to generate an OpenCL driver version of
the PSy layer and OpenCL kernels.

Example 4: Kernels containing use statements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
41 changes: 23 additions & 18 deletions doc/user_guide/transformations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,8 @@ provided to show the available transformations

.. _sec_transformations_available:

Available
---------
Available transformations
-------------------------

Most transformations are generic as the schedule structure is
independent of the API, however it often makes sense to specialise
Expand Down Expand Up @@ -148,9 +148,6 @@ can be found in the API-specific sections).
:members: apply
:noindex:

.. note:: OpenCL support is still under development. See
:ref:`opencl_dev` for more details.

####

.. autoclass:: psyclone.transformations.OMPLoopTrans
Expand Down Expand Up @@ -518,22 +515,30 @@ transformation.
OpenCL
------

In common with OpenMP, the conversion of the generated code to use
OpenCL is performed by a transformation (``OCLTrans`` - see the
:ref:`sec_transformations_available` Section above). Currently this
transformation is only supported for the GOcean1.0 API and is applied
to the whole InvokeSchedule of an Invoke. This means that all kernels in
that Invoke will be executed on the OpenCL device. At present the
``OCLTrans`` transformation only alters the generated PSy-layer code. It
is currently the user's responsibility to convert the actual kernel code
from Fortran into OpenCL. Work is underway to extend PSyclone in
order to perform this translation automatically.

The OpenCL code generated by PSyclone is still Fortran and makes use
OpenCL is added to a code by using the ``OCLTrans`` transformation (see the
:ref:`sec_transformations_available` Section above).
Currently this transformation is only supported for the GOcean1.0 API and
is applied to the whole InvokeSchedule of an Invoke.
This transformation will add an OpenCL driver infrastructure to the PSy layer
and generate an OpenCL kernel for each of the Invoke kernels.
This means that all kernels in that Invoke will be executed on the OpenCL
device.
The PSy-layer OpenCL code generated by PSyclone is still Fortran and makes use
of the FortCL library (https://github.com/stfc/FortCL) to access
OpenCL functionality. It also relies upon the OpenCL support provided
OpenCL functionality. It also relies upon the OpenCL support provided
by the dl_esm_inf library (https://github.com/stfc/dl_esm_inf).

At the moment we don't apply additional transformations to OpenCL kernels,
this means that all references to the same kernel will have an indentical
OpenCL generated output (with identical names). Nevertheless, we can use
the `--kernel-renaming` psyclone argument to just generate a single output
file (with the `single` option) or multiple index postfixed (identical)
versions of the kernel (with the `multi` option, which is the default one).
Because OpenCL kernels are linked at run-time, it will be up to the run-time
environment to specify which of the kernels to use. For instace, one could
merge multiple kenrels together in a single binary file and
use the `PSYCLONE_KERNELS_FILE` provided by the FortCL library.

The introduction of OpenCL code generation in PSyclone has been
largely motivated by the need to target Field Programmable Gate Array
(FPGA) accelerator devices. It is not currently designed to target the other
Expand Down
4 changes: 1 addition & 3 deletions examples/gocean/README
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,7 @@ Example 3
---------

Illustrates the use of PSyclone to generate an OpenCL driver layer for
a four-kernel invoke. Currently the kernels themselves must be converted
from Fortran to OpenCL manually but work is in progress to automate this
(Issue #249).
a four-kernel invoke and an OpenCL version of each of the kernels.

Example 4
---------
Expand Down
22 changes: 16 additions & 6 deletions examples/gocean/eg3/README
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,10 @@
# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
#------------------------------------------------------------------------------
# Author A. R. Porter, STFC Daresbury Lab
# Author A. R. Porter and S. Siso, STFC Daresbury Lab

The directory containing this file contains an example of the use of
PSyclone to generate OpenCL driver code with the GOcean 1.0 API.
PSyclone to generate OpenCL code with the GOcean 1.0 API.

In order to use PSyclone you must first install it, ideally with pip.
See ../../../README.md for more details.
Expand All @@ -52,8 +52,18 @@ provided with a transformation script::
psyclone -api "gocean1.0" -s ./ocl_trans.py alg.f90

where ocl_trans.py simply applies the psyclone.transformations.OCLTrans
transformation to the Schedule of the Invoke.
transformation to the Schedule of the Invoke. This will generate the OpenCL
driver layer to stdout and a 'kernel_name'.cl file for each of the kernels
referenced in alg.f90 traslated to OpenCL.

Currently the (Fortran) kernels called by the Invoke must be manually
translated into OpenCL. This step will be automated in a future
release of PSyclone.
Each OpenCL kernel needs to be compiled before buidling the driver layer.
For example, the steps to generate the code using the Intel OpenCL SDK
(https://software.intel.com/en-us/opencl-sdk) are::

psyclone -oalg psyalg.f90 -opsy psylayer.f90 -api "gocean1.0" \
-s ./ocl_trans.py alg.f90

# Pre-build OpenCL kernels
ioc64 -cmd=build -device=cpu -input=kernels.cl -spirv64=kernels.spirv \
-bo="-cl-std=CL1.2"
export PSYCLONE_KERNELS_FILE=kernels.spirv
12 changes: 6 additions & 6 deletions examples/gocean/eg3/alg.f90
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,9 @@ program simple

integer :: ncycle

model_grid = grid_type(ARAKAWA_C, &
(/BC_PERIODIC,BC_PERIODIC,BC_NONE/), &
OFFSET_SW)
model_grid = grid_type(GO_ARAKAWA_C, &
(/GO_BC_PERIODIC,GO_BC_PERIODIC,GO_BC_NONE/), &
GO_OFFSET_SW)

! Create fields on this grid
p_fld = r2d_field(model_grid, T_POINTS)
Expand All @@ -66,13 +66,13 @@ program simple
z_fld = r2d_field(model_grid, F_POINTS)
h_fld = r2d_field(model_grid, T_POINTS)

do ncycle=1,itmax
write(*,*) "Simulation start"
do ncycle=1, 100
call invoke( compute_cu(CU_fld, p_fld, u_fld), &
compute_cv(CV_fld, p_fld, v_fld), &
compute_z(z_fld, p_fld, u_fld, v_fld), &
compute_h(h_fld, p_fld, u_fld, v_fld) )

end do
write(*,*) "Simulation end"

end program simple
7 changes: 3 additions & 4 deletions examples/gocean/eg3/compute_cu_mod.f90
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,10 @@ module compute_cu_mod

private

public invoke_compute_cu
public compute_cu, compute_cu_code

type, extends(kernel_type) :: compute_cu
type(arg), dimension(3) :: meta_args = &
type(go_arg), dimension(3) :: meta_args = &
(/ go_arg(GO_WRITE, GO_CU, GO_POINTWISE), & ! cu
go_arg(GO_READ, GO_CT, GO_POINTWISE), & ! p
go_arg(GO_READ, GO_CU, GO_POINTWISE) & ! u
Expand Down Expand Up @@ -76,8 +75,8 @@ module compute_cu_mod
subroutine compute_cu_code(i, j, cu, p, u)
implicit none
integer, intent(in) :: I, J
real(wp), intent(out), dimension(:,:) :: cu
real(wp), intent(in), dimension(:,:) :: p, u
real(go_wp), intent(out), dimension(:,:) :: cu
real(go_wp), intent(in), dimension(:,:) :: p, u


CU(I,J) = 0.5d0*(P(i,J)+P(I-1,J))*U(I,J)
Expand Down
5 changes: 2 additions & 3 deletions examples/gocean/eg3/compute_h_mod.f90
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ module compute_h_mod

private

public invoke_compute_h
public compute_h, compute_h_code

type, extends(kernel_type) :: compute_h
Expand Down Expand Up @@ -75,8 +74,8 @@ module compute_h_mod
SUBROUTINE compute_h_code(i, j, h, p, u, v)
IMPLICIT none
integer, intent(in) :: I, J
REAL(wp), INTENT(out), DIMENSION(:,:) :: h
REAL(wp), INTENT(in), DIMENSION(:,:) :: p, u, v
REAL(go_wp), INTENT(out), DIMENSION(:,:) :: h
REAL(go_wp), INTENT(in), DIMENSION(:,:) :: p, u, v

H(I,J) = P(I,J)+.25d0*(U(I+1,J)*U(I+1,J)+U(I,J)*U(I,J) + &
V(I,J+1)*V(I,J+1)+V(I,J)*V(I,J))
Expand Down
1 change: 0 additions & 1 deletion examples/gocean/eg3/compute_z_mod.f90
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ module compute_z_mod

private

public invoke_compute_z
public compute_z, compute_z_code

type, extends(kernel_type) :: compute_z
Expand Down
Binary file modified psyclone.pdf
Binary file not shown.
4 changes: 4 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,7 @@
[pycodestyle]
ignore = E266,E121,E123,E126,E133,E226,E241,E242,E704,W503,W504,W505

# Ensure that any XPASS ("unexpectedly passing") results are reported
# as failures in the test suite.
[tool:pytest]
xfail_strict=true
Loading