Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #249. OCLTrans() generates OpenCL kernels (also fixes fparser2 stmt_fns) #387

Merged
merged 30 commits into from
Jun 12, 2019
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
b33b987
#249 fparser2 statement functions fix and rename_and_write outputs op…
sergisiso May 21, 2019
db105b4
#249 Fixed pytest using tmpdir fixture
sergisiso May 21, 2019
ed5d035
Merge remote-tracking branch 'origin/master' into 249_OCLTransKernels
sergisiso May 21, 2019
b35fdfe
#249 Statement function correction unit-testing
sergisiso Jun 3, 2019
0695060
#249 gen_c_code always use 'e' for scientific notation
sergisiso Jun 3, 2019
876970f
#249 Fixes in gocean1p0 unit test
sergisiso Jun 3, 2019
77ae128
Merge remote-tracking branch 'origin/master' into 249_OCLTransKernels
sergisiso Jun 3, 2019
505540c
#249 Fixed unit-test issues
sergisiso Jun 3, 2019
9ee0535
#249 Remove unnecessary renaming
sergisiso Jun 3, 2019
5d41644
#249 Added some comments
sergisiso Jun 4, 2019
2abd6c2
Merge branch '249_OCLTransKernels' of github.com:stfc/PSyclone into 2…
sergisiso Jun 4, 2019
a09dff2
#249 Added missing unit-test and fixed style issue
sergisiso Jun 4, 2019
bf48327
#249 Updated references of examples/gocean/eg3 that don't have GO_ pr…
sergisiso Jun 6, 2019
9017cb4
#249 Updated OpenCL documentation
sergisiso Jun 6, 2019
44c1e69
#249 Removed Issue reference
sergisiso Jun 6, 2019
b5dd0bd
Merge remote-tracking branch 'origin/master' into 249_OCLTransKernels
sergisiso Jun 6, 2019
1cebdce
#249 Fixed some issues with gocean/eg3
sergisiso Jun 7, 2019
63d3cfc
#249 Addressed reviewer's comments
sergisiso Jun 7, 2019
5e991ba
#249 Fixed unittest compileopencl, OCLTrans only accepts single kerne…
sergisiso Jun 10, 2019
c5a68c6
#249 Addressed reviewer's comments
sergisiso Jun 10, 2019
d4651fa
#249 Removed OpenCL single kernel renaming limitation
sergisiso Jun 10, 2019
e327a46
#249 Removed commented code on gocean/eg3
sergisiso Jun 10, 2019
3ddd9f4
#249 Removed in-progress gocean/eg3 Makefile
sergisiso Jun 10, 2019
724ede9
#249 Added Error docstring for Part_Ref handler
sergisiso Jun 11, 2019
903b643
#249 Update documentation and OCLTrans docstring
sergisiso Jun 11, 2019
47f37be
#249 Minor changes in the test suite
sergisiso Jun 11, 2019
dfe5244
#249 Fixed pylint issues
sergisiso Jun 11, 2019
8db3970
#249 Addressed reviewers comments
sergisiso Jun 12, 2019
e20fb6a
#249 Added a xfail for a module variable access, xpasses now make the…
sergisiso Jun 12, 2019
752e2f0
#387 update changelog and UG
arporter Jun 12, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 2 additions & 10 deletions doc/developer_guide/developers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1893,7 +1893,8 @@ OpenCL
======

PSyclone is able to generate an OpenCL :cite:`opencl` version of
PSy-layer code for the GOcean 1.0 API. Such code may then be executed
PSy-layer code for the GOcean 1.0 API and its associated kernels.
Such code may then be executed
on devices such as GPUs and FPGAs (Field-Programmable Gate
Arrays). Since OpenCL code is very different to that which PSyclone
normally generates, its creation is handled by ``gen_ocl`` methods
Expand Down Expand Up @@ -1995,15 +1996,6 @@ of this setup is done, the kernel itself is launched by calling
Limitations
-----------

Currently PSyclone can only generate the OpenCL version of the PSy
layer. Execution of the resulting code requires that the kernels
themselves be converted from Fortran to OpenCL (a dialect of C) and at
present this must be done manually. Since all data accessed by an
OpenCL kernel must be passed as an argument, this conversion must also
convert any accesses to module data into routine arguments.
Work is in progress to support kernel transformation and this will be
made available in a future PSyclone release.

In OpenCL, all tasks to be performed (whether copying data or kernel
execution) are associated with a command queue. Tasks submitted to
different command queues may then be executed concurrently,
Expand Down
5 changes: 2 additions & 3 deletions doc/user_guide/examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -71,9 +71,8 @@ installed.
Example 3: OpenCL
^^^^^^^^^^^^^^^^^

Example of the use of PSyclone to generate an OpenCL version of the
PSy layer. The kernels are not yet transformed automatically (Issue
#249).
Example of the use of PSyclone to generate an OpenCL driver version of
the PSy layer and OpenCL kernels.

Example 4: Kernels containing use statements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
41 changes: 23 additions & 18 deletions doc/user_guide/transformations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,8 +57,8 @@ provided to show the available transformations

.. _sec_transformations_available:

Available
---------
Available transformations
-------------------------

Most transformations are generic as the schedule structure is
independent of the API, however it often makes sense to specialise
Expand Down Expand Up @@ -148,9 +148,6 @@ can be found in the API-specific sections).
:members: apply
:noindex:

.. note:: OpenCL support is still under development. See
:ref:`opencl_dev` for more details.

####

.. autoclass:: psyclone.transformations.OMPLoopTrans
Expand Down Expand Up @@ -518,22 +515,30 @@ transformation.
OpenCL
------

In common with OpenMP, the conversion of the generated code to use
OpenCL is performed by a transformation (``OCLTrans`` - see the
:ref:`sec_transformations_available` Section above). Currently this
transformation is only supported for the GOcean1.0 API and is applied
to the whole InvokeSchedule of an Invoke. This means that all kernels in
that Invoke will be executed on the OpenCL device. At present the
``OCLTrans`` transformation only alters the generated PSy-layer code. It
is currently the user's responsibility to convert the actual kernel code
from Fortran into OpenCL. Work is underway to extend PSyclone in
order to perform this translation automatically.

The OpenCL code generated by PSyclone is still Fortran and makes use
OpenCL is added to a code by using the ``OCLTrans`` transformation (see the
:ref:`sec_transformations_available` Section above).
Currently this transformation is only supported for the GOcean1.0 API and
is applied to the whole InvokeSchedule of an Invoke.
This transformation will add an OpenCL driver infrastructure to the PSy layer
and generate an OpenCL kernel for each of the Invoke kernels.
This means that all kernels in that Invoke will be executed on the OpenCL
device.
The PSy-layer OpenCL code generated by PSyclone is still Fortran and makes use
of the FortCL library (https://github.com/stfc/FortCL) to access
OpenCL functionality. It also relies upon the OpenCL support provided
OpenCL functionality. It also relies upon the OpenCL support provided
by the dl_esm_inf library (https://github.com/stfc/dl_esm_inf).

At he moment we don't apply additional transformations to OpenCL kernels,
arporter marked this conversation as resolved.
Show resolved Hide resolved
this means that all references to the same kernel will have an indentical
OpenCL generated output (with identical names). Nevertheless, we can use
the `--kernel-renaming` psyclone argument to just generate a single output
file (with the `single` option) or multiple index postfixed (identical)
versions of the kernel (with the `multi` option, which is the default one).
Because OpenCL kernels are linked at run-time, it will be up to the run-time
environment to specify which of the kernels to use. For instace, one could
merge multiple kenrels together in a single binary file and
use the `PSYCLONE_KERNELS_FILE` provided by the FortCL library.

The introduction of OpenCL code generation in PSyclone has been
largely motivated by the need to target Field Programmable Gate Array
(FPGA) accelerator devices. It is not currently designed to target the other
Expand Down
4 changes: 1 addition & 3 deletions examples/gocean/README
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,7 @@ Example 3
---------

Illustrates the use of PSyclone to generate an OpenCL driver layer for
a four-kernel invoke. Currently the kernels themselves must be converted
from Fortran to OpenCL manually but work is in progress to automate this
(Issue #249).
a four-kernel invoke and an OpenCL version of each of the kernels.

Example 4
---------
Expand Down
22 changes: 16 additions & 6 deletions examples/gocean/eg3/README
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,10 @@
# ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
# POSSIBILITY OF SUCH DAMAGE.
#------------------------------------------------------------------------------
# Author A. R. Porter, STFC Daresbury Lab
# Author A. R. Porter and S. Siso, STFC Daresbury Lab

The directory containing this file contains an example of the use of
PSyclone to generate OpenCL driver code with the GOcean 1.0 API.
PSyclone to generate OpenCL code with the GOcean 1.0 API.

In order to use PSyclone you must first install it, ideally with pip.
See ../../../README.md for more details.
Expand All @@ -52,8 +52,18 @@ provided with a transformation script::
psyclone -api "gocean1.0" -s ./ocl_trans.py alg.f90

where ocl_trans.py simply applies the psyclone.transformations.OCLTrans
transformation to the Schedule of the Invoke.
transformation to the Schedule of the Invoke. This will generate the OpenCL
driver layer to stdout and a 'kernel_name'.cl file for each of the kernels
referenced in alg.f90 traslated to OpenCL.

Currently the (Fortran) kernels called by the Invoke must be manually
translated into OpenCL. This step will be automated in a future
release of PSyclone.
Each OpenCL kernel needs to be compiled before buidling the driver layer.
For example, the steps to generate the code using the Intel OpenCL SDK
(https://software.intel.com/en-us/opencl-sdk) are::

psyclone -oalg psyalg.f90 -opsy psylayer.f90 -api "gocean1.0" \
-s ./ocl_trans.py alg.f90

# Pre-build OpenCL kernels
ioc64 -cmd=build -device=cpu -input=kernels.cl -spirv64=kernels.spirv \
-bo="-cl-std=CL1.2"
export PSYCLONE_KERNELS_FILE=kernels.spirv
12 changes: 6 additions & 6 deletions examples/gocean/eg3/alg.f90
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,9 @@ program simple

integer :: ncycle

model_grid = grid_type(ARAKAWA_C, &
(/BC_PERIODIC,BC_PERIODIC,BC_NONE/), &
OFFSET_SW)
model_grid = grid_type(GO_ARAKAWA_C, &
(/GO_BC_PERIODIC,GO_BC_PERIODIC,GO_BC_NONE/), &
GO_OFFSET_SW)

! Create fields on this grid
p_fld = r2d_field(model_grid, T_POINTS)
Expand All @@ -66,13 +66,13 @@ program simple
z_fld = r2d_field(model_grid, F_POINTS)
h_fld = r2d_field(model_grid, T_POINTS)

do ncycle=1,itmax
write(*,*) "Simulation start"
do ncycle=1, 100
call invoke( compute_cu(CU_fld, p_fld, u_fld), &
compute_cv(CV_fld, p_fld, v_fld), &
compute_z(z_fld, p_fld, u_fld, v_fld), &
compute_h(h_fld, p_fld, u_fld, v_fld) )

end do
write(*,*) "Simulation end"

end program simple
7 changes: 3 additions & 4 deletions examples/gocean/eg3/compute_cu_mod.f90
Original file line number Diff line number Diff line change
Expand Up @@ -44,11 +44,10 @@ module compute_cu_mod

private

public invoke_compute_cu
public compute_cu, compute_cu_code

type, extends(kernel_type) :: compute_cu
type(arg), dimension(3) :: meta_args = &
type(go_arg), dimension(3) :: meta_args = &
(/ go_arg(GO_WRITE, GO_CU, GO_POINTWISE), & ! cu
go_arg(GO_READ, GO_CT, GO_POINTWISE), & ! p
go_arg(GO_READ, GO_CU, GO_POINTWISE) & ! u
Expand Down Expand Up @@ -76,8 +75,8 @@ module compute_cu_mod
subroutine compute_cu_code(i, j, cu, p, u)
implicit none
integer, intent(in) :: I, J
real(wp), intent(out), dimension(:,:) :: cu
real(wp), intent(in), dimension(:,:) :: p, u
real(go_wp), intent(out), dimension(:,:) :: cu
real(go_wp), intent(in), dimension(:,:) :: p, u


CU(I,J) = 0.5d0*(P(i,J)+P(I-1,J))*U(I,J)
Expand Down
5 changes: 2 additions & 3 deletions examples/gocean/eg3/compute_h_mod.f90
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ module compute_h_mod

private

public invoke_compute_h
public compute_h, compute_h_code

type, extends(kernel_type) :: compute_h
Expand Down Expand Up @@ -75,8 +74,8 @@ module compute_h_mod
SUBROUTINE compute_h_code(i, j, h, p, u, v)
IMPLICIT none
integer, intent(in) :: I, J
REAL(wp), INTENT(out), DIMENSION(:,:) :: h
REAL(wp), INTENT(in), DIMENSION(:,:) :: p, u, v
REAL(go_wp), INTENT(out), DIMENSION(:,:) :: h
REAL(go_wp), INTENT(in), DIMENSION(:,:) :: p, u, v

H(I,J) = P(I,J)+.25d0*(U(I+1,J)*U(I+1,J)+U(I,J)*U(I,J) + &
V(I,J+1)*V(I,J+1)+V(I,J)*V(I,J))
Expand Down
1 change: 0 additions & 1 deletion examples/gocean/eg3/compute_z_mod.f90
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ module compute_z_mod

private

public invoke_compute_z
public compute_z, compute_z_code

type, extends(kernel_type) :: compute_z
Expand Down
96 changes: 85 additions & 11 deletions src/psyclone/psyGen.py
Original file line number Diff line number Diff line change
Expand Up @@ -3749,7 +3749,7 @@ def rename_and_write(self):
from psyclone.line_length import FortLineLength

# If this kernel has not been transformed we do nothing
if not self.modified:
if not self.modified and not self.root.opencl:
return

# Remove any "_mod" if the file follows the PSyclone naming convention
Expand All @@ -3769,7 +3769,11 @@ def rename_and_write(self):
while not fdesc:
name_idx += 1
new_suffix = "_{0}".format(name_idx)
new_name = old_base_name + new_suffix + "_mod.f90"
if self.root.opencl:
new_name = old_base_name + new_suffix + ".cl"
else:
new_name = old_base_name + new_suffix + "_mod.f90"

try:
# Atomically attempt to open the new kernel file (in case
# this is part of a parallel build)
Expand All @@ -3786,8 +3790,13 @@ def rename_and_write(self):
continue

# Use the suffix we have determined to rename all relevant quantities
# within the AST of the kernel code
self._rename_ast(new_suffix)
# within the AST of the kernel code.
# We can't rename OpenCL kernels as the Invoke set_args functions
# have already been generated. The link to an specific kernel
# implementation is delayed to run-time in OpenCL. (e.g. FortCL has
# the PSYCLONE_KERNELS_FILE environment variable)
arporter marked this conversation as resolved.
Show resolved Hide resolved
if not self.root.opencl:
self._rename_ast(new_suffix)

# Kernel is now self-consistent so unset the modified flag
self.modified = False
Expand All @@ -3801,10 +3810,13 @@ def rename_and_write(self):
raise NotImplementedError("Cannot module-inline a transformed "
"kernel ({0})".format(self.name))

# Generate the Fortran for this transformed kernel, ensuring that
# we limit the line lengths
fll = FortLineLength()
new_kern_code = fll.process(str(self.ast))
if self.root.opencl:
new_kern_code = self.get_kernel_schedule().gen_ocl()
else:
# Generate the Fortran for this transformed kernel, ensuring that
# we limit the line lengths
fll = FortLineLength()
new_kern_code = fll.process(str(self.ast))

if not fdesc:
# If we've not got a file descriptor at this point then that's
Expand Down Expand Up @@ -5396,6 +5408,48 @@ def iterateitems(nodes):
"declarations for fparser nodes {1}."
"".format(str(arg_list), nodes))

# fparser2 does not always handle Statement Functions correctly, this
# loop checks for Stmt_Functions that should be an array statement
# and recovers them, otherwise it raises an error as currently
# Statement Functions are not supported in PSyIR.
for stmtfn in walk_ast(nodes, [Fortran2003.Stmt_Function_Stmt]):
arporter marked this conversation as resolved.
Show resolved Hide resolved
(fn_name, arg_list, scalar_expr) = stmtfn.items
try:
symbol = parent.symbol_table.lookup(fn_name.string)
if symbol.is_array:
# This is an array assignment wrongly categorized as a
# statement_function by fparser2.
array_name = fn_name
if hasattr(arg_list, 'items'):
array_subscript = arg_list.items
else:
array_subscript = [arg_list]
assignment_rhs = scalar_expr
arporter marked this conversation as resolved.
Show resolved Hide resolved

# Create assingment node
assignment = Assignment(parent=parent)
parent.addchild(assignment)

# Build lhs
lhs = Array(array_name.string, parent=assignment)
self.process_nodes(parent=lhs, nodes=array_subscript,
nodes_parent=arg_list)
assignment.addchild(lhs)

# Build rhs
self.process_nodes(parent=assignment, nodes=[assignment_rhs],
arporter marked this conversation as resolved.
Show resolved Hide resolved
nodes_parent=scalar_expr)
else:
raise InternalError(
"Could not process '{0}'. Symbol '{1}' is in the"
" SymbolTable but it is not an array as expected, so"
" it can not be recovered as an array assignment."
"".format(str(stmtfn), symbol.name))
except KeyError:
raise NotImplementedError(
"Could not process '{0}'. Statement Function declarations "
"are not supported.".format(str(stmtfn)))

# TODO remove nodes_parent argument once fparser2 AST contains
# parent information (fparser/#102).
def process_nodes(self, parent, nodes, nodes_parent):
Expand Down Expand Up @@ -5904,6 +5958,10 @@ def _part_ref_handler(self, node, parent):
:type node: :py:class:`fparser.two.Fortran2003.Part_Ref`
:param parent: Parent node of the PSyIR node we are constructing.
:type parent: :py:class:`psyclone.psyGen.Node`

:raises NotImplementedError: If the fparser node represents
arporter marked this conversation as resolved.
Show resolved Hide resolved
unsupported PSyIR features and should be placed in a CodeBlock.

:returns: PSyIR representation of node
:rtype: :py:class:`psyclone.psyGen.Array`
'''
Expand All @@ -5912,8 +5970,8 @@ def _part_ref_handler(self, node, parent):
reference_name = node.items[0].string.lower()

# Intrinsics are wrongly parsed as arrays by fparser2 (fparser issue
# #189), we can fix the issue here and convert them to appropiate PSyIR
# nodes.
# #189), we can fix the issue here and convert them to appropriate
# PSyIR nodes.
if reference_name == 'sign':
bop = BinaryOperation(BinaryOperation.Operator.SIGN, parent)
self.process_nodes(parent=bop, nodes=[node.items[1].items[0]],
Expand All @@ -5940,6 +5998,9 @@ def _part_ref_handler(self, node, parent):
argument = node.items[1].items[0]
if len(node.items[1].items) > 1:
# If it has more than a single argument create a CodeBlock
# TODO: Note that real(var, kind) expressions are not
# supported because Fortran kinds are still not captured
# (Issue #375)
arporter marked this conversation as resolved.
Show resolved Hide resolved
raise NotImplementedError()
else:
argument = node.items[1]
Expand Down Expand Up @@ -6731,6 +6792,15 @@ def name(self):
'''
return self._name

@name.setter
def name(self, new_name):
'''
Sets a new name for the kernel.

:param str new_name: New name for the kernel.
'''
self._name = new_name

@property
def symbol_table(self):
'''
Expand Down Expand Up @@ -7324,7 +7394,11 @@ def gen_c_code(self, indent=0):
:returns: C language code representing the node.
:rtype: str
'''
return self._value
str_value = self._value
# C Scientific notation is always an 'e' letter
str_value = str_value.replace('d', 'e')
str_value = str_value.replace('D', 'e')
return str_value


class Return(Node):
Expand Down
Loading