Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[utils] Add script to generate elaborated IR and assembly tests #89026

Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions llvm/docs/TestingGuide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -433,6 +433,69 @@ actually participate in the test besides holding the ``RUN:`` lines.
putting the extra files in an ``Inputs/`` directory. This pattern is
deprecated.

Elaborated assembly tests
-------------------------

Generally, assembly test files benefit from being cleaned to remove unnecessary
details. However, for tests requiring elaborate assembly files where cleanup is
less practical (e.g., large amount of debug information output from Clang),
you can include generation instructions within ``.ifdef GEN`` and ``.endif``
directives. Then, run ``llvm/utils/update_test_body.py`` on
the test file to generate the needed content.

.. code-block:: none

# RUN: llvm-mc -filetype=obj -triple=x86_64 %s -o a.o
# RUN: ... | FileCheck %s

# CHECK: hello

.ifdef GEN
#--- a.cc
int va;
#--- gen
clang --target=x86_64-linux -S -g a.cc -o -
.endif
# content generated by the script 'gen'

.. code-block:: bash

PATH=/path/to/clang_build/bin:$PATH llvm/utils/update_test_body.py path/to/test.s

The script will prepare extra files with ``split-file``, invoke ``gen``, and
then rewrite the part after ``.endif`` with its stdout.

.. note::

Consider specifying an explicit target triple to avoid differences when
regeneration is needed on another machine.

``gen`` is invoked with ``PWD`` set to ``/proc/self/cwd``. Clang commands
don't need ``-fdebug-compilation-dir=``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow this.
At least for split dwarf when generating assembly files for BOLT it helps to set -fdebug-compilation-dir= to '.' and then run bolt from a test directory where executable is created. That way it can find the .dwo files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've clarified this.

Clang commands don't need -fdebug-compilation-dir= since its default value is PWD.


Check prefixes should be placed before ``.endif`` since the part after
``.endif`` is replaced.

If the test body contains multiple files, you can print ``#---`` separators and
utilize ``split-file`` in ``RUN`` lines.

.. code-block:: none

# RUN: rm -rf %t && split-file %s %t && cd %t
...

.ifdef GEN
#--- a.cc
int va;
#--- b.cc
int vb;
#--- gen
echo '#--- a.s'
clang --target=x86_64-linux -S -g a.cc -o -
echo '#--- b.s'
clang --target=x86_64-linux -S -g b.cc -o -
.endif

Fragile tests
-------------

Expand Down
7 changes: 6 additions & 1 deletion llvm/test/tools/UpdateTestChecks/lit.local.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@ def add_update_script_substition(
# Specify an explicit default version in UTC tests, so that the --version
# embedded in UTC_ARGS does not change in all test expectations every time
# the default is bumped.
extra_args += " --version=1"
if name != "%update_test_body":
extra_args += " --version=1"
config.substitutions.append(
(name, "'%s' %s %s" % (python_exe, script_path, extra_args))
)
Expand Down Expand Up @@ -47,3 +48,7 @@ if os.path.isfile(llvm_mca_path):
config.available_features.add("llvm-mca-binary")
mca_arg = "--llvm-mca-binary " + shell_quote(llvm_mca_path)
add_update_script_substition("%update_test_checks", extra_args=mca_arg)

split_file_path = os.path.join(config.llvm_tools_dir, "split-file")
if os.path.isfile(split_file_path):
add_update_script_substition("%update_test_body")
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# RUN: cp %s %t && %update_test_body %t 2>&1 | count 0
# RUN: diff -u %S/Inputs/basic.test.expected %t

.ifdef GEN
#--- a.txt
.long 0
#--- b.txt
.long 1
#--- gen
cat a.txt b.txt
.endif
.long 0
.long 1
11 changes: 11 additions & 0 deletions llvm/test/tools/UpdateTestChecks/update_test_body/basic.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# RUN: cp %s %t && %update_test_body %t 2>&1 | count 0
# RUN: diff -u %S/Inputs/basic.test.expected %t

.ifdef GEN
#--- a.txt
.long 0
#--- b.txt
.long 1
#--- gen
cat a.txt b.txt
.endif
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# RUN: cp %s %t && not %update_test_body %t 2>&1 | FileCheck %s
# RUN: diff -u %t %s

# CHECK: stdout is empty; forgot -o - ?

.ifdef GEN
#--- a.txt
.long 0
#--- b.txt
.long 1
#--- gen
true
.endif
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# RUN: cp %s %t && not %update_test_body %t 2>&1 | FileCheck %s

# CHECK: 'gen' does not exist

.ifdef GEN
#--- a.txt
.endif
11 changes: 11 additions & 0 deletions llvm/test/tools/UpdateTestChecks/update_test_body/gen-fail.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# RUN: cp %s %t && not %update_test_body %t 2>&1 | FileCheck %s

# CHECK: log
# CHECK-NEXT: 'gen' failed

.ifdef GEN
#--- gen
echo log >&2
false # gen fails due to sh -e
true
.endif
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
import platform

if platform.system() == "Windows":
config.unsupported = True
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# RUN: cp %s %t && not %update_test_body %t 2>&1 | FileCheck %s

# CHECK: error: -: no part separator was found

.ifdef GEN
true
.endif
26 changes: 14 additions & 12 deletions llvm/test/tools/llvm-dwarfdump/X86/formclass4.s
Original file line number Diff line number Diff line change
@@ -1,22 +1,24 @@
# Source:
# struct e {
# enum {} f[16384];
# short g;
# };
# e foo() {
# auto E = new e;
# return *E;
# }
# Compile with:
# clang -O2 -gdwarf-4 -S a.cpp -o a4.s

# RUN: llvm-mc %s -filetype obj -triple x86_64-apple-darwin -o %t.o
# RUN: llvm-dwarfdump -debug-info -name g %t.o | FileCheck %s

# CHECK: DW_TAG_member
# CHECK: DW_AT_name ("g")
# CHECK: DW_AT_data_member_location (0x4000)

.ifdef GEN
#--- a.cpp
struct e {
enum {} f[16384];
short g;
};
e foo() {
auto E = new e;
return *E;
}
#--- gen
clang --target=x86_64-apple-macosx -O2 -gdwarf-4 -S a.cpp -o -
.endif

.section __TEXT,__text,regular,pure_instructions
.macosx_version_min 10, 14
.globl __Z3foov ## -- Begin function _Z3foov
Expand Down
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# RUN: llvm-mc < %s -filetype obj -triple x86_64 -o - \
# RUN: | llvm-dwarfdump - | FileCheck %s

# Generated from:
#
# struct t1 { };
# t1 v1;
#
# $ clang++ -S -g -fdebug-types-section -gsplit-dwarf -o test.5.split.s -gdwarf-5 -g

# CHECK: DW_TAG_variable
# CHECK: DW_AT_type ({{.*}} "t1")

.ifdef GEN
#--- test.cpp
struct t1 { };
t1 v1;
#--- gen
clang++ --target=x86_64-linux -S -g -fdebug-types-section -gsplit-dwarf -gdwarf-5 test.cpp -o -
.endif
.text
.file "test.cpp"
.section .debug_types.dwo,"e",@progbits
Expand Down
119 changes: 119 additions & 0 deletions llvm/utils/update_test_body.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
#!/usr/bin/env python3
"""Generate test body using split-file and a custom script.
Currently, only assembly files are supported by placing generation instructions
surrounded by .ifdef GEN/.endif directives.

.ifdef GEN
#--- a.cc
int va;
#--- gen
clang --target=aarch64-linux -S -g a.cc -o -
.endif
# content generated by the script 'gen'

The script will prepare extra files with `split-file`, invoke `gen`, and then
rewrite the part after `.endif` with its stdout.

Example:
PATH=/path/to/clang_build/bin:$PATH llvm/utils/update_test_body.py path/to/test.s
"""
import argparse
import contextlib
import os
import subprocess
import sys
import tempfile


@contextlib.contextmanager
def cd(directory):
cwd = os.getcwd()
os.chdir(directory)
try:
yield
finally:
os.chdir(cwd)


def process(args, path):
split_file_input = []
prolog = []
is_split_file_input = False
is_prolog = True
with open(path) as f:
for line in f.readlines():
line = line.rstrip()
if is_prolog:
prolog.append(line)
if line.startswith(".endif"):
is_split_file_input = is_prolog = False
if is_split_file_input:
split_file_input.append(line)
if line.startswith(".ifdef GEN"):
is_split_file_input = True

if not split_file_input:
print("no .ifdef GEN", file=sys.stderr)
return 1
if is_split_file_input:
print("no .endif", file=sys.stderr)
return 1
with tempfile.TemporaryDirectory(prefix="update_test_body_") as dir:
try:
sub = subprocess.run(
["split-file", "-", dir],
input="\n".join(split_file_input).encode(),
capture_output=True,
check=True,
)
except subprocess.CalledProcessError as ex:
sys.stderr.write(ex.stderr.decode())
return 1
with cd(dir):
if args.shell:
print(f"invoke shell in the temporary directory '{dir}'")
subprocess.run([os.environ.get("SHELL", "sh")])
return 0
if not os.path.exists("gen"):
print("'gen' does not exist", file=sys.stderr)
return 1

sub = subprocess.run(
["sh", "-eu", "gen"],
capture_output=True,
# Don't encode the directory information to the Clang output.
# Remove unneeded details (.ident) as well.
env=dict(
os.environ,
CCC_OVERRIDE_OPTIONS="#^-fno-ident",
PWD="/proc/self/cwd",
),
)
sys.stderr.write(sub.stderr.decode())
if sub.returncode != 0:
print("'gen' failed", file=sys.stderr)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a weird mixture of print(..., file=sys.stderr) and sys.stderr.write in this file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sys.stderr.write does not append a newline while print does by default (unless end='')... I feel that sys.stderr.write is quite common as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit that I rarely, if ever use print for stderr writing, but I don't know what the norm is in the LLVM scripts.

return sub.returncode
if not sub.stdout:
print("stdout is empty; forgot -o - ?", file=sys.stderr)
return 1
content = sub.stdout.decode()

with open(path, "w") as f:
# Print lines up to '.endif'.
print("\n".join(prolog), file=f)
# Then print the stdout of 'gen'.
f.write(content)


parser = argparse.ArgumentParser(
description="Generate test body using split-file and a custom script"
)
parser.add_argument("files", nargs="+")
parser.add_argument(
"--shell", action="store_true", help="invoke shell instead of 'gen'"
)
args = parser.parse_args()
for path in args.files:
retcode = process(args, path)
if retcode != 0:
sys.exit(retcode)
Loading