Skip to content

Commit

Permalink
Additional unit testing for run-to-run bitwise reproducibility
Browse files Browse the repository at this point in the history
Currently, we have tests that verify that the deterministic algorithms
for scan and reduce (available via the det and par__det_nosync
execution policies) produce the same results when run twice in a row
within the same test executable.

This change adds new bitwise reproducibility testing that is performed
across multiple test executable runs.

To do this, the test executable must be run twice. On the first run,
we hash all the inputs and outputs to/from the deterministic function
calls, and store those hashes in an SQLite database. We also insert
information about the ROCm version, rocThrust version, and GPU architecture,
into the database, since the results the deterministic algorithms produce
are allowed to vary if any those factors change.

On the second run, we hash the inputs/output again and check to see
if a row matching them (plus the ROCm version, rocThrust version etc.) exists
in the database.

This change:
- Adds SQLite as a dependency in the CMake files. SQLite code is public domain.
- Adds a library called CRCpp as a dependency in the CMake files. This library
provides a checksum algorithm that we can use for hashing. It uses a BSD 3-clause
license.
- Add the database-based testing to the existing reproducibility tests in
test/test_reproducibility.cpp
- Adds classes for database creation/manipulation and building hashes in
tests/bitwise_repro/

Because the run-to-run testing accesses a disk-based database file, it's pretty
slow, so I've disabled it by default. To turn it on, you must define an
environment variable called ROCTHRUST_BWR_PATH and set it to the path to
the SQLite database file to use for the testing.

If the database does not exist, to generate it, you must also define an
environment variable called ROCTHRUST_BWR_GENERATE and set it to 1.
This creates the database file given by ROCTHRUST_BWR_PATH and inserts
data into it as the run-to-run reproducibility tests are executed (the tests
will all pass in this case).

If you want to take an existing database and update it for a new
ROCm version/rocThrust version/GPU architecture, you can set
ROCTHRUST_BWR_GENERATE=1 and point ROCTHRUST_BWR_PATH to
the existing database file.
  • Loading branch information
umfranzw committed Jul 22, 2024
1 parent 9b1c755 commit db784c4
Show file tree
Hide file tree
Showing 7 changed files with 1,077 additions and 1 deletion.
3 changes: 2 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ else()
endif()

# Thrust project
project(rocthrust LANGUAGES CXX)
# Note: C is required here for dependencies
project(rocthrust LANGUAGES CXX C)

#Adding CMAKE_PREFIX_PATH
list( APPEND CMAKE_PREFIX_PATH /opt/rocm/llvm /opt/rocm ${ROCM_PATH} )
Expand Down
67 changes: 67 additions & 0 deletions cmake/Dependencies.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -62,4 +62,71 @@ if(BUILD_TEST)
)
find_package(GTest REQUIRED CONFIG PATHS ${GTEST_ROOT})
endif()

# SQlite (for run-to-run bitwise-reproducibility tests)
# Note: SQLite 3.36.0 enabled the backup API by default, which we need
# for cache serialization. We also want to use a static SQLite,
# and distro static libraries aren't typically built
# position-independent.
include( FetchContent )

if(DEFINED ENV{SQLITE_3_43_2_SRC_URL})
set(SQLITE_3_43_2_SRC_URL_INIT $ENV{SQLITE_3_43_2_SRC_URL})
else()
set(SQLITE_3_43_2_SRC_URL_INIT https://www.sqlite.org/2023/sqlite-amalgamation-3430200.zip)
endif()
set(SQLITE_3_43_2_SRC_URL ${SQLITE_3_43_2_SRC_URL_INIT} CACHE STRING "Location of SQLite source code")
set(SQLITE_SRC_3_43_2_SHA3_256 af02b88cc922e7506c6659737560c0756deee24e4e7741d4b315af341edd8b40 CACHE STRING "SHA3-256 hash of SQLite source code")

# embed SQLite
if(CMAKE_VERSION VERSION_GREATER_EQUAL 3.24)
# use extract timestamp for fetched files instead of timestamps in the archive
cmake_policy(SET CMP0135 NEW)
endif()

message("Downloading SQLite.")
FetchContent_Declare(sqlite_local
URL ${SQLITE_3_43_2_SRC_URL}
URL_HASH SHA3_256=${SQLITE_SRC_3_43_2_SHA3_256}
)
FetchContent_MakeAvailable(sqlite_local)

add_library(sqlite3 OBJECT ${sqlite_local_SOURCE_DIR}/sqlite3.c)
target_include_directories(sqlite3 PUBLIC ${sqlite_local_SOURCE_DIR})
set_target_properties( sqlite3 PROPERTIES
C_VISIBILITY_PRESET "hidden"
VISIBILITY_INLINES_HIDDEN ON
POSITION_INDEPENDENT_CODE ON
LINKER_LANGUAGE CXX
)

# We don't need extensions, and omitting them from SQLite removes the
# need for dlopen/dlclose from within rocThrust.
# We also don't need the shared cache, and omitting it yields some performance improvements.
target_compile_options(
sqlite3
PRIVATE -DSQLITE_OMIT_LOAD_EXTENSION
PRIVATE -DSQLITE_OMIT_SHARED_CACHE
)

# CRCpp (cyclic redundancy check library, for run-to-run bitwise-reproducibility tests)
message(STATUS "Downloading and building CRCpp.")
download_project(
PROJ crcpp
GIT_REPOSITORY https://github.com/d-bahr/CRCpp.git
GIT_TAG release-1.2.0.0
GIT_SHALLOW 1
INSTALL_DIR ${CMAKE_CURRENT_BINARY_DIR}/deps/CRCpp
CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=<INSTALL_DIR> -DBUILD_TEST=OFF -DBUILD_DOC=OFF
LOG_DOWNLOAD TRUE
LOG_CONFIGURE TRUE
LOG_BUILD TRUE
LOG_INSTALL TRUE
BUILD_PROJECT TRUE
UPDATE_DISCONNECTED TRUE # Never update automatically from the remote repository
)
set(CRCPP_INCLUDE_PATH ${CMAKE_CURRENT_BINARY_DIR}/deps/CRCpp/include)
# Note: The "esoteric" crc definitions include the 64-bit CRC-ECMA algorithm, which we're using.
set(CRCPP_COMPILE_DEFS "-DCRCPP_USE_CPP11 -DCRCPP_INCLUDE_ESOTERIC_CRC_DEFINITIONS")

endif()
7 changes: 7 additions & 0 deletions test/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -54,11 +54,15 @@ function(add_rocthrust_test TEST)
target_include_directories(${TEST_TARGET} SYSTEM BEFORE
PUBLIC
$<BUILD_INTERFACE:${CMAKE_CURRENT_BINARY_DIR}>
${sqlite_local_SOURCE_DIR}
${CRCPP_INCLUDE_PATH}
)
target_link_libraries(${TEST_TARGET}
PRIVATE
rocthrust
roc::rocprim_hip
PUBLIC
sqlite3
)
if (TARGET GTest::GTest)
target_link_libraries(${TEST_TARGET}
Expand All @@ -83,6 +87,9 @@ function(add_rocthrust_test TEST)
PROPERTIES
RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/test/"
)
target_compile_definitions(${TEST_TARGET}
PUBLIC ${CRCPP_COMPILE_DEFS}
)
if(AMDGPU_TEST_TARGETS)
foreach(AMDGPU_TARGET IN LISTS AMDGPU_TEST_TARGETS)
add_test("${AMDGPU_TARGET}-${TEST_TARGET}" ${TEST_TARGET})
Expand Down
Loading

0 comments on commit db784c4

Please sign in to comment.