Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update build configurations to webgpu EP #22047

Merged
merged 37 commits into from
Sep 25, 2024
Merged
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
75322e9
Use Dawn libs directly to minimize binary size.
skottmckay Sep 10, 2024
e8ed35f
Fix Windows build
skottmckay Sep 10, 2024
7ec73ba
Merge remote-tracking branch 'origin/fs-eire/webgpu-ep' into skottmck…
fs-eire Sep 11, 2024
f4cbc76
Update patch with iOS build fixes.
skottmckay Sep 11, 2024
7d9ad9a
Merge branch 'skottmckay/MiscWebGPUUpdates' of https://github.com/mic…
skottmckay Sep 11, 2024
e1e75b8
WGSL writer is only needed when Vulkan is being used
skottmckay Sep 11, 2024
9333307
Merge remote-tracking branch 'origin/fs-eire/webgpu-ep' into skottmck…
skottmckay Sep 13, 2024
afd202a
Fix build errors
skottmckay Sep 13, 2024
bd25d1c
Fix transpose.cc build error.
skottmckay Sep 13, 2024
3f9be82
Go back to WGSL writer being required on all builds
skottmckay Sep 13, 2024
788e129
Refine external libraries to add dependencies
skottmckay Sep 15, 2024
edf5056
Try again
skottmckay Sep 16, 2024
0ad21bf
De-alias existing deps
skottmckay Sep 16, 2024
ee1d958
Fix C++20 errors
skottmckay Sep 16, 2024
3cb0c6a
Fix c++20 errors
skottmckay Sep 16, 2024
7b85dda
Update some apple infra
skottmckay Sep 16, 2024
1da2bce
Enable webgpu in some configs to test via CI
skottmckay Sep 16, 2024
6d14e28
Merge branch 'skottmckay/MiscWebGPUUpdates' of https://github.com/mic…
skottmckay Sep 16, 2024
259aa5d
Update one more CI
skottmckay Sep 16, 2024
64ccd2d
Fix some build and test issues
skottmckay Sep 17, 2024
ce23c21
Expand check on whether std::chrono::operator<< can be used to cover …
skottmckay Sep 18, 2024
51db660
Fix condition. Leave in some pragmas for debugging build failures sho…
skottmckay Sep 18, 2024
b712ebc
Add dummy header
skottmckay Sep 18, 2024
45f3bbf
Update apple uitest apps to run webgpu tests
skottmckay Sep 18, 2024
9b888af
Disable WebGPU in mac-catalyst build. APIs used by Dawn are not avail…
skottmckay Sep 18, 2024
210a760
Fix AppendExecutionProvider call
skottmckay Sep 18, 2024
d366d44
Merge
skottmckay Sep 23, 2024
909ac3d
Merge remote-tracking branch 'origin/fs-eire/webgpu-ep' into skottmck…
skottmckay Sep 23, 2024
edb9980
reduce some diffs
skottmckay Sep 23, 2024
9ea28ac
Merge remote-tracking branch 'origin/fs-eire/webgpu-ep' into skottmck…
fs-eire Sep 24, 2024
698e6ae
Enable in Android build for automated testing.
skottmckay Sep 24, 2024
7d4946d
Merge branch 'skottmckay/MiscWebGPUUpdates' of https://github.com/mic…
skottmckay Sep 24, 2024
b9e98a7
Fix typo in #define
skottmckay Sep 24, 2024
4b55f23
Fix some macos warnings.
skottmckay Sep 24, 2024
af7c39e
Fix ATanH on Metal
skottmckay Sep 25, 2024
770023f
Minor cleanups
skottmckay Sep 25, 2024
ba3fa14
Merge
skottmckay Sep 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 60 additions & 22 deletions cmake/external/onnxruntime_external_deps.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -575,10 +575,11 @@ if (onnxruntime_USE_MIMALLOC)
onnxruntime_fetchcontent_makeavailable(mimalloc)
endif()

#onnxruntime_EXTERNAL_LIBRARIES could contain onnx, onnx_proto,libprotobuf, cuda/cudnn,
# dnnl/mklml, onnxruntime_codegen_tvm, tvm and pthread
# pthread is always at the last
set(onnxruntime_EXTERNAL_LIBRARIES ${onnxruntime_EXTERNAL_LIBRARIES_XNNPACK} ${WIL_TARGET} nlohmann_json::nlohmann_json onnx onnx_proto ${PROTOBUF_LIB} re2::re2 Boost::mp11 safeint_interface flatbuffers::flatbuffers ${GSL_TARGET} ${ABSEIL_LIBS} date::date ${ONNXRUNTIME_CLOG_TARGET_NAME})
set(onnxruntime_EXTERNAL_LIBRARIES ${onnxruntime_EXTERNAL_LIBRARIES_XNNPACK} ${WIL_TARGET} nlohmann_json::nlohmann_json
onnx onnx_proto ${PROTOBUF_LIB} re2::re2 Boost::mp11 safeint_interface
flatbuffers::flatbuffers ${GSL_TARGET} ${ABSEIL_LIBS} date::date
${ONNXRUNTIME_CLOG_TARGET_NAME})

# The source code of onnx_proto is generated, we must build this lib first before starting to compile the other source code that uses ONNX protobuf types.
# The other libs do not have the problem. All the sources are already there. We can compile them in any order.
set(onnxruntime_EXTERNAL_DEPENDENCIES onnx_proto flatbuffers::flatbuffers)
Expand Down Expand Up @@ -638,33 +639,70 @@ if (onnxruntime_USE_WEBGPU)
dawn
URL ${DEP_URL_dawn}
URL_HASH SHA1=${DEP_SHA1_dawn}
PATCH_COMMAND ${Patch_EXECUTABLE} --binary --ignore-whitespace -p1 < ${PROJECT_SOURCE_DIR}/patches/dawn/dawn.patch
)
set(DAWN_FETCH_DEPENDENCIES ON)
set(DAWN_ENABLE_INSTALL ON)
set(TINT_BUILD_TESTS OFF)
set(DAWN_USE_BUILT_DXC ON)

# use dawn::dawn_native and dawn::dawn_proc instead of the monolithic dawn::webgpu_dawn to minimize binary size
set(DAWN_BUILD_MONOLITHIC_LIBRARY OFF CACHE BOOL "" FORCE)
set(DAWN_BUILD_SAMPLES OFF CACHE BOOL "" FORCE)
set(DAWN_ENABLE_INSTALL OFF CACHE BOOL "" FORCE)
set(DAWN_ENABLE_NULL OFF CACHE BOOL "" FORCE)
set(DAWN_FETCH_DEPENDENCIES ON CACHE BOOL "" FORCE)

# disable things we don't use
set(DAWN_DXC_ENABLE_ASSERTS_IN_NDEBUG OFF)
onnxruntime_fetchcontent_makeavailable(dawn)
endif()
set(DAWN_ENABLE_DESKTOP_GL OFF CACHE BOOL "" FORCE)
set(DAWN_ENABLE_OPENGLES OFF CACHE BOOL "" FORCE)
set(DAWN_SUPPORTS_GLFW_FOR_WINDOWING OFF CACHE BOOL "" FORCE)
set(DAWN_USE_GLFW OFF CACHE BOOL "" FORCE)
set(DAWN_USE_WINDOWS_UI OFF CACHE BOOL "" FORCE)
set(DAWN_USE_X11 OFF CACHE BOOL "" FORCE)

set(TINT_BUILD_TESTS OFF CACHE BOOL "" FORCE)
set(TINT_BUILD_CMD_TOOLS OFF CACHE BOOL "" FORCE)
set(TINT_BUILD_GLSL_WRITER OFF CACHE BOOL "" FORCE)
set(TINT_BUILD_GLSL_VALIDATOR OFF CACHE BOOL "" FORCE)
set(TINT_BUILD_IR_BINARY OFF CACHE BOOL "" FORCE)
set(TINT_BUILD_SPV_READER OFF CACHE BOOL "" FORCE) # don't need. disabling is a large binary size saving
set(TINT_BUILD_WGSL_WRITER ON CACHE BOOL "" FORCE) # needed to create cache key. runtime error if not enabled.

# SPIR-V validation shouldn't be required given we're using Tint to create the SPIR-V.
if (NOT CMAKE_BUILD_TYPE STREQUAL "Debug")
set(DAWN_ENABLE_SPIRV_VALIDATION OFF CACHE BOOL "" FORCE)
endif()

message(STATUS "Finished fetching external dependencies")
if (WIN32)
# building this requires the HLSL writer to be enabled in Tint. TBD if that we need either of these to be ON.
set(DAWN_USE_BUILT_DXC ON CACHE BOOL "" FORCE)
set(TINT_BUILD_HLSL_WRITER ON CACHE BOOL "" FORCE)

set(onnxruntime_LINK_DIRS )
# Vulkan may optionally be included in a Windows build. Exclude until we have an explicit use case that requires it.
set(DAWN_ENABLE_VULKAN OFF CACHE BOOL "" FORCE)
endif()

onnxruntime_fetchcontent_makeavailable(dawn)

list(APPEND onnxruntime_EXTERNAL_LIBRARIES dawn::dawn_native dawn::dawn_proc)
endif()

set(onnxruntime_LINK_DIRS)
if (onnxruntime_USE_CUDA)
find_package(CUDAToolkit REQUIRED)
find_package(CUDAToolkit REQUIRED)

if(onnxruntime_CUDNN_HOME)
file(TO_CMAKE_PATH ${onnxruntime_CUDNN_HOME} onnxruntime_CUDNN_HOME)
set(CUDNN_PATH ${onnxruntime_CUDNN_HOME})
endif()
include(cuDNN)
if(onnxruntime_CUDNN_HOME)
file(TO_CMAKE_PATH ${onnxruntime_CUDNN_HOME} onnxruntime_CUDNN_HOME)
set(CUDNN_PATH ${onnxruntime_CUDNN_HOME})
endif()

include(cuDNN)
endif()

if(onnxruntime_USE_SNPE)
include(external/find_snpe.cmake)
list(APPEND onnxruntime_EXTERNAL_LIBRARIES ${SNPE_NN_LIBS})
include(external/find_snpe.cmake)
list(APPEND onnxruntime_EXTERNAL_LIBRARIES ${SNPE_NN_LIBS})
endif()

FILE(TO_NATIVE_PATH ${CMAKE_BINARY_DIR} ORT_BINARY_DIR)
FILE(TO_NATIVE_PATH ${PROJECT_SOURCE_DIR} ORT_SOURCE_DIR)
FILE(TO_NATIVE_PATH ${CMAKE_BINARY_DIR} ORT_BINARY_DIR)
FILE(TO_NATIVE_PATH ${PROJECT_SOURCE_DIR} ORT_SOURCE_DIR)

message(STATUS "Finished fetching external dependencies")
65 changes: 59 additions & 6 deletions cmake/onnxruntime.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -89,10 +89,22 @@ elseif(onnxruntime_BUILD_APPLE_FRAMEWORK)
# create Info.plist for the framework and podspec for CocoaPods (optional)
set(MACOSX_FRAMEWORK_NAME "onnxruntime")
set(MACOSX_FRAMEWORK_IDENTIFIER "com.microsoft.onnxruntime")
# Need to include CoreML as a weaklink for CocoaPods package if the EP is enabled

# Setup weak frameworks for macOS/iOS. 'weak' as the CoreML or WebGPU EPs are optionally enabled.
if(onnxruntime_USE_COREML)
set(APPLE_WEAK_FRAMEWORK "\\\"CoreML\\\"")
list(APPEND _weak_frameworks "\\\"CoreML\\\"")
endif()

if(onnxruntime_USE_WEBGPU)
list(APPEND _weak_frameworks "\\\"QuartzCore\\\"")
list(APPEND _weak_frameworks "\\\"IOSurface\\\"")
list(APPEND _weak_frameworks "\\\"Metal\\\"")
endif()

if (_weak_frameworks)
string(JOIN ", " APPLE_WEAK_FRAMEWORK ${_weak_frameworks})
endif()

set(INFO_PLIST_PATH "${CMAKE_CURRENT_BINARY_DIR}/Info.plist")
configure_file(${REPO_ROOT}/cmake/Info.plist.in ${INFO_PLIST_PATH})
configure_file(
Expand Down Expand Up @@ -364,16 +376,57 @@ if(onnxruntime_BUILD_APPLE_FRAMEWORK)
endif()
endforeach()

set(_processed_libs) # keep track of processed libraries to skip any duplicate dependencies
function(add_symlink_for_static_lib_and_dependencies lib)
function(process cur_target)
# de-alias if applicable so a consistent target name is used
get_target_property(alias ${cur_target} ALIASED_TARGET)
if(TARGET ${alias})
set(cur_target ${alias})
endif()

if(${cur_target} IN_LIST _processed_libs OR ${cur_target} IN_LIST lib_and_dependencies)
return()
endif()

list(APPEND lib_and_dependencies ${cur_target})

get_target_property(link_libraries ${cur_target} LINK_LIBRARIES)
foreach(dependency ${link_libraries})
if(TARGET ${dependency})
process(${dependency})
endif()
endforeach()

set(lib_and_dependencies ${lib_and_dependencies} PARENT_SCOPE)
endfunction()

set(lib_and_dependencies)
process(${lib})

foreach(_target ${lib_and_dependencies})
get_target_property(type ${_target} TYPE)
if(${type} STREQUAL "STATIC_LIBRARY")
# message(STATUS "Adding symlink for ${_target}")
add_custom_command(TARGET onnxruntime POST_BUILD
COMMAND ${CMAKE_COMMAND} -E create_symlink
$<TARGET_FILE:${_target}> ${STATIC_LIB_DIR}/$<TARGET_LINKER_FILE_NAME:${_target}>)
endif()
endforeach()

list(APPEND _processed_libs ${lib_and_dependencies})
set(_processed_libs ${_processed_libs} PARENT_SCOPE)
endfunction()

# for external libraries we create a symlink to the .a file
foreach(_LIB ${onnxruntime_EXTERNAL_LIBRARIES})
if(NOT TARGET ${_LIB}) # if we didn't build from source. it may not a target
if(NOT TARGET ${_LIB}) # if we didn't build from source it may not be a target
continue()
endif()

GET_TARGET_PROPERTY(_LIB_TYPE ${_LIB} TYPE)
if(_LIB_TYPE STREQUAL "STATIC_LIBRARY")
add_custom_command(TARGET onnxruntime POST_BUILD
COMMAND ${CMAKE_COMMAND} -E create_symlink
$<TARGET_FILE:${_LIB}> ${STATIC_LIB_DIR}/$<TARGET_LINKER_FILE_NAME:${_LIB}>)
add_symlink_for_static_lib_and_dependencies(${_LIB})
endif()
endforeach()

Expand Down
12 changes: 3 additions & 9 deletions cmake/onnxruntime_providers_webgpu.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,8 @@

source_group(TREE ${REPO_ROOT} FILES ${onnxruntime_providers_webgpu_cc_srcs})
onnxruntime_add_static_library(onnxruntime_providers_webgpu ${onnxruntime_providers_webgpu_cc_srcs})
onnxruntime_add_include_to_target(onnxruntime_providers_webgpu onnxruntime_common onnx onnx_proto flatbuffers::flatbuffers Boost::mp11 safeint_interface)
target_link_libraries(onnxruntime_providers_webgpu dawn::webgpu_dawn)

# Copy webgpu_dawn.dll to the output directory
add_custom_command(
TARGET onnxruntime_providers_webgpu
POST_BUILD
COMMAND ${CMAKE_COMMAND} -E copy_if_different "$<TARGET_FILE:dawn::webgpu_dawn>" "$<TARGET_FILE_DIR:onnxruntime_providers_webgpu>"
VERBATIM )
onnxruntime_add_include_to_target(onnxruntime_providers_webgpu
onnxruntime_common dawn::dawncpp_headers dawn::dawn_headers onnx onnx_proto flatbuffers::flatbuffers Boost::mp11 safeint_interface)
target_link_libraries(onnxruntime_providers_webgpu dawn::dawn_native dawn::dawn_proc)

set_target_properties(onnxruntime_providers_webgpu PROPERTIES FOLDER "ONNXRuntime")
66 changes: 66 additions & 0 deletions cmake/patches/dawn/dawn.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
diff --git a/src/dawn/native/CMakeLists.txt b/src/dawn/native/CMakeLists.txt
index 9c0bd6fa4e..bf8a57aeac 100644
--- a/src/dawn/native/CMakeLists.txt
+++ b/src/dawn/native/CMakeLists.txt
@@ -857,6 +857,11 @@ if (DAWN_ENABLE_SWIFTSHADER)
target_compile_definitions(dawn_native PRIVATE "DAWN_ENABLE_SWIFTSHADER")
endif()

+if (IOS)
+ target_compile_options(dawn_native_objects PRIVATE -fno-objc-arc)
+ target_compile_options(dawn_native PRIVATE -fno-objc-arc)
+endif()
+
if (DAWN_BUILD_MONOLITHIC_LIBRARY)
###############################################################################
# Do the 'complete_lib' build.
diff --git a/src/dawn/native/Surface_metal.mm b/src/dawn/native/Surface_metal.mm
index ce55acbd43..baa4835362 100644
--- a/src/dawn/native/Surface_metal.mm
+++ b/src/dawn/native/Surface_metal.mm
@@ -36,7 +36,13 @@
namespace dawn::native {

bool InheritsFromCAMetalLayer(void* obj) {
- id<NSObject> object = static_cast<id>(obj);
+ id<NSObject> object =
+#if TARGET_OS_IOS
+ (__bridge id)obj;
+#else
+ static_cast<id>(obj);
+#endif
+
return [object isKindOfClass:[CAMetalLayer class]];
}

diff --git a/src/dawn/native/metal/SharedFenceMTL.mm b/src/dawn/native/metal/SharedFenceMTL.mm
index bde8bfea07..f2f6459e91 100644
--- a/src/dawn/native/metal/SharedFenceMTL.mm
+++ b/src/dawn/native/metal/SharedFenceMTL.mm
@@ -40,7 +40,13 @@ ResultOrError<Ref<SharedFence>> SharedFence::Create(
DAWN_INVALID_IF(descriptor->sharedEvent == nullptr, "MTLSharedEvent is missing.");
if (@available(macOS 10.14, iOS 12.0, *)) {
return AcquireRef(new SharedFence(
- device, label, static_cast<id<MTLSharedEvent>>(descriptor->sharedEvent)));
+ device, label,
+#if TARGET_OS_IOS
+ (__bridge id<MTLSharedEvent>)(descriptor->sharedEvent)
+#else
+ static_cast<id<MTLSharedEvent>>(descriptor->sharedEvent)
+#endif
+ ));
} else {
return DAWN_INTERNAL_ERROR("MTLSharedEvent not supported.");
}
diff --git a/src/tint/api/BUILD.cmake b/src/tint/api/BUILD.cmake
index 0037d83276..6372c4ee77 100644
--- a/src/tint/api/BUILD.cmake
+++ b/src/tint/api/BUILD.cmake
@@ -57,6 +57,7 @@ tint_target_add_dependencies(tint_api lib
tint_lang_wgsl_ast_transform
tint_lang_wgsl_common
tint_lang_wgsl_features
+ tint_lang_wgsl_inspector
tint_lang_wgsl_program
tint_lang_wgsl_sem
tint_lang_wgsl_writer_ir_to_program
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.

// Dummy file to provide a signal in the ONNX Runtime C cocoapod as to whether the WebGPU EP was included in the build.
// If it was, this file will be included in the cocoapod, and a test like this can be used:
//
// #if __has_include(<onnxruntime/webgpu_provider_factory.h>)
// #define WEBGPU_EP_AVAILABLE 1
// #else
// #define WEBGPU_EP_AVAILABLE 0
// #endif

// The WebGPU EP can be enabled via the generic SessionOptionsAppendExecutionProvider method, so no direct usage of
// the provider factory is required.
4 changes: 3 additions & 1 deletion java/src/main/java/ai/onnxruntime/OrtProvider.java
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,9 @@ public enum OrtProvider {
/** The XNNPACK execution provider. */
XNNPACK("XnnpackExecutionProvider"),
/** The Azure remote endpoint execution provider. */
AZURE("AzureExecutionProvider");
AZURE("AzureExecutionProvider"),
/** The WebGPU execution provider */
WEBGPU("WebGpuExecutionProvider");

private static final Map<String, OrtProvider> valueMap = new HashMap<>(values().length);

Expand Down
2 changes: 0 additions & 2 deletions onnxruntime/core/platform/apple/logging/apple_log_sink.mm
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,6 @@

#include <sstream>

#include "date/date.h"

namespace onnxruntime {
namespace logging {

Expand Down
5 changes: 5 additions & 0 deletions onnxruntime/core/providers/webgpu/webgpu_context.cc
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
#include <memory>
#include <cmath>

#include "dawn/dawn_proc.h"
#include "dawn/native/DawnNative.h"

#include "core/common/common.h"

#include "core/providers/webgpu/compute_context.h"
Expand All @@ -21,6 +24,8 @@ void WebGpuContext::Initialize(const WebGpuExecutionProviderInfo& webgpu_ep_info
std::call_once(init_flag_, [this, &webgpu_ep_info]() {
snnn marked this conversation as resolved.
Show resolved Hide resolved
// Initialization.Step.1 - Create wgpu::Instance
if (instance_ == nullptr) {
dawnProcSetProcs(&dawn::native::GetProcs());

wgpu::InstanceDescriptor instance_desc{};
instance_desc.features.timedWaitAnyEnable = true;
instance_ = wgpu::CreateInstance(&instance_desc);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@
class ONNX_OPERATOR_KERNEL_CLASS_NAME(kWebGpuExecutionProvider, kOnnxDomain, 9, Cosh);
class ONNX_OPERATOR_KERNEL_CLASS_NAME(kWebGpuExecutionProvider, kOnnxDomain, 9, Asinh);
class ONNX_OPERATOR_KERNEL_CLASS_NAME(kWebGpuExecutionProvider, kOnnxDomain, 9, Acosh);
class ONNX_OPERATOR_KERNEL_CLASS_NAME(kWebGpuExecutionProvider, kOnnxDomain, 9, Atanh);
// class ONNX_OPERATOR_KERNEL_CLASS_NAME(kWebGpuExecutionProvider, kOnnxDomain, 9, Atanh); TEMPORARY - Doesn't handle 1.0f -> inf with Metal

Check warning on line 118 in onnxruntime/core/providers/webgpu/webgpu_execution_provider.cc

View workflow job for this annotation

GitHub Actions / Optional Lint C++

[cpplint] reported by reviewdog 🐶 Lines should be <= 120 characters long [whitespace/line_length] [2] Raw Output: onnxruntime/core/providers/webgpu/webgpu_execution_provider.cc:118: Lines should be <= 120 characters long [whitespace/line_length] [2]
class ONNX_OPERATOR_VERSIONED_KERNEL_CLASS_NAME(kWebGpuExecutionProvider, kOnnxDomain, 6, 12, Tanh);
class ONNX_OPERATOR_KERNEL_CLASS_NAME(kWebGpuExecutionProvider, kOnnxDomain, 13, Tanh);
class ONNX_OPERATOR_KERNEL_CLASS_NAME(kWebGpuExecutionProvider, kOnnxDomain, 1, Not);
Expand Down Expand Up @@ -435,7 +435,7 @@
KERNEL_CREATE_INFO(9, Cosh),
KERNEL_CREATE_INFO(9, Asinh),
KERNEL_CREATE_INFO(9, Acosh),
KERNEL_CREATE_INFO(9, Atanh),
// KERNEL_CREATE_INFO(9, Atanh), TEMPORARY - Doesn't handle 1.0f -> inf with Metal
fs-eire marked this conversation as resolved.
Show resolved Hide resolved
KERNEL_CREATE_INFO_VERSIONED(6, 12, Tanh),
KERNEL_CREATE_INFO(13, Tanh),
KERNEL_CREATE_INFO(1, Not),
Expand Down
Loading
Loading