Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PyTorch] Update to 2.4.0, add distributed #1510

Merged
merged 96 commits into from
Sep 1, 2024
Merged
Show file tree
Hide file tree
Changes from 93 commits
Commits
Show all changes
96 commits
Select commit Hold shift + click to select a range
9d7b815
2.3.1, add distributed, intrusive and weak adapters, cuda dep
HGuillemet Jun 7, 2024
4510736
Fix gloo include link for Windows
HGuillemet Jun 8, 2024
2d7a3c0
Move `IntrusivePtr` and `WeakPtr` to `helper` package
HGuillemet Jun 8, 2024
a46b49b
Change order of includes to address compilation error on windows
HGuillemet Jun 8, 2024
8b9b5ed
Define _WINSOCKAPI_ to address compilation error on windows
HGuillemet Jun 8, 2024
4760ec4
Patch ProcessGroupGloo.hpp to address compilation issue on windows
HGuillemet Jun 8, 2024
e329cc5
Revert "Change order of includes"
HGuillemet Jun 9, 2024
aef540e
Remove includes not available on Windows
HGuillemet Jun 9, 2024
4b99f14
Add libuv to deploy-windows
HGuillemet Jun 9, 2024
c7c374d
Revert "Add libuv to deploy-windows"
HGuillemet Jun 13, 2024
0e74675
Add compilation of libuv for windows. Remove non exported classes.
HGuillemet Jun 13, 2024
20b3bfc
Fix compilation of libuv
HGuillemet Jun 13, 2024
3bb3c2a
Fix creation of link for gloo includes
HGuillemet Jun 13, 2024
6cc590a
Fix libuv files copying on windows
HGuillemet Jun 14, 2024
9e0bd72
Add cuda-platform dep to javacpp plugin
HGuillemet Jun 14, 2024
bd2c8c9
Remove NCCL
HGuillemet Jun 15, 2024
3ba3ac2
Merge functions packages into main packages
HGuillemet Jun 15, 2024
7abb739
Add uv to preloads
HGuillemet Jun 17, 2024
3165a76
Add uv to preloads
HGuillemet Jun 20, 2024
f059d37
Remove openmp preloads. Remove pytorch FindOpenMP for all platforms.
HGuillemet Jun 22, 2024
d882992
Update module-info.java
HGuillemet Jun 22, 2024
75e6526
2.3.1, add distributed, intrusive and weak adapters, cuda dep
HGuillemet Jun 7, 2024
e366bdc
Fix gloo include link for Windows
HGuillemet Jun 8, 2024
65d05dc
Move `IntrusivePtr` and `WeakPtr` to `helper` package
HGuillemet Jun 8, 2024
8d7c893
Change order of includes to address compilation error on windows
HGuillemet Jun 8, 2024
2ec8d71
Define _WINSOCKAPI_ to address compilation error on windows
HGuillemet Jun 8, 2024
f07868b
Patch ProcessGroupGloo.hpp to address compilation issue on windows
HGuillemet Jun 8, 2024
6932aea
Revert "Change order of includes"
HGuillemet Jun 9, 2024
e301e1d
Remove includes not available on Windows
HGuillemet Jun 9, 2024
cbc274f
Add libuv to deploy-windows
HGuillemet Jun 9, 2024
0d71e18
Revert "Add libuv to deploy-windows"
HGuillemet Jun 13, 2024
ad457cb
Add compilation of libuv for windows. Remove non exported classes.
HGuillemet Jun 13, 2024
3497f44
Fix compilation of libuv
HGuillemet Jun 13, 2024
bfb4ad2
Fix creation of link for gloo includes
HGuillemet Jun 13, 2024
e2fb568
Fix libuv files copying on windows
HGuillemet Jun 14, 2024
16e78e2
Add cuda-platform dep to javacpp plugin
HGuillemet Jun 14, 2024
adffe7c
Remove NCCL
HGuillemet Jun 15, 2024
243e67a
Merge functions packages into main packages
HGuillemet Jun 15, 2024
1eefaf8
Add uv to preloads
HGuillemet Jun 17, 2024
8b14657
Add uv to preloads
HGuillemet Jun 20, 2024
8370f7b
Remove openmp preloads. Remove pytorch FindOpenMP for all platforms.
HGuillemet Jun 22, 2024
cca5b5c
Update module-info.java
HGuillemet Jun 22, 2024
8558364
Fix openmp on mac
HGuillemet Jun 22, 2024
71b330e
Merge remote-tracking branch 'HGuillemet/hg_pytorch' into hg_pytorch
HGuillemet Jun 30, 2024
b03bf72
Merge branch 'master' into hg_pytorch
HGuillemet Jul 5, 2024
2730203
Merge branch 'master' into hg_pytorch
HGuillemet Jul 6, 2024
5e9f862
Upgrade to PyTorch 2.4.0
HGuillemet Aug 1, 2024
6151980
Fix compilation error on MacOS 12
HGuillemet Aug 1, 2024
79e20b9
Fix "non-standard-layout" warnings
HGuillemet Aug 1, 2024
e372f94
Fix link error on windows
HGuillemet Aug 1, 2024
1f57f8c
Make torch presets inherit from cuda presets
HGuillemet Aug 1, 2024
a3d59a8
Fix link error on windows
HGuillemet Aug 1, 2024
8be483d
Revert "Fix compilation error on MacOS 12"
HGuillemet Aug 1, 2024
f9cd783
Run macos-x86_64 workflow on macos-13
HGuillemet Aug 1, 2024
f4884ba
Use chrono from javacpp
HGuillemet Aug 2, 2024
c20d891
link with cuda_linalg on linux
HGuillemet Aug 2, 2024
9ceff5c
Update gen after chrono merge
HGuillemet Aug 4, 2024
f263188
Remove helper package
HGuillemet Aug 5, 2024
973af1c
Update gen
HGuillemet Aug 5, 2024
cef1268
Remove useless imports
HGuillemet Aug 9, 2024
bdf9805
Split JNI libraries
HGuillemet Aug 9, 2024
feed000
Add brew link for libomp on macosx
HGuillemet Aug 10, 2024
81e39b2
Rename libomp as libiomp5 on macosx
HGuillemet Aug 11, 2024
1c621b3
Links most cuda libs to jnitorch_cuda only.
HGuillemet Aug 11, 2024
f8a5510
Replace libomp by libiomp5 in link list
HGuillemet Aug 11, 2024
9f0cac4
Add path for cupti
HGuillemet Aug 12, 2024
4106713
Remove useless import
HGuillemet Aug 12, 2024
d71057a
Add a missing @NoOffset
HGuillemet Aug 12, 2024
2bc75cb
Revert "Links most cuda libs to jnitorch_cuda only" and preload/link …
HGuillemet Aug 13, 2024
da86531
Fix typo
HGuillemet Aug 14, 2024
dbc19eb
Fix linking
HGuillemet Aug 14, 2024
197c450
Fix linking on Windows
HGuillemet Aug 14, 2024
e2ca26b
Update for cuda 12.6
HGuillemet Aug 15, 2024
c88f4f7
Merge remote-tracking branch 'bytedeco/master' into hg_pytorch
HGuillemet Aug 15, 2024
0d506e6
Fix linking of torch_cuda
HGuillemet Aug 16, 2024
b67366f
Let brew overwrite links already installed in runner
HGuillemet Aug 17, 2024
ed1eff8
Revert "brew --overwrite". Uninstall already installed python instead.
HGuillemet Aug 17, 2024
1f7407c
Keep Apple-installed Python instead.
HGuillemet Aug 17, 2024
cc8680f
Patch torch source instead
HGuillemet Aug 18, 2024
ac9e231
Revert spurious modification of poms in cpython
HGuillemet Aug 19, 2024
b5a43a8
Add cupti
HGuillemet Aug 19, 2024
df91ff1
Fix JNI compilation errors
HGuillemet Aug 20, 2024
bcf6464
Add missing link and linkpath
HGuillemet Aug 20, 2024
e3cf216
Change author
HGuillemet Aug 20, 2024
45f4aeb
Add preload for windows
HGuillemet Aug 21, 2024
e897fef
Add cupti from cuda presets
HGuillemet Aug 21, 2024
8f383ef
Add back StrideVaryingShape
HGuillemet Aug 21, 2024
adb324d
Skip max_compile_time_stream_priorities
HGuillemet Aug 21, 2024
c69d712
Merge branch 'cuda_cupti' into hg_pytorch
HGuillemet Aug 21, 2024
da9e8e0
Update gen
HGuillemet Aug 21, 2024
aec9d0a
Fix preload added for wrong platform
HGuillemet Aug 22, 2024
161903a
Merge remote-tracking branch 'origin/master' into hg_pytorch
HGuillemet Aug 22, 2024
277548c
Restore preloading of asmjit,fbgemm on windows
HGuillemet Aug 24, 2024
4a9b029
Update CHANGELOG.md and fix nits
saudet Aug 25, 2024
25b38fd
Add preloading of gomp on linux
HGuillemet Aug 25, 2024
e48c077
Merge remote-tracking branch 'HGuillemet/hg_pytorch' into hg_pytorch
HGuillemet Aug 25, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
9 changes: 3 additions & 6 deletions .github/workflows/pytorch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,16 +33,13 @@ jobs:
- uses: bytedeco/javacpp-presets/.github/actions/deploy-ubuntu@actions
timeout-minutes: 350
macosx-arm64:
runs-on: macos-12
runs-on: macos-14
steps:
- uses: bytedeco/javacpp-presets/.github/actions/deploy-macosx@actions
- uses: HGuillemet/javacpp-presets/.github/actions/deploy-macosx@hg_pytorch
macosx-x86_64:
runs-on: macos-12
# strategy:
# matrix:
# ext: ["", -gpu]
steps:
- uses: bytedeco/javacpp-presets/.github/actions/deploy-macosx@actions
- uses: HGuillemet/javacpp-presets/.github/actions/deploy-macosx@hg_pytorch
windows-x86_64:
runs-on: windows-2019
strategy:
Expand Down
2 changes: 1 addition & 1 deletion platform/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -292,7 +292,7 @@
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>pytorch-platform</artifactId>
<version>2.3.0-${project.version}</version>
<version>2.4.0-${project.version}</version>
</dependency>
<dependency>
<groupId>org.bytedeco</groupId>
Expand Down
6 changes: 3 additions & 3 deletions pytorch/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Introduction
------------
This directory contains the JavaCPP Presets module for:

* PyTorch 2.3.0 https://pytorch.org/
* PyTorch 2.4.0 https://pytorch.org/

Please refer to the parent README.md file for more detailed information about the JavaCPP Presets.

Expand Down Expand Up @@ -48,14 +48,14 @@ We can use [Maven 3](http://maven.apache.org/) to download and install automatic
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>pytorch-platform</artifactId>
<version>2.3.0-1.5.11-SNAPSHOT</version>
<version>2.4.0-1.5.11-SNAPSHOT</version>
</dependency>

<!-- Additional dependencies required to use CUDA, cuDNN, and NCCL -->
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>pytorch-platform-gpu</artifactId>
<version>2.3.0-1.5.11-SNAPSHOT</version>
<version>2.4.0-1.5.11-SNAPSHOT</version>
</dependency>

<!-- Additional dependencies to use bundled CUDA, cuDNN, and NCCL -->
Expand Down
63 changes: 58 additions & 5 deletions pytorch/cppbuild.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ export USE_CUDNN=0
export USE_NUMPY=0
export USE_OPENMP=1
export USE_SYSTEM_NCCL=1
export USE_DISTRIBUTED=1
export USE_NCCL=0 # Not supported on Windows

if [[ "$EXTENSION" == *gpu ]]; then
export USE_CUDA=1
export USE_CUDNN=1
Expand All @@ -35,7 +38,7 @@ if [[ $PLATFORM == windows* ]]; then
export PYTHON_BIN_PATH=$(which python.exe)
fi

PYTORCH_VERSION=2.3.0
PYTORCH_VERSION=2.4.0

export PYTORCH_BUILD_VERSION="$PYTORCH_VERSION"
export PYTORCH_BUILD_NUMBER=1
Expand All @@ -44,6 +47,23 @@ mkdir -p "$PLATFORM$EXTENSION"
cd "$PLATFORM$EXTENSION"
INSTALL_PATH=`pwd`

# Distributed needs libuv on Windows (on other platforms, it's included in tensorpipe)
if [[ $PLATFORM == windows* ]]; then
if [[ ! -d libuv ]]; then
mkdir libuv
cd libuv
download https://dist.libuv.org/dist/v1.39.0/libuv-v1.39.0.tar.gz libuv.tgz
tar xfz libuv.tgz
mkdir build
cd build
cmake ../libuv-v1.39.0 -DBUILD_TESTING=OFF
cmake --build . --config Release
cmake --install . --config Release --prefix ../dist
cd ../..
fi
export libuv_ROOT=${INSTALL_PATH}/libuv/dist
fi

if [[ ! -d pytorch ]]; then
git clone https://github.com/pytorch/pytorch
fi
Expand Down Expand Up @@ -123,14 +143,16 @@ case $PLATFORM in
macosx-arm64)
export CC="clang"
export CXX="clang++"
export CMAKE_OSX_ARCHITECTURES=arm64 # enable cross-compilation on a x86_64 host machine
# export PATH=$(brew --prefix llvm@18)/bin:$PATH # Use brew LLVM instead of Xcode LLVM 14
export USE_MKLDNN=OFF
export USE_QNNPACK=OFF # not compatible with arm64 as of PyTorch 2.1.2
export CMAKE_OSX_DEPLOYMENT_TARGET=11.00 # minimum needed for arm64 support
;;
macosx-x86_64)
export CC="clang"
export CXX="clang++"
export USE_MKLDNN=OFF
# export PATH=$(brew --prefix llvm@18)/bin:$PATH # Use brew LLVM instead of Xcode LLVM 14
;;
windows-x86_64)
if which ccache.exe; then
Expand Down Expand Up @@ -181,22 +203,53 @@ TORCH_API std::ostream& operator<<(std::ostream& stream, const nn::Module& modul
' torch/csrc/api/include/torch/nn/module.h
sedinplace 's/char(\(.*\))/\1/g' torch/csrc/jit/serialization/pickler.h

# some windows header defines a macro named "interface"
sedinplace 's/const std::string& interface)/const std::string\& interface_name)/g' torch/csrc/distributed/c10d/ProcessGroupGloo.hpp

# fix missing #include (Pytorch 2.4.0)
sedinplace 's/#include <stdexcept>/#include <stdexcept>\
#include <vector>\
#include <unordered_map>/' torch/csrc/distributed/c10d/control_plane/Handlers.cpp

# Remove pytorch adaptations of FindOpenMP.cmake that.
# On Windows without iomp and with new versions of VS 2019, including -openmp:experimental and libomp, causes
# final binary to be linked to both libomp and vcomp and produce incorrect results.
# Wait for eventual upstream fix, or for cmake 2.30 that allows to choose between -openmp and -openmp:experimental
# and see if choosing experimental works. See Issue #1503.
# On Linux, pytorch FindOpenMP.cmake picks llvm libomp over libgomp. See Issue #1504.
# On MacOS CMake standard version works tooL
rm cmake/Modules/FindOpenMP.cmake
sedinplace 's/include(${CMAKE_CURRENT_LIST_DIR}\/Modules\/FindOpenMP.cmake)/find_package(OpenMP)/g' cmake/Dependencies.cmake

#USE_FBGEMM=0 USE_KINETO=0 USE_GLOO=0 USE_MKLDNN=0 \
"$PYTHON_BIN_PATH" setup.py build

rm -Rf ../lib
if [[ ! -e torch/include/gloo ]]; then
ln -sf ../../third_party/gloo/gloo torch/include
fi
ln -sf pytorch/torch/include ../include
ln -sf pytorch/torch/lib ../lib
ln -sf pytorch/torch/bin ../bin

# fix library with correct rpath on Mac
case $PLATFORM in
macosx-*)
cp /usr/local/lib/libomp.dylib ../lib/libiomp5.dylib
# Disguise libomp as libiomp5 (they share the same codebase and have the same symbols)
# This helps if user wants to link with MKL.
# On linux, user linking with mkl would need to set
# MKL_THREADING_LAYER=GNU
cp "$(brew ls libomp|grep libomp.dylib)" ../lib/libiomp5.dylib
chmod +w ../lib/libiomp5.dylib
install_name_tool -id @rpath/libiomp5.dylib ../lib/libiomp5.dylib
install_name_tool -change @rpath/libomp.dylib @rpath/libiomp5.dylib ../lib/libtorch_cpu.dylib
codesign --force -s - ../lib/libiomp5.dylib
old=$(otool -L ../lib/libtorch_cpu.dylib|grep libomp.dylib|awk '{print $1}')
echo install_name_tool -change $old @rpath/libiomp5.dylib ../lib/libtorch_cpu.dylib
install_name_tool -change $old @rpath/libiomp5.dylib ../lib/libtorch_cpu.dylib
codesign --force -s - ../lib/libtorch_cpu.dylib
;;
windows-*)
cp ../libuv/dist/lib/Release/* ../lib
;;
esac

cd ../..
39 changes: 31 additions & 8 deletions pytorch/include_list.pl
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ ($)
for (my $d = @inc_per_depth - 1; $d >= $min_depth; $d--) {
if ($inc_per_depth[$d]) {
foreach my $i (@{$inc_per_depth[$d]}) {
print "#include \"$i\"\n";
print "#include \"$i\"\n" unless $incs{$i};
$incs{$i} = 1;
}
undef $inc_per_depth[$d];
Expand All @@ -27,12 +27,20 @@ ($)
}

sub go {
my $path = join ' ', @_;
my ($roots, $opts) = @_;
my $path = join ' ', @$roots, @$opts;

my $exe = "g++ -I. -I torch/csrc/api/include/ -DUSE_UCC -DUSE_C10D_GLOO -DUSE_C10D_MPI -DUSE_DISTRIBUTED -H $path -E 2>&1 > /dev/null";
#my $exe = "g++ -I. -I torch/csrc/api/include/ -DUSE_UCC -DUSE_C10D_GLOO -DUSE_C10D_MPI -DUSE_DISTRIBUTED -D_WIN32 -H $path -E 2>&1 > /dev/null";
my @inc = `$exe`;
if ($? != 0) {
print STDERR "Failed:\n$exe\nError: $?: $!\n";
exit $?;
}

my @inc = `g++ -I. -I torch/csrc/api/include/ -H $path -E 2>&1 > /dev/null`;
foreach my $i (@inc) {
chomp $i;
my ($depth, $f) = $i =~ /^(\.+)\s(.*\.h)$/;
my ($depth, $f) = $i =~ /^(\.+)\s(.*\.h(?:pp)?)$/;
next unless $depth;
$depth = length($depth);
$f =~ s#^\./##;
Expand All @@ -48,18 +56,33 @@ sub go {
push @$incs, $f;
}
flush(0);
foreach my $i (@$roots) {
print "#include \"$i\"\n" unless $incs{$i};
$incs{$i} = 1;
}
}

chdir "cppbuild/linux-x86_64-gpu/pytorch/torch/include";

go('torch/csrc/api/include/torch/torch.h', 'torch/script.h', 'torch/csrc/inductor/aoti_runner/model_container_runner_cpu.h');
print <<EOF;
// Included by
// torch/csrc/api/include/torch/torch.h
// torch/script.h
// torch/csrc/inductor/aoti_runner/model_container_runner_cpu.h
// torch/csrc/distributed/c10d/ProcessGroupGloo.hpp
// torch/csrc/distributed/c10d/PrefixStore.hpp
// torch/csrc/distributed/c10d/logger.hpp
EOF

go(['torch/csrc/api/include/torch/torch.h', 'torch/script.h', 'torch/csrc/inductor/aoti_runner/model_container_runner_cpu.h', 'torch/csrc/distributed/c10d/ProcessGroupGloo.hpp', 'torch/csrc/distributed/c10d/PrefixStore.hpp', 'torch/csrc/distributed/c10d/logger.hpp'], []);

print <<EOF;

// Included by
// ATen/cudnn/Descriptors.h
// ATen/cudnn/Types.h
// c10/cuda/CUDAGuard.h
// ATen/cudnn/Descriptors.h
// ATen/cuda/CUDAEvent.h
// torch/csrc/inductor/aoti_runner/model_container_runner_cuda.h
EOF

go('ATen/cudnn/Descriptors.h', 'ATen/cudnn/Types.h', 'c10/cuda/CUDAGuard.h', '-I/opt/cuda/targets/x86_64-linux/include/', 'torch/csrc/inductor/aoti_runner/model_container_runner_cuda.h');
go(['ATen/cudnn/Types.h', 'ATen/cudnn/Descriptors.h', 'ATen/cuda/CUDAEvent.h', 'torch/csrc/inductor/aoti_runner/model_container_runner_cuda.h'], ['-I/opt/cuda/targets/x86_64-linux/include/', '-DUSE_CUDA']);
2 changes: 1 addition & 1 deletion pytorch/platform/gpu/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

<groupId>org.bytedeco</groupId>
<artifactId>pytorch-platform-gpu</artifactId>
<version>2.3.0-${project.parent.version}</version>
<version>2.4.0-${project.parent.version}</version>
<name>JavaCPP Presets Platform GPU for PyTorch</name>

<properties>
Expand Down
11 changes: 9 additions & 2 deletions pytorch/platform/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

<groupId>org.bytedeco</groupId>
<artifactId>pytorch-platform</artifactId>
<version>2.3.0-${project.parent.version}</version>
<version>2.4.0-${project.parent.version}</version>
<name>JavaCPP Presets Platform for PyTorch</name>

<properties>
Expand Down Expand Up @@ -47,6 +47,12 @@
<version>${project.version}</version>
<classifier>${javacpp.platform.macosx-x86_64}</classifier>
</dependency>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>${javacpp.moduleId}</artifactId>
<version>${project.version}</version>
<classifier>${javacpp.platform.macosx-arm64}</classifier>
</dependency>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>${javacpp.moduleId}</artifactId>
Expand All @@ -65,7 +71,7 @@
<configuration>
<archive>
<manifestEntries>
<Class-Path>${javacpp.moduleId}.jar ${javacpp.moduleId}-linux-x86_64.jar ${javacpp.moduleId}-macosx-x86_64.jar ${javacpp.moduleId}-windows-x86_64.jar</Class-Path>
<Class-Path>${javacpp.moduleId}.jar ${javacpp.moduleId}-linux-x86_64.jar ${javacpp.moduleId}-macosx-x86_64.jar ${javacpp.moduleId}-macosx-arm64.jar ${javacpp.moduleId}-windows-x86_64.jar</Class-Path>
</manifestEntries>
</archive>
</configuration>
Expand Down Expand Up @@ -112,6 +118,7 @@
module org.bytedeco.${javacpp.moduleId}.platform {
requires static org.bytedeco.${javacpp.moduleId}.linux.x86_64;
requires static org.bytedeco.${javacpp.moduleId}.macosx.x86_64;
requires static org.bytedeco.${javacpp.moduleId}.macosx.arm64;
requires static org.bytedeco.${javacpp.moduleId}.windows.x86_64;
}
</moduleInfoSource>
Expand Down
14 changes: 13 additions & 1 deletion pytorch/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

<groupId>org.bytedeco</groupId>
<artifactId>pytorch</artifactId>
<version>2.3.0-${project.parent.version}</version>
<version>2.4.0-${project.parent.version}</version>
<name>JavaCPP Presets for PyTorch</name>

<dependencies>
Expand All @@ -24,6 +24,12 @@
<artifactId>openblas</artifactId>
<version>0.3.28-${project.parent.version}</version>
</dependency>
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>cuda</artifactId>
<version>12.6-9.3-${project.parent.version}</version>
<optional>true</optional>
</dependency>
</dependencies>

<build>
Expand All @@ -43,6 +49,11 @@
<artifactId>openblas-platform</artifactId>
<version>0.3.28-${project.parent.version}</version>
</dependency>
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>cuda-platform</artifactId>
<version>12.6-9.3-${project.parent.version}</version>
</dependency>
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>numpy-platform</artifactId>
Expand All @@ -60,6 +71,7 @@
<classPath>${basedir}/../openblas/target/classes/</classPath>
<classPath>${basedir}/../cpython/target/classes/</classPath>
<classPath>${basedir}/../numpy/target/classes/</classPath>
<classPath>${basedir}/../cuda/target/classes/</classPath>
<classPath>${project.build.outputDirectory}</classPath>
</classPaths>
<includePaths>
Expand Down
4 changes: 2 additions & 2 deletions pytorch/samples/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,14 @@
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>pytorch-platform</artifactId>
<version>2.3.0-1.5.11-SNAPSHOT</version>
<version>2.4.0-1.5.11-SNAPSHOT</version>
</dependency>

<!-- Additional dependencies required to use CUDA, cuDNN, and NCCL -->
<dependency>
<groupId>org.bytedeco</groupId>
<artifactId>pytorch-platform-gpu</artifactId>
<version>2.3.0-1.5.11-SNAPSHOT</version>
<version>2.4.0-1.5.11-SNAPSHOT</version>
</dependency>

<!-- Additional dependencies to use bundled CUDA, cuDNN, and NCCL -->
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@

import org.bytedeco.pytorch.Allocator;
import org.bytedeco.pytorch.Function;
import org.bytedeco.pytorch.functions.*;
import org.bytedeco.pytorch.Module;
import org.bytedeco.javacpp.annotation.Cast;
import java.nio.*;
Expand All @@ -14,6 +13,8 @@
import static org.bytedeco.javacpp.presets.javacpp.*;
import static org.bytedeco.openblas.global.openblas_nolapack.*;
import static org.bytedeco.openblas.global.openblas.*;
import org.bytedeco.javacpp.chrono.*;
import static org.bytedeco.javacpp.global.chrono.*;

import static org.bytedeco.pytorch.global.torch.*;

Expand All @@ -35,9 +36,9 @@ public class AOTIModelContainerRunner extends Pointer {

public native @ByVal ExtraFilesMap getConstantNamesToOriginalFQNs();
public native @ByVal StringIntMap getConstantNamesToDtypes();
public native void update_inactive_constant_buffer(@Cast("const torch::inductor::TensorConstantMap*") @ByRef HashAliasedIValueMap const_map);
public native void update_inactive_constant_buffer(@Cast("const torch::inductor::TensorConstantMap*") @ByRef SizeTStringMap const_map);
public native void update_constant_buffer(
@Cast("const torch::inductor::TensorConstantMap*") @ByRef HashAliasedIValueMap const_map,
@Cast("const torch::inductor::TensorConstantMap*") @ByRef SizeTStringMap const_map,
@Cast("bool") boolean use_inactive,
@Cast("bool") boolean validate_full_updates);
public native void run_const_fold(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@

import org.bytedeco.pytorch.Allocator;
import org.bytedeco.pytorch.Function;
import org.bytedeco.pytorch.functions.*;
import org.bytedeco.pytorch.Module;
import org.bytedeco.javacpp.annotation.Cast;
import java.nio.*;
Expand All @@ -14,6 +13,8 @@
import static org.bytedeco.javacpp.presets.javacpp.*;
import static org.bytedeco.openblas.global.openblas_nolapack.*;
import static org.bytedeco.openblas.global.openblas.*;
import org.bytedeco.javacpp.chrono.*;
import static org.bytedeco.javacpp.global.chrono.*;

import static org.bytedeco.pytorch.global.torch.*;

Expand Down
Loading
Loading