Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use llama instead of libllama in [DllImport] #465

Merged
merged 2 commits into from
Jan 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions .github/workflows/compile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -204,18 +204,18 @@ jobs:
cp artifacts/llama-bin-linux-avx2-x64.so/libllama.so deps/avx2/libllama.so
cp artifacts/llama-bin-linux-avx512-x64.so/libllama.so deps/avx512/libllama.so

cp artifacts/llama-bin-win-noavx-x64.dll/llama.dll deps/libllama.dll
cp artifacts/llama-bin-win-avx-x64.dll/llama.dll deps/avx/libllama.dll
cp artifacts/llama-bin-win-avx2-x64.dll/llama.dll deps/avx2/libllama.dll
cp artifacts/llama-bin-win-avx512-x64.dll/llama.dll deps/avx512/libllama.dll
cp artifacts/llama-bin-win-noavx-x64.dll/llama.dll deps/llama.dll
cp artifacts/llama-bin-win-avx-x64.dll/llama.dll deps/avx/llama.dll
cp artifacts/llama-bin-win-avx2-x64.dll/llama.dll deps/avx2/llama.dll
cp artifacts/llama-bin-win-avx512-x64.dll/llama.dll deps/avx512/llama.dll

cp artifacts/llama-bin-osx-arm64.dylib/libllama.dylib deps/osx-arm64/libllama.dylib
cp artifacts/ggml-metal.metal/ggml-metal.metal deps/osx-arm64/ggml-metal.metal
cp artifacts/llama-bin-osx-x64.dylib/libllama.dylib deps/osx-x64/libllama.dylib

cp artifacts/llama-bin-win-cublas-cu11.7.1-x64.dll/llama.dll deps/cu11.7.1/libllama.dll
cp artifacts/llama-bin-win-cublas-cu11.7.1-x64.dll/llama.dll deps/cu11.7.1/llama.dll
cp artifacts/llama-bin-linux-cublas-cu11.7.1-x64.so/libllama.so deps/cu11.7.1/libllama.so
cp artifacts/llama-bin-win-cublas-cu12.1.0-x64.dll/llama.dll deps/cu12.1.0/libllama.dll
cp artifacts/llama-bin-win-cublas-cu12.1.0-x64.dll/llama.dll deps/cu12.1.0/llama.dll
cp artifacts/llama-bin-linux-cublas-cu12.1.0-x64.so/libllama.so deps/cu12.1.0/libllama.so

- name: Upload artifacts
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ When building from source, please add `-DBUILD_SHARED_LIBS=ON` to the cmake inst
cmake .. -DLLAMA_CUBLAS=ON -DBUILD_SHARED_LIBS=ON
```

After running `cmake --build . --config Release`, you could find the `llama.dll`, `llama.so` or `llama.dylib` in your build directory. After pasting it to `LLamaSharp/LLama/runtimes` and renaming it to `libllama.dll`, `libllama.so` or `libllama.dylib`, you can use it as the native library in LLamaSharp.
After running `cmake --build . --config Release`, you could find the `llama.dll`, `llama.so` or `llama.dylib` in your build directory. After pasting it to `LLamaSharp/LLama/runtimes` you can use it as the native library in LLamaSharp.


## Add a new feature to LLamaSharp
Expand Down
24 changes: 12 additions & 12 deletions LLama/LLamaSharp.Runtime.targets
Original file line number Diff line number Diff line change
Expand Up @@ -4,29 +4,29 @@
</PropertyGroup>
<ItemGroup Condition="'$(IncludeBuiltInRuntimes)' == 'true'">

<None Include="$(MSBuildThisFileDirectory)runtimes/deps/libllama.dll">
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/llama.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/win-x64/native/noavx/libllama.dll</Link>
<Link>runtimes/win-x64/native/noavx/llama.dll</Link>
</None>
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/avx/libllama.dll">
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/avx/llama.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/win-x64/native/avx/libllama.dll</Link>
<Link>runtimes/win-x64/native/avx/llama.dll</Link>
</None>
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/avx2/libllama.dll">
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/avx2/llama.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/win-x64/native/avx2/libllama.dll</Link>
<Link>runtimes/win-x64/native/avx2/llama.dll</Link>
</None>
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/avx512/libllama.dll">
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/avx512/llama.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/win-x64/native/avx512/libllama.dll</Link>
<Link>runtimes/win-x64/native/avx512/llama.dll</Link>
</None>
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/cu11.7.1/libllama.dll">
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/cu11.7.1/llama.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/win-x64/native/cuda11/libllama.dll</Link>
<Link>runtimes/win-x64/native/cuda11/llama.dll</Link>
</None>
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/cu12.1.0/libllama.dll">
<None Include="$(MSBuildThisFileDirectory)runtimes/deps/cu12.1.0/llama.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<Link>runtimes/win-x64/native/cuda12/libllama.dll</Link>
<Link>runtimes/win-x64/native/cuda12/llama.dll</Link>
</None>

<None Include="$(MSBuildThisFileDirectory)runtimes/deps/libllama.so">
Expand Down
27 changes: 15 additions & 12 deletions LLama/Native/NativeApi.Load.cs
Original file line number Diff line number Diff line change
Expand Up @@ -129,31 +129,33 @@ private static string GetCudaVersionFromPath(string cudaPath)
}

#if NET6_0_OR_GREATER
private static string GetAvxLibraryPath(NativeLibraryConfig.AvxLevel avxLevel, string prefix, string suffix)
private static string GetAvxLibraryPath(NativeLibraryConfig.AvxLevel avxLevel, string prefix, string suffix, string libraryNamePrefix)
{
var avxStr = NativeLibraryConfig.AvxLevelToString(avxLevel);
if (!string.IsNullOrEmpty(avxStr))
{
avxStr += "/";
}
return $"{prefix}{avxStr}{libraryName}{suffix}";
return $"{prefix}{avxStr}{libraryNamePrefix}{libraryName}{suffix}";
}

private static List<string> GetLibraryTryOrder(NativeLibraryConfig.Description configuration)
{
OSPlatform platform;
string prefix, suffix;
string prefix, suffix, libraryNamePrefix;
if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
{
platform = OSPlatform.Windows;
prefix = "runtimes/win-x64/native/";
suffix = ".dll";
libraryNamePrefix = "";
}
else if (RuntimeInformation.IsOSPlatform(OSPlatform.Linux))
{
platform = OSPlatform.Linux;
prefix = "runtimes/linux-x64/native/";
suffix = ".so";
libraryNamePrefix = "lib";
}
else if (RuntimeInformation.IsOSPlatform(OSPlatform.OSX))
{
Expand All @@ -163,6 +165,7 @@ private static List<string> GetLibraryTryOrder(NativeLibraryConfig.Description c
prefix = System.Runtime.Intrinsics.Arm.ArmBase.Arm64.IsSupported
? "runtimes/osx-arm64/native/"
: "runtimes/osx-x64/native/";
libraryNamePrefix = "lib";
}
else
{
Expand All @@ -181,8 +184,8 @@ private static List<string> GetLibraryTryOrder(NativeLibraryConfig.Description c
// if check skipped, we just try to load cuda libraries one by one.
if (configuration.SkipCheck)
{
result.Add($"{prefix}cuda12/{libraryName}{suffix}");
result.Add($"{prefix}cuda11/{libraryName}{suffix}");
result.Add($"{prefix}cuda12/{libraryNamePrefix}{libraryName}{suffix}");
result.Add($"{prefix}cuda11/{libraryNamePrefix}{libraryName}{suffix}");
}
else
{
Expand All @@ -209,25 +212,25 @@ private static List<string> GetLibraryTryOrder(NativeLibraryConfig.Description c
// use cpu (or mac possibly with metal)
if (!configuration.AllowFallback && platform != OSPlatform.OSX)
{
result.Add(GetAvxLibraryPath(configuration.AvxLevel, prefix, suffix));
result.Add(GetAvxLibraryPath(configuration.AvxLevel, prefix, suffix, libraryNamePrefix));
}
else if (platform != OSPlatform.OSX) // in macos there's absolutely no avx
{
if (configuration.AvxLevel >= NativeLibraryConfig.AvxLevel.Avx512)
result.Add(GetAvxLibraryPath(NativeLibraryConfig.AvxLevel.Avx512, prefix, suffix));
result.Add(GetAvxLibraryPath(NativeLibraryConfig.AvxLevel.Avx512, prefix, suffix, libraryNamePrefix));

if (configuration.AvxLevel >= NativeLibraryConfig.AvxLevel.Avx2)
result.Add(GetAvxLibraryPath(NativeLibraryConfig.AvxLevel.Avx2, prefix, suffix));
result.Add(GetAvxLibraryPath(NativeLibraryConfig.AvxLevel.Avx2, prefix, suffix, libraryNamePrefix));

if (configuration.AvxLevel >= NativeLibraryConfig.AvxLevel.Avx)
result.Add(GetAvxLibraryPath(NativeLibraryConfig.AvxLevel.Avx, prefix, suffix));
result.Add(GetAvxLibraryPath(NativeLibraryConfig.AvxLevel.Avx, prefix, suffix, libraryNamePrefix));

result.Add(GetAvxLibraryPath(NativeLibraryConfig.AvxLevel.None, prefix, suffix));
result.Add(GetAvxLibraryPath(NativeLibraryConfig.AvxLevel.None, prefix, suffix, libraryNamePrefix));
}

if (platform == OSPlatform.OSX)
{
result.Add($"{prefix}{libraryName}{suffix}");
result.Add($"{prefix}{libraryNamePrefix}{libraryName}{suffix}");
}

return result;
Expand Down Expand Up @@ -329,7 +332,7 @@ string TryFindPath(string filename)
#endif
}

internal const string libraryName = "libllama";
internal const string libraryName = "llama";
private const string cudaVersionFile = "version.json";
private const string loggingPrefix = "[LLamaSharp Native]";
private static bool enableLogging = false;
Expand Down
8 changes: 4 additions & 4 deletions LLama/runtimes/build/LLamaSharp.Backend.Cpu.nuspec
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,10 @@
<files>
<file src="LLamaSharpBackend.props" target="build/netstandard2.0/LLamaSharp.Backend.Cpu.props" />

<file src="runtimes/deps/libllama.dll" target="runtimes\win-x64\native\libllama.dll" />
<file src="runtimes/deps/avx/libllama.dll" target="runtimes\win-x64\native\avx\libllama.dll" />
<file src="runtimes/deps/avx2/libllama.dll" target="runtimes\win-x64\native\avx2\libllama.dll" />
<file src="runtimes/deps/avx512/libllama.dll" target="runtimes\win-x64\native\avx512\libllama.dll" />
<file src="runtimes/deps/llama.dll" target="runtimes\win-x64\native\llama.dll" />
<file src="runtimes/deps/avx/llama.dll" target="runtimes\win-x64\native\avx\llama.dll" />
<file src="runtimes/deps/avx2/llama.dll" target="runtimes\win-x64\native\avx2\llama.dll" />
<file src="runtimes/deps/avx512/llama.dll" target="runtimes\win-x64\native\avx512\llama.dll" />

<file src="runtimes/deps/libllama.so" target="runtimes\linux-x64\native\libllama.so" />
<file src="runtimes/deps/avx/libllama.so" target="runtimes\linux-x64\native\avx\libllama.so" />
Expand Down
2 changes: 1 addition & 1 deletion LLama/runtimes/build/LLamaSharp.Backend.Cuda11.nuspec
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
<files>
<file src="LLamaSharpBackend.props" target="build/netstandard2.0/LLamaSharp.Backend.Cuda11.props" />

<file src="runtimes/deps/cu11.7.1/libllama.dll" target="runtimes\win-x64\native\cuda11\libllama.dll" />
<file src="runtimes/deps/cu11.7.1/llama.dll" target="runtimes\win-x64\native\cuda11\llama.dll" />
<file src="runtimes/deps/cu11.7.1/libllama.so" target="runtimes\linux-x64\native\cuda11\libllama.so" />

<file src="icon512.png" target="icon512.png" />
Expand Down
2 changes: 1 addition & 1 deletion LLama/runtimes/build/LLamaSharp.Backend.Cuda12.nuspec
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
<files>
<file src="LLamaSharpBackend.props" target="build/netstandard2.0/LLamaSharp.Backend.Cuda12.props" />

<file src="runtimes/deps/cu12.1.0/libllama.dll" target="runtimes\win-x64\native\cuda12\libllama.dll" />
<file src="runtimes/deps/cu12.1.0/llama.dll" target="runtimes\win-x64\native\cuda12\llama.dll" />
<file src="runtimes/deps/cu12.1.0/libllama.so" target="runtimes\linux-x64\native\cuda12\libllama.so" />

<file src="icon512.png" target="icon512.png" />
Expand Down
File renamed without changes.
File renamed without changes.
2 changes: 1 addition & 1 deletion docs/ContributingGuide.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ When building from source, please add `-DBUILD_SHARED_LIBS=ON` to the cmake inst
cmake .. -DLLAMA_CUBLAS=ON -DBUILD_SHARED_LIBS=ON
```

After running `cmake --build . --config Release`, you could find the `llama.dll`, `llama.so` or `llama.dylib` in your build directory. After pasting it to `LLamaSharp/LLama/runtimes` and renaming it to `libllama.dll`, `libllama.so` or `libllama.dylib`, you can use it as the native library in LLamaSharp.
After running `cmake --build . --config Release`, you could find the `llama.dll`, `llama.so` or `llama.dylib` in your build directory. After pasting it to `LLamaSharp/LLama/runtimes` , you can use it as the native library in LLamaSharp.


## Add a new feature to LLamaSharp
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ LLamaSharp is the C#/.NET binding of [llama.cpp](https://github.com/ggerganov/ll
If you are new to LLM, here're some tips for you to help you to get start with `LLamaSharp`. If you are experienced in this field, we'd still recommend you to take a few minutes to read it because some things perform differently compared to cpp/python.

1. The main ability of LLamaSharp is to provide an efficient way to run inference of LLM (Large Language Model) locally (and fine-tune model in the future). The model weights, however, need to be downloaded from other resources such as [huggingface](https://huggingface.co).
2. Since LLamaSharp supports multiple platforms, The nuget package is split into `LLamaSharp` and `LLama.Backend`. After installing `LLamaSharp`, please install one of `LLama.Backend.Cpu`, `LLama.Backend.Cuda11` or `LLama.Backend.Cuda12`. If you use the source code, dynamic libraries can be found in `LLama/Runtimes`. Rename the one you want to use to `libllama.dll`.
2. Since LLamaSharp supports multiple platforms, The nuget package is split into `LLamaSharp` and `LLama.Backend`. After installing `LLamaSharp`, please install one of `LLama.Backend.Cpu`, `LLama.Backend.Cuda11` or `LLama.Backend.Cuda12`. If you use the source code, dynamic libraries can be found in `LLama/Runtimes`.
3. `LLaMa` originally refers to the weights released by Meta (Facebook Research). After that, many models are fine-tuned based on it, such as `Vicuna`, `GPT4All`, and `Pyglion`. Though all of these models are supported by LLamaSharp, some steps are necessary with different file formats. There're mainly three kinds of files, which are `.pth`, `.bin (ggml)`, `.bin (quantized)`. If you have the `.bin (quantized)` file, it could be used directly by LLamaSharp. If you have the `.bin (ggml)` file, you could use it directly but get higher inference speed after the quantization. If you have the `.pth` file, you need to follow [the instructions in llama.cpp](https://github.com/ggerganov/llama.cpp#prepare-data--run) to convert it to `.bin (ggml)` file at first.
4. LLamaSharp supports GPU acceleration, but it requires cuda installation. Please install cuda 11 or cuda 12 on your system before using LLamaSharp to enable GPU. If you have another cuda version, you could compile llama.cpp from source to get the dll. For building from source, please refer to [issue #5](https://github.com/SciSharp/LLamaSharp/issues/5).

Expand Down
Loading