Re: NPU Support
#8697
Replies: 1 comment
-
see links to issues in #9181 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
A bit over a year ago discussion #336 by @BrianSemiglia brought up the idea of adding NPU support. At the time most NPUs were around or below 5 TOPS and many CPUs didn't have integrated NPUs. As such there was very limited gain. However, with the next generation of CPUs announced by AMD and Intel (plus Snapdragon) promising around 50 TOPS, strong NPUs will be on more consumer hardware. This, plus the rise of LPDDR5X, makes the use of an NPU attractive for running LLMs on an mobile or embedded solution.
There are some major challenges, for example, NPUs don't have a unified API (that I know of). With that said, I think it's worth investing time to at least consider. On a personal note, I'd like to see NPU support for Linux. It'd be great to use a low-power embedded Ryzen solution as a home server, and use it's NPU power an Ollama + Open WebUI stack.
Beta Was this translation helpful? Give feedback.
All reactions