You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the demos I’ve seen of Leon AI, it appeared rather slow. I have no idea if this was a limitation of the hardware or there were inefficiencies that might be improved upon. GPT4All appears to be rather performant, even on systems without CUDA compatible GPUs. I have no idea if it is any faster than the inference engine you’re already using.
The text was updated successfully, but these errors were encountered:
Which demos are you referring to? If it's about the former new voice video, then it's because I don't show the tokens being generated for most of the video. But you can see it from here.
Also, it is possible to disable the LLM and use the built-in text classification which is nearly real time.
While I based my recommendation on the performance I saw in this video: https://youtu.be/6CInSt6pTVA?si=oIipaG4Rb07EqSet I know many local LLM inference and training systems rely heavily on Nvdia CUDA GPUs. I mentioned GPT4All as I knew it leveraged AVX CPU instructions and Nomic Vulkan to provide efficient access to LLM inference on Nvidia and AMD GPUs. I’m not sure if Leon currently relies on CUDA for performance, but if so, GPT4All may help you support more hardware.
In the demos I’ve seen of Leon AI, it appeared rather slow. I have no idea if this was a limitation of the hardware or there were inefficiencies that might be improved upon. GPT4All appears to be rather performant, even on systems without CUDA compatible GPUs. I have no idea if it is any faster than the inference engine you’re already using.
The text was updated successfully, but these errors were encountered: