-
Notifications
You must be signed in to change notification settings - Fork 947
Caching to speed up prompt generation #1170
Comments
This is a very interesting approach, but are you aware that the whole codebase is being rewritten for 0.7.0? Your functions are based on the master (current) branch, but we are in the process of migrating to the next (0.7.0) branch. |
Nope. Didn't know that. I'm open to suggestions for how to proceed. Doing nothing is fine with me, too. |
In case anyone is also bothered by how slow the vcs/git prompt is, I've managed to speed it up by a large margin. On Linux my whole prompt (which includes The trick was to write a custom C binary that prints everything we need to know about a git repo. It's much faster than calling If anyone wants to give it a try with their own master-based setup (especially if it's sluggish), here's a non-committal way to do it. # Enable caching of parts of the prompt to make rendering much faster.
POWERLEVEL9K_USE_CACHE=true
# Enable alternative implementation for the vcs prompt. It's much faster but it only supports git.
# Tell it to not scan for dirty files in repos with over 4k files.
POWERLEVEL9K_VCS_STATUS_COMMAND="/tmp/gitstatus --dirty-max-index-size=4096"
wget https://github.com/romkatv/gitstatus/releases/latest/download/gitstatus -P /tmp
chmod +x /tmp/gitstatus
# Adjust this path depending on where you normally source powerlevel9k.zsh-theme from.
POWERLEVEL9K_INSTALLATION_DIR=~/.oh-my-zsh/custom/themes/powerlevel9k
source <(curl -f https://raw.githubusercontent.com/romkatv/powerlevel9k/caching/powerlevel9k.zsh-theme) Then see if your current shell feels more responsive. No permanent changes are done to your setup, so once you exit zsh you are back to business as usual. I can vouch for the following prompts being fast in the patched version:
|
Hi @romkatv ! @bhilburn Would we add a binary to P9K to replace Edit: By upstreaming to git I meant the concept obviously. Having a single command that outputs machine-readable information as fast as possible. Of course there is |
This is implemented in a safe way. The data we pull from the cache is never stale. Every entry in the cache is immutable and it never becomes invalid. Here's how it works. Every time we need to draw a vcs prompt, we call
(The meaning of each line is documented at powerlevel9k.zsh-theme). From this we can assemble the vcs prompt. It doesn't depend on any other data such as $PWD, time of the day or anything else (if it did depend on any of the above, we would add them to the cache key). Once we assemble the prompt, we can save it for the future. The key in the cache is the ten lines that So we have two things: fast way to get the git state, and then a fast way to format that data as a prompt. The first is fast because C, the second is fast because caching. I use the same technique for printing other prompts, such as Likewise for
Not really. This is a larger commitment than I can afford. FWIW, I did try to make it more-or-less generic in p9k. It's possible to ditch By default, Users could override It's also possible to use |
I'm concerned that your I have thought of doing something similar using golang (which is great for cross compiling) but my git-internals-fu isn't that strong. Caching the work that ZSH is doing to build the actual prompt (symbols, colors, etc.) shouldn't be taking so long that caching it speeds it up my 10x. If it is, then we should fix the ZSH code. |
Just my two cents, I think it's a great idea and it would synergize really well with making the prompt asynchronous (insert fitting issue here). Show the old prompt from the cache and update to the current values if ready. With a synchronous prompt it has its drawback, I agree. |
That's a valid concern. In theory it can be as cross-platform as libgit2, which officially supports all major platforms. Getting there isn't difficult but it's work nonetheless. Ideally, I'd love to see p9k having a configurable provider of vcs status information so that users who care about the responsiveness of their shell (like myself) could plug in their (potentially platform-specific) implementation without having to rewrite prompt formatting code in p9k. Users who don't care could use the most portable default implementation (like
Indeed. I was surprised when my improvised profiler had pointed to the formatting code as the primary CPU hog in p9k. Having replaced a large chunk of p9k with more efficient code, and having added caching, I've confirmed that indeed formatting is a major cause of prompt latency. Consider the following trivial prompt function: prompt_greet() {
if [[ -n $POWERLEVEL9K_GREETING ]]; then
$1_prompt_segment $0 $2 green black $POWERLEVEL9K_GREETING LINUX_ICON
fi
} If When I add this prompt to
(I've replaced the Linux icon with @ to make it readable.) We can see the "hi" message surrounded by some fluff. In order to produce this string, p9k executed 120 lines of code. That's a lot of code for such a simple segment. The same 120 lines will execute on every prompt. If we run the same experiment with the patched version of p9k from https://github.com/romkatv/powerlevel9k/tree/caching, we get 58 lines on the first drawing of the prompt--half of the original--and just 11 lines on subsequent prompts. If you look carefully at the last trace, you'll notice that it says "bye" rather than "hi". And yet, it has hit the cache. This is because p9k has cached the formatting fluff but the dynamic content isn't in the cache. Function Here are the 11 lines, annotated: # Should we join this segment with the previous? This is a function call.
_p9k_should_join_right 4
[[ 2 -ge 4 ]]
local join=false
# The cache key incorporates everything that affects the formatting of the segment.
# Note the lack of "hi" or "bye" in it.
local cache_key='right_prompt_segment prompt_greet green black LINUX_ICON 008 false'
# Check if we have the formatting fluff in the cache. This is a function call.
_p9k_cache_get 'right_prompt_segment prompt_greet green black LINUX_ICON 008 false'
_P9K_RETVAL=\''%F{002}<%K{002} %F{000}'\'' '\''002'\'' '\''%F{000}@ '\'
[[ \''%F{002}<%K{002} %F{000}'\'' '\''002'\'' '\''%F{000}@ '\' != __p9k_empty__ ]]
# Cache hit! We've retrieved the formatting fluff from the cache.
local tuple=( '%F{002}<%K{002} %F{000}' 002 '%F{000}@ ' )
# Splice the dynamic content into the formatting string and print the prompt.
echo -n '%F{002}<%K{002} %F{000}bye %F{000}@ '
# Update the environment so that the next segment knows how to print itself.
CURRENT_BG=002
LAST_SEGMENT_INDEX=4 The difference in performance is immediately apparently when you try it. Even if your prompt latency feels OK now, you'll still notice the improvement. I'm happy to keep using my own patched version with a custom C binary. Anyone else is welcome to try it. If you want to use any ideas or code in p9k, go ahead and copy them. |
Thanks, Syphdias. I just want to clarify that the cache I implemented never gives you wrong results. It's a pure optimization that never affects what actually gets printed. The only impact it has is that printing happens much faster, and memory usage is a bit higher. [1] Even though the git prompt I now have is an order of magnitude faster than the stock one, it produces the same results. [2] [1] This isn't exactly true. With the official p9k you can set [2] This is also not exactly true. My |
A quick look at the code doesn't show much that should be slow. It is all tests ( Another question: Are you using ZSH compiling? Basically, those lines of code should be very fast and if they aren't we should delve into what's being slow and fix it before we start optimizing via caching. The async stuff is fixing specific known problems and will speed up the VCS segment. |
No, I'm not. Didn't know such a thing existed. Sounds exciting! What's the right way to enable ZSH compilation for p9k? I'm using oh-my-zsh if that's relevant.
FWIW, here's trace comparison for |
The magic words to search for are |
Oh, I'm not arguing that |
Mysterious at the very least. Google search for "zcompinit zcompile" (without quotes) shows a total of 2 results, neither of which are enlightening. I even though something is wrong with Google and tried Bing. A lone result. Just one. Are there some instructions I can follow to enable zsh compilation for p9k prompt? Is it common knowledge that without such compilation p9k is slow but with compilation it's fast? |
I am still busy with finishing up
That is what I am currently preparing (again.. 🙄 ). The PR to come adds async functionality to segments and has a cache as well (but not as sophisticated as seen here)..
This could be done by using a rust implementation of
I think it is not very common knowledge.. That would make a great add on to our documentation. |
I've optimized my C binary that talks to git and see a nice additional drop in latency. The code is a bit messy (didn't have time to clean it up) but it works. My prompt renders in 200ms in Linux kernel git repo with all vcs bells and whistles turned on, including dirty files scanning. In nerd-fonts (a fairly large repo with 4k files) my prompt renders in 70ms. This is again with dirty files scanning and all. In a small repo prompt renders in 30ms. I still haven't tried ZSH compiling. Will give it a shot when I'll have free time. |
@romkatv Just for you, I whipped something together using I've done C cross compiling before and it is a real pain to do. What do you think about using golang instead. I'm willing to whip up the build infrastructure. Here is the the pure golang git library. https://github.com/docwhat/dotfiles/blob/master/local/libexec/tools-update/zsh |
@docwhat Did you mean to link to your code/doc? Where can I find out what you've whipped together? I searched for By the way, is it actually true that it's common knowledge that one must use zcompile to optimize prompt rendering time? @dritter says yes, but his dotfiles don't mention zcompile. I'm also surprised that Google doesn't return any blog posts, StackOverflow questions or installation guides that say how to use zcompile to speed up prompt. None of the popular zsh frameworks have compilation by default or as an option. Given how easy it is to find complains about slow prompts in zsh (dozens just on https://github.com/bhilburn/powerlevel9k/), this lack of awareness of the apparent solution seem surprising.
We live in different worlds. For me a suggestion to port C code to Go probably sounds as bizarre and baffling as the reverse would sound to you. |
Many of you have dotfiles in public repositories on github. Can anyone point me to a config you use where enabling or disabling zcompile affects prompt latency? Not just loading time but latency of rendering the prompt when you press Enter. After some time researching this subject I'm inclined to believe this isn't possible. I'd be glad to be proven wrong as I'll take anything that will make my shell speedier. |
I added the link above: https://github.com/docwhat/dotfiles/blob/master/local/libexec/tools-update/zsh Sent with GitHawk |
@docwhat You mentioned that First of all, we need to define what we are measuring. What I'm after is the time it takes for prompt to appear after you press I measured the prompt timing with Powerlevel9k with the default setup (no for f in ~/.oh-my-zsh/custom/themes/powerlevel9k/{powerlevel9k.zsh-theme,functions/*}; do
zcompile $f && echo ok $f || echo error $f;
done I've verified that a Am I doing something wrong? Can anyone confirm that it's possible to speed up prompt by using According to the documentation (and common sense), FYI, I've renamed my fork to Powerlevel10k because it seems unlikely that my changes are going to be upstreamed. I made it clear that Powerlevel10k is a fork of Powerlevel9k but I wanted to use a separate name to be able to refer to my code easily. I wrote decent documentation and included benchmarks comparing Powerlevel10k to Powerlevel9k, both
It bears repeating that Powerlevel10k achieves this performance without sacrificing functionality. The prompts it displays are always the same as in powerlevel9k/master. |
Btw. I wanted to clarify that we are not against integrating caching or |
I'll be happy to collaborate on integrating gitstatus with Powerlevel9k. Take a look at how I dit it with robobenklein/p10k in robobenklein/zinc#2. I think this approach could work well for Powerlevel9k, too. I modified the existing vcs segment to do the following:
By default This provides a nice fallback for systems where gitstatus doesn't work or isn't running for some other reason. It also allows us to support all backends that |
By the way, for those of you who don't follow /r/zsh, Powerlevel10k now works on all major platforms (Mac, Linux, FreeBSD and WSL). It's backward-compatible with Powerlevel9k configuration and it's 50 times faster. If you are currently using Powerlevel9k, type this and see how fast your shell will become while still looking the same: git clone https://github.com/romkatv/powerlevel10k.git /tmp/powerlevel10k
source /tmp/powerlevel10k/powerlevel10k.zsh-theme (When you are done playing, You can find other options for trying out the theme in the docs. |
Hey all - Sorry I'm so late jumping in, here. I've been following this conversation by e-mail (I get an e-mail for every single comment that gets posted for p9k, and I read every single one of them, hah), and in Gitter. @romkatv - Clearly, your experiment in compiled code & caching has resulted in some pretty serious gains. That's fantastic! We've been chatting about this in Gitter, and we are quite serious about getting your work upstreamed. There are a few key things that we want to be sure of in the process:
If possible, we will also try to upstream any improvements we make to @romkatv - Can you work with the development team in #1185 to get your stuff upstreamed? I think the whole community will benefit from the work you've done. I just tried merging your branch into our Separately, you're obviously welcome to do whatever you like in terms of the code and P10k, but I would encourage you to work closely with us to get your work upstreamed rather than splitting the community. Looking forward to seeing the progress in #1185. I'm not sure if you've looked at |
I'm guessing you haven't read https://github.com/romkatv/powerlevel10k and haven't tried the theme. It's identical to Powerlevel9k in terms of installation and configuration. If you replace the content of your
This would be great.
I'm trying to, but it's pretty discouraging that my comments get ignored both here and there. I see what appears to be unsound technical judgement, attempt to engage in conversation and never get a reply. Misleading statements that waste my time don't get followed up with any sort of an apology or acknowledgement of mistake (zsh compilation reduces prompt latency in Powerlevel9k? really?). It's unclear whether anyone from the dev has even looked at the docs or code, or tried the theme. This isn't the first open source project I'm contributing to, so I'm well aware of the differences in project cultures. So far my experience with powerlevel9k has been underwhelming. Hopefully it'll improve. For an example of a good contributor experience see my comment in libgit2 and the replies below. That's the attitude that makes people want to contribute to your project.
I'm pulling changes from Powerlevel9k. When you attempt to merge, there are no conflicts because I've already resolved them. Conflicts in
Indeed. I deleted everything I don't use to avoid misleading users and to have have fewer upstream changes to merge.
I've changed almost every line of code that gets executed when prompt is rendered with my settings (I haven't touched the prompts I don't use though). Have you seen the benchmarks? Powerlevel10k renders prompt in my root directory in 1ms, compared to 101ms for powerlevel9k/master and 26ms for powerlevel9k/next. My root directory isn't a git repo, so the difference isn't due to gitstatus.
What do you mean by splitting the community? Powerlevel10k renders the same prompt with the same configuration options as Powerlevel9k, the only difference being performance. It's essentially the same theme with two implementations. Users can freely switch back and forth between the two implementations and exchange configuration profiles without worrying about which implementation others use.
I've only used it when benchmarking. What are the biggest improvements that you are looking forward to bringing to your users? |
Hi @romkatv -
I wasn't providing those items to accuse you of having done something wrong, or saying that you hadn't done them. I was merely telling you what we care about, and the kinds of things we consider during any major feature upstream. It sounds like these likely aren't a concern for you work, which is great!
I haven't looked for a response to every one of your comments, but at cursary glance, it seems like you are getting quick responses to everything you post?
I'll address the second one first - my guess is that any misleading statements were made either due to misunderstanding or by mistake. I haven't seen such statements, but if you disagree or feel that you have been insulted in any way, then I encourage you to specifically point me to the relevant posts. If you aren't comfortable doing it publicly, then you're welcome to e-mail me privately. To address your first comment, accusations of unsound technical judgement are neither constructive nor conducive to good collaboration. If you disagree with a technical decision that has been made regarding your work, then please raise it directly with me. You can do that here, in Gitter, or by e-mail. I've not seen any evidence of the things you are accusing the P9k community about. As I said above, if you feel strongly that your involvement with the development community has been mishandled in some way, please point me to specific examples. Otherwise, please step down from your rather dramatic accusations. You've received significant time investment from the P9k devs both in this issue as well as in #1185, and we are eager to work with you.
Great! Thanks for maintaining upstream compatibility. That will make things much easier, obviously.
Makes sense.
Yes, I've seen the benchmarks - hence my comments about how impressive the results are.
A FOSS community doesn't hinge solely around workflow. If
There are significant performance improvements in most parts of the codebase, it's been completely re-organized to make it much simpler to write new segments, much more of the prompt is "configured" at initialization so that there is less runtime work, and lot of the segments have new features based on all of the above. We're pretty excited about it. |
I watched this topic from afar (most of the time) since I didn't dive deeper into the details and didn't (and probably still don't) fully understand how your code works. I'd say I am not a programmer but more of a scripter if that makes any sense. A couple of things I noticed:
You are very quick, whether it is writing code, researching something or responding. I think, this might be the reason why you felt, that conversation was going slowly. This makes your decision to fork very understandable, imho. Furthermore, I think you are referring to three statements from this thread
With my little knowledge, I think you are probably right about 1 (but you already know that) but you expected some kind of acknowledgement, which I can totally understand, I like to be right as well. 😄 I would love to get the performance boost for p9k and still have people who expand the feature of p9k. I don't think you can (or rather want to) maintain powerlevel10k with pulling from upstream every now and then. Especially the merge from current TL;DR: I hope the bad start you had with this community didn't turn you off too much. 😄 |
Thanks for adding your thoughts, @Syphdias. @romkatv - If what @Syphdias accurately reflects your experience, then I do want to be clear that a slow response time is not indicative of disinterest. Realistically, P9k has hundreds of thousands of users, and we have (at best) a handful of core developers. We are inundated with issues and PRs on a daily basis, are trying to keep development moving, and this is just a hobby for us - we all have day jobs. In short, this is the all-too-typical story of a successful F/OSS project. As @Syphdias said, we would very clearly welcome your involvement and participation :) |
@Syphdias You are very perceptive. I appreciate your comment. |
I opened this issue to initiate a discussion. Since it has apparently concluded, I'm closing it. |
So... your changes are going to stay in Powerline10K for now? |
Changes that I've proposed on this issue no longer exist in their original form. Powerlevel10k has many changes compared to powerlevel9k: some of them can be though of as an evolution of what'd been discussed here, most are unrelated. @dritter has implemented in powerlevel9k some ideas from powerlevel10k (e.g., gitstatusd integration) and cherry-picked some commits (mostly bug fixes; powerlevel10k has bug fixes for all bugs from powerlevel9k issue tracker, with some caveats). I don't intend to send PRs to powerlevel9k, which I've made clear several times in the past to avoid false expectations. I cannot speak for powerlevel9k devs about their plans. |
tl;dr: I patched P9K to reduce my zsh prompt drawing latency by over 10x. Is this something you would be interested in adopting?
I recently started using Windows Subsystem for Linux and discovered that my standard Linux zsh environment was so slow, it was barely usable. It was taking over a second to draw a prompt after each command. Granted, it was never snappy even on Linux but it was at least manageable.
I've spent some time profiling and optimizing the code and the results look promising. On each prompt P9K performs a lot of computation and over 90% of it is the same as in the last prompt. I implemented simple caching to avoid recomputing things and now my prompts get rendered in less than 100ms on WSL -- over 10x reduction in latency. They are now also blazingly fast on Linux.
I'm not an expert in zsh, so I didn't want to open a PR right away. I also didn't write any tests. If there is interest in this type of improvements, the code is here: https://github.com/romkatv/powerlevel9k/tree/caching. The code adds a general caching mechanism for anything that prompt generators might want to cache between calls, and then applies it to
left_prompt_segment
andright_prompt_segment
for nice latency reduction across the board. It also contains optimizations for all prompt generators that I personally use and that were showing up on my profile as being slow. Let me know what you want me to do. Open a PR, sign something so that you can pull the code, go away, something else?P.S.
My
.zshrc
is here: https://github.com/romkatv/dotfiles-public/blob/master/.zshrc.The text was updated successfully, but these errors were encountered: