-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use shelf allocation for font texture atlases #723
base: master
Are you sure you want to change the base?
Use shelf allocation for font texture atlases #723
Conversation
First of all, very cool! I have been wanting to improve the font engine for some time, so it is great to see efforts in this direction. We do already have our texture packer, which I'm sure you're aware of. I wonder if it makes sense to integrate the feature into that existing code, or if we might as well start over like it seems you have done here? By the way, did you write this from scratch? I'd be really interested in some numbers, that's really decisive for an improvement like this. Do you think you could make some realistic benchmarks with before and after. For example, how long does it take to add one new character to a font texture? With a minimal existing number of characters, and with a large number of existing characters. And for a large and small texture / font size. Removing unused glyphs is also something I've been thinking of. That would be a great addition, and especially important for CJK and many other languages. One thing I've also considered is whether we could share texture atlases between font sizes, or even font faces/families. |
This is indeed my own from-scratch implementation. The existing texture packer is quite primitive in that it relayouts all glyphs on each glyph set change. My implementation will be later refined to blend in with the codebase but, as it is so radically different, it will effectively be a full on rewrite of the original packer.
I plan to introduce proper benchmarks when the full solution somewhat takes shape. But as it is quite similar to what Firefox is using, I would expect it to be similarly fast at least when finding out where to place a glyph: within each atlas page, the shelf allocator iterates through only unallocated areas, which are usually quite large as long as the page is not too fragmented. Also, further potential performance improvements could be achieved by only uploading the updated portion of the texture atlas.
If this is a good idea to you then I would be glad to implement it; Firefox and Skia also employ such a strategy. I anticipate it would affect a larger portion of the codebase, however, so it would be of great help if I could receive some assistance in the design of this feature. |
Great, sounds good, I think it's reasonable to start from scratch here. Let's just make sure to also clean up old code (i.e. remove it or refactor) so that we don't have any duplicate code doing similar things. I of course understand this is just a draft for now, just wanted to say that up-front.
What I'm am mostly interested in here is a baseline of the current implementation. I am sure things can be done a lot more performant, but I'm not even sure what the current condition is. I.e., whether or not this is a significant bottleneck at all. I think that should be established first before we make a large effort in this direction.
Great, I'll be glad to help out later on here. I think we should do it in steps though, so essentially start with what you're doing here, with just one font face at a time. Then once that work is done and merged, we can start expanding towards a more global solution. |
8c06684
to
949fef2
Compare
949fef2
to
e1dc4b2
Compare
I have just added a benchmark suite that involves cycling through a number of Chinese characters, the same suite is also applied to the original implementation in my Original implementation
New implementation
It appears that my implementation is slower, but this is because the processing time is absolutely dominated by the operation of copying the whole texture atlas each time it is regenerated. My implementation uses a fixed 1024 × 1024 atlas size, while the original implementation determines a size that is just enough for the current glyphs and takes less time to copy the texture data. Because my implementation maintains the texture data at all times, performance could be dramatically improved by directly referring to that texture data while uploading, and possibly only uploading dirty regions within the atlas. Update: I removed the unnecessary copy and it is indeed much faster now. It beats the original implementation at higher glyph counts thanks to not having to rearrange all glyphs. New benchmark results
|
This pull request aims to resolve the following comment in
FontFaceLayer.cpp
:My approach is to use the shelf packing algorithm, which is employed in Firefox. (For perspective, Skia in Chromium divides the atlas into plots and uses the skyline algorithm for each plot.)
My further plan is to also conserve memory by removing glyphs and fonts that have not been used for a while.
This is currently my first iteration of the solution which brings in the shelf allocator I have implemented previously and so the code style and convention have not matched RmlUi's yet. I would like to receive some thoughts about the design, as well as behavior and performance testing in the meantime.