-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce RenderClusterIterator #3458
Introduce RenderClusterIterator #3458
Conversation
src/renderer/vt/paint.cpp
Outdated
short totalWidth = 0; | ||
for (const auto& cluster : clusters) | ||
unclusteredString.reserve(40); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of magic number. But without reserving space, this will run much slower. Should we pass size alongside clusterIter
? I'm not sure about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh... perhaps preallocate the number of cells width from coord
to the right edge of the known dirty rect (see impl of GetDirtyRectInChars
)
All of the engines should know a little bit about what area is actually invalid and they're always given a drawing coordinate.
Failing this, instead of allocating and reserving on each call to _Paint*BufferLine
, they might be able to allocate the buffer when the StartPaint()
comes in as a dirty-rect-wide string/vector and just re-use that one for every row and free it when the EndPaint()
comes around. That'd make way fewer allocations per frame than one per call.
src/renderer/dx/CustomTextLayout.cpp
Outdated
@@ -42,11 +42,14 @@ CustomTextLayout::CustomTextLayout(gsl::not_null<IDWriteFactory1*> const factory | |||
_localeName.resize(gsl::narrow_cast<size_t>(format->GetLocaleNameLength()) + 1); // +1 for null | |||
THROW_IF_FAILED(format->GetLocaleName(_localeName.data(), gsl::narrow<UINT32>(_localeName.size()))); | |||
|
|||
for (const auto& cluster : clusters) | |||
RenderClusterIterator it = clusterIter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because we are using RenderClusterIterator const
, the copy is needed. It could be RenderClusterIterator&
and then it can be modifed by the caller. But that's just kind of implicit to me. I prefer the copying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, I like what you did. It's constant for the specification of the custom layout, but then the internals of the layout makes the copy to walk through it.
src/renderer/base/renderer.cpp
Outdated
// And hold the point where we should start drawing. | ||
auto screenPoint = target; | ||
|
||
// This outer loop will continue until we reach the end of the text we are trying to draw. | ||
while (it) | ||
{ | ||
TextBufferCellIterator runStartIt(it); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is inner loop is over clusterIter
will be exhausted. So I copy it and keep the start of the run here.
src/renderer/base/renderer.cpp
Outdated
// Advance the cluster and column counts. | ||
const auto columnCount = clusters.back().GetColumns(); | ||
it += columnCount > 0 ? columnCount : 1; // prevent infinite loop for no visible columns | ||
const auto columnCount = (*clusterIter).GetColumns(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calculating the cols
is cumbersome. , I don't know how to do this in a more elegant way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. This might actually justify making the PaintBufferLine
's first arg be a & instead.
Save a copy of it on the way in, let the engine increment it, then add a method that lets you difference the two of them.
Diff the resultant one you get back versus the one you initially created and it should tell you how many columns it walked. Then you don't have to cycle through the loop multiple times.
That's what I did for the cell/text iterators with .GetCellDistance
and whatnot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. I don't think it's enough to justify this. Making it &
sounds very error-prone and implicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make two of them? Take a const input and then have a second & as the output one?
It's not uncommon to use an Inout parameter, but if you think its safer to have an explicit In and Out (given we can't use the return code as it's always HRESULT
for these thing), it seems OK.
src/renderer/gdi/paint.cpp
Outdated
const COORD coord, | ||
const bool trimLeft) noexcept | ||
{ | ||
try | ||
{ | ||
const auto cchLine = clusters.size(); | ||
size_t cchLine = 120; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, without size I can't make use of length to init arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, decent estimate is coord to right edge of dirty rect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An estimate is not enough, though. I need the exact size here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iterating it two times, could it be that bad? I'm stucked...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not strictly. You could replace things like auto pwsPoly = std::make_unique<wchar_t>(cchLine)
instead with a std::wstring
and .reserve()
it up to the "estimate" of _rcInvalid.Right - coord.X
. Same goes for the rgdxPoly
with a vector of ints or something. (or use vectors for both).
The first phase with the for
loop has to iterate the whole thing anyway to condense the clusters back into a string. Either you'll get lucky in the majority case and the estimate won't resize the vectors, or you'll get unlucky in the uncommon case and you'll resize up once or twice. (And if resizing up happens a lot, we can use an estimate function like the CustomTextLayout::_EstimateGlyphCount
does to reduce the potential resizes.)
By the end of the for
loop (lines 316-325), you should be able to know the exact widths required for all the functions further down PaintBufferLine
and have only iterated once. E.g. you should have calculated cchLine
exactly by iterating once and also filled a vector pwsPoly
and vector rgdxPoly
with the condensed version. That should be enough information to finish the draw operation.
However, now that I look at this further. The loop in this specific GDI renderer only allows a maximum of one wchar_t
per unit because of GDI limitation. (comment on L322 where cluster.GetTextAsSingle()
is called. So the "estimate" used for .reserve()
above should actually be the exact column count as a Cluster
is a console grid unit described by one-or-more-wchar_t
and the vectors wouldn't auto-resize.
The last problem is that the vectors would free when going out of scope. There isn't really a detach
to the vectors. We could just hold a reference to them that is freed inside _FlushBufferLines
after PolyTextOut
is called.
Member variables on the class like:
std::vector<std::vector<wchar_t>> _polyStrings
and std::vector<std::vector<int>> _polyWidths
, push a new vector into those on each PaintBufferLine
, take the .data()
of the line and stuff it into the pPolyTextLine
fields, and then clear them out on _FlushBufferLines
.
That might make _FlushBuferLines
a little cleaner than raw delete[]
s of bare pointers as well.
I don't know what I did wrong but I'm getting the same error as CI build:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly the direction I was hoping you would go with this. Pretty much perfectly what I was envisioning with the comment I left yesterday. Thank you!
src/renderer/vt/paint.cpp
Outdated
short totalWidth = 0; | ||
for (const auto& cluster : clusters) | ||
unclusteredString.reserve(40); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh... perhaps preallocate the number of cells width from coord
to the right edge of the known dirty rect (see impl of GetDirtyRectInChars
)
All of the engines should know a little bit about what area is actually invalid and they're always given a drawing coordinate.
Failing this, instead of allocating and reserving on each call to _Paint*BufferLine
, they might be able to allocate the buffer when the StartPaint()
comes in as a dirty-rect-wide string/vector and just re-use that one for every row and free it when the EndPaint()
comes around. That'd make way fewer allocations per frame than one per call.
src/renderer/inc/Cluster.hpp
Outdated
@@ -23,17 +23,17 @@ namespace Microsoft::Console::Render | |||
public: | |||
Cluster(const std::wstring_view text, const size_t columns); | |||
|
|||
const wchar_t GetTextAsSingle() const noexcept; | |||
wchar_t GetTextAsSingle() const noexcept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did we lose const
s here?
src/renderer/dx/CustomTextLayout.cpp
Outdated
@@ -42,11 +42,14 @@ CustomTextLayout::CustomTextLayout(gsl::not_null<IDWriteFactory1*> const factory | |||
_localeName.resize(gsl::narrow_cast<size_t>(format->GetLocaleNameLength()) + 1); // +1 for null | |||
THROW_IF_FAILED(format->GetLocaleName(_localeName.data(), gsl::narrow<UINT32>(_localeName.size()))); | |||
|
|||
for (const auto& cluster : clusters) | |||
RenderClusterIterator it = clusterIter; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, I like what you did. It's constant for the specification of the custom layout, but then the internals of the layout makes the copy to walk through it.
src/host/ut_host/VtRendererTests.cpp
Outdated
@@ -1026,7 +1026,7 @@ void VtRendererTest::XtermTestCursor() | |||
clusters.emplace_back(std::wstring_view{ &line[i], 1 }, static_cast<size_t>(rgWidths[i])); | |||
} | |||
|
|||
VERIFY_SUCCEEDED(engine->PaintBufferLine({ clusters.data(), clusters.size() }, { 1, 1 }, false)); | |||
// VERIFY_SUCCEEDED(engine->PaintBufferLine({ clusters.data(), clusters.size() }, { 1, 1 }, false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBD I presume
src/renderer/dx/DxRenderer.cpp
Outdated
_dwriteFontFace.Get(), | ||
{ &cluster, 1 }, | ||
_glyphCell.cx); | ||
//// Create the text layout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also TBD, I presume
src/renderer/gdi/paint.cpp
Outdated
const COORD coord, | ||
const bool trimLeft) noexcept | ||
{ | ||
try | ||
{ | ||
const auto cchLine = clusters.size(); | ||
size_t cchLine = 120; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, decent estimate is coord to right edge of dirty rect.
src/renderer/base/renderer.cpp
Outdated
// Advance the cluster and column counts. | ||
const auto columnCount = clusters.back().GetColumns(); | ||
it += columnCount > 0 ? columnCount : 1; // prevent infinite loop for no visible columns | ||
const auto columnCount = (*clusterIter).GetColumns(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. This might actually justify making the PaintBufferLine
's first arg be a & instead.
Save a copy of it on the way in, let the engine increment it, then add a method that lets you difference the two of them.
Diff the resultant one you get back versus the one you initially created and it should tell you how many columns it walked. Then you don't have to cycle through the loop multiple times.
That's what I did for the cell/text iterators with .GetCellDistance
and whatnot.
Just overall, did you check a trace on this (can I see)? I presume it's better than the original but maybe not as perfect as the more memory-intense previous iteration. |
d0d432e
to
e9a785f
Compare
src/host/ut_host/VtRendererTests.cpp
Outdated
|
||
VERIFY_SUCCEEDED(engine->PaintBufferLine({ clusters1.data(), clusters1.size() }, { 0, 0 }, false)); | ||
VERIFY_SUCCEEDED(engine->PaintBufferLine({ clusters2.data(), clusters2.size() }, { 0, 1 }, false)); | ||
const COORD screenBufferSize{ 119, 9029 }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just way too much ... kinda hurtful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's suppose to be a unit test that focuses on this certain VT thing. Now I dragged all the other units like TextBuffer
. I'm not a fan of this..
static const size_t s_cPolyTextCache = 80; | ||
POLYTEXTW _pPolyText[s_cPolyTextCache]; | ||
static constexpr size_t s_cPolyTextCache = 80; | ||
std::array<POLYTEXTW, s_cPolyTextCache> _polyText; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured that all those containers have exact same fixed size. Using std::array
is more reasonable
// Exit early if there are no lines to draw. | ||
RETURN_HR_IF(S_OK, 0 == cchLine); | ||
|
||
const size_t preallocateSize = (_rcInvalid.right - _rcInvalid.left) / 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that 8
is the magic number needed here to get the correct size? Any GDI+ reference that explains this?
} | ||
} | ||
|
||
_polyStrings = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This certainly is safer and better than raw delete[]
.
return !(*this == it); | ||
} | ||
|
||
RenderClusterIterator& RenderClusterIterator::operator+=(const ptrdiff_t& movement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently RenderClusterIterator
is not aware of the dbcs
columns. Caller must use clusterIter += cols > 0 ? cols : 1
to skip dbcs trailing char. I've tried to let RenderClusterIterator
handle this itself, but failed to do so because the code gets too complicated and implicit.
Plain I found something interesting. Plain Update: With #3480 merged, |
d9e896e
to
d1f4634
Compare
@@ -18,5 +18,5 @@ | |||
<!-- DONT ADD NEW FILES HERE, ADD THEM TO vt-renderer-common.vcxitems --> | |||
<!-- Careful reordering these. Some default props (contained in these files) are order sensitive. --> | |||
<Import Project="$(SolutionDir)src\common.build.post.props" /> | |||
<!-- <Import Project="$(SolutionDir)src\common.build.tests.props" /> --> | |||
<Import Project="$(SolutionDir)src\common.build.tests.props" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why but I need this to make CI and my local VS happy. Wonder why master is still green without it
3dbb0bc
to
dcb9bd6
Compare
Mostly this is ready now. Tests are finally green. |
93df594
to
544c21c
Compare
This needs to hold, until #3546 is solved. I'm seeing even worse result with this PR. |
OK this PR is not the cause of the regression. It's probabaly #3474. But still let's hold this until the regression is solved. |
544c21c
to
fe96309
Compare
Now that #778 is closed, this does not seem necessary. Close this for now, |
Thanks for this anyway, @skyline75489. I do want to get this done at some point in the future and will likely use this as a template whenever that time comes. But you're right, it's not as critical right now anymore. |
Summary of the Pull Request
This is the successor of #3438 . The idea is based on @miniksa 's comment originally posted here #3438 (comment)
References
#3075 #3438
PR Checklist
Detailed Description of the Pull Request / Additional comments
This is still WIP, though. Would love to hear everybody's feedback.
Validation Steps Performed
Just pretty much everything should be validated. Vim, cat, top, cmatrix, and so on.