Skip to content

Commit

Permalink
Refactor VT control sequence identification (#7304)
Browse files Browse the repository at this point in the history
This PR changes the way VT control sequences are identified and
dispatched, to be more efficient and easier to extend. Instead of
parsing the intermediate characters into a vector, and then having to
identify a sequence using both that vector and the final char, we now
use just a single `uint64_t` value as the identifier.

The way the identifier is constructed is by taking the private parameter
prefix, each of the intermediate characters, and then the final
character, and shifting them into a 64-bit integer one byte at a time,
in reverse order. For example, the `DECTLTC` control has a private
parameter prefix of `?`, one intermediate of `'`, and a final character
of `s`. The ASCII values of those characters are `0x3F`, `0x27`, and
`0x73` respectively, and reversing them gets you 0x73273F, so that would
then be the identifier for the control.

The reason for storing them in reverse order, is because sometimes we
need to look at the first intermediate to determine the operation, and
treat the rest of the sequence as a kind of sub-identifier (the
character set designation sequences are one example of this). When in
reverse order, this can easily be achieved by masking off the low byte
to get the first intermediate, and then shifting the value right by 8
bits to get a new identifier with the rest of the sequence.

With 64 bits we have enough space for a private prefix, six
intermediates, and the final char, which is way more than we should ever
need (the _DEC STD 070_ specification recommends supporting at least
three intermediates, but in practice we're unlikely to see more than
two).

With this new way of identifying controls, it should now be possible for
every action code to be unique (for the most part). So I've also used
this PR to clean up the action codes a bit, splitting the codes for the
escape sequences from the control sequences, and sorting them into
alphabetical order (which also does a reasonable job of clustering
associated controls).

## Validation Steps Performed

I think the existing unit tests should be good enough to confirm that
all sequences are still being dispatched correctly. However, I've also
manually tested a number of sequences to make sure they were still
working as expected, in particular those that used intermediates, since
they were the most affected by the dispatch code refactoring.

Since these changes also affected the input state machine, I've done
some manual testing of the conpty keyboard handling (both with and
without the new Win32 input mode enabled) to make sure the keyboard VT
sequences were processed correctly. I've also manually tested the
various VT mouse modes in Vttest to confirm that they were still working
correctly too.

Closes #7276
  • Loading branch information
j4james authored Aug 18, 2020
1 parent 5d082ff commit 7fcff4d
Show file tree
Hide file tree
Showing 17 changed files with 721 additions and 901 deletions.
2 changes: 2 additions & 0 deletions .github/actions/spell-check/expect/expect.txt
Original file line number Diff line number Diff line change
Expand Up @@ -548,6 +548,7 @@ DECSCUSR
DECSED
DECSEL
DECSET
DECSLPP
DECSLRM
DECSMBV
DECSMKR
Expand Down Expand Up @@ -2579,6 +2580,7 @@ vstudio
vswhere
vtapp
VTE
VTID
vtio
vtmode
vtpipeterm
Expand Down
102 changes: 95 additions & 7 deletions src/terminal/adapter/DispatchTypes.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,94 @@

#pragma once

namespace Microsoft::Console::VirtualTerminal
{
class VTID
{
public:
template<size_t Length>
constexpr VTID(const char (&s)[Length]) :
_value{ _FromString(s) }
{
}

constexpr VTID(const uint64_t value) :
_value{ value }
{
}

constexpr operator uint64_t() const
{
return _value;
}

constexpr char operator[](const size_t offset) const
{
return SubSequence(offset)._value & 0xFF;
}

constexpr VTID SubSequence(const size_t offset) const
{
return _value >> (CHAR_BIT * offset);
}

private:
template<size_t Length>
static constexpr uint64_t _FromString(const char (&s)[Length])
{
static_assert(Length - 1 <= sizeof(_value));
uint64_t value = 0;
for (auto i = Length - 1; i-- > 0;)
{
value = (value << CHAR_BIT) + gsl::at(s, i);
}
return value;
}

uint64_t _value;
};

class VTIDBuilder
{
public:
void Clear() noexcept
{
_idAccumulator = 0;
_idShift = 0;
}

void AddIntermediate(const wchar_t intermediateChar) noexcept
{
if (_idShift + CHAR_BIT >= sizeof(_idAccumulator) * CHAR_BIT)
{
// If there is not enough space in the accumulator to add
// the intermediate and still have room left for the final,
// then we reset the accumulator to zero. This will result
// in an id with all zero intermediates, which shouldn't
// match anything.
_idAccumulator = 0;
}
else
{
// Otherwise we shift the intermediate so as to add it to the
// accumulator in the next available space, and then increment
// the shift by 8 bits in preparation for the next character.
_idAccumulator += (static_cast<uint64_t>(intermediateChar) << _idShift);
_idShift += CHAR_BIT;
}
}

VTID Finalize(const wchar_t finalChar) noexcept
{
return _idAccumulator + (static_cast<uint64_t>(finalChar) << _idShift);
}

private:
uint64_t _idAccumulator = 0;
size_t _idShift = 0;
};
}

namespace Microsoft::Console::VirtualTerminal::DispatchTypes
{
enum class EraseType : unsigned int
Expand Down Expand Up @@ -101,16 +189,16 @@ namespace Microsoft::Console::VirtualTerminal::DispatchTypes
W32IM_Win32InputMode = 9001
};

namespace CharacterSets
enum CharacterSets : uint64_t
{
constexpr auto DecSpecialGraphics = std::make_pair(L'0', L'\0');
constexpr auto ASCII = std::make_pair(L'B', L'\0');
}
DecSpecialGraphics = VTID("0"),
ASCII = VTID("B")
};

enum CodingSystem : wchar_t
enum CodingSystem : uint64_t
{
ISO2022 = L'@',
UTF8 = L'G'
ISO2022 = VTID("@"),
UTF8 = VTID("G")
};

enum TabClearType : unsigned short
Expand Down
6 changes: 3 additions & 3 deletions src/terminal/adapter/ITermDispatch.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -98,9 +98,9 @@ class Microsoft::Console::VirtualTerminal::ITermDispatch
virtual bool TertiaryDeviceAttributes() = 0; // DA3
virtual bool Vt52DeviceAttributes() = 0; // VT52 Identify

virtual bool DesignateCodingSystem(const wchar_t codingSystem) = 0; // DOCS
virtual bool Designate94Charset(const size_t gsetNumber, const std::pair<wchar_t, wchar_t> charset) = 0; // SCS
virtual bool Designate96Charset(const size_t gsetNumber, const std::pair<wchar_t, wchar_t> charset) = 0; // SCS
virtual bool DesignateCodingSystem(const VTID codingSystem) = 0; // DOCS
virtual bool Designate94Charset(const size_t gsetNumber, const VTID charset) = 0; // SCS
virtual bool Designate96Charset(const size_t gsetNumber, const VTID charset) = 0; // SCS
virtual bool LockingShift(const size_t gsetNumber) = 0; // LS0, LS1, LS2, LS3
virtual bool LockingShiftRight(const size_t gsetNumber) = 0; // LS1R, LS2R, LS3R
virtual bool SingleShift(const size_t gsetNumber) = 0; // SS2, SS3
Expand Down
10 changes: 5 additions & 5 deletions src/terminal/adapter/adaptDispatch.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1670,7 +1670,7 @@ void AdaptDispatch::_InitTabStopsForWidth(const size_t width)
// - codingSystem - The coding system that will be selected.
// Return value:
// True if handled successfully. False otherwise.
bool AdaptDispatch::DesignateCodingSystem(const wchar_t codingSystem)
bool AdaptDispatch::DesignateCodingSystem(const VTID codingSystem)
{
// If we haven't previously saved the initial code page, do so now.
// This will be used to restore the code page in response to a reset.
Expand Down Expand Up @@ -1712,10 +1712,10 @@ bool AdaptDispatch::DesignateCodingSystem(const wchar_t codingSystem)
// If the specified charset is unsupported, we do nothing (remain on the current one)
//Arguments:
// - gsetNumber - The G-set into which the charset will be selected.
// - charset - The characters indicating the charset that will be used.
// - charset - The identifier indicating the charset that will be used.
// Return value:
// True if handled successfully. False otherwise.
bool AdaptDispatch::Designate94Charset(const size_t gsetNumber, const std::pair<wchar_t, wchar_t> charset)
bool AdaptDispatch::Designate94Charset(const size_t gsetNumber, const VTID charset)
{
return _termOutput.Designate94Charset(gsetNumber, charset);
}
Expand All @@ -1727,10 +1727,10 @@ bool AdaptDispatch::Designate94Charset(const size_t gsetNumber, const std::pair<
// If the specified charset is unsupported, we do nothing (remain on the current one)
//Arguments:
// - gsetNumber - The G-set into which the charset will be selected.
// - charset - The characters indicating the charset that will be used.
// - charset - The identifier indicating the charset that will be used.
// Return value:
// True if handled successfully. False otherwise.
bool AdaptDispatch::Designate96Charset(const size_t gsetNumber, const std::pair<wchar_t, wchar_t> charset)
bool AdaptDispatch::Designate96Charset(const size_t gsetNumber, const VTID charset)
{
return _termOutput.Designate96Charset(gsetNumber, charset);
}
Expand Down
6 changes: 3 additions & 3 deletions src/terminal/adapter/adaptDispatch.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -89,9 +89,9 @@ namespace Microsoft::Console::VirtualTerminal
bool ForwardTab(const size_t numTabs) override; // CHT, HT
bool BackwardsTab(const size_t numTabs) override; // CBT
bool TabClear(const size_t clearType) override; // TBC
bool DesignateCodingSystem(const wchar_t codingSystem) override; // DOCS
bool Designate94Charset(const size_t gsetNumber, const std::pair<wchar_t, wchar_t> charset) override; // SCS
bool Designate96Charset(const size_t gsetNumber, const std::pair<wchar_t, wchar_t> charset) override; // SCS
bool DesignateCodingSystem(const VTID codingSystem) override; // DOCS
bool Designate94Charset(const size_t gsetNumber, const VTID charset) override; // SCS
bool Designate96Charset(const size_t gsetNumber, const VTID charset) override; // SCS
bool LockingShift(const size_t gsetNumber) override; // LS0, LS1, LS2, LS3
bool LockingShiftRight(const size_t gsetNumber) override; // LS1R, LS2R, LS3R
bool SingleShift(const size_t gsetNumber) override; // SS2, SS3
Expand Down
6 changes: 3 additions & 3 deletions src/terminal/adapter/termDispatch.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -92,9 +92,9 @@ class Microsoft::Console::VirtualTerminal::TermDispatch : public Microsoft::Cons
bool TertiaryDeviceAttributes() noexcept override { return false; } // DA3
bool Vt52DeviceAttributes() noexcept override { return false; } // VT52 Identify

bool DesignateCodingSystem(const wchar_t /*codingSystem*/) noexcept override { return false; } // DOCS
bool Designate94Charset(const size_t /*gsetNumber*/, const std::pair<wchar_t, wchar_t> /*charset*/) noexcept override { return false; } // SCS
bool Designate96Charset(const size_t /*gsetNumber*/, const std::pair<wchar_t, wchar_t> /*charset*/) noexcept override { return false; } // SCS
bool DesignateCodingSystem(const VTID /*codingSystem*/) noexcept override { return false; } // DOCS
bool Designate94Charset(const size_t /*gsetNumber*/, const VTID /*charset*/) noexcept override { return false; } // SCS
bool Designate96Charset(const size_t /*gsetNumber*/, const VTID /*charset*/) noexcept override { return false; } // SCS
bool LockingShift(const size_t /*gsetNumber*/) noexcept override { return false; } // LS0, LS1, LS2, LS3
bool LockingShiftRight(const size_t /*gsetNumber*/) noexcept override { return false; } // LS1R, LS2R, LS3R
bool SingleShift(const size_t /*gsetNumber*/) noexcept override { return false; } // SS2, SS3
Expand Down
124 changes: 53 additions & 71 deletions src/terminal/adapter/terminalOutput.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,107 +17,89 @@ TerminalOutput::TerminalOutput() noexcept
_gsetTranslationTables.at(3) = Latin1;
}

bool TerminalOutput::Designate94Charset(size_t gsetNumber, const std::pair<wchar_t, wchar_t> charset)
bool TerminalOutput::Designate94Charset(size_t gsetNumber, const VTID charset)
{
switch (charset.first)
switch (charset)
{
case L'B': // US ASCII
case L'1': // Alternate Character ROM
case VTID("B"): // US ASCII
case VTID("1"): // Alternate Character ROM
return _SetTranslationTable(gsetNumber, Ascii);
case L'0': // DEC Special Graphics
case L'2': // Alternate Character ROM Special Graphics
case VTID("0"): // DEC Special Graphics
case VTID("2"): // Alternate Character ROM Special Graphics
return _SetTranslationTable(gsetNumber, DecSpecialGraphics);
case L'<': // DEC Supplemental
case VTID("<"): // DEC Supplemental
return _SetTranslationTable(gsetNumber, DecSupplemental);
case L'A': // British NRCS
case VTID("A"): // British NRCS
return _SetTranslationTable(gsetNumber, BritishNrcs);
case L'4': // Dutch NRCS
case VTID("4"): // Dutch NRCS
return _SetTranslationTable(gsetNumber, DutchNrcs);
case L'5': // Finnish NRCS
case L'C': // (fallback)
case VTID("5"): // Finnish NRCS
case VTID("C"): // (fallback)
return _SetTranslationTable(gsetNumber, FinnishNrcs);
case L'R': // French NRCS
case VTID("R"): // French NRCS
return _SetTranslationTable(gsetNumber, FrenchNrcs);
case L'f': // French NRCS (ISO update)
case VTID("f"): // French NRCS (ISO update)
return _SetTranslationTable(gsetNumber, FrenchNrcsIso);
case L'9': // French Canadian NRCS
case L'Q': // (fallback)
case VTID("9"): // French Canadian NRCS
case VTID("Q"): // (fallback)
return _SetTranslationTable(gsetNumber, FrenchCanadianNrcs);
case L'K': // German NRCS
case VTID("K"): // German NRCS
return _SetTranslationTable(gsetNumber, GermanNrcs);
case L'Y': // Italian NRCS
case VTID("Y"): // Italian NRCS
return _SetTranslationTable(gsetNumber, ItalianNrcs);
case L'6': // Norwegian/Danish NRCS
case L'E': // (fallback)
case VTID("6"): // Norwegian/Danish NRCS
case VTID("E"): // (fallback)
return _SetTranslationTable(gsetNumber, NorwegianDanishNrcs);
case L'`': // Norwegian/Danish NRCS (ISO standard)
case VTID("`"): // Norwegian/Danish NRCS (ISO standard)
return _SetTranslationTable(gsetNumber, NorwegianDanishNrcsIso);
case L'Z': // Spanish NRCS
case VTID("Z"): // Spanish NRCS
return _SetTranslationTable(gsetNumber, SpanishNrcs);
case L'7': // Swedish NRCS
case L'H': // (fallback)
case VTID("7"): // Swedish NRCS
case VTID("H"): // (fallback)
return _SetTranslationTable(gsetNumber, SwedishNrcs);
case L'=': // Swiss NRCS
case VTID("="): // Swiss NRCS
return _SetTranslationTable(gsetNumber, SwissNrcs);
case L'&':
switch (charset.second)
{
case L'4': // DEC Cyrillic
return _SetTranslationTable(gsetNumber, DecCyrillic);
case L'5': // Russian NRCS
return _SetTranslationTable(gsetNumber, RussianNrcs);
default:
return false;
}
case L'"':
switch (charset.second)
{
case L'?': // DEC Greek
return _SetTranslationTable(gsetNumber, DecGreek);
case L'>': // Greek NRCS
return _SetTranslationTable(gsetNumber, GreekNrcs);
case L'4': // DEC Hebrew
return _SetTranslationTable(gsetNumber, DecHebrew);
default:
return false;
}
case L'%':
switch (charset.second)
{
case L'=': // Hebrew NRCS
return _SetTranslationTable(gsetNumber, HebrewNrcs);
case L'0': // DEC Turkish
return _SetTranslationTable(gsetNumber, DecTurkish);
case L'2': // Turkish NRCS
return _SetTranslationTable(gsetNumber, TurkishNrcs);
case L'5': // DEC Supplemental
return _SetTranslationTable(gsetNumber, DecSupplemental);
case L'6': // Portuguese NRCS
return _SetTranslationTable(gsetNumber, PortugueseNrcs);
default:
return false;
}
case VTID("&4"): // DEC Cyrillic
return _SetTranslationTable(gsetNumber, DecCyrillic);
case VTID("&5"): // Russian NRCS
return _SetTranslationTable(gsetNumber, RussianNrcs);
case VTID("\"?"): // DEC Greek
return _SetTranslationTable(gsetNumber, DecGreek);
case VTID("\">"): // Greek NRCS
return _SetTranslationTable(gsetNumber, GreekNrcs);
case VTID("\"4"): // DEC Hebrew
return _SetTranslationTable(gsetNumber, DecHebrew);
case VTID("%="): // Hebrew NRCS
return _SetTranslationTable(gsetNumber, HebrewNrcs);
case VTID("%0"): // DEC Turkish
return _SetTranslationTable(gsetNumber, DecTurkish);
case VTID("%2"): // Turkish NRCS
return _SetTranslationTable(gsetNumber, TurkishNrcs);
case VTID("%5"): // DEC Supplemental
return _SetTranslationTable(gsetNumber, DecSupplemental);
case VTID("%6"): // Portuguese NRCS
return _SetTranslationTable(gsetNumber, PortugueseNrcs);
default:
return false;
}
}

bool TerminalOutput::Designate96Charset(size_t gsetNumber, const std::pair<wchar_t, wchar_t> charset)
bool TerminalOutput::Designate96Charset(size_t gsetNumber, const VTID charset)
{
switch (charset.first)
switch (charset)
{
case L'A': // ISO Latin-1 Supplemental
case L'<': // (UPSS when assigned to Latin-1)
case VTID("A"): // ISO Latin-1 Supplemental
case VTID("<"): // (UPSS when assigned to Latin-1)
return _SetTranslationTable(gsetNumber, Latin1);
case L'B': // ISO Latin-2 Supplemental
case VTID("B"): // ISO Latin-2 Supplemental
return _SetTranslationTable(gsetNumber, Latin2);
case L'L': // ISO Latin-Cyrillic Supplemental
case VTID("L"): // ISO Latin-Cyrillic Supplemental
return _SetTranslationTable(gsetNumber, LatinCyrillic);
case L'F': // ISO Latin-Greek Supplemental
case VTID("F"): // ISO Latin-Greek Supplemental
return _SetTranslationTable(gsetNumber, LatinGreek);
case L'H': // ISO Latin-Hebrew Supplemental
case VTID("H"): // ISO Latin-Hebrew Supplemental
return _SetTranslationTable(gsetNumber, LatinHebrew);
case L'M': // ISO Latin-5 Supplemental
case VTID("M"): // ISO Latin-5 Supplemental
return _SetTranslationTable(gsetNumber, Latin5);
default:
return false;
Expand Down
4 changes: 2 additions & 2 deletions src/terminal/adapter/terminalOutput.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ namespace Microsoft::Console::VirtualTerminal
TerminalOutput() noexcept;

wchar_t TranslateKey(const wchar_t wch) const noexcept;
bool Designate94Charset(const size_t gsetNumber, const std::pair<wchar_t, wchar_t> charset);
bool Designate96Charset(const size_t gsetNumber, const std::pair<wchar_t, wchar_t> charset);
bool Designate94Charset(const size_t gsetNumber, const VTID charset);
bool Designate96Charset(const size_t gsetNumber, const VTID charset);
bool LockingShift(const size_t gsetNumber);
bool LockingShiftRight(const size_t gsetNumber);
bool SingleShift(const size_t gsetNumber);
Expand Down
Loading

0 comments on commit 7fcff4d

Please sign in to comment.