Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TableGen] Refactor SequenceToOffsetTable class #104986

Merged
merged 1 commit into from
Aug 23, 2024

Conversation

jurahul
Copy link
Contributor

@jurahul jurahul commented Aug 20, 2024

  • Replace use of std::isalnum/ispunct with StringExtras version to avoid possibly locale dependent behavior.
  • Remove static from printChar (do its deduplicated when linking).
  • Use range based for loops and structured bindings.
  • No need to use llvm:: for code in llvm namespace.

@jurahul jurahul force-pushed the sequence_to_offset_refactor branch from bff288e to 66c6eed Compare August 20, 2024 16:05
@jurahul jurahul requested review from MaskRay and jayfoad August 20, 2024 16:06
@jurahul jurahul marked this pull request as ready for review August 20, 2024 16:06
@llvmbot
Copy link
Member

llvmbot commented Aug 20, 2024

@llvm/pr-subscribers-llvm-adt

Author: Rahul Joshi (jurahul)

Changes
  • Replace use of std::isalnum/ispunct with StringExtras version to avoid possibly locale dependent behavior.
  • Add isPunct to StringExtras.
  • Remove static from printChar (do its deduplicated when linking).
  • Use range based for loops and structured bindings.
  • No need to use llvm:: for code in llvm namespace.

Full diff: https://github.com/llvm/llvm-project/pull/104986.diff

2 Files Affected:

  • (modified) llvm/include/llvm/ADT/StringExtras.h (+9)
  • (modified) llvm/utils/TableGen/Basic/SequenceToOffsetTable.h (+12-15)
diff --git a/llvm/include/llvm/ADT/StringExtras.h b/llvm/include/llvm/ADT/StringExtras.h
index 20e6ad1f68f996..3c0f30ce13cb16 100644
--- a/llvm/include/llvm/ADT/StringExtras.h
+++ b/llvm/include/llvm/ADT/StringExtras.h
@@ -140,6 +140,15 @@ inline bool isPrint(char C) {
   return (0x20 <= UC) && (UC <= 0x7E);
 }
 
+/// Checks whether character \p C is a punctuation character.
+///
+/// Locale-independent version of the C standard library ispunct.
+inline bool isPunct(char C) {
+  static constexpr StringRef Punctuations =
+      R"(!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~)";
+  return Punctuations.contains(C);
+}
+
 /// Checks whether character \p C is whitespace in the "C" locale.
 ///
 /// Locale-independent version of the C standard library isspace.
diff --git a/llvm/utils/TableGen/Basic/SequenceToOffsetTable.h b/llvm/utils/TableGen/Basic/SequenceToOffsetTable.h
index 09100b39650d81..497e74afc18ec9 100644
--- a/llvm/utils/TableGen/Basic/SequenceToOffsetTable.h
+++ b/llvm/utils/TableGen/Basic/SequenceToOffsetTable.h
@@ -15,20 +15,20 @@
 #ifndef LLVM_UTILS_TABLEGEN_BASIC_SEQUENCETOOFFSETTABLE_H
 #define LLVM_UTILS_TABLEGEN_BASIC_SEQUENCETOOFFSETTABLE_H
 
+#include "llvm/ADT/StringExtras.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Support/raw_ostream.h"
 #include <algorithm>
 #include <cassert>
-#include <cctype>
 #include <functional>
 #include <map>
 
 namespace llvm {
-extern llvm::cl::opt<bool> EmitLongStrLiterals;
+extern cl::opt<bool> EmitLongStrLiterals;
 
-static inline void printChar(raw_ostream &OS, char C) {
+inline void printChar(raw_ostream &OS, char C) {
   unsigned char UC(C);
-  if (isalnum(UC) || ispunct(UC)) {
+  if (isAlnum(UC) || isPunct(UC)) {
     OS << '\'';
     if (C == '\\' || C == '\'')
       OS << '\\';
@@ -126,7 +126,7 @@ class SequenceToOffsetTable {
   /// initializer, where each element is a C string literal terminated by
   /// `\0`. Falls back to emitting a comma-separated integer list if
   /// `EmitLongStrLiterals` is false
-  void emitStringLiteralDef(raw_ostream &OS, const llvm::Twine &Decl) const {
+  void emitStringLiteralDef(raw_ostream &OS, const Twine &Decl) const {
     assert(Entries && "Call layout() before emitStringLiteralDef()");
     if (!EmitLongStrLiterals) {
       OS << Decl << " = {\n";
@@ -140,9 +140,9 @@ class SequenceToOffsetTable {
        << "#pragma GCC diagnostic ignored \"-Woverlength-strings\"\n"
        << "#endif\n"
        << Decl << " = {\n";
-    for (auto I : Seqs) {
-      OS << "  /* " << I.second << " */ \"";
-      OS.write_escaped(I.first);
+    for (const auto &[Seq, Offset] : Seqs) {
+      OS << "  /* " << Offset << " */ \"";
+      OS.write_escaped(Seq);
       OS << "\\0\"\n";
     }
     OS << "};\n"
@@ -156,13 +156,10 @@ class SequenceToOffsetTable {
   void emit(raw_ostream &OS, void (*Print)(raw_ostream &, ElemT),
             const char *Term = "0") const {
     assert((empty() || Entries) && "Call layout() before emit()");
-    for (typename SeqMap::const_iterator I = Seqs.begin(), E = Seqs.end();
-         I != E; ++I) {
-      OS << "  /* " << I->second << " */ ";
-      for (typename SeqT::const_iterator SI = I->first.begin(),
-                                         SE = I->first.end();
-           SI != SE; ++SI) {
-        Print(OS, *SI);
+    for (const auto &[Seq, Offset] : Seqs) {
+      OS << "  /* " << Offset << " */ ";
+      for (const ElemT &Element : Seq) {
+        Print(OS, Element);
         OS << ", ";
       }
       OS << Term << ",\n";

@jurahul jurahul force-pushed the sequence_to_offset_refactor branch from 66c6eed to df5a13e Compare August 20, 2024 16:46
@jurahul jurahul marked this pull request as draft August 21, 2024 01:23
- Replace use of std::isalnum/ispunct with StringExtras version to
  avoid possibly locale dependent behavior.
- Remove `static` from printChar (do its deduplicated when linking).
- Use range based for loops and structured bindings.
- No need to use `llvm::` for code in llvm namespace.
@jurahul jurahul force-pushed the sequence_to_offset_refactor branch from df5a13e to b80c716 Compare August 21, 2024 21:24
@jurahul jurahul marked this pull request as ready for review August 22, 2024 01:20
@jurahul jurahul requested a review from kazutakahirata August 22, 2024 21:45
@jurahul
Copy link
Contributor Author

jurahul commented Aug 22, 2024

This is ready for review now. @kuhar @zwuis or @kazutakahirata

@jurahul jurahul changed the title [NFC][TableGen] Refactor SequenceToOffsetTable class. [TableGen] Refactor SequenceToOffsetTable class. Aug 22, 2024
@jurahul jurahul requested review from kuhar and zwuis August 23, 2024 10:55
Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I think that the majority of patches omit . in the subject.

@jurahul jurahul changed the title [TableGen] Refactor SequenceToOffsetTable class. [TableGen] Refactor SequenceToOffsetTable class Aug 23, 2024
@jurahul
Copy link
Contributor Author

jurahul commented Aug 23, 2024

I think that the majority of patches omit . in the subject.

Right, will remember going forward.

@jurahul jurahul merged commit a968ae6 into llvm:main Aug 23, 2024
10 checks passed
@jurahul jurahul deleted the sequence_to_offset_refactor branch August 23, 2024 18:14
5chmidti pushed a commit that referenced this pull request Aug 24, 2024
- Replace use of std::isalnum/ispunct with StringExtras version to avoid
possibly locale dependent behavior.
- Remove `static` from printChar (do its deduplicated when linking).
- Use range based for loops and structured bindings.
- No need to use `llvm::` for code in llvm namespace.
dmpolukhin pushed a commit to dmpolukhin/llvm-project that referenced this pull request Sep 2, 2024
- Replace use of std::isalnum/ispunct with StringExtras version to avoid
possibly locale dependent behavior.
- Remove `static` from printChar (do its deduplicated when linking).
- Use range based for loops and structured bindings.
- No need to use `llvm::` for code in llvm namespace.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants