Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teach AUTOs to process escaped backslashes at the end of quoted strings (#1831) #1840

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

acr4
Copy link
Contributor

@acr4 acr4 commented Sep 9, 2023

"...\..." is a string containing a backslash:
"..."..." is a string containing a double-quote: "

When parsing double-quoted strings, we must detect embedded escaped backslashes and double-quotes. Detecting when a string terminates with one or more of these characters is tricky. Strings terminating in "..."" are not uncommon, and have reasonably robust support, but strings terminating in "...\" are quite rare, and were quite broken.

This patch aims to improve the detection of "...\" (or "...\\", or any even number of backslashes in a row immediately preceding a double-quote string terminator.

…gs (veripool#1831)

  "...\\..." is a string containing a backslash: \
  "...\"..." is a string containing a double-quote: "

When parsing double-quoted strings, we must detect embedded escaped backslashes
and double-quotes. Detecting when a string terminates with one or more of these
characters is tricky. Strings terminating in "...\"" are not uncommon, and have
reasonably robust support, but strings terminating in "...\\" are quite rare,
and were quite broken.

This patch aims to improve the detection of "...\\" (or "...\\\\", or any even
number of backslashes in a row immediately preceding a double-quote string
terminator.
@acr4
Copy link
Contributor Author

acr4 commented Sep 9, 2023

@wsnyder this isn't ready yet... It works on some test-cases, but this trivial one fails:

module m;
   always @(/*AS*/) z = a ? b : c;
   string trailing_backslash = "...\\\"\\";
endmodule

@acr4
Copy link
Contributor Author

acr4 commented Sep 9, 2023

verilog-scan-region behaves better (at least in terms of accuracy, but not necessarily performance) by scanning forward one character at a time (and two chars for backslash-escaped things) like this:

                 ((looking-at "\"")
                  (setq pt (point))
                  ;; Move forward to the first non-escaped (backslashed) double-quote ("). That closes this string.
                  ;;   skip: \" (an escaped double-quote)
                  ;;   match: \\" (a backslash as the last character of a string)
                  (forward-char 1)
                  (while (not (looking-at-p "\""))
                    (if (looking-at-p "\\\\")
                        (forward-char 2)
                      (forward-char 1)))
                  (forward-char 1)
                  (put-text-property (1+ pt) (point) 'v-cmts t))

I'm sure this could be done more cleverly with some (search-forward "[\\\\\"]") or similar but the backslashing for elisp regular expressions is breaking my brain.

But now I'm starting to peel the onion a little more and I see that both verilog-read-decls and verilog-read-always-signals-recurse both look for balanced quotes, and emit errors with the string "Unmatched quotes" when they get confused. That's happening for the trivial example above with a string data-type.

I suspect we may need to add a new function that abstracts complex string traversal and returns the end-point for a given double-quotes start-point, then call that wherever string boundaries need to be detected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant