Output pretty error if possible #399

fantix · 2022-11-17T15:14:40Z

Error position and hint are now included by default if present, overridden by EDGEDB_ERROR_HINT (enabled/disabled).

The output is aslo colored if the stderr refers to a terminal, overriden by EDGEDB_COLOR_OUTPUT (auto/enabled/disabled).

Fixes #395

c.query("select $0")

c.query("select ('\033[95m嘿嘿嘿', 1 < '\033[95m哈哈😁lol');")

c.query("""\
select (
    '\033[95m哈哈哈哈哈哈哈😁', '\033[95m嘿嘿嘿' < (
        2, 3, 4,
    ), 345
);
""")

Error position and hint are now included by default if present, colored if the stderr refers to a terminal, overridden by EDGEDB_PRETTY_ERROR.

edgedb/errors/_base.py

1st1

Overall looks amazing. I didn't thoroughly review the printing logic though, cursory it looks OK. I'd change the way we define valid valued for ENV vars to enums like in server

Also: * Extracted color impl * Recover from error formatting failure

msullivan · 2022-11-18T20:32:39Z

edgedb/errors/_base.py

+    return rv.getvalue()
+
+
+def _unicode_width(text):


Note that this is an approximation that will overestimate the length of strings using combining characters that don't normalize away.
This means we can get stuff like

cannot redefine property 'x' of object type 'std::FreeObject' as scalar type 'std::str' ┌─ query:2:26 │ 2 │ select { x := 1 } { x := 'f̷͈͎͒̕ǫ̴̏͌ö̶̱̘' }; │ ^^^^^^^^^^^^^^^^ error

https://zalgo.org/ has a great generator for pathological unicode strings.

Probably this is almost always fine.

The CLI's error formatter does seem to handle this one right. I'm not sure exactly what algorithm it uses.
I think to some extent anything we do is an approximation because it depends on assumptions about how the terminal will render it.

Unicode does have a notion of "grapheme clusters", and there are python libraries for handling them (https://github.com/alvinlindstam/grapheme). I'm not sure that this is the best approach either, though, since the width of those clusters can vary and I don't know if there is good way to guess that well. (Consider 🇨🇦, which depending on your terminal will show as either a flag or the letters CA, but either way will probably take two characters, but is only one grapheme cluster.)

Maybe the best approach on our side would be basically just to drop combining characters.

There is a potentially infinitely deep rabbit hole here.

Rust uses de facto standard in Rust: https://crates.io/crates/unicode-width
Which uses Unicode Standard Annex 11 for calculating width.

Note: unicode width != number of graphemes. As width of a grapheme can be two chars.

And you're right that in generic case it's impossible to calculate width. Because some characters can have width not just depending on the terminal but also on the specific font (if there is a character for a specific set of emojis). unicode-width's frontpage gives an example.

Dropping Zero-width Joiners

I've tested three terminals I have around (xfce4, alacritty, wezterm), and all of them do match width with unicode width, even though some of them do not render some of the characters (i.e. flags are only supported in wezterm). Disclaimer: alacritty and wezterm are implemented in Rust so this explains it a bit, although xfce4 is not and has much longer history.

Copying from any of the three terminal's I've tried does discard zero-width joiner by itself

Rendering in Firefox on my machine gives invalid width regardless of if there is a zero-width joiner (i.e. it somehow draws emojis 1.5 of character width 🤷 )

Rendering in Chrome on my machine gives good results when zero-width joiner is dropped (like normally copying from terminal), but doesn't if I somehow manage to keep that character (i.e. piping output to xsel -b manually).

Note: zalgo-generated stuff looks like plain foo in all three terminals (although those unicode tricks are present in the output)

The summary of this: dropping zero-width joiners is safe to do, although looks unnecessary in my setup. If that's not true for some popular terminals or on other systems it could be done.

Fixup

For my examples this function works:

def unicode_width(s): return sum(0 if unicodedata.category(c) in ('Mn', 'Cf') else 2 if unicodedata.east_asian_width(c) == "W" else 1 for c in s)

But I'm not sure if this is good enough. The libraries doing this (rustic unicode-width, and pythonic urwid) have their own width tables.

Added in #404

Error position and hint are now included by default if present, overridden by EDGEDB_ERROR_HINT (enabled/disabled). The output is aslo colored if the stderr refers to a terminal, overriden by EDGEDB_COLOR_OUTPUT (auto/enabled/disabled).

@fantix

Changes ======= * Output pretty error if possible (#399) (by @fantix in a2bec18 for #399) * Codegen: allow providing a path after --file (#400) (by @fantix in 6bce57e for #400) Fixes ===== * Handle ErrorResponse in ping (#393) (by @fantix in 8b28947 for #393) * Disallow None in elements of array argument (by @fantix in 26fb6d8) Docs ==== * Remove references to unix-domain sockets (by @quinchs in 4b8bec6)

fantix added 4 commits November 17, 2022 10:02

Output pretty error if possible

b05b319

Error position and hint are now included by default if present, colored if the stderr refers to a terminal, overridden by EDGEDB_PRETTY_ERROR.

OS-specific line separator

17c6997

Safe lazy color initialization

0595db9

Able to turn off error hint by EDGEDB_ERROR_HINT=off

89399c1

1st1 reviewed Nov 18, 2022

View reviewed changes

edgedb/errors/_base.py Outdated Show resolved Hide resolved

1st1 approved these changes Nov 18, 2022

View reviewed changes

CRF: use consistent env var values

0a5acae

Also: * Extracted color impl * Recover from error formatting failure

fantix merged commit 22e9daf into master Nov 18, 2022

fantix deleted the pretty-error branch November 18, 2022 17:38

msullivan reviewed Nov 18, 2022

View reviewed changes

This was referenced Nov 23, 2022

edgedb-python 1.2.0 #403

Merged

Improve unicode width in error output #404

Merged

nsidnev mentioned this pull request Sep 26, 2023

render error hints edgedb/edgedb-elixir#167

Merged

aljazerzen mentioned this pull request Feb 23, 2024

v1.9.0 #482

Closed

This was referenced Aug 2, 2024

1.6.1 #513

Closed

edgedb-python 1.6.1 #516

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output pretty error if possible #399

Output pretty error if possible #399

fantix commented Nov 17, 2022 •

edited

Loading

1st1 left a comment

msullivan Nov 18, 2022

msullivan Nov 18, 2022

tailhook Nov 21, 2022

fantix Nov 23, 2022

Output pretty error if possible #399

Output pretty error if possible #399

Conversation

fantix commented Nov 17, 2022 • edited Loading

1st1 left a comment

Choose a reason for hiding this comment

msullivan Nov 18, 2022

Choose a reason for hiding this comment

msullivan Nov 18, 2022

Choose a reason for hiding this comment

tailhook Nov 21, 2022

Choose a reason for hiding this comment

Dropping Zero-width Joiners

Fixup

fantix Nov 23, 2022

Choose a reason for hiding this comment

fantix commented Nov 17, 2022 •

edited

Loading