-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ux): improve error message on unequal schemas during set ops #9115
Conversation
82d43e8
to
50f3693
Compare
ibis/expr/operations/relations.py
Outdated
if "Conflicting values" in str(e): | ||
raise RelationError(err_msg) from e | ||
else: | ||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should really avoid this sort of exception hacking unless it's required.
In no way are the messages coming out of exceptions stable, including messages coming from library internals.
If this ValueError
is something we control, then the appropriate thing to do here is to raise a more specific exception in the schema subtract operation and then catch that specific type here.
Raising ValueError
is about as useful as other mostly-useless idioms like raising Exception
or RuntimeError
, which means catching ValueError
is very heavy handed (and requires checking for specificity using string hacking).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, was just trying to keep this PR focused on the one thing. I'll switch the valueerror to a relation error :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, check out the new way I do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, actually now I'm thinking about this more. If we make a new error, then if a user wants to do schema1 | schema2
, then they will need to import a special ibis exception so they can catch it, and remember its name etc. I actually am tempted to keep the error as some builtin exception and then expect us/users to match via the string. Perhaps we make it so the message string always has a "conflicting values:"
prefix (or some other key) and that is what we match on?
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very against reusing standard library exceptions unless it unambiguously makes sense.
ValueError
IMO almost never makes sense to reuse. Generally speaking exceptions that map to a huge variety of errors are terrible candidates for reuse, because any downstream code that uses a ValueError
exception handler has now opted into a potentially massive set of error conditions. It's really not that different from catching Exception
.
That video isn't about matching builtins AFAICT, it's about not naming exceptions redundantly.
I am definitely on board with avoiding the SomethingWentWrongException
in favor of SomethingWentWrong
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I linked to the very start of where he starts talking about exceptions, later he actually gets to his thesis of "stop writing custom exceptions". He talks about that more scattered through the whole talk, you sorta have to listen to the whole thing to get everything (not expecting you to)
Yeah that is the downside to using generic exceptions. I think there are pros and cons of both, which I prefer probably depends on the circumstance. One factor is how likely is the user going to realize something is fallible and try to catch it. If they always wrap the schema1 | schema2
call in a try/catch, then the bubbling up error isn't really a problem. But if someone forgets to do that, then then genericness is a problem. I might expect with no evidence, that in this case, the vast majority of people would not wrap these set ops in a try/catch (they are probably gonna be data scientists) and thus that makes me more likely to agree with you.
I guess the thing I want to avoid is ibis exporting 20 different errors that someone has to keep track of. I would love it if we could use some other existing ibis Exception class here, or do something to simplify the situation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other typical approach is to have a generic class of error and then some kind of enum-like object embedded in it that allows someone to differentiate.
This is effectively OOP-ization of the errno
approach from C programming. OSError
in the stdlib is a good example of this approach.
As a user, I don't really care how many exception types there are. I care about getting specific, actionable feedback about what went wrong. If there are 20 ways in which things can go wrong, I want to know about that. Sure, those could all be just IbisError
. From an interactive-mode-mostly-end-user perspective, I don't the exception count matters.
As a library author I care about the number of exception types only if they aren't stable. Even in an unstable world where exceptions are changing often, I would much rather have to deal with 20 exception types than have 20 different if "string" in str(e)
-style checks littered all over my code, liable to break in hard-to-debug ways just because someone decided to fix a typo or capitalize a letter in their error message.
I'm not saying we need a new exception type for every possible error. We should determine whether it's appropriate on a case by case basis and we shouldn't set some arbitrary limit like "no more than 20 custom exception types".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL "Define your own exceptions when you need to"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL "Define your own exceptions when you need to"
lol yeah I wish he had seen how this was funny :), can't believe the audience doesn't laugh there more
OK I think I'm happy, I just wanted to make sure you had heard this train of thought, but it appears you have thought about it plenty and know what you're looking for. Sounds fine to me then! Is there something we should to to remind ourselves of all of the custom exceptions we define that we will have to eventually expose once we make that stable? maybe just a tracking issue that just lists them all? I'm worried that eg this one hidden away in this corner will slip through the cracks and won't be considered once we merge/publish the errors
3de6eae
to
e13cf23
Compare
I'll rebase your PR to pick up the fix for the failing docs build. |
I got tired of
ibis.union()
failing but only saying "the schemas are different"The relevant tests are in ibis/tests/expr/test_set_operations.py, looks like the bases are already covered pretty well, don't think I need to add any more