connection_loss thrown on protocol errors #686

alexolog · 2023-05-08T00:01:02Z

As a follow-up to #287:

If other people get the same message, hopefully the error message will take them to a discussion such as this one.

I am getting the same message, and it took me to that discussion

When I generate the problem, I get SQL error code 08P01: protocol_violation.

And herein lies the problem: pqxx::broken_connection does not expose the error code. If it did, I could check it and decide what to do based on that.

Passing the wrong number of parameters to a statement would normally indicate a bug in the application, so hopefully this complication won't inconvenience too many people.

A reasonable assumption that a pqxx::broken_connection just indicates a connection loss, could (and has!) lead one to write code that catches the exception and attempts to reconnect and resubmit, resulting a loop that cannot be resolved.

At least you get a somewhat helpful error message.

Parsing an error message is a terrible advice. That's what error codes are for!

If you have any good ideas about detecting this exact problem without risking confusion with different errors, feel free to file a new one.

The simple solution is to attach an error code to the exception.
In fact, any exception generated from a condition that has an error code (as per https://www.postgresql.org/docs/current/errcodes-appendix.html), should expose it.

Looking at result.cxx I see that the problem is with all of class 08, as well as with codes 53300 and 42501.

There are several options to deal with it:

Move connection_error under sql_error.
Introduce an intermediary failure_with_code or sql_failure as a parent of both connection_error and sql_error, and move sqlstate() to it.
Just add sqlstate() to connection_error.

I personally prefer #1.

Thank you!

The text was updated successfully, but these errors were encountered:

alexolog · 2023-05-09T21:57:11Z

I looked at other exceptions that don't expose an error code:

pqxx::too_many_connections - a subclass of pqxx::broken_connection for error code 53300.
pqxx::insufficient_privilege - doesn't expose code 42501, but does expose other codes.
pqxx::blob_already_exists - couldn't find any code that throws it.
pqxx::in_doubt_error - not triggered by postgres, so no error code.
pqxx::variable_set_to_null - not triggered by postgres, so no error code.

jtv · 2023-05-10T09:53:48Z

Okay, I'll look into exposing sqlstate on more exception classes. Thanks for digging into this. Haven't had time to respond, but I'll incorporate it one way or another.

See #686. This is not a full implementatino of what is suggested there. But it will enable a program to avoid the problem of catching a `broken_connection` and retrying ad infniitum.

jtv · 2023-05-13T21:16:50Z

Doing #687 as a first step in addressing the concrete hazard, by enabling a broken application to detect that a particular broken_connection is due to a bug in its own code and so not worth retrying. We'll want to add sqlstate to more exception classes, but that will take a bit more work.

(For the record, nobody recommended parsing error messages. I totally agree that parsing error messages would be a terrible suggestion, but that is just not a thing that happened.)

alexolog · 2023-05-15T22:44:24Z

Does it break the connection in the sense that a reconnect is needed?

jtv · 2023-05-16T00:30:29Z

Yes, a broken_connection signals that the connection is broken and you'll need a new one.

That's also one of the reasons by the way why I don't think it'd work to make broken_connection a subclass of sql_error. An sql_error normally means that your transaction is doomed, but you can still use your connection. A broken_connection means your entire connection is shot. (There are other reasons as well.)

I had a look into what it would take to add sqlstate to broken_connection, but there are complications. In most code paths where broken_connection can occur, there is no actual sqlstate variable that we can query — because that information only comes with a query result. In most cases we'd have to make our own decisions about which sqlstate string to use in which scenario. That would leave us with a baroque boundary of choices and responsibilities. Your application might make the same call twice and hit two connection errors with the same sqlstate code — but one might be coming from the database, and the other might originate inside libpqxx with an sqlstate code that libpqxx picked. In one case a libpq upgrade might change the code, in the other a libpqxx upgrade might change the code.

In most cases, libpqxx would choose an sqlstate code, but it can't necessarily promise that it'll be a good choice. For example, when unable to connect to the server, how would libpqxx know whether to use 08001 (SQL-client unable to establish SQL-connection) or 08004 (SQL-server rejected establishment of SQL-connection)? What if it used 08000 and maybe later versions made the code more specific in some scenarios? It seems quite arbitrary and subject to change, which still leaves you with a messy decision matrix for drawing useful conclusions from the sqlstate.

So for now I'm leaning in the direction of keeping the existing approach: if you've got a useful distinction between errors, I try to give them separate classes, and inheritance keeps the change compatible with older code. When that doesn't work, we discuss and look for a better solution. And from a practical perspective, I think "my app needs to know this distinction" is a better driver for that discussion than "which is the more appropriate sqlstate code."

alexolog · 2023-05-17T15:13:07Z

It is a little surprising that a failure to prepare kills a connection.
Does one need to recreate a connection every on every broken_connection exception? Are there "best practice" recommendations (or better yet, sample code)?

Thanks!

jtv · 2023-05-18T09:37:27Z

Yes, it's absolutely surprising! But not much I can do about it.

Yes, a broken_connection signals that the connection is broken and you'll need to either give up and report an error, or make a new connection.

Best practices? The usual stuff I suppose... Don't retry the same thing forever. Report exceptions somewhere unless you know very precisely what caused them and why it's irrelevant, so a human can see problem patterns happening. Don't over-complicate error handling, especially since error handling code is often hard to test thoroughly. Accept that anything can fail, including your own code, so don't try to fix every possible problem. And of course if you want to reconnect, you'll have to have your connection string again.

See #686. This is not a full implementatino of what is suggested there. But it will enable a program to avoid the problem of catching a `broken_connection` and retrying ad infniitum.

This came up in #686.

jtv added a commit that referenced this issue May 13, 2023

New exception subclass: protocol_violation.

470d679

See #686. This is not a full implementatino of what is suggested there. But it will enable a program to avoid the problem of catching a `broken_connection` and retrying ad infniitum.

jtv mentioned this issue May 13, 2023

New exception subclass: protocol_violation. #687

Merged

jtv added a commit that referenced this issue May 20, 2023

Retire unused blob_already_exists exception.

2e54ad4

This came up in #686.

jtv mentioned this issue May 20, 2023

Retire unused blob_already_exists exception. #693

Merged

jtv added a commit that referenced this issue May 21, 2023

Retire unused blob_already_exists exception. (#693)

dce21e8

This came up in #686.

jtv closed this as completed Jun 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

connection_loss thrown on protocol errors #686

connection_loss thrown on protocol errors #686

alexolog commented May 8, 2023 •

edited

Loading

alexolog commented May 9, 2023

jtv commented May 10, 2023

jtv commented May 13, 2023

alexolog commented May 15, 2023

jtv commented May 16, 2023

alexolog commented May 17, 2023

jtv commented May 18, 2023

connection_loss thrown on protocol errors #686

connection_loss thrown on protocol errors #686

Comments

alexolog commented May 8, 2023 • edited Loading

alexolog commented May 9, 2023

jtv commented May 10, 2023

jtv commented May 13, 2023

alexolog commented May 15, 2023

jtv commented May 16, 2023

alexolog commented May 17, 2023

jtv commented May 18, 2023

alexolog commented May 8, 2023 •

edited

Loading