Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Now that we have a mechanism for blocking subclassing, where should we use it? #1044

Closed
njsmith opened this issue May 7, 2019 · 9 comments
Closed

Comments

@njsmith
Copy link
Member

njsmith commented May 7, 2019

Followup to #1021

Possibly we should use it on... all classes? (Except ABCs.) Possibly not.


Edit on 2020-05-07 to add more rationale: The question here is about whether we should forbid subclassing on Trio classes.

Subclassing is a controversial topic in general – some people argue that it's an anti-pattern in general; some think there are situations where it's appropriate. But I think even subclassing advocates mostly agree that subclassing only works well when the base class is intentionally designed to be subclassed. In particular, without this, you can easily create fragile coupling between the internals of the base class and the subclass. For example, let's say we have a base class that uses the common trick where one method is implemented using another:

class BaseClass:
    def my_op_basic(self, *basic_args):
        ...

    def my_op_extended(self, *extended_args):
        basic_args = process_extended_args(*extended_args)
        return self.my_op_basic(*basic_args)

If you make a subclass and want to customize how all the different my_op variations work, the obvious probably just override my_op_basic, because that automatically covers both cases:

class SubClass(BaseClass):
    def my_op_basic(self, *basic_args):
        # ... custom logic here ...

    # no override for my_op_extended; base class version is fine

We test it, and it works great: our custom logic runs on SubClass.my_op_basic and SubClass.my_op_extended, because BaseClass.my_op_extended internally invokes SubClass.my_op_basic. But then, someone refactors BaseClass, and moves the main logic into my_op_extended:

class BaseClass:  # version 2
    def my_op_basic(self, *basic_args):
        extended_args = process_basic_args(*basic_args)
        return self.my_op_extended(*extended_args)

    def my_op_extended(self, *extended_args):
        ...

Now, BaseClass's public API hasn't changed at all – but this internal refactoring is still a breaking change for SubClass! Now calls to SubClass.my_op_basic will invoke the custom logic, but calls to SubClass.my_op_extended won't!

The point is: subclasses don't just rely on base class public APIs; they also rely on details of how the base class methods are implemented. And then something that looks like an innocent refactoring suddenly becomes a breaking change. That can be fine in some cases: if both classes are in the same project, then the tests will probably catch it and you can refactor SubClass as well. Or, if the base class was explicitly designed for subclassing so these internal implementation details are documented as part of the public API, then it's not an issue.

But for most of the classes that Trio exports as part of its public API, the potential subclassers are Trio's users, and we can't see their code. So if we do an innocent-looking refactoring and it breaks our users code, we have no way to notice that until we ship and their system falls over. And, most of the classes we export aren't carefully designed with subclassing in mind – that takes a lot of work, and isn't useful in most cases. (The main exception are the classes in trio.abc, which are explicitly designed with subclassing in mind, and their inter-method dependencies are explicitly documented.)

We make promises to our users about API stability. For most of our classes, if users are subclassing them, those promises are currently a lie – we actually have no idea which of our changes might break user code. Lying is bad. It's better to explicitly disable subclassing and give users a clear error up front, instead of letting them ship code that relies on subclassing and then starts failing in mysterious ways.

@njsmith
Copy link
Member Author

njsmith commented Jun 9, 2019

From a quick skim, here are some classes that should probably be marked NoPublicConstructor:

For Final... I'm actually serious that it should potentially be used for every single public class in Trio, outside of trio.abc. There are a very small number of places where we use inheritance ourselves (StrictFIFOLock), but in general Trio doesn't use subclassing and its classes haven't been designed with any consideration for subclassing.

@njsmith
Copy link
Member Author

njsmith commented Aug 30, 2019

This is some kind of argument, but I'm not sure if it's for or against :-)

https://trio.discourse.group/t/socketstream-putback/208/8?u=njs

@belm0
Copy link
Member

belm0 commented Aug 30, 2019

MultiError? #1199

@njsmith
Copy link
Member Author

njsmith commented Aug 31, 2019

@belm0 yeah I thought about bringing this up in #1199, but decided there wasn't any point. MultiError is going to get killed once exceptiongroup is ready, so there's no point in messing around with it too much, and IIUC anyio's subclassing is just a workaround for exceptiongroup not being ready yet anyway; no point in making @agronholm's life difficult by outlawing his temporary workaround. And I think exceptiongroup will have to allow subclassing because Yury wants to make an empty subclass class AsyncioExceptionGroup(ExceptionGroup, Exception) as a partial workaround for try/except not being exception-group-aware. For the eventual PEP version we might be able to switch it back to no-subclasses though, depending on what happens with try/except.

@njsmith
Copy link
Member Author

njsmith commented Oct 17, 2019

I think this issue has a pretty compelling argument that we should be marking most of our classes as final: #1247

Specifically, notice the subclass of trio.Lock. It looks nice and simple, but! It depends on the fact that when trio.Lock.acquire acquires the lock, it always calls trio.Lock.acquire_nowait internally. But this is an internal implementation detail, not part of the public API. For example, in the future we might refactor trio.Lock so that acquire and acquire_nowait both call a new _acquire_nowait_internal method. But if folks are subclassing Lock like this, then that internal refactoring would become a public breaking change. And here we have proof that at least one person is subclassing Lock like this.

So, we're forced to pick one:

  1. Commit to never changing the internal details of Lock in the future
  2. Decide that it's fine if folks want to do this, and accept that in the future we'll randomly break users in without warning, and possibly in subtle and confusing ways (e.g. here you could end up with acquire and acquire_nowait having incompatible semantics, which is super confusing behavior for a lock)
  3. Decide now that this isn't supported, and generate a nice up-front error message saying so

None of these is ideal. But given that we have to pick one, I think (3) is the least-worst.

@smurfix
Copy link
Contributor

smurfix commented Oct 17, 2019

Another option would be (4) refactor the code of affected classes so that no public method calls another public method.

Yes I know you're generally not in favor of the "add behavior by subclassing" pattern, and there are reasons for that, but our users might still have their own reasons for doing things "their way". Trio shouldn't take options away from its users without a compelling reason.

@agronholm
Copy link
Contributor

AnyIO currently subclasses trio.Condition, though I could refactor it to use compositing instead as I do with the Event class.

@njsmith
Copy link
Member Author

njsmith commented Oct 17, 2019

Another option would be (4) refactor the code of affected classes so that no public method calls another public method. [...] Trio shouldn't take options away from its users without a compelling reason.

"We don't want to have to carefully audit and refactor all our code" is a pretty compelling reason IMO :-). We'd need to like... implement some kind of static analysis to enforce that public methods never call public methods? That sounds like a lot of work for unclear advantages.

(We can also make things final now, and change our minds later on a case-by-case basis – if we decide that it's worth doing the work to properly support subclassing for some class, then transitioning from final classes → allowing subclassing is a backwards compatible change. So we can afford to hold off on investing in static analysis etc. until we find a strong motivating example.)

@njsmith njsmith changed the title Decide where to use trio._util.Final Now that we have a mechanism for blocking subclassing, where should we use it? May 8, 2020
@oremanj
Copy link
Member

oremanj commented May 12, 2020

#1501 followed through with making most public classes final.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants