Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backwards compatibility #426

Closed
Reinmar opened this issue Apr 25, 2017 · 12 comments
Closed

Backwards compatibility #426

Reinmar opened this issue Apr 25, 2017 · 12 comments

Comments

@Reinmar
Copy link
Member

Reinmar commented Apr 25, 2017

It's time we talk a bit about backwards compatibility.

We're getting close to 1.0.0. To make it clear, we won't stabilise the API at this point yet. We need more time to do the postponed refactorings (including some quite serious ones). It's impossible to come up with a perfect API from day one and we're fully aware of that so we reserve some time after the initial releases to stabilise the API.

This post is, therefore, about the situation after the initial stabilisation phase.

Backwards compatibility – burden & benefit

I think that our situation is very similar to that of Drupal. We have a similar philosophy to be a framework but with sane defaults. We're also focused on being open and on creating a platform for contrib modules. Additionally, we also propose concrete, ready-to-use builds, which, I think, doesn't apply to Drupal. But the basics are very similar.

Therefore, I'd like to recommend you reading @wimleer's Backwards compatibility – burden & benefit. It touches a range of aspects which also apply (or will) to our case:

  1. APIs are never perfect. They may always need to be changed, even when working on a bug fix.
  2. For the same reason, freezing an API is an unrealistic assumption.
  3. What applies to a single component's API also applies to the entire constellation of components. Architecture may also need to change with time.
  4. Ensuring BC is costly (tests, increased complexity and code size) and can't be done forever.
  5. Differentiation between public and internal APIs may help to reduce the burden to a smaller set of APIs.
  6. It's easier to maintain functionality (in terms of BC but also a bug rate) than APIs. APIs have (often) much wider surface than the functionality they implement.
  7. And a lot more... I really recommend thoroughly reading Wim's presentation.

CKEditor 5's BC

As far as I can see we have something around 210 classes and 50+ other modules. This is all public API currently. It means that a non-BC change in any of these modules will result in a major release of at least that one package.

I can already tell you that either we'll have major releases in nearly every sprint or we'll be extremely limited when doing changes (fixing bugs, improving functionality, adding features to existing modules) which will lead to a state which we know very well from CKEditor 4 – a state where there's no hope for big chunks of the code and when doing any change costs 3-10x as much as it could because you need to consider X, Y, Z limitations of the existing codebase.

However, CKEditor 4 was in a much better situation than CKEditor 5 – CKEditor 4 has far less public API. Features are usually implemented as monoliths and big part of the core is tightly coupled making the public stuff there our internal code (since it's not reusable). This means that far more things can be changed without touching the public API.

All this means that we need a better plan for CKEditor 5. Rules which will allow us making necessary changes in the least disturbing ways for external developers.

I don't want to come here with a definitive plan so here are some ideas from which we can choose.

  1. We can use the idea from Drupal to mark APIs with @internal (would apply to entire modules). We have @private already but I'd differentiate between these two situations. A module/class is @private so it should never be used (or rather – you can use it but don't expect anything – this API was not meant to be seen from a wider perspective). If something is internal it means that breaking changes in this piece of API will not cause major releases (they will be marked as "Internal" so they will not land in the changelog) but it's a real API, which made some sense for us so it may make some sense for you. I guess that nearly all modules inside feature packages, except plugins themselves, could be marked as internal. They have the highest chance to change and plugins have no API themselves (and can't be internal anyway). This way we can implement functionality and reduce the surface of the API while not blocking developers from actually reusing pieces of the code if they really want.
  2. We can (and actually did already in the README) split the packages into "framework" and "features". Based on that we can explain that we'll aim at having breaking changes in the framework less often than in the features. That will be helpful for the developers because their code will depend much more on the framework than on end-features. It's more acceptable if we introduce breaking changes in the image feature's API than somewhere in the engine.
  3. We could define additional flag like @apiStability <level> to mark APIs which we considered stabilised and a subject of change in the near future. It may help especially in the early phase when some things will stabilise sooner than other. However, it would be quite a lot of work to maintain such flags for the entire codebase so I'm not sure that they would be really helpful.
  4. Finally, we should not forget about the changelog. The way how we describe breaking changes may make upgrading to a new version a nightmare for other developers or a, still painful (let's be honest – no one likes breaking changes), but at least clear path.
@scofalik
Copy link
Contributor

scofalik commented Apr 25, 2017

First of all, I have a feeling that this post is a little bit premature, and simply, time will tell. We can discuss, throw ideas, etc., but I have a feeling that we won't change anything in our work process for now.

I agree that we won't be able to stabilise API before first release or even several iterations later. I also think that trying to make everything backward-compatible forever is bad idea. Versioning tools exist, so people not interested may update their code every now and then. Still, we should not introduce breaking changes to API every three weeks.

Anyway, I see two kinds of breaking changes:

  • API changes, which more-or-less can be summed up as: class/method/function changed how it works/what kind of parameters it accepts,
  • architectural changes (for example: priorities for events).

When you talk about breaking changes to the API, you really should think where are "sticking" points between third-party features and our code. I don't think there are that many, honestly. I'd be much more worried about architectural changes that changes how the whole editor works.

Changes in the "API group" are less harmful than in the second one. When API changes, we can deprecate some methods/parameters or -- if a method changed a lot -- keep an old version of method too (that will be removed in a same way as other deprecated things). People can get around those changes actually pretty easily, if we enable them to do so.

I don't think we need @internal, it will be similar to @private and @protected. We already agreed that @protected and @private stuff is something that user shouldn't ever touch. We took some care to decide which methods should be public and which not. I think that ATM we are fine in this regard. We strive to have public API sufficient to do almost anything that we want user to do (and we really give a lot of flexibility to users). There is no place in editor where you need to use @protected method. If there is, we create public API for that use case.

Still, as with @internal, people can use @protected and @private methods, just don't be surprised if they change. I think that additional tag will make things even more confusing and will make us take more time when writing code. @apiStability is even worse.

If anything I am for deprecating parts of code and keeping old version of code for some time, but not forever, we don't want junk. And, of course try to, group issues / releases in a way that there is a minimum amount of breaking changes.

@pjasiun
Copy link

pjasiun commented Apr 25, 2017

We can use the idea from Drupal to mark APIs with @internal

We may consider using @Protected. We use it to mark a method which should be used only in the same module and closely connected classes, what is pretty much this case.

We could define additional flag like @apiStability

It might be hard to estimate when we will need to change this piece of the API. On the other hand, we could introduce a tool for docs which tell when the last change in this API happens. We may add the information when was the last backward compatible and backward incompatible change and even automatically mark parts of the code which did not change for the last year as stable. At the same time, we could mark parts of API which we know that we want to change as @unstable (we know what we want to change, but we are never sure what we don't need to change).

Also, for sure, at some point, we will mark some parts of the API as @deprecated. Instead of keeping them forever, like we did in CKE4, we could try to do what Google Docs do. I like very much the idea of "sunset schedule" https://developers.google.com/apps-script/sunset

@Reinmar
Copy link
Member Author

Reinmar commented May 5, 2017

Regarding @private and @protected – this is not the same. These are class's internals – implementation dependent, potentially harmful when used directly. And most importantly, they are defined on class/function/property/method level (most of the time). @internal would be different. It'd be used on a module level to indicate the focus of this API. It's still a real API, it's designed like public from the package perspective, but there's no focus on it being used by others.

For me, there's a clear distinction between API visibility levels and... well, I can't find a good name for that :) Maybe this is why I didn't sell the idea.

Anyway, I agree we can get back to this later when we understand the topic better.

@pjasiun
Copy link

pjasiun commented May 9, 2017

I just realized something: can't we mark constructors of internal classes mark as protected, so they will not be shown in the public docs?

@pjasiun
Copy link

pjasiun commented May 9, 2017

I mean, I'm fine with @internal for modules, but when it comes to the classes, many times the constructor should also not be used publically in may cases when the object should be created by the factory.

@Reinmar
Copy link
Member Author

Reinmar commented May 9, 2017

Yes, this is possible. We may be even using it in some places.

@Reinmar
Copy link
Member Author

Reinmar commented May 9, 2017

We talked with @fredck about this topic a bit and we focused on versioning first. How to version the software to best indicate what changes were made without blocking us from doing any changes. How to indicate how safe is the new version for its user.

Versioning

Semantic versioning proposes certain rules how to bump up versions, but those rules are not always adequate (see this and this) and are useless for applications.

Semantic versioning is all about API. First, it needs to be clearly defined. Then, it must be clear what is its expected behaviour because how could you define whether a behaviour which you changed is a bug fix or not? There are obvious cases, but more often than not, in multi-layer libraries a change in one place may be a bug fix there, but it may be a breaking change in the other place.

In other words, semantic versioning fails when there's not enough certainty or when we're not talking about the API. I really recommend reading Jeremy Ashkenas's post about "semantic pedantic" if you haven't read it yet.

We are not going to be pedantic. We are going to mix our gut feeling into our choices. And, since we have at least 3 levels of "software", we're going to treat each level in a slightly different manner.

Source packages

By source packages I mean the editing framework, feature packages, and editors. These are the libraries which other developers will use to create new features and build new editing solutions. This is where the API matters the most.

Hence, for source packages we'll follow semantic versioning, but not pedantically. First of all, as I mentioned in the first post, not all APIs inside every package are to be considered "public". As we agreed, it's not time yet to strictly define what's internal and here we'll use common sense.

This means that e.g. changes in the engine will be treated more strictly when describing whether they are breaking changes than changes in feature packages. Feature packages are more about behaviour to the end user than about API, so only significant changes in their APIs (e.g. configuration option changes, plugins being removed or renamed, etc.) will be marked as breaking changes.

This loosened semantic versioning (or, romantic-semantic versioning :D) will also allow us to refactor the code more freely and frequently and unblock adoption of these changes (which would be blocked by developers relying on ^x.y.z ranges).

Finally, this is where changelogs are critically important. Whenever we describe some change, even if we don't mark it as breaking, but we feel that it's important, we should clearly explained how the behaviour changed and what to do. The better we'll do the changelogs the more forgiving the developers will be (hopefully :D).

Builds

EDIT: After publishing first builds, we decided to change the versioning of them to that which matches CKEditor 5 project (see the next section).

What is the public interface of builds? Editor class's API, configuration options and behaviour of included features. This is a mix of being an application and being a library, hence, mixed rules will apply. Also, most of changes in builds will come from their dependencies.

Therefore, if there were no changes in the build itself, any change in any of the build's dependency version will bump up a patch number of the build itself (note that we also apply the same rule to source packages). This is how e.g. Lerna works and it's the only reasonable automatic algorithm which came to my mind too. So, a bug fix in a dependency may be a bug fix in the dependent package (hence, bump). New feature in a dependency is not a new feature in the dependent package if the dependent package didn't change itself (hence, still a patch bump). The same about breaking changes.

That was the first, automatic step. Then, we need to analyse the changes in dependencies and decide whether they were significant enough to be minor or major change in the build. This can only be done manually and this is also why changelogs are so important.

CKEditor 5 project

Here, taken that CKEditor 5 as a whole is neither a library nor even an app (it's multiple apps from which you choose what you want) but a whole ecosystem. We also need to remember that the project version will be mostly used to marketise the project.

So, we'll start with CKEditor 5 v1.0.0. If an iteration will bring bug fixes we'll have v1.0.1, if there will be new features, then it will be v1.1.0 and if important breaking changes will be made, it will be v2.0.0.

The version of the project will be reflected by the https://github.com/ckeditor/ckeditor5 package (it's not clear yet whether it will be published to npm, but it has package.json with a version number and a changelog). A release of "CKEditor 5 project" is just a release of every official package. Check out the changelog to see how it works.

@pjasiun
Copy link

pjasiun commented May 22, 2017

New feature in a dependency is not a new feature in the dependent package if the dependent package didn't change itself (hence, still a patch bump).

Is it true only about builds or for source packages too?

I agree that build is more about the editors API and how it works and that even some internal backward incompatible change in the engine can be not important for one who uses the editor as a build.

However, in the F2F talk, you suggested that we should use this policy also when it comes to the framework and I don't agree. Imagine that I am working on my own build using CKEditor framework. I imported engine, image and link plugins and I am creating my own editor (this is what framework is for, isn't it?) I set my dependencies to update only patches automatically because I don't want to have any surprises. Now. Engine version has some breaking changes and image and link only update the engine version, has no other change. If they change only patch version, my project will stop working. My project will not work anymore because image and link use the different version of the engine than my project use, and editor can not be built in such case. It means that patch update causes breaking changes. Since this is what framework is for: creating your own editors based on engine and plugins, we should prevent such situations. This is why I think that if the commit updates major version of the engine in the link plugin it should be a major change of the link plugin. If the commits update minor version of the engine in the link plugin it should be a minor change of the link plugin. At the same time, if the change does not cases major/minor change in the editors API it can be a patch in the build, so devs who are only interested in the editor as a whole get updates automatically.

@Reinmar
Copy link
Member Author

Reinmar commented May 22, 2017

My project will not work anymore because image and link use the different version of the engine than my project use, and editor can not be built in such case. It means that patch update causes breaking changes.

You described here one of the major issues of Node (or npm or Webpack – whichever you don't like more at the moment) and semver's broken promise. This is nothing new. How many times did you have to remove entire node_modules? And how many times did things suddenly stop working?

But this is how things are and we have two options:

  • we can try to discover some new brilliant way of versioning a set of packages,
  • or we can try to deal with situations which you described.

The former option may initially sound good, but guess why things are just like they are? Because some other brilliant developer and thousands after him/her didn't find another way around.

For instance, you exemplify that a minor change in the engine may cause some other project to stop working because there will be a conflict of dependency versions and the engine will get installed twice. Guess what – the same problem may appear when you released a patch version if someone hardcoded his dependency on the engine. Or if he/she called npm i twice in a row. Or in various other ways. And this is not CKEditor's issue.

Even changing the release scheme to bump major version of a dependent plugin if its dependency had a major release (which you proposed in the F2F talk) won't prevent that because this is just how the platform works.

Instead, we choose to follow the common practices in the most reasonable way for us. This means that the platform, practices and problems will be familiar to any developer with experience in Node.js. This will also make the best (i.e. most meaningful) use of changelogs and version bumps.

Also, we can help developers dealing with the common issues. For instance, I'd like to add a check for duplicated packages to our Webpack plugin. This should significantly help in tracking and solving the issue that you mentioned.

@pjasiun
Copy link

pjasiun commented May 22, 2017

This is nothing new. How many times did you have to remove entire node_modules?

The problem is that in this scenario removing entire node_modules and reinstalling all dependencies will not help. Your engine will still be in the different version than the engine in the link plugin. Even worst: everything will be fine as long as you do not reinstall project.

For instance, you exemplify that a minor change in the engine may cause some other project to stop working because there will be a conflict of dependency versions and the engine will get installed twice.

The problem is that if the developer wanted to get only patch updates he/she should not get major engine update, even if it's not a direct dependency.

Guess what – the same problem may appear when you released a patch version if someone hardcoded his dependency on the engine.

Most probably if someone hardcoded engine version he also hardcoded plugins versions end everything will be fine. If he accepts minor or major change in all of them it should be fine too. But I agree that he will get a problem if he will have different update rules for different plugins, but it seems to be asking for troubles.

@Reinmar
Copy link
Member Author

Reinmar commented May 22, 2017

I understand your concerns and I know this will be a problem. But you ignore the argument that it's not us who introduced this issue, but it's the platform and I don't see how we could fight the platform without introducing different kinds of issues.

@Reinmar
Copy link
Member Author

Reinmar commented Sep 14, 2017

In #548 we decided to change the versioning scheme of the builds so I edited my post.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants