Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v3] Roadmap #265

Closed
11 of 12 tasks
theofidry opened this issue Sep 14, 2015 · 33 comments
Closed
11 of 12 tasks

[v3] Roadmap #265

theofidry opened this issue Sep 14, 2015 · 33 comments
Labels
Milestone

Comments

@theofidry
Copy link
Member

theofidry commented Sep 14, 2015

EDIT: update this part of the message to avoid to have to crawl the whole thread to get a review of v3.

Main changes with v3:

  • Drop the persistence layer
  • Make the library more resilient
  • Make the library more extendable
  • Make the library more performant
  • Add a few new features & bugfixes

Global overview of the library:

screen shot 2016-06-04 at 00 08 25

Main interfaces (#381):

LoaderInterface::load(string $file, array $parameters = [], array $objects = []): ObjectSet;

FixtureBuilderInterface::build(string $file, array $parameters = [], array $objects = []): FixtureSet;
GeneratorInterface::generate(FixtureSet $fixtureSet): ObjectSet;

FixtureBuilder secondary interfaces (#382):

Generator secondary interfaces (#383):

–––

(original message)

What's your plans @tshelburne? I would be more than happy to work on it :)

@tshelburne
Copy link
Collaborator

Haven't had plans right now, since 2.x is still relatively fresh. I think it would take, at minimum, a discussion of what we want to accomplish and how things are failing right now.

From my perspective, the biggest struggle we have right now is that the "lifecycle" of moving from files -> fixtures -> instances is not particularly flexible, and all the state getting passed around and updated all over the place makes it hard to reason about making changes. I think a more well-defined interface for that flow would let us do things like the relationship hydration step more easily.

I'm going to be mostly just reviewing PRs for a while - I have some client work that's pressing right now. The best start right now would probably be to start drafting a proposed changeset and some ideas behind desired features. It would be nice to just move forward with some of the issues in a 3.x version, rather than trying to forcibly patch them into 2.x. I don't think we'll need to have too many BC problems for a 3.x release anyway, but that's yet to be seen.

@theofidry
Copy link
Member Author

From my perspective, the biggest struggle we have right now is that the "lifecycle" of moving from files -> fixtures -> instances is not particularly flexible, and all the state getting passed around and updated all over the place makes it hard to reason about making changes. I think a more well-defined interface for that flow would let us do things like the relationship hydration step more easily.

Agreed. My feeling is that your PR improved many things, but there is some parts that are very hard to follow or not flexible at all, hard to debug and to maintain. And this is partly due to unclear responsibilities at some times (like the persistence layer: this should not be part of this library).

I would love to help we it but truth is, even if I have some ideas I still have only a very rough understanding of the "lifecycle". Would love to have something more clear on this part and then be able to argue about some part of the design. IMO it would be better to do that first before deciding of what should remain in the 2.x and what should be put in the 3.x.

@tshelburne
Copy link
Collaborator

Yea, my big PR was a quick and dirty approach to get things to a more configurable place so that I could use the library in a project I was working on. It was sufficient for that project, but ultimately not especially flexible / easy to use.

Sounds good - I'm going to try and set aside some time in the next couple weeks to do a rough sketch of what I have in mind.

@theofidry
Copy link
Member Author

👍

@theofidry theofidry changed the title v3.x v3 Oct 20, 2015
@theofidry
Copy link
Member Author

Any update @tshelburne?

@tshelburne
Copy link
Collaborator

Work has been intense for a bit - hopefully things will settle enough for me to block out some time to handle it.

If you have the bandwidth to get even just a spec template in place, that could be helpful. Otherwise, I'll get to it ASAP and keep you updated on progress. Sound good?

@theofidry
Copy link
Member Author

Not much work is pretty intensive right now for me too^^ But you can mail me at any time for reviewing a document or something.

@theofidry
Copy link
Member Author

@tshelburne any update on that? Otherwise having a v3 with some cleanup and without the persistence layer would be great already.

@theofidry
Copy link
Member Author

ping @tshelburne. Note that if you want some help, I'll probably manage to find some times in a couple of weeks (so not right now). But still available to at least discuss on the general idea and also have some feedback from using it with HautelookAliceBundle.

cc @soullivaneuh

@tshelburne
Copy link
Collaborator

@theofidry I think cleanup and removing persistence would be good enough - it's not looking like my schedule is going to clear substantially, but instead the opposite. I think it would be wiser to just look at making some incremental progress at this point, rather than making another huge step. That way we can at least show some consistent improvement.

I do still think decoupling persistence from object generation is important - half of the problems I see coming through the issues list is about trying to get more automatic solutions around Doctrine, and that doesn't feel like a healthy step forward to me. I wouldn't be opposed to removing the notion of persistence entirely, and just returning "lists" (could be any data structure, IMO) of generated instances. The important part is to try to separate the Doctrine-specific influences so that Alice can be a more general solution, and the bundle can take care of writing the Doctrine "bindings".

@theofidry
Copy link
Member Author

Ok great. How would you like to proceed? Creating a 2.x branch and pushing v3 changes on master and create a v3 milestone?

@tshelburne
Copy link
Collaborator

I'd like to have things spec'd pretty clearly first in order to build the milestone out of tickets - if you have the time right now, an initial breakdown of updates and deprecations you have in mind would be useful, and then I can review + add to them more quickly. If you list those here, we can consider this the source of truth for now, and then break them out into tasks once things feel solid.

Thoughts?

@theofidry
Copy link
Member Author

Fine by me, gonna do that in the evening after cooling down a bit :)

@theofidry
Copy link
Member Author

Ok so here's a breakdown of the current library: https://gist.github.com/theofidry/2b1de50bb17fc06cb502

I am really not familiar with this kind of usage of PHP (crating PHP objects this way), so aside from the way files are loaded, I am not familiar with the internals of the library. As a result there is a few classes I have no idea of what they are doing.

So some feedback of what I missed when I used it:

  • Not enough interfaces
  • Interfaces not always clear (you have no idea of what the class is doing)
  • Usage in static context: not really configurable
  • Lack custom exceptions

So the first thing I would like to do before cleaning up the implementation details is clarifying the interfaces, then take care of the implementations.

@theofidry
Copy link
Member Author

@tshelburne little ping 💃

@theofidry
Copy link
Member Author

I've got some refactoring ideas but need a bit more details on the responsibility of each class @tshelburne :(

Maybe @Seldaek remember some parts too?

@tshelburne
Copy link
Collaborator

@theofidry I have this pulled up in another tab and will try to comment tonight - thanks for this start. I agree with all of your judgments.

@theofidry
Copy link
Member Author

Maybe you got a 'lil time for this one this week? :p

@tshelburne
Copy link
Collaborator

@theofidry Sorry, I never updated this thread, but I have posted comments on the gist.

@theofidry
Copy link
Member Author

Oh damn, never checked

@theofidry
Copy link
Member Author

How do you think we should proceed? Is there anything that should be changed in the current system at the fondamental level or we should simply rework a bit the current one?

If we don't need a big change in the system itself, the I'm thinking of re-working bits by bits (ex first Processor).

@tshelburne
Copy link
Collaborator

I would certainly encourage bit by bit - given that I don't have an active project depending on Alice right now, finding large chunks of time to parse through significant PRs is more challenging. To the extent possible, I think a piecemeal approach would be great.

@theofidry theofidry changed the title v3 v3 Roadmap May 4, 2016
@theofidry theofidry changed the title v3 Roadmap [v3] Roadmap May 4, 2016
@theofidry theofidry added this to the 3.x milestone May 4, 2016
@theofidry
Copy link
Member Author

theofidry commented May 6, 2016

Thoughts of yesterday, I'm putting this as a reminder. Here's how alice works without the persistence layer:

screen shot 2016-05-06 at 11 17 12

It's only the high level stuff, so the names may not of the classes may not match. For each components, here are some things they must do which are a bit tricky to implement

  • Parser: parse a PHP/YAML file to give a PHP array read to be transformed into Fixtures
    • Although loads only one file, this file can have include statements.
    • As loading one file may result in having to load multiple files, the state must be kept between the loading of each file (to not lose references of other files)
  • Builder: transform a PHP array into a collection of Fixture.
    • The tricky part is dealing with dynamic references (ex with lists or ranges) and flags
    • Must be extendable to add custom flags
  • Generator (the most complex bit by far):
    • Right now is composed of instantiators and populators
    • Have to deal with resolving data values which can be:
      • functions for faker providers
      • references to another fixture (which can be ranged as well)
      • parameters, coming from the file or injected (to be able to use framework parameters in bridges for example)
      • dynamic values (can refer to the value of another property)
      • specifying accessors, calls to be made on the instantiated object or instantiation method (factory, constructor, reflection)

Among the list of what each components must deal with, there is currently some features are currently missing in v2 (ex doing multiple calls on an instance cf. #293), but most of the stuff is already implemented.

What to change

Main changes:

  • Removing the persistence layer: it's definitely something interesting, but should be done in another library if needs to be done
  • Make the library more flexible: mostly by introducing interfaces where needed
  • Reduce technical debt: change a few patterns, making classes final when possible, adding unit tests etc.

Generator

Maybe it would be possible to use Symfony Serializer as its primary usage is to transform a data structure into another instead of having this custom combo Instantiator + Populator. But in that case it requires to resolve all the fixtures value before passing the to the Serializer:

screen shot 2016-05-06 at 12 02 05

Cache

Although speed is not a primary concern, I still think we should take it into consideration. I think we could have the following cache layers:

For both the Parser and Builder, calling them multiple times with the same input will always give the same result so it is something that could be cached. One caveat, the user could load 5 different files which all include the same file. Loading a file (at that stage, i.e. to create an array of Fixture) is stateless, meaning that the caching can be done on a file basis. If file1.yml includes file2.yml, It doesn't matter which file is loaded first as no value is being resolved yet. So loading the included files can be cached.

At the Generator level, it's much harder to cache anything. If we take the schema with the Resolver and the Serializer, Resolving values is always dynamic, resolving two times the same Fixture object gives two different result. However in the Serializer it is a different story, and hopefully the Serializer has natively some cache, we would only need to make sure things works well on that side.

To sum up, here's how it would look like in v3:

screen shot 2016-05-06 at 14 02 10

Upgrade path

For the end user, there is unlikely to be big changes. There may be a few changes like #337, but in that case can add deprecation notices in v2 to help the migration.

For developers, that's another story. I sadly think there is not way to have an easy upgrade there. Even with HautelookAliceBundle which has tried to not hack too much alice to work well in Symfony, I see a lot of BC beak changes coming. So IMO what we should do is having a good changelog explaining the changes and be ready to help the biggest library in upgrading.

While doing #326, I think there is a lot of things that can be ported to v2 without much extra work. As long as it doesn't require too much energy, I'm willing to do it. That being said, I prefer to keep the focus on v3, changes ported to v2 would be mostly bugfixes and deprecation notices.

Work on v3

@tshelburne I would like to hear how you would want to proceed. In one hand, I would like to avoid waiting 2-3w for each review, on the other hand I don't want to merge things without anyone's opinion at all. Do you have any idea of comprise we could find?

WDYT @tshelburne @soullivaneuh @vincentchalamon?

@tshelburne
Copy link
Collaborator

@theofidry

On a high level, I think it's great you are taking initiative and running with the improvements - as has been the case for a while now, I am too slammed with work to dedicate the requisite time to review. I'd like to be pinged for decisions being made, but I think you're doing a great job with upkeep / refactoring / etc. The responsiveness has been badly needed.

Regarding your suggestions above:

  1. What's the benefit of final classes? I've always hated it when library authors provide final classes because it restricts what I can do with their code.
  2. I don't have specific concerns about Serializer - do you think it will cover all bases as well as what we have now? If so, it seems like a good step. I built the existing v2 structures as a response to the code in v1 for specific use-cases, so they are likely more specialized than necessary.
  3. Personally, if performance isn't a problem now, I vote we leave caching out. If we need to layer it in later on, we should build the system in such a way that it's not a chore (I think it would already be easy enough), but caching is always opening up the chance for hard-to-find bugs.
  4. I agree with your steps towards v3 - we've been responding for a long time to devs that we are going to remove support for persistence, and I think we need to just do it. Pushing for v3 now seems like the right choice, so long as people aren't left hanging on v2.

@theofidry
Copy link
Member Author

\o/

What's the benefit of final classes?

Using final class is to prevent inheritance in favour of composition. This is safer as we consumer as inheriting a class is accepting an implicit contract (the child class accept all the responsibilities thrown from the parent) whereas composition is an explicit contract (you are using the methods signature only and not relying on the internal behaviour). In short it helps a lot from a maintainability point of view because as long as you don't change the signature of your public methods, you are not introducing BC breaks.

I've always hated it when library authors provide final classes because it restricts what I can do with their code.

Just putting everything as final would indeed lead to that which is not very nice. Instead we have to also introduce the proper interface. Also it's like everything: I prefer to put things as private first, then if needed and sensible, we can always switch to public or non-final.

I don't have specific concerns about Serializer - do you think it will cover all bases as well as what we have now?

At that point I can't guarantee anything to be honest, but I do hope we could use it to remove some chunks of our code.

Personally, if performance isn't a problem now, I vote we leave caching out.

Actually it kind of is. I've had some feedback from AliceBundle where some people were complaining on performances – nothing dramatic, but if it can be improved it would be nice. As I said, there is two part where I can see a cache layer of some kind:

  • The result* of loading a file; *:before interpreting any value, i.e. just the raw array of Fixture build
  • At the Serializer level: it has some cache already to avoid to re-discover the structure of the object to be instantiated; it's just a matter of making sure it works

but caching is always opening up the chance for hard-to-find bugs.

Agreed, it won't be the priority and should be well tested.

@tshelburne
Copy link
Collaborator

Re: final classes, while I understand your point, I think that's the whole point of private and protected pieces - they represent implementation details that you can't rely on from version to version. However, if you restrict yourself to a set version, I feel like you should be able to customize things in that version as much as desired, even to the point of wrapping the primary interface of the library and extending it. From my perspective, if we change implementation details and it introduces a backwards incompatible change with a developer's child class but not with our public interface, it's not our responsibility to handle that - it's the developer's who extended the library.

All the rest sounds good - if performance is already an issue, let's do it.

@theofidry
Copy link
Member Author

That's a fair point, and the library must still be extensible. That said, small classes don't have much value to be not final: they implement an interface and if you want to extend there behaviour it's better to decorate it.

For more complex objects or some value objects like Fixture it might be better not to make them final.

@tshelburne
Copy link
Collaborator

Agreed 100%.

@theofidry
Copy link
Member Author

For the top level interfaces, here are my suggestions:

  • Parser: ParserInterface::parse(string $file): array; transforms a file in a PHP array, done in [v3] Rework parsers #361
  • Builder: BuilderInterface::build(array $fixtures): UnresolvedFixtureSet where UnresolvedFixtureSet contains an UnresolvedParameterBag and UnresolvedFixturesBag. Those bags are basically just containers to have a safer typehint than array and eventually provide an nice API as well, not sure yet how useful that would be yet.
  • FixtureBuilder: FixtureBuilderInterface::build(string $file): UnresolvedFixtureSet: basically just combines the parser and builder cf. the diagram above. This combinations is useful for caching as up to that point, nothing is dynamic yet, the result is predictable and indempotent.

After that, what's left is:

  • Resolving UnresolvedParameterBag to get a ParameterBag
  • Resolving UnresolvedFixturesBag to get a FixturesBag – requires ParameterBag
  • Once FixturesBag, we "just" have to instantiate and populate the objects (current Instantiator/Populator system or the Serializer)

Based on that, we could suggest the following interfaces:

  • ParameterResolver::resolve(UnresovledParamterBag $parameters): ParameterBag
  • FixtureResolver::resolve(UnresolvedFixtureBag $fixtures, ParameterBag $parameters): FixtureBag
  • SerializerInterface::serialize(FixtureBag $fixtures, ParameterBag $parameters): object[]

Which leads to a loader with the interface LoaderInterface::load(string $file): object[]

The tricky part now is injecting external parameters and objects. We have different ways to go:

Solution °1: Have a stateless approach

If you want to inject additional parameters/objects for the loading, you add them in the loader signature:

LoaderInterface::load(string $file, array $parameters = [], array $objects = []): object[]

This looks rather clean, but raises a few questions:

  • does object[] contains the injected objects as well? (IMO would make sense)
  • If there is parameters loaded in the file as well, those parameters are lost and the user has no means to retrieve them. This can't be solved, unless we return a ObjectSet object instead, which would contains parameters (resolved + injected?) and objects
  • when injected parameters conflict with the loaded one, then what? (IMO the injected ones should override the loaded ones)
  • when injected objects (which must have a reference) conflict with loaded fixtures, then what? (IMO the injected ones should be overridden)

Solution °2: Have a stateful approach

the loader keep the origin signature:

LoaderInterface::load(string $file): object[]

And to add parameters and objects, we have accessors (getters/adders/setters). load() would return all the loaded objects and include the ones presents in the Loader.

There is other variants, for now I prefer °1 but I would like to hear your thoughts on this @tshelburne.

@theofidry
Copy link
Member Author

@tshelburne quick update on the new version!

I finished the biggest part of the work which is the implementation for the core classes, i.e. creating the new interfaces, value objects and the interfaces implementations require to make the library work. It is currently functional to the point of instantiating a set of fixtures (no values resolution or random values generated yet although the system behind it is ready). If you were to take a look right now, you should be able to grasp how the library is working and see most of the classes.

The plan now is:

  • Porting the integration tests of 2.x to v3 to ensure all the current features will be properly ported and that there will be no regressions on that side to be able to release a first beta 3.0.0-beta.0
  • Review all the issues to make sure the features expected in v3 are implemented
  • Update the doc where necessary:
    • doc regarding the persistence layer
    • doc regarding the new features
  • Add a bridge for the different frameworks. By bridge I mean registering the services to the framework DIC, I plan to add it for Symfony only (I don't think I'll have time for doing it for another framework). The library is also usable without any DIC by relying on NativeLoader
  • Review performances: benchmark against v2, do a POC for caching to see if it's worth having a cache layer or not)
  • Add a technical documentation on how the library works (mainly taking the diagram in the recap of this issue and adding some comments left and right, nothing too deep)

In parallel:

  • Continue to prepare a new release for v2 with all the deprecation notices to ease the upgrade to v3
  • Work on HautelookAliceBundle for a Symfony integration and see if everything is ok on alice side (to be done before the stable release to be able to do a change requiring a BC break and to have more feeback)
  • Release a beta on HautelookAliceBundle (will probably be a new major release)

So there is still a lot of work but the hardest and biggest part is behind. I hope to release the 3.0-beta.0 soon (in 1-2 weeks) and then wait a couple of month to finish the work and have some feedback before having the stable release (probably between october-november)

@theofidry
Copy link
Member Author

Now that everything is clear, I think we can close this issue. I still have a couple of things to fix before the first beta, but the next issues should have a dedicated issue rather than continuing on this long story.

@theofidry
Copy link
Member Author

Not sure where's the best place to put that putting it here.

I've run a quick benchmark on alice:

The end result is roughly the same (-0.5% faster in 3.x) which is a very good news as, unlike in v2, there is a big "build the fixtures set" step in 3.x. You can see that some call or optimized, e.g. only 604 calls made on FakerGenerator::format() (used to generate any data via faker) against 680 in 3.x).

Also in 3.x the "build the fixture set" (via the FixtureBuilder) accounts for 24% of the time spend, so as it's cacheable it's a potential 20% (guess it will take some time to put and retrieve from the cache as it's a big chunk) speed win. I can't say I tried to squeeze any performance yet so this comes as a pleasant surprise :)

It's also worth mentioning that we've moved from 534 tests and 1179 assertions in 2.x to 1762 tests and 2991 assertions, with some I hope will be easier to debug.

@tshelburne
Copy link
Collaborator

@theofidry Excellent news!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants