Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove SpawnModules #12783

Closed
rjernst opened this issue Aug 10, 2015 · 4 comments · Fixed by #13034
Closed

Remove SpawnModules #12783

rjernst opened this issue Aug 10, 2015 · 4 comments · Fixed by #13034

Comments

@rjernst
Copy link
Member

rjernst commented Aug 10, 2015

This is a spinoff from #12744. SpawnModules is a way to have modules produce sub modules. However, one way they have been used in the past is broken since we now have plugin classloader isolation. Some extension points used to allow plugging in an entire module "implementation". This is far too crazy for us to maintain in ES. It also leads to having lots of tiny modules that bind just a handful of classes.

We should remove this interface, and flatten the modules we have. All extension points should be class implementation based, by registering custom implementations with the module responsible for binding a given class. This has a bunch of benefits, like:

  • Making it easier to find where to add new classes to bind
  • Making it easier to document and write plugins since there will be less modules to possibly register custom extensions with
  • Reducing spaghetti dependencies because it is clear when we have tons of bound classes (a lot of which can probably be removed when we can clearly see what they are instead of being spread across tons of modules)
@mikemccand
Copy link
Contributor

+1

@s1monw
Copy link
Contributor

s1monw commented Aug 10, 2015

++

@uboness
Copy link
Contributor

uboness commented Aug 10, 2015

personally I find having a hierarchy of modules helpful to keep things clean, more modular and more testable:

  • a module is a unit responsible for X
  • the module itself is a sub-system within a system. Breaking the system to modules is good, and I believe the same holds for modules themselves. Specially when it comes to complex modules.
  • one can argue that this can also be fixed by collapsing a hierarchy to a single module - I believe this will lead to a spaghetti code on the module level. Instead, split responsibilities to higher/lower level functions and work your way down/up.
  • applied properly, sub-modules actually reduce spaghetti dependencies, simply because higher modules are not (and should not) be aware of lower ones. When flattening everything, you risk modules in the system knowing/depending on other modules that they should not even know about. A hierarchy mandates strict narrow dependency rules:
A
|-- B
|    |-- C
|    |-- D
|-- E
|    |-- F
|    |-- G

A should only know about B & E
B and E only know about each other
C and D only know about each other
F and G only know about each other

each module knows the minimum it needs to know in order to function. With a flatten model the dependency rules are quite weak:

A
B
C
D
E
F
G

everyone knows about everyone... no clear dependency rules... everything is allowed.

One thing I like about the hierarchy model is that if you follow the rules it forces you to think. That is, if you see that for some reason dependencies leak out of their boundaries - you probably did something wrong.

We've had plenty of discussion on this topic already in #12744, so not going to repeat everything again. If most agree to remove this.... it's fine... as long as the end result doesn't turn out to be worse than where we started.

@s1monw
Copy link
Contributor

s1monw commented Aug 11, 2015

@uboness I totally understand where you are coming from. The last couple of months elasticsearch has undergone some massive changes since we are basically not able to cope with the amount of classes and abstractions, number of packages etc. It's largely grown over the years and taking several steps back here is a reasonable thing to do. So what does this mean, we mainly folded classes into larger and more general classes, removed packages containing a single class or like a handful, removed wiring services via guice and use good old ctors.
These refactorings are similar to this, in theory the abstractions where ok and the right thing todo if you look at smaller and meant to be extendable projects. In elasticsearch the complexity of the system alone is hard to cope with, we should try to make our interfaces simple. Lemme explain using an example:

We have a class SearchModule it spawns a bunch of modules:

    @Override
    public Iterable<? extends Module> spawnModules() {
        return ImmutableList.of(
                new SearchServiceModule(settings),
                new TransportSearchModule(),
                new HighlightModule(),
                new SuggestModule(),
                new FunctionScoreModule(),
                new AggregationModule(),
                new FetchSubPhaseModule());
    }

almost all of them follow a simple pattern:

public class FooModule extends AbstractModule {

  public void registerFoo(Foo foo) { ... }

  @Override
  protected void configure() { ... }
}

I think stuff would be way simpler if we would merge them all into SearchModule like this:

public class  SearchModule extends AbstractModule  {

  public void registerHighlighter(Class<? extends Highlighter> clazz) { ... } 
  public void registerSuggester(Class<? extends Suggester> suggester) { ... }
  public void registerFetchSubPhase(Class<? extends FetchSubPhase> subPhase)  { ... }
  public void registerParser(Class<? extends ScoreFunctionParser> parser) { .. }
  public void registerStream(SignificanceHeuristicStreams.Stream stream) { ... } 
  public void registerStream(MovAvgModelStreams.Stream stream) { ... }

}

This is simple to use, clear and reduces bloat. Abstractions are added via the individual classes you can register that is a clear interface. The fact that SearchModule spawns TransportSearchModule to spawn TransportAggregationModule to spawn [TransportSignificantTermsHeuristicModule, TransportMovAvgModelModule] is just confusing and not needed. There is no gain here only complexity. I am 100% convinced we should discourage and prevent this overdesign pattern.

@s1monw s1monw added the blocker label Aug 11, 2015
@clintongormley clintongormley added >breaking :Core/Infra/Plugins Plugin API and infrastructure labels Aug 11, 2015
s1monw added a commit to s1monw/elasticsearch that referenced this issue Aug 12, 2015
@clintongormley clintongormley mentioned this issue Aug 13, 2015
14 tasks
rjernst added a commit to rjernst/elasticsearch that referenced this issue Aug 16, 2015
The ClusterModule contained a couple submodules. This moves the
functionality from those modules into ClusterModule. Two of those
had to do with DynamicSettings. This change also cleans up
how DynamicSettings are built, and enforces they are added, with
validators, in ClusterModule.

See elastic#12783.
rjernst added a commit to rjernst/elasticsearch that referenced this issue Aug 17, 2015
The IndicesModule was made up of two submodules, one which
handled registering queries, and the other for registering
hunspell dictionaries. This change moves those into
IndicesModule. It also adds a new extension point type,
InstanceMap. This is simply a Map<K,V>, where K and V are
actual objects, not classes like most other extension points.
I also added a test method to help testing instance map extensions.
This was particularly painful because of how guice binds the key
and value as separate bindings, and then reconstitutes them
into a Map at injection time. In order to gain access to the
object which links the key and value, I had to tweak our
guice copy to not use an anonymous inner class for the Provider.

Note that I also renamed the existing extension point types, since
they were very redundant. For example, ExtensionPoint.MapExtensionPoint
is now ExtensionPoint.ClassMap.

See elastic#12783.
rjernst added a commit to rjernst/elasticsearch that referenced this issue Aug 17, 2015
Custom repository types are registered through the RepositoriesModule.
Later, when a specific repository type is used, the RespositoryModule
is installed, which in turn would spawn the module that was
registered for that repository type. However, a module is not needed
here. Each repository type has two associated classes, a Repository and
an IndexShardRepository.

This change makes the registration method for custom repository
types take both of these classes, instead of a module.

See elastic#12783.
rjernst added a commit that referenced this issue Aug 19, 2015
The IndicesModule was made up of two submodules, one which
handled registering queries, and the other for registering
hunspell dictionaries. This change moves those into
IndicesModule. It also adds a new extension point type,
InstanceMap. This is simply a Map<K,V>, where K and V are
actual objects, not classes like most other extension points.
I also added a test method to help testing instance map extensions.
This was particularly painful because of how guice binds the key
and value as separate bindings, and then reconstitutes them
into a Map at injection time. In order to gain access to the
object which links the key and value, I had to tweak our
guice copy to not use an anonymous inner class for the Provider.

Note that I also renamed the existing extension point types, since
they were very redundant. For example, ExtensionPoint.MapExtensionPoint
is now ExtensionPoint.ClassMap.

See #12783.

Backport of #12921
rjernst added a commit that referenced this issue Aug 19, 2015
Custom repository types are registered through the RepositoriesModule.
Later, when a specific repository type is used, the RespositoryModule
is installed, which in turn would spawn the module that was
registered for that repository type. However, a module is not needed
here. Each repository type has two associated classes, a Repository and
an IndexShardRepository.

This change makes the registration method for custom repository
types take both of these classes, instead of a module.

See #12783.

Backport of #12948.
rjernst added a commit to rjernst/elasticsearch that referenced this issue Aug 21, 2015
The last use case of spawn modules was for plugins. This change handles
plugin modules directly, and removes SpawnModules.

closes elastic#12783
rjernst added a commit to rjernst/elasticsearch that referenced this issue Aug 23, 2015
This change makes modules added by plugins come before others, as it was
before elastic#12783. The order of configuration, and thereby binding, happens
in the order modules are received, and without this change, some plugins
can get *insane* guice errors (500mb stack trace).
adrpar pushed a commit to crate/crate that referenced this issue Jan 7, 2016
…t get the compiler to stop complaining)

More information and clues here:
elastic/elasticsearch#12783
adrpar pushed a commit to crate/crate that referenced this issue Jan 7, 2016
…t get the compiler to stop complaining)

More information and clues here:
elastic/elasticsearch#12783
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants