Remove SpawnModules #12783

rjernst · 2015-08-10T21:07:18Z

This is a spinoff from #12744. SpawnModules is a way to have modules produce sub modules. However, one way they have been used in the past is broken since we now have plugin classloader isolation. Some extension points used to allow plugging in an entire module "implementation". This is far too crazy for us to maintain in ES. It also leads to having lots of tiny modules that bind just a handful of classes.

We should remove this interface, and flatten the modules we have. All extension points should be class implementation based, by registering custom implementations with the module responsible for binding a given class. This has a bunch of benefits, like:

Making it easier to find where to add new classes to bind
Making it easier to document and write plugins since there will be less modules to possibly register custom extensions with
Reducing spaghetti dependencies because it is clear when we have tons of bound classes (a lot of which can probably be removed when we can clearly see what they are instead of being spread across tons of modules)

mikemccand · 2015-08-10T21:15:48Z

+1

s1monw · 2015-08-10T21:18:46Z

++

uboness · 2015-08-10T22:59:30Z

personally I find having a hierarchy of modules helpful to keep things clean, more modular and more testable:

a module is a unit responsible for X
the module itself is a sub-system within a system. Breaking the system to modules is good, and I believe the same holds for modules themselves. Specially when it comes to complex modules.
one can argue that this can also be fixed by collapsing a hierarchy to a single module - I believe this will lead to a spaghetti code on the module level. Instead, split responsibilities to higher/lower level functions and work your way down/up.
applied properly, sub-modules actually reduce spaghetti dependencies, simply because higher modules are not (and should not) be aware of lower ones. When flattening everything, you risk modules in the system knowing/depending on other modules that they should not even know about. A hierarchy mandates strict narrow dependency rules:

A
|-- B
|    |-- C
|    |-- D
|-- E
|    |-- F
|    |-- G

A should only know about B & E
B and E only know about each other
C and D only know about each other
F and G only know about each other

each module knows the minimum it needs to know in order to function. With a flatten model the dependency rules are quite weak:

A
B
C
D
E
F
G

everyone knows about everyone... no clear dependency rules... everything is allowed.

One thing I like about the hierarchy model is that if you follow the rules it forces you to think. That is, if you see that for some reason dependencies leak out of their boundaries - you probably did something wrong.

We've had plenty of discussion on this topic already in #12744, so not going to repeat everything again. If most agree to remove this.... it's fine... as long as the end result doesn't turn out to be worse than where we started.

s1monw · 2015-08-11T07:46:08Z

@uboness I totally understand where you are coming from. The last couple of months elasticsearch has undergone some massive changes since we are basically not able to cope with the amount of classes and abstractions, number of packages etc. It's largely grown over the years and taking several steps back here is a reasonable thing to do. So what does this mean, we mainly folded classes into larger and more general classes, removed packages containing a single class or like a handful, removed wiring services via guice and use good old ctors.
These refactorings are similar to this, in theory the abstractions where ok and the right thing todo if you look at smaller and meant to be extendable projects. In elasticsearch the complexity of the system alone is hard to cope with, we should try to make our interfaces simple. Lemme explain using an example:

We have a class SearchModule it spawns a bunch of modules:

    @Override
    public Iterable<? extends Module> spawnModules() {
        return ImmutableList.of(
                new SearchServiceModule(settings),
                new TransportSearchModule(),
                new HighlightModule(),
                new SuggestModule(),
                new FunctionScoreModule(),
                new AggregationModule(),
                new FetchSubPhaseModule());
    }

almost all of them follow a simple pattern:

public class FooModule extends AbstractModule {

  public void registerFoo(Foo foo) { ... }

  @Override
  protected void configure() { ... }
}

I think stuff would be way simpler if we would merge them all into SearchModule like this:

public class  SearchModule extends AbstractModule  {

  public void registerHighlighter(Class<? extends Highlighter> clazz) { ... } 
  public void registerSuggester(Class<? extends Suggester> suggester) { ... }
  public void registerFetchSubPhase(Class<? extends FetchSubPhase> subPhase)  { ... }
  public void registerParser(Class<? extends ScoreFunctionParser> parser) { .. }
  public void registerStream(SignificanceHeuristicStreams.Stream stream) { ... } 
  public void registerStream(MovAvgModelStreams.Stream stream) { ... }

}

This is simple to use, clear and reduces bloat. Abstractions are added via the individual classes you can register that is a clear interface. The fact that SearchModule spawns TransportSearchModule to spawn TransportAggregationModule to spawn [TransportSignificantTermsHeuristicModule, TransportMovAvgModelModule] is just confusing and not needed. There is no gain here only complexity. I am 100% convinced we should discourage and prevent this overdesign pattern.

Relates to elastic#12783

The ClusterModule contained a couple submodules. This moves the functionality from those modules into ClusterModule. Two of those had to do with DynamicSettings. This change also cleans up how DynamicSettings are built, and enforces they are added, with validators, in ClusterModule. See elastic#12783.

The IndicesModule was made up of two submodules, one which handled registering queries, and the other for registering hunspell dictionaries. This change moves those into IndicesModule. It also adds a new extension point type, InstanceMap. This is simply a Map<K,V>, where K and V are actual objects, not classes like most other extension points. I also added a test method to help testing instance map extensions. This was particularly painful because of how guice binds the key and value as separate bindings, and then reconstitutes them into a Map at injection time. In order to gain access to the object which links the key and value, I had to tweak our guice copy to not use an anonymous inner class for the Provider. Note that I also renamed the existing extension point types, since they were very redundant. For example, ExtensionPoint.MapExtensionPoint is now ExtensionPoint.ClassMap. See elastic#12783.

Custom repository types are registered through the RepositoriesModule. Later, when a specific repository type is used, the RespositoryModule is installed, which in turn would spawn the module that was registered for that repository type. However, a module is not needed here. Each repository type has two associated classes, a Repository and an IndexShardRepository. This change makes the registration method for custom repository types take both of these classes, instead of a module. See elastic#12783.

The IndicesModule was made up of two submodules, one which handled registering queries, and the other for registering hunspell dictionaries. This change moves those into IndicesModule. It also adds a new extension point type, InstanceMap. This is simply a Map<K,V>, where K and V are actual objects, not classes like most other extension points. I also added a test method to help testing instance map extensions. This was particularly painful because of how guice binds the key and value as separate bindings, and then reconstitutes them into a Map at injection time. In order to gain access to the object which links the key and value, I had to tweak our guice copy to not use an anonymous inner class for the Provider. Note that I also renamed the existing extension point types, since they were very redundant. For example, ExtensionPoint.MapExtensionPoint is now ExtensionPoint.ClassMap. See #12783. Backport of #12921

Custom repository types are registered through the RepositoriesModule. Later, when a specific repository type is used, the RespositoryModule is installed, which in turn would spawn the module that was registered for that repository type. However, a module is not needed here. Each repository type has two associated classes, a Repository and an IndexShardRepository. This change makes the registration method for custom repository types take both of these classes, instead of a module. See #12783. Backport of #12948.

The last use case of spawn modules was for plugins. This change handles plugin modules directly, and removes SpawnModules. closes elastic#12783

This change makes modules added by plugins come before others, as it was before elastic#12783. The order of configuration, and thereby binding, happens in the order modules are received, and without this change, some plugins can get *insane* guice errors (500mb stack trace).

…t get the compiler to stop complaining) More information and clues here: elastic/elasticsearch#12783

rjernst added the :Internal label Aug 10, 2015

rjernst mentioned this issue Aug 10, 2015

Remove Settings.getAsClass #12744

Merged

s1monw added the blocker label Aug 11, 2015

clintongormley added >breaking :Core/Infra/Plugins Plugin API and infrastructure labels Aug 11, 2015

s1monw mentioned this issue Aug 11, 2015

Flatten SearchService and clean up build-in registration #12807

Merged

s1monw added a commit to s1monw/elasticsearch that referenced this issue Aug 12, 2015

Flatten SearchService and clean up build-in registration

6316718

Relates to elastic#12783

clintongormley mentioned this issue Aug 13, 2015

Roadmap for 2.0 #9970

Closed

14 tasks

clintongormley added the v2.0.0-beta2 label Aug 13, 2015

rjernst mentioned this issue Aug 16, 2015

Flatten ClusterModule and add more tests #12916

Merged

clintongormley added v2.0.0 and removed v2.0.0-beta2 labels Aug 16, 2015

rjernst mentioned this issue Aug 17, 2015

Flatten IndicesModule and add tests #12921

Merged

rjernst mentioned this issue Aug 17, 2015

Simplify custom repository type setup #12948

Merged

rjernst added a commit to rjernst/elasticsearch that referenced this issue Aug 21, 2015

Internal: Remove SpawnModules

2a726fe

The last use case of spawn modules was for plugins. This change handles plugin modules directly, and removes SpawnModules. closes elastic#12783

rjernst mentioned this issue Aug 21, 2015

Remove SpawnModules #13034

Merged

rjernst closed this as completed in #13034 Aug 22, 2015

rjernst mentioned this issue Aug 23, 2015

Add plugin modules before (almost all) others #13061

Merged

clintongormley added v2.0.0-beta2 and removed v2.0.0 labels Sep 14, 2015

adrpar pushed a commit to crate/crate that referenced this issue Jan 7, 2016

Initial migration of Plugins (most probably wrong and incomplete, jus…

3816113

…t get the compiler to stop complaining) More information and clues here: elastic/elasticsearch#12783

adrpar pushed a commit to crate/crate that referenced this issue Jan 7, 2016

Initial migration of Plugins (most probably wrong and incomplete, jus…

cbab2e1

…t get the compiler to stop complaining) More information and clues here: elastic/elasticsearch#12783

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove SpawnModules #12783

Remove SpawnModules #12783

rjernst commented Aug 10, 2015

mikemccand commented Aug 10, 2015

s1monw commented Aug 10, 2015

uboness commented Aug 10, 2015

s1monw commented Aug 11, 2015

Remove SpawnModules #12783

Remove SpawnModules #12783

Comments

rjernst commented Aug 10, 2015

mikemccand commented Aug 10, 2015

s1monw commented Aug 10, 2015

uboness commented Aug 10, 2015

s1monw commented Aug 11, 2015