Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only grant connect,accept permissions to transport-netty4 #22116

Closed
4 tasks done
s1monw opened this issue Dec 12, 2016 · 35 comments
Closed
4 tasks done

Only grant connect,accept permissions to transport-netty4 #22116

s1monw opened this issue Dec 12, 2016 · 35 comments
Assignees

Comments

@s1monw
Copy link
Contributor

s1monw commented Dec 12, 2016

Today we still grant a quite scary permission to core:

  // Allow connecting to the internet anywhere
  permission java.net.SocketPermission "*", "accept,connect,resolve";

But since we now have netty4 moved to a module we can potentially move this to into the modules security policy. Yet, there are a bunch of things that needs fixing until we can do that:

  • since we use MockTcpTransport from our test framework this needs to have the same permissions granted. Yet, if we just go ahead and grant accept,connect to the test-framework we might run into trouble since our tests will just inherit that permission ie. if unit and pseudo integ-tests are run since we don't grant this to a codebase. We might want to add some kind of MockSocket project just like SecureMock that we can grant this permission to and where we can depend on for testing.
  • netty-4 still has issues with missing doPrivileged blocks that needs fixing
  • move URLRepository somewhere else since it uses connect and core shouldn't establish any kind of connection. (this can be a second step, we can first start removing accept from the list.
  • some other plugins like ec2 / gce etc. might need extra permission to connect to their endpoints which needs manual testing

here is an example of a missing doPrivileged block ie here:

> Throwable #1: java.security.AccessControlException: access denied ("java.net.SocketPermission" "[fe80:0:0:0:0:0:0:1%1]:52661" "connect,resolve")
   > 	at __randomizedtesting.SeedInfo.seed([8FDA867CA1C20E0D:47AB82401EC3F2C5]:0)
   > 	at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
   > 	at java.security.AccessController.checkPermission(AccessController.java:884)
   > 	at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
   > 	at java.lang.SecurityManager.checkConnect(SecurityManager.java:1051)
   > 	at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:625)
   > 	at io.netty.channel.socket.nio.NioSocketChannel.doConnect(NioSocketChannel.java:331)
   > 	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.connect(AbstractNioChannel.java:254)
   > 	at io.netty.channel.DefaultChannelPipeline$HeadContext.connect(DefaultChannelPipeline.java:1266)
   > 	at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:556)
   > 	at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:541)
   > 	at io.netty.channel.ChannelOutboundHandlerAdapter.connect(ChannelOutboundHandlerAdapter.java:47)
   > 	at io.netty.channel.AbstractChannelHandlerContext.invokeConnect(AbstractChannelHandlerContext.java:556)
   > 	at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:541)
   > 	at io.netty.channel.AbstractChannelHandlerContext.connect(AbstractChannelHandlerContext.java:523)
   > 	at io.netty.channel.DefaultChannelPipeline.connect(DefaultChannelPipeline.java:985)
   > 	at io.netty.channel.AbstractChannel.connect(AbstractChannel.java:255)
   > 	at io.netty.bootstrap.Bootstrap$3.run(Bootstrap.java:252)
   > 	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
   > 	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:418)
   > 	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:454)
   > 	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
   > 	at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
   > 	at java.lang.Thread.run(Thread.java:745)
@jasontedor
Copy link
Member

jasontedor commented Dec 12, 2016

+1

In general Netty needs some love with respect to handling the security manager correctly so we'll have some work to do here.

@Tim-Brooks
Copy link
Contributor

Tim-Brooks commented Dec 18, 2016

I've been able to make the modifications to netty necessary to move the "accept" permission to the transport plugin and get the transport and core tests passing (with creation of a mocksocket jar).

I've had some issues with the azure plugin. That plugin uses com.sun.net.httpserver.HttpsServer for running some tests. it appears to me that HttpsServer is not calling doPriviledge when accepting connections.

if(!ServerImpl.this.terminating) {

    SocketChannel var5 = ServerImpl.this.schan.accept();

    if(ServerConfig.noDelay()) {

        var5.socket().setTcpNoDelay(true);

    }


    if(var5 != null) {

        var5.configureBlocking(false);

        SelectionKey var6 = var5.register(ServerImpl.this.selector, 1);

        var7 = new HttpConnection();

        var7.selectionKey = var6;

        var7.setChannel(var5);

        var6.attach(var7);

        ServerImpl.this.requestStarted(var7);

        ServerImpl.this.allConnections.add(var7);

    }

}

@s1monw
Copy link
Contributor Author

s1monw commented Dec 19, 2016

@tbrooks8 amazing progress, would you mind linking the netty PRs / issues with this issue so we can track them. Also for all core changes can you make sure you open smallish PRs if possible?

I've had some issues with the azure plugin. That plugin uses com.sun.net.httpserver.HttpsServer for running some tests. it appears to me that HttpsServer is not calling doPriviledge when accepting connections.

that is super annoying. I am not 100% sure what we can do about this, I think first of all we gotta fix it in the JDK such that down the road we can remove whatever hack we add here.

@Tim-Brooks
Copy link
Contributor

I push the work that I have completed on the mocksocket here:

https://github.com/elastic/mocksocket

I have opened a Netty PR (netty/netty#6146)

I will start making PRs to ES as I split up the work.

@s1monw
Copy link
Contributor Author

s1monw commented Dec 20, 2016

cool stuff - can you make the https://github.com/elastic/mocksocket repo public I don't think think there is anything private about this! :)

@Tim-Brooks
Copy link
Contributor

Tim-Brooks commented Dec 20, 2016

I have opened a PR to integrate mocksocket into ElasticSearch. And replace raw usages of Sockets and ServerSockets in the ES tests.

#22287

@Tim-Brooks
Copy link
Contributor

Tim-Brooks commented Dec 21, 2016

Based on the discussion in the status meeting today, I investigated applying permissions based on files opposed to a separate jar.

So I migrated accept out of the core security policy. And placed the following the in test-framework.policy:

grant codeBase "file:/path/to/elastic/elasticsearch/test/framework/build-idea/classes/test/" {
  permission java.net.SocketPermission "*", "accept,resolve,connect";
 };

grant codeBase "file:/path/to/elastic/elasticsearch/test/framework/build-idea/classes/main/" {
   permission java.net.SocketPermission "*", "accept,resolve,connect";
 };

With that change, any tests involving the MockTCPTransport passed. (Although the netty transport tests failed as they were not given permission.)

This approach seems to provide a way forward. Although my approach is certainly naive and hardcoded.

I'm not immediately clear on what approach to take in order to make this solution robust. I see that we currently parse the classpath and build up some system properties. Currently this works pretty well for jars. However, in the case of our code it is constantly rewriting the codebase.main and codebase.test system properties.

So when you run the "core" tests the following codebases are parsed:

file:/path/to/elastic/elasticsearch/core/build-idea/classes/test/
file:/path/to/elastic/elasticsearch/core/build-idea/classes/main/
file:/path/to/elastic/elasticsearch/test/framework/build-idea/classes/main/
file:/path/to/elastic/elasticsearch/client/rest/build-idea/classes/main/

And for discovery-azure-classic plugin tests:

file:/path/to/elastic/elasticsearch/plugins/discovery-azure-classic/build-idea/classes/test/
file:/path/to/elastic/elasticsearch/plugins/discovery-azure-classic/build-idea/classes/main/
file:/path/to/elastic/elasticsearch/plugins/discovery-azure-classic/build-idea/generated-resources/
file:/path/to/elastic/elasticsearch/core/build-idea/classes/main/
file:/path/to/elastic/elasticsearch/test/framework/build-idea/classes/main/
file:/path/to/elastic/elasticsearch/client/rest/build-idea/classes/main/

Our method of parsing the "shortName" means that the property will be codebase.test, codebase.main, or codebase.generated-source. And whichever one is parsed last "wins".

I could work on implementing proper handling of this. Such as parsing these into:

  • codebase.core.main
  • codebase.core.test
  • codebase.test.framework.main
  • codebase.test.framework.test
  • codebase.plugins.discovery-azure-classic.main
  • codebase.plugins.discovery-azure-classic.test

Those system properties could be accessed in security policy files to give certain packages permissions.

Maybe this is not the easiest way forward? It's entirely possible that I missing a better way of doing this.

Also - I'm not sure what implications those changes would have for the distribution (when all the packages / modules are different jars).

@s1monw
Copy link
Contributor Author

s1monw commented Dec 22, 2016

++ this sounds like a plan @tbrooks8

Also - I'm not sure what implications those changes would have for the distribution (when all the packages / modules are different jars).

I think that will be fine since we don't use this in distribution. Our tests will catch if something is wrong since they use the real zip / tar.gz

@Tim-Brooks
Copy link
Contributor

I've spiked on this here.

I've been learning more about the SecurityManager and it seems that creating a new Thread inherits the permissions from where the thread was created. So giving "discovery-azure-classic" SocketPermission and wrapping https start in doPrivileged {} with proper permissions for discovery-azure-classic works.

AccessController.doPrivileged((PrivilegedExceptionAction<Void>) () -> {
            httpsServer.start();
            httpServer.start();
            return null;
        });

combined with:

grant codeBase "${codebase.elasticsearch.plugins.discovery-azure-classic.test}" {
  permission java.net.SocketPermission "*", "accept,connect,resolve";
};

grant codeBase "${codebase.elasticsearch.plugins.discovery-ec2.test}" {
  permission java.net.SocketPermission "*", "accept,connect,resolve";
};

grant codeBase "${codebase.elasticsearch.plugins.discovery-gce.test}" {
  permission java.net.SocketPermission "*", "accept,connect,resolve";
};

allows all of the tests for those packages to pass.

There are a few things about this that are kind of ugly. If you run the tests in an IDE (like IntelliJ) the main classes remain in directories. If you run the tests using Gradle the main classes are placed in an JAR. This means that for test "framework" you must use two permissions for MockTCPTransport to work.

grant codeBase "${codebase.elasticsearch.test.framework.main}" {
  permission java.net.SocketPermission "*", "accept,connect,resolve";
};

grant codeBase "${codebase.framework-6.0.0-alpha1-SNAPSHOT.jar}" {
  permission java.net.SocketPermission "*", "accept,connect,resolve";
};

I think that tomorrow I should explore moving the starting of the HTTPS and HTTP servers to the MockSocket JAR. And if that works then we should still be able to isolate all SocketPermission to that JAR for test packages. And avoid scattering SocketPermission to many different packages (framework main, framework test, ec2 test, azure test, etc).

@Tim-Brooks
Copy link
Contributor

Updates:

  1. I updated Replace Socket, ServerSocket, and HttpServer usages in tests with mocksocket versions #22287 with changes to mocksocket that include wrapping the http/https servers.
  2. I pushed mocksocket to MavenCentral.
  3. I worked on integrating the netty snapshot based on my PR on this branch. With the netty snapshot + mocksocket PR, when I move the accept permissions out of core and to just netty/mocksocket the elasticsearch test suite passes.

@Tim-Brooks
Copy link
Contributor

Tim-Brooks commented Dec 29, 2016

I've started exploring what is involved in removing connect and resolve from core.

Here are some preliminary findings:

  1. NetworkUtils makes the following call that requires connect:
  • NetworkInterface.getInetAddresses()
  1. MacAddressProvider makes the following call that requires connect:
  • NetworkInterface.getHardwareAddress()
  1. IfConfig uses these two calls when logging.

So maybe these classes (or some of the code) will need to be moved to a dependency? Or something else?

As Simon pointed out various plugins (ec2, etc) will need permissions granted to a number of their dependencies (EC2Client, etc). I have not figured out complete lists of these yet.

@s1monw
Copy link
Contributor Author

s1monw commented Jan 3, 2017

NetworkInterface.getInetAddresses() & NetworkInterface.getHardwareAddress() supports the getNetworkInformation permission I think we should grant this to get rid of one of the problems.

For IfConfig.logIfNecessary(); I wonder if we can move it into bootstrap instead then we can log it before we initialize the security manager? @jasontedor WDYT?

As Simon pointed out various plugins (ec2, etc) will need permissions granted to a number of their dependencies (EC2Client, etc). I have not figured out complete lists of these yet.

yeah I think all the repositories need resolve an connect granted.

@Tim-Brooks
Copy link
Contributor

Tim-Brooks commented Jan 5, 2017

@s1monw I do not know if moving IfConfig.logIfNecessary(); to bootstrap was decided on as the correct option, but I opened a PR with that work.

@jasontedor
Copy link
Member

For IfConfig.logIfNecessary(); I wonder if we can move it into bootstrap instead then we can log it before we initialize the security manager? @jasontedor WDYT?

+1

@Tim-Brooks
Copy link
Contributor

Tim-Brooks commented Jan 6, 2017

As a quick update - it does appear that if I grant permissions to a plugin, that does successfully grant permissions to all of the dependency jars. So after I add the doPrivilege blocks the plugins (ec2, gce, etc) should work.

However, I identified a few other issues. In two places in core we currently need connect/resolve. In NetworkService (here) and TcpTransport (here).

Thoughts? Maybe this is similar to the URLRepository issue you originally identified?

Tim-Brooks added a commit to Tim-Brooks/elasticsearch that referenced this issue Jan 6, 2017
This is related to elastic#22116. A logIfNecessary() call makes a call to
NetworkInterface.getInterfaceAddresses() requiring SocketPermission
connect privileges. By moving this to bootstrap the logging call can be
made before installing the SecurityManager.
@s1monw
Copy link
Contributor Author

s1monw commented Jan 6, 2017

However, I identified a few other issues. In two places in core we currently need connect/resolve. In NetworkService (here) and TcpTransport (here).

hmm I am starting to think maybe removing connect, resolve is too ambiguous. I don't think it's worth adding a specific jar we depend on to grant this permission so I wonder if we should stick with accept for now? @jasontedor WDYT?

Tim-Brooks added a commit that referenced this issue Jan 6, 2017
This is related to #22116. A logIfNecessary() call makes a call to
NetworkInterface.getInterfaceAddresses() requiring SocketPermission
connect privileges. By moving this to bootstrap the logging call can be
made before installing the SecurityManager.
@jasontedor
Copy link
Member

However, I identified a few other issues. In two places in core we currently need connect/resolve. In NetworkService (here) and TcpTransport (here).

@tbrooks8 I think your first link here is not right? At least, I don't think what you link to needs permissions.

@Tim-Brooks
Copy link
Contributor

I opened #22534 to wrap the connect operations in plugins.

Tim-Brooks added a commit to Tim-Brooks/elasticsearch that referenced this issue Jan 11, 2017
This is related to elastic#22116. netty channels require socket `connect` and
`accept` privileges. Netty does not currently wrap these operations
with `doPrivileged` blocks. These changes extend the netty channels
and wrap calls to the relevant super methods in doPrivileged blocks.
@Tim-Brooks
Copy link
Contributor

@s1monw
Copy link
Contributor Author

s1monw commented Jan 13, 2017

have investigated URLRepository and it does not appear to me that class makes socket connections (some of its plugin subclasses do, but I'm dealing with that in the plugin changes). Let me know if I am incorrect about that.

this might happen here?

@Tim-Brooks
Copy link
Contributor

Yep. I agree that looks like a connect. Thanks. I must of missed that.

Tim-Brooks added a commit that referenced this issue Jan 13, 2017
This is related to #22116. netty channels require socket `connect` and
`accept` privileges. Netty does not currently wrap these operations
with `doPrivileged` blocks. These changes extend the netty channels
and wrap calls to the relevant super methods in doPrivileged blocks.
Tim-Brooks added a commit that referenced this issue Jan 16, 2017
This is related to #22116. A number of modules (reindex, etc) use the
rest client. The rest client opens connections using the apache http
client. To avoid throwing SecurityException when using the
SecurityManager these operations must be privileged. This is tricky
because connections are opened within the httpclient code on its
reactor thread. The way I confronted this was to wrap the creation
of the client (and creation of reactor thread) in a doPrivileged
block. The new thread inherits the existing security context.
Tim-Brooks added a commit that referenced this issue Jan 18, 2017
This is related to #22116. Certain plugins (discovery-azure-classic, 
discovery-ec2, discovery-gce, repository-azure, repository-gcs, and 
repository-s3) open socket connections. As SocketPermissions are 
transitioned out of core, these plugins will require connect 
permission. This pull request wraps operations that require these 
permissions in doPrivileged blocks.
@Tim-Brooks
Copy link
Contributor

I've looked into what is involved in moving URLRepository to a module. It looks pretty straightforward to me to move just URLRepository.

@s1monw suggested that I move both of the core Repository classes (URLRepository and FsRepository). Unfortunately moving FsRepository looks more complicated. MockRepository extends FsRepository. The following core tests use MockRepository:

  • BlobStoreFormatIT
  • DedicatedClusterSnapshotRestoreIT
  • MinThreadsSnapshotRestoreIT
  • RepositoriesIT
  • RepositoryUpgradabilityIT
  • RestoreBackwardsCompatIT
  • SharedClusterSnapshotRestoreIT

And the functionality seems to depend on FsBlobStore and FsBlobContainer (so the super class cannot simply be removed).

I guess there are a couple of options:

  • Those tests be moved to the new repositories module
  • MockRepository be modified to stand alone without the FSRepository components. (Also maybe MockRepository should be moved to test.framework opposed to core-tests?)
  • Those tests be moved to an external qa package that depends on both core and repositories.
  • Some combination of the above?

Also - should this be completed in multiple PRs? Like an initial PR that moves URLRepository to modules that unblocks SocketPermission work. And then another issue/PR to tackle the moving FSRepository?

@s1monw
Copy link
Contributor Author

s1monw commented Jan 19, 2017

ok I think we start with URLRepository and take it from there? I didn't see how much FSRepo was involved here so sorry for pushing that direction. The cleanups would be nice but lets focus on the connect/accept permissions for now.

Tim-Brooks added a commit that referenced this issue Jan 20, 2017
This is related to #22116. Core no longer needs SocketPermission 
accept. This permission is relegated to the transport-netty4 module 
and (for tests) to the mocksocket jar.
Tim-Brooks added a commit that referenced this issue Jan 25, 2017
This is related to #22116. URLRepository requires SocketPermission
connect. This commit introduces a new module called "repository-url"
where URLRepository will reside. With the new module, permissions can
be removed from core.
Tim-Brooks added a commit to Tim-Brooks/elasticsearch that referenced this issue Jan 26, 2017
This is related to elastic#22116. Core no longer needs `SocketPermission`
`connect`.

This permission is relegated to these modules/plugins:
- transport-netty4 module
- reindex module
- repository-url module
- discovery-azure-classic plugin
- discovery-ec2 plugin
- discovery-gce plugin
- repository-azure plugin
- repository-gcs plugin
- repository-hdfs plugin
- repository-s3 plugin

And for tests:
- mocksocket jar
- rest client
- httpcore-nio jar
- httpasyncclient jar
@Tim-Brooks
Copy link
Contributor

Tim-Brooks commented Jan 26, 2017

I opened the PR #22797 that should removed connect.

Tim-Brooks added a commit that referenced this issue Jan 27, 2017
)

This is related to #22116. The repository-hdfs plugin opens socket
connections. As SocketPermission is transitioned out of core, hdfs
will require connect permission. This pull request wraps operations
that require this permission in doPrivileged blocks.
Tim-Brooks added a commit that referenced this issue Feb 3, 2017
This is related to #22116. Core no longer needs `SocketPermission`
`connect`.

This permission is relegated to these modules/plugins:
- transport-netty4 module
- reindex module
- repository-url module
- discovery-azure-classic plugin
- discovery-ec2 plugin
- discovery-gce plugin
- repository-azure plugin
- repository-gcs plugin
- repository-hdfs plugin
- repository-s3 plugin

And for tests:
- mocksocket jar
- rest client
- httpcore-nio jar
- httpasyncclient jar
Tim-Brooks added a commit that referenced this issue Feb 7, 2017
As part of #22116 we are going to forbid usage of api
java.net.URL#openStream(). However in a number of places across the
we use this method to read files from the local filesystem. This commit
introduces a helper method openFileURLStream(URL url) to read files
from URLs. It does specific validation to only ensure that file:/
urls are read.

Additionlly, this commit removes unneeded method
FileSystemUtil.newBufferedReader(URL, Charset). This method used the
openStream () method which will soon be forbidden. Instead we use the
Files.newBufferedReader(Path, Charset).
Tim-Brooks added a commit that referenced this issue Feb 7, 2017
This is related to #22116. This commit adds calls that require
SocketPermission connect to forbidden APIs.

The following calls are now forbidden:

- java.net.URL#openStream()
- java.net.URLConnection#connect()
- java.net.URLConnection#getInputStream()
- java.net.Socket#connect(java.net.SocketAddress)
- java.net.Socket#connect(java.net.SocketAddress, int)
- java.nio.channels.SocketChannel#open(java.net.SocketAddress)
- java.nio.channels.SocketChannel#connect(java.net.SocketAddress)
@Tim-Brooks
Copy link
Contributor

Tim-Brooks commented Feb 9, 2017

Once #23057 is merged in the last checkbox will be complete.

As an additional note #22964 marked connect apis as forbidden as at attempt to warn developers not to use those apis or to provide appropriate permissions.

@s1monw Once the last PR (#23057) is merged in is this issue complete?

@s1monw
Copy link
Contributor Author

s1monw commented Feb 9, 2017

@tbrooks8 I think we are done here... congrats!

@Tim-Brooks
Copy link
Contributor

Closing as everything has been merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants