Skip to content
This repository has been archived by the owner on May 15, 2019. It is now read-only.

Add bindHost argument #92

Merged
merged 8 commits into from
May 5, 2017
Merged

Add bindHost argument #92

merged 8 commits into from
May 5, 2017

Conversation

kpruden
Copy link
Contributor

@kpruden kpruden commented Apr 28, 2017

Fixes #91

Remove the {grpc,rest,supervisor}Host arguments and replace with
host and bindHost. The host argument is the hostname at which the
GCF emulator can be reached by clients, while the bindHost is the
address to which the emulator will bind its listener(s). This allows
the emulator to be running on a different host from the client, and
supports running the emulator behind a NAT (for example when running
as a docker container).

  • - npm test succeeds (see below)
  • - Code coverage does not decrease (if any source code was changed)
  • - If applicable, PR includes appropriate documentation changes

Some questions/comments for reviewers:

  • As mentioned, I removed the separate arguments for the grpc, rest, and supervisor addresses and replaced them with a single host argument. It's not clear to me why supporting separate values for each of these would be useful, but if there's a use case I'm not aware of I can work them back in.
  • Because of the removal of these arguments, this is change is not technically backwards-compatible. If compatibility is important, I can work in deprecated support for {grpc,rest,supervisor}Host, but since this is still pre-release software this didn't seem that critical.
  • One thing I noticed during development of this feature is the implementation of delete assumes the filesystem location of the zip archive is accessible to the emulator process (it attempts to delete it). This obviously fails when the function was deployed by a client running on a different host, but the only negative effect I can see is a log message (and the file not actually getting deleted). Since the zip archive is located in a temporary folder, this doesn't seem to be a big deal to me, but if this is important I can rework the delete command so the unlink happens in the client process.
  • I'm having difficulty getting all of the tests to pass in my environment, even without my changes. All the failures are in the the rest-sdk/deploy suite, which if I understand correctly is testing pointing the gcloud tool at the emulator. The failures themselves are unreliable - different specific tests fail on each run - which leads me to suspect there's a timing issue or perhaps a subtle compatibility issue with my version of gcloud (I'm running 153.0.0). As such, I can't be 100% my changes don't introduce a regression, but I don't think they do. I can include details of the failures in a subsequent comment if desired.

kpruden and others added 4 commits April 27, 2017 16:43
Remove the `{grpc,rest,supervisor}Host` arguments and replace with
`host` and `bindHost`. The `host` argument is the hostname at which the
GCF emulator can be reached by clients, while the `bindHost` is the
address to which the emulator will bind its listener(s). This allows
the emulator to be running on a different host from the client, and
supports running the emulator behind a NAT (for example when running
as a docker container).
The `service` argument is currently documented in the wiki..
Always use localhost when calling back into the Supervisor.
@jmdobry
Copy link
Contributor

jmdobry commented May 1, 2017

Great work! 👍

It's not clear to me why supporting separate values for each of these would be useful

I think I was imagining a use case where one might want to scale the supervisor portion of the Emulator horizontally in a sort of dev environment setup (think multiple devs talking to the same Emulator instance). Probably just premature optimization.

zip archive

The local archive may or may not go away in the future, so I don't think that issue is a big deal.

tests

Would you mind posting some output on what's failing? In any case, I will clone your PR and run the tests myself. It would probably be good in another PR for me to make the rest-sdk tests optional.

@kpruden
Copy link
Contributor Author

kpruden commented May 1, 2017

Upon further testing, it turns out the emulator does require filesystem access to the zip archive at deployment-time. If it does not, this error appears in the log:

2017-05-01T22:02:01.446Z - debug: createWorker projects/kpruden-gcf/locations/us-central1/functions/hello
2017-05-01T22:02:01.549Z - error: ERROR: projects/kpruden-gcf/locations/us-central1/functions/hello
2017-05-01T22:02:01.550Z - error:  Error: spawn /usr/bin/node ENOENT
    at exports._errnoException (util.js:1022:11)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:193:32)
    at onErrorNT (internal/child_process.js:359:16)
    at _combinedTickCallback (internal/process/next_tick.js:74:11)
    at process._tickCallback (internal/process/next_tick.js:98:9)

I'm able to work around this in my environment using filesystem sharing between my laptop and the VM running docker. I suppose one could work around this limitation by using a GCS bucket for staging as well...

I'm not sure if the local-filesystem requirement is a deal-breaker for this feature, but if you elect to reject this PR for this reason, I understand :)

@kpruden
Copy link
Contributor Author

kpruden commented May 1, 2017

Here's a recent test run:

https://gist.github.com/kpruden/13b3751e2c70c25fad43084e8e89a3b0

The delete failure is consistent - it looks like gcloud for some reason is adding a newline to the output the test is checking against.

The other failure is representative of the failures I'm seeing, but the individual failures are not reliable - different tests fail each time...

@jmdobry
Copy link
Contributor

jmdobry commented May 2, 2017

Pull your branch and try it now.

@jmdobry
Copy link
Contributor

jmdobry commented May 4, 2017

Do the tests pass for you now after my changes were merged into your branch?

@kpruden
Copy link
Contributor Author

kpruden commented May 4, 2017

Yes, all the tests pass now, thanks!

There is some noisy output however: https://gist.github.com/kpruden/e73d8bec52d7c698247b9e285caa1a57

Do you have any thoughts about my previous comment about filesystem access ?

@jmdobry
Copy link
Contributor

jmdobry commented May 4, 2017

Ah I missed that comment, I'll take a look.

@jmdobry
Copy link
Contributor

jmdobry commented May 4, 2017

Regarding the filesystem access issue, can you further describe what broke? Were you unable to the deploy the function? Unable to call the function? Unable to delete the function? etc.

@jmdobry
Copy link
Contributor

jmdobry commented May 4, 2017

Okay, I think I see what the issue is.

  1. Your function source code is on your local filesystem.
  2. The Emulator (and its worker processes) are somewhere else (another machine, inside a docker container, etc.)
  3. You deploy a function using the local file system method (you do not provide --stage-bucket)
  4. A worker attempts to the load the function code from the file system
  5. The worker fails to load the function code, because it's on a different file system

There's simply no way around this, the worker has to be able to load the function source code from somewhere. The production Cloud Functions works because when you deploy you provide --stage-bucket and your function's source code is uploaded to Cloud Storage, where the production service can access it. With the Emulator, the user takes a shortcut and skips --stage-bucket, in which case the Emulator assumes that it should have direct file access to the function's source code.

Luckily, I built --stage-bucket support into the Emulator. However, when using --stage-bucket, deployment behavior is now much closer to deploying a function to the production service: the source code is zipped and uploaded to Cloud Storage, the source code is downloaded by the service, dependencies are installed, and then the function is ready.


Before I posted this comment, I tested the Emulator's --stage-bucket functionality and found a few bugs. I'll fix those, so you can merge into your PR, and then your PR should be good to go.

@jmdobry
Copy link
Contributor

jmdobry commented May 4, 2017

This PR is unblocked, you can rebase against master and resolve conflicts. Or if you want me to do, check the "allow edits from maintainers" box

@kpruden
Copy link
Contributor Author

kpruden commented May 5, 2017

Your explanation makes sense. I'm not using --stage-bucket because reasons, but I've been able to work around that in my environment. When used in combination with --stage-bucket this feature does have general-purpose utility, so that's good :)

I'm merged master locally and I'm resolving conflicts. I'll push an update shortly..

@kpruden
Copy link
Contributor Author

kpruden commented May 5, 2017

I think I may have found another flaky test:

1 failing

  1) system/cli rest deploy should deploy a function that needs "npm install":

      AssertionError:   # test/system/cli/index.test.js:287

  assert(output.includes(`Function helloUuidNpm deployed.`))
         |      |        |
         |      false    "Function helloUuidNpm deployed."
         "Copying file:///var/folders/jy/3khqrg6d3q3fnnchzykgn74c0000gn/T/us-central1-helloUuidNpm-6784jSI2uei0CS6k.zip [Content-Type=application/zip]...\nWaiting for operation to finish...done.\nDeploying function............."

      + expected - actual

      -false
      +true

      at Decorator._callFunc (node_modules/empower-core/lib/decorator.js:110:20)
      at Decorator.concreteAssert (node_modules/empower-core/lib/decorator.js:103:17)
      at decoratedAssert (node_modules/empower-core/lib/decorate.js:49:30)
      at powerAssert (node_modules/empower-core/index.js:63:32)
      at Context.err (test/system/cli/index.test.js:287:11)

A second test run passed, so it may have been a hiccup on my machine 🤷‍♂️

@@ -614,7 +614,7 @@ class Functions {
}

getSupervisorHost () {
return `http://${this.config.supervisorHost}:${this.config.supervisorPort}`;
return `http://localhost:${this.config.supervisorPort}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why localhost and not ${this.config.host}?

Copy link
Contributor Author

@kpruden kpruden May 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used by code that runs inside the emulator service to talk to itself (lines 392 and 538 - createFunction and deleteFunction). Neither host nor bindHost are guaranteed to be routable from the emulator process. For example, bindHost could be 0.0.0.0 and host could be the IP of the other side of the NAT (the docker host for example).

I forgot about this detail in my original PR description, but this effectively eliminates the possibility of the three services (rest, grpc, supervisor) from running on different hosts. Since the consolidation of the {grpc,rest,supervisor}Host arguments already has this effect, I think this is OK..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thanks for clarifying.

@@ -37,24 +37,18 @@ const EPILOGUE = `Available configuration options:
${OPTIONS.timeout.description}

${'EMULATOR'.underline} - Emulates the Cloud Functions API.
${'grpcHost'.bold}
${OPTIONS.grpcHost.description}

${'grpcPort'.bold}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a bindHost entry is missing from this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, you're right. I'll add it..

Copy link
Contributor

@jmdobry jmdobry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@jmdobry jmdobry merged commit 893efc5 into googlearchive:master May 5, 2017
@jmdobry
Copy link
Contributor

jmdobry commented May 5, 2017

I'm now wondering if there should be some kind of migration. The restHost and grpcHost settings will need to be removed from users' config files (which they can do via functions config reset), but when the upgrade @google-cloud/functions-emulator it'd be nice if the new host option could be pre-populated with the currently configured restHost option.

@jmdobry
Copy link
Contributor

jmdobry commented May 5, 2017

I just pulled down the latest master and ran functions config list -j and noticed that my config now has host and bindHost, but if a use has configured restHost and grpcHost to something different, they would experience a break if the host option doesn't assume the existing restHost value.

@kpruden
Copy link
Contributor Author

kpruden commented May 5, 2017

I thought about that, and mentioned it in the original PR description (it's one point in a long description so I understand if you missed it). Given that this is pre-release software I'm not sure what the stance is on backwards compatibility/migration, so I left this out of the original PR.

If you think this is important, I can probably get something submitted tomorrow. I agree that the failure mode in this case is not obvious, so it's probably worth doing...

@jmdobry
Copy link
Contributor

jmdobry commented May 5, 2017

I added some migration in 904fc36

@kpruden
Copy link
Contributor Author

kpruden commented May 5, 2017

Cool. I took a stab at it myself and came up with a similar fix, but there's a problem - since these options have defaults (localhost) the value of restHost will always be used.

I put up a PR that addresses this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants