Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

countdash example doesn't work #6463

Closed
kneufeld opened this issue Oct 10, 2019 · 6 comments
Closed

countdash example doesn't work #6463

kneufeld opened this issue Oct 10, 2019 · 6 comments

Comments

@kneufeld
Copy link
Contributor

Nomad version

Nomad v0.10.0-beta1 (7df4da7)

Operating system and Environment details

Ubuntu 18.04
Docker version 19.03.2, build 6a30dfc

Issue

After failing to get my own service running using sidecar_service I tried to use the example countdash and it doesn't work either. I can get the dash board but it complains that counting service is unreachable.

Near as I can tell the api port isn't correctly created.

Reproduction steps

this would be easier if there was a github repo of countdash.nomad and the two dockerfiles

Made countdash.nomad from https://www.hashicorp.com/blog/consul-connect-integration-in-hashicorp-nomad

nomad run countdash.nomad

Job file (if appropriate)

job "countdash" {

   datacenters = ["dc1"] # changed to 'west' for my environment

   group "api" {
     network {
       mode = "bridge"
     }

     service {
       name = "count-api"
       port = "9001"

       connect {
         sidecar_service {}
       }
     }

     task "web" {
       driver = "docker"
       config {
         image = "hashicorpnomad/counter-api:v1"
       }
     }
   }

   group "dashboard" {
     network {
       mode ="bridge"
       port "http" {
         static = 9002
         to     = 9002
       }
     }

     service {
       name = "count-dashboard"
       port = "9002"

       connect {
         sidecar_service {
           proxy {
             upstreams {
               destination_name = "count-api"
               local_bind_port = 8080
             }
           }
         }
       }
     }

     task "dashboard" {
       driver = "docker"
       env {
         COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_count_api}"
       }
       config {
         image = "hashicorpnomad/counter-dashboard:v1"
       }
     }
   }
 }

Nomad Client logs (if appropriate)

Here's my only clue, lots of errors/warnings from /alloc/logs/connect-proxy-count-api.stderr.0

[2019-10-10 15:21:39.721][1][info][main] [source/server/server.cc:238] initializing epoch 0 (hot restart version=11.104)
[2019-10-10 15:21:39.721][1][info][main] [source/server/server.cc:240] statically linked extensions:
[2019-10-10 15:21:39.721][1][info][main] [source/server/server.cc:242]   access_loggers: envoy.file_access_log,envoy.http_grpc_access_log
[2019-10-10 15:21:39.721][1][info][main] [source/server/server.cc:245]   filters.http: envoy.buffer,envoy.cors,envoy.csrf,envoy.ext_authz,envoy.fault,envoy.filters.http.dynamic_forward_proxy,envoy.filters.http.grpc_http1_reverse_bridge,envoy.filters.http.header_to_metadata,envoy.filters.http.jwt_authn,envoy.filters.http.original_src,envoy.filters.http.rbac,envoy.filters.http.tap,envoy.grpc_http1_bridge,envoy.grpc_json_transcoder,envoy.grpc_web,envoy.gzip,envoy.health_check,envoy.http_dynamo_filter,envoy.ip_tagging,envoy.lua,envoy.rate_limit,envoy.router,envoy.squash
[2019-10-10 15:21:39.721][1][info][main] [source/server/server.cc:248]   filters.listener: envoy.listener.original_dst,envoy.listener.original_src,envoy.listener.proxy_protocol,envoy.listener.tls_inspector
[2019-10-10 15:21:39.721][1][info][main] [source/server/server.cc:251]   filters.network: envoy.client_ssl_auth,envoy.echo,envoy.ext_authz,envoy.filters.network.dubbo_proxy,envoy.filters.network.mysql_proxy,envoy.filters.network.rbac,envoy.filters.network.sni_cluster,envoy.filters.network.thrift_proxy,envoy.filters.network.zookeeper_proxy,envoy.http_connection_manager,envoy.mongo_proxy,envoy.ratelimit,envoy.redis_proxy,envoy.tcp_proxy
[2019-10-10 15:21:39.721][1][info][main] [source/server/server.cc:253]   stat_sinks: envoy.dog_statsd,envoy.metrics_service,envoy.stat_sinks.hystrix,envoy.statsd
[2019-10-10 15:21:39.721][1][info][main] [source/server/server.cc:255]   tracers: envoy.dynamic.ot,envoy.lightstep,envoy.tracers.datadog,envoy.tracers.opencensus,envoy.zipkin
[2019-10-10 15:21:39.721][1][info][main] [source/server/server.cc:258]   transport_sockets.downstream: envoy.transport_sockets.alts,envoy.transport_sockets.tap,raw_buffer,tls
[2019-10-10 15:21:39.721][1][info][main] [source/server/server.cc:261]   transport_sockets.upstream: envoy.transport_sockets.alts,envoy.transport_sockets.tap,raw_buffer,tls
[2019-10-10 15:21:39.721][1][info][main] [source/server/server.cc:267] buffer implementation: old (libevent)
[2019-10-10 15:21:39.724][1][warning][misc] [source/common/protobuf/utility.cc:199] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cds.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details.
[2019-10-10 15:21:39.728][1][info][main] [source/server/server.cc:322] admin address: 127.0.0.1:19000
[2019-10-10 15:21:39.729][1][info][main] [source/server/server.cc:432] runtime: layers:
  - name: base
    static_layer:
      {}
  - name: admin
    admin_layer:
      {}
[2019-10-10 15:21:39.729][1][warning][runtime] [source/common/runtime/runtime_impl.cc:497] Skipping unsupported runtime layer: name: "base"
static_layer {
}

[2019-10-10 15:21:39.729][1][info][config] [source/server/configuration_impl.cc:61] loading 0 static secret(s)
[2019-10-10 15:21:39.729][1][info][config] [source/server/configuration_impl.cc:67] loading 1 cluster(s)
[2019-10-10 15:21:39.733][1][info][upstream] [source/common/upstream/cluster_manager_impl.cc:144] cm init: initializing cds
[2019-10-10 15:21:39.735][1][info][config] [source/server/configuration_impl.cc:71] loading 0 listener(s)
[2019-10-10 15:21:39.735][1][info][config] [source/server/configuration_impl.cc:96] loading tracing configuration
[2019-10-10 15:21:39.735][1][info][config] [source/server/configuration_impl.cc:116] loading stats sink configuration
[2019-10-10 15:21:39.736][1][info][main] [source/server/server.cc:516] starting main dispatch loop
[2019-10-10 15:21:39.736][1][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination
[2019-10-10 15:21:39.736][1][info][upstream] [source/common/upstream/cluster_manager_impl.cc:148] cm init: all clusters initialized
[2019-10-10 15:21:39.736][1][info][main] [source/server/server.cc:500] all clusters initialized. initializing init manager
[2019-10-10 15:21:39.977][1][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination
[2019-10-10 15:21:39.977][1][info][config] [source/server/listener_manager_impl.cc:761] all dependencies initialized. starting workers
[2019-10-10 15:21:40.869][1][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination
[2019-10-10 15:21:42.574][1][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination
[2019-10-10 15:21:46.028][1][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination
[2019-10-10 15:21:46.645][1][warning][config] [bazel-out/k8-opt/bin/source/common/config/_virtual_includes/grpc_stream_lib/common/config/grpc_stream.h:87] gRPC config stream closed: 14, upstream connect error or disconnect/reset before headers. reset reason: connection termination
...
@angrycub
Copy link
Contributor

I was unable to replicate your error myself. I copied the job specification from the guide as you mentioned and ran it with no issue on my nodes. Some follow-up questions:

  • Are you running this in a cluster or on a dev agent?
  • If you aren't using -dev agents, have you enabled GRPC and connect in the config of your Nomad-cluster consul agents?
  • Have you enabled connect on your Consul servers?
  • Could you provide a copy of your Consul configuration (please remove any sensitive information)?

I appreciate the effort to collect this information and hope it will help us get your issue resolved.

@kneufeld
Copy link
Contributor Author

The problem was that I didn't have grpc enabled in consul, once I did that the example started working.

Where does it say that grpc is required? I looked around again because I clearly missed it the first time but still can't find it.

@shantanugadgil
Copy link
Contributor

@kneufeld the example runs fine in -dev mode where the options are enabled. The difference in behavior between dev and cluster mode was "discovered" by community members 😜

https://discuss.hashicorp.com/t/nomad-consul-connect/2806/3?u=shantanugadgil

@angrycub
Copy link
Contributor

We added information to the Connect guide in the documentation (which was based on that blog post) as well to help clarify that for folks - https://www.nomadproject.io/guides/integrations/consul-connect/index.html Was there another place that you happened to look in the documentation? I ask to see if there are other places we can put some signposts. I will also see if there is any way to add the information retroactively to the blog post.

@kneufeld
Copy link
Contributor Author

I just did a search on that page and the only instance of grpc is

Consul Connect HTTP and gRPC checks are not yet supported

I'm closing this bug as a doc bug seems like a new issue. Thanks for your quick help.

picatz added a commit to picatz/terraform-google-nomad that referenced this issue Jul 27, 2020
Should have Nomad and Consul deployed and configured with mTLS. ACLs are currently not enabled on Consul, only Nomad.

This should provide the minimal working example using mTLS to get the cought dashboard working after a ton of tinkering. 😭

The links I used during my investigation/debugging session:
* hashicorp/nomad#6463
* https://learn.hashicorp.com/nomad/consul-integration/nomad-connect-acl#run-a-connect-enabled-job
* hashicorp/nomad#6594
* hashicorp/nomad#4276
hashicorp/nomad#7715
* https://www.consul.io/docs/agent/options
⭐ * hashicorp/nomad#7602
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants