Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Synthetics data streams do not show up in data streams list #102651

Closed
ruflin opened this issue Jun 18, 2021 · 14 comments · Fixed by elastic/beats#26774
Closed

[Fleet] Synthetics data streams do not show up in data streams list #102651

ruflin opened this issue Jun 18, 2021 · 14 comments · Fixed by elastic/beats#26774
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability v7.14.0

Comments

@ruflin
Copy link
Member

ruflin commented Jun 18, 2021

I'm using the synthetics integration and it ships data to the synthetics-*-* data streams. But these data streams do not show up in the data streams page in Fleet, only logs and metrics.

My assumption is that it missing in the filter. traces from APM might also be missing.

We should ensure to somehow keep this in sync with the spec supported prefixes: https://github.com/elastic/package-spec/blob/master/versions/1/data_stream/manifest.spec.yml#L121

Version: 7.13.0

@ruflin ruflin added bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team labels Jun 18, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@jen-huang
Copy link
Contributor

@ruflin traces and synthetics are part of our search pattern:

const DATA_STREAM_INDEX_PATTERN = 'logs-*-*,metrics-*-*,traces-*-*,synthetics-*-*';

In your testing were the data streams actually populated in ES and just missing from Fleet?

@ruflin
Copy link
Member Author

ruflin commented Jun 23, 2021

The data streams contain data. See screenshots below from stack management vs what is shown in Fleet. Also check the drop down list:

Screenshot 2021-06-23 at 08 36 01
Screenshot 2021-06-23 at 08 35 31

@jen-huang jen-huang self-assigned this Jun 30, 2021
@jen-huang
Copy link
Contributor

I was able to replicate this and found that although Fleet is finding the synthetics data streams correctly, in the end they get filtered out because the documents do not contain data_stream.* fields. Fleet will filter out data streams whose documents do not contain data_stream.dataset, data_stream.namespace, and data_stream.type:

.filter(({ dataset, namespace, type }) => dataset && namespace && type)

cc @dominiqueclarke, sending this your way per our discussion.

Here is an example HTTP synthetics document I ingested, note the absence of any data_stream.* fields:

      {
        "_index" : ".ds-synthetics-http-default-2021.06.30-000001",
        "_id" : "oW2vXXoBTY3cXbdXNQEz",
        "_score" : 1.0,
        "_source" : {
          "tcp" : {
            "rtt" : {
              "connect" : {
                "us" : 15333
              }
            }
          },
          "summary" : {
            "up" : 1,
            "down" : 0
          },
          "agent" : {
            "hostname" : "Jens-MacBook-Pro-New.local",
            "name" : "Jens-MacBook-Pro-New.local",
            "id" : "e495ae31-bcad-42df-8d92-e954df5de8f1",
            "type" : "heartbeat",
            "ephemeral_id" : "a558eecb-9f62-4b92-b149-3b56dfc4f407",
            "version" : "7.13.2"
          },
          "resolve" : {
            "rtt" : {
              "us" : 29079
            },
            "ip" : "142.251.33.68"
          },
          "monitor" : {
            "duration" : {
              "us" : 252998
            },
            "ip" : "142.251.33.68",
            "name" : "google",
            "check_group" : "9e1b8e24-d9bd-11eb-a689-367ddaa4a3f7",
            "id" : "ec06dd5f-8895-46ab-ab37-a6c452f65c7b",
            "timespan" : {
              "lt" : "2021-06-30T16:12:59.211Z",
              "gte" : "2021-06-30T16:09:59.211Z"
            },
            "fleet_managed" : true,
            "type" : "http",
            "status" : "up"
          },
          "url" : {
            "path" : "/",
            "scheme" : "https",
            "port" : 443,
            "domain" : "www.google.com",
            "full" : "https://www.google.com/"
          },
          "observer" : {
            "geo" : {
              "name" : "Fleet managed"
            },
            "hostname" : "Jens-MacBook-Pro-New.local",
            "ip" : [
              "REDACTED"
            ],
            "mac" : [
              "REDACTED"
            ]
          },
          "@timestamp" : "2021-06-30T16:09:58.958Z",
          "ecs" : {
            "version" : "1.9.0"
          },
          "http" : {
            "rtt" : {
              "response_header" : {
                "us" : 62147
              },
              "total" : {
                "us" : 223504
              },
              "write_request" : {
                "us" : 129
              },
              "content" : {
                "us" : 2159
              },
              "validate" : {
                "us" : 64307
              }
            },
            "response" : {
              "headers" : {
                "Accept-Ranges" : "none",
                "X-Frame-Options" : "SAMEORIGIN",
                "Alt-Svc" : "h3=\":443\"; ma=2592000,h3-29=\":443\"; ma=2592000,h3-T051=\":443\"; ma=2592000,h3-Q050=\":443\"; ma=2592000,h3-Q046=\":443\"; ma=2592000,h3-Q043=\":443\"; ma=2592000,quic=\":443\"; ma=2592000; v=\"46,43\"",
                "Cache-Control" : "private, max-age=0",
                "Server" : "gws",
                "X-Xss-Protection" : "0",
                "Vary" : "Accept-Encoding",
                "Set-Cookie" : [
                  "1P_JAR=2021-06-30-16; expires=Fri, 30-Jul-2021 16:09:59 GMT; path=/; domain=.google.com; Secure",
                  "NID=218=GU87VJNS6mIQ4m7PsYGJE8DLRtU0lrfyeKCEroYgUqSDen_UVhEVowfNVYF9ejf-kXWqouXIPluEeuDiJYpOFmUcvOpDF1JOlM2oMyGJ6l4WBxZcXLTbNyDPd7xxb6HgOrmQSlsOvC-o_VAjDBLpGToFx5SEV9rmxqyjDFeztwY; expires=Thu, 30-Dec-2021 16:09:59 GMT; path=/; domain=.google.com; HttpOnly"
                ],
                "Expires" : "-1",
                "P3p" : "CP=\"This is not a P3P policy! See g.co/p3phelp for more info.\"",
                "Date" : "Wed, 30 Jun 2021 16:09:59 GMT",
                "Content-Type" : "text/html; charset=ISO-8859-1"
              },
              "status_code" : 200,
              "mime_type" : "text/html; charset=utf-8",
              "body" : {
                "bytes" : 13411,
                "hash" : "be4c242d798f698d836d869b6582e2fed3d5c838a4c91df79c7884b12317664b"
              }
            }
          },
          "tls" : {
            "cipher" : "TLS-AES-128-GCM-SHA256",
            "certificate_not_valid_before" : "2021-06-07T03:57:26.000Z",
            "established" : true,
            "server" : {
              "x509" : {
                "not_after" : "2021-08-30T03:57:25.000Z",
                "subject" : {
                  "distinguished_name" : "CN=www.google.com,O=Google LLC,L=Mountain View,ST=California,C=US",
                  "common_name" : "www.google.com"
                },
                "not_before" : "2021-06-07T03:57:26.000Z",
                "public_key_algorithm" : "ECDSA",
                "public_key_curve" : "P-256",
                "signature_algorithm" : "SHA256-RSA",
                "serial_number" : "6817691114215874172539110871462832804",
                "issuer" : {
                  "distinguished_name" : "CN=GTS CA 1O1,O=Google Trust Services,C=US",
                  "common_name" : "GTS CA 1O1"
                }
              },
              "hash" : {
                "sha1" : "74229605f7298631aabf8b0fbf521894ffc84462",
                "sha256" : "ef495a0c31d8727996accb27e051d1d36979d8e13765b3b870039f12ddbc92ed"
              }
            },
            "rtt" : {
              "handshake" : {
                "us" : 143472
              }
            },
            "version" : "1.3",
            "certificate_not_valid_after" : "2021-08-30T03:57:25.000Z",
            "version_protocol" : "tls"
          },
          "event" : {
            "agent_id_status" : "auth_metadata_missing",
            "ingested" : "2021-06-30T16:10:00.244135Z",
            "dataset" : "http"
          }
        }
      }

@jen-huang jen-huang added the Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability label Jun 30, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/uptime (Team:uptime)

@ruflin
Copy link
Member Author

ruflin commented Jul 1, 2021

It is important that heartbeat / synthetics sets the data_stream.* fields. @andrewvc can you take this on?

@dominiqueclarke
Copy link
Contributor

@ruflin. Andrew is away for a few weeks. I am investigating the issue, but we'll likely need a beats contributor to investigate why the fields are not being indexed by heartbeat. Can we clarify the priority for this resolution? Is this needed for 7.14? That'll help us coordinate and figure out if we'll need to look for outside help while Andrew is away.

@ruflin
Copy link
Member Author

ruflin commented Jul 7, 2021

How is synthetics currently filtering down the data? I assume as this issue did not bubble up yet, it is not prefiltering on the data_stream.* fields? That these fields are missing means you can't reliably filter on it. There is an effort that we set the values of these fields already in the package itself and with it in the template, but not sure if this was also applied to the synthetics package (@mtojek ?). This might be a good short term fix.

There are at least 2 ways to add these fields: Elastic Agent magically adds a processor or heartbeat adds the fields. As Heartbeat knows which fields with values to add, this should be right place. Who is working on heartbeat during the time Andrew is out?

@paulb-elastic
Copy link
Contributor

@vigneshshanmugam is this anything you'd be able to look at on the Heartbeat side (/cc @shahzad31)?

Failing that, it would be if either @blakerouse or @urso have any capacity to help here?

@vigneshshanmugam
Copy link
Member

Did a quick look at the code and seems like Heartbeat already has support for data streams and it feels like we are missing the configuration from the integrations side. I am not a heartbeat expert and might be wrong.

Here is the PR for the Data streams support - elastic/beats#24223 And I did a quick test on my local and can confirm it works as expected.

> curl -XGET -H "Content-type: application/json" http://localhost:9200/_cat/indices
green  open .geoip_databases                              BRCc_zP3QsidfL6HcBDqEQ 1 0 38 0  35.8mb  35.8mb
yellow open .ds-synthetics-http-default-2021.07.07-000001 6bH20e0qQy-TRuYE2c-gPw 1 1  3 0  59.1kb  59.1kb

@dominiqueclarke
Copy link
Contributor

@vigneshshanmugam and I synced offline and confirmed that, while we do have support for data streams, the data stream documents are missing the necessary fields. These fields, as far as I am aware, will need to be indexed on the heartbeat side as the integration package is passing information for all the necessary fields within the integration policy.

@vigneshshanmugam
Copy link
Member

vigneshshanmugam commented Jul 7, 2021

We were able to verify that the documents were missing the data_stream.* fields. Did a quick fix for the same elastic/beats#26774 and can confirm the documents are having the necessary fields. I don't know much about the integrations side of things, If anyone can do quick test to confirm the fix, it would be great.

@ruflin
Copy link
Member Author

ruflin commented Jul 8, 2021

If the fields are now shipped by heartbeat directly, I don't think any changes to the integrations are needed. Can you test that also a change to the namespace works as expected?

@vigneshshanmugam
Copy link
Member

I did a test with the fix using an example integration policy @dominiqueclarke has sent over to me and it works as expected.

Integration Policy
- id: 211f80e4-1903-4cf5-aa3b-df683ab2722e
   name: Sample Monitor 3/3
   revision: 1
   type: synthetics/http
   use_output: default
   meta:
     package:
       name: synthetics
       version: 0.1.46
   data_stream:
     namespace: default
   streams:
     - id: synthetics/http-http-211f80e4-1903-4cf5-aa3b-df683ab2722e
       name: Sample Monitor 3/3
       type: http
       data_stream:
         dataset: http
         type: synthetics
       urls: 'http://elastic.co'
       service.name: null
       schedule: '@every 5s'
       timeout: 1600
       max_redirects: 0
       proxy_url: ''
       tags:
         - tag1
         - tag2

Also the fix is landed in master and also back ported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team Team:Uptime - DEPRECATED Synthetics & RUM sub-team of Application Observability v7.14.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants