Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EMT-179: implement metadata query versioning based on ingest manager installed ES assets #77252

Conversation

nnamdifrankie
Copy link
Contributor

@nnamdifrankie nnamdifrankie commented Sep 10, 2020

Summary

https://github.com/elastic/security-team/issues/179
#76545

  • introduce query strategy pattern that can either be specified or selected based on ingest package information.
  • expose PackageService in ingest manager to return list of installed es assets in a specific package.
  • add test for both versions of query.
  • minimally refactored code to allow for changes.
fejoh-mbp:queries fejoh$ curl -X POST -H 'kbn-xsrf: xxx' --user elastic:changeme -H "Content-Type: application/json" http://localhost:5601/api/endpoint/v1/metadata | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2070  100  2070    0     0  11694      0 --:--:-- --:--:-- --:--:-- 11694
{
  "request_page_size": 10,
  "request_page_index": 0,
  "hosts": [
    {
      "metadata": {
        "agent": {
          "id": "ac427df3-d59d-4a37-8b13-5e97b587c37e",
          "type": "endpoint",
          "version": "6.8.1"
        },
        "@timestamp": 1599944080721,
        "Endpoint": {
          "status": "enrolled",
          "policy": {
            "applied": {
              "name": "Default",
              "id": "00000000-0000-0000-0000-000000000000",
              "status": "success"
            }
          }
        },
        "elastic": {
          "agent": {
            "id": "ee74dd4a-a110-4b2e-bdba-733a142e157b"
          }
        },
        "host": {
          "hostname": "Host-3f69srrhhq",
          "os": {
            "Ext": {
              "variant": "Windows Server"
            },
            "name": "windows 10.0",
            "family": "Windows",
            "version": "10.0",
            "platform": "Windows",
            "full": "Windows Server 2016"
          },
          "ip": [
            "10.196.41.234",
            "10.187.235.9",
            "10.241.178.162"
          ],
          "name": "Host-3f69srrhhq",
          "id": "1491b636-75df-4a02-a639-7dc295336314",
          "mac": [
            "71-9e-b3-2b-6-93",
            "8b-8d-78-f2-d2-b0"
          ],
          "architecture": "agddpyot36"
        },
        "event": {
          "ingested": "2020-09-12T20:54:42.221185Z",
          "created": 1599944080721,
          "kind": "metric",
          "module": "endpoint",
          "action": "endpoint_metadata",
          "id": "684c81ca-4a85-480a-a63b-518797c22dc5",
          "category": [
            "host"
          ],
          "type": [
            "info"
          ],
          "dataset": "endpoint.metadata"
        }
      },
      "host_status": "error"
    }
  ],
  "total": 1,
  "query_strategy_version": "v1"
}
fejoh-mbp:queries fejoh$ curl -X POST -H 'kbn-xsrf: xxx' --user elastic:changeme -H "Content-Type: application/json" http://localhost:5601/api/endpoint/metadata | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2070  100  2070    0     0  20495      0 --:--:-- --:--:-- --:--:-- 20495
{
  "request_page_size": 10,
  "request_page_index": 0,
  "hosts": [
    {
      "metadata": {
        "agent": {
          "id": "7f027429-2368-48c0-bbe9-656b248c26a2",
          "type": "endpoint",
          "version": "6.4.1"
        },
        "@timestamp": 1599944083571,
        "Endpoint": {
          "status": "enrolled",
          "policy": {
            "applied": {
              "name": "With Eventing",
              "id": "00000000-0000-0000-0000-000000000000",
              "status": "failure"
            }
          }
        },
        "elastic": {
          "agent": {
            "id": "b628ea2a-9900-4920-8195-46266d144dca"
          }
        },
        "host": {
          "hostname": "Host-27l8alqr9n",
          "os": {
            "Ext": {
              "variant": "Windows Server"
            },
            "name": "windows 10.0",
            "family": "Windows",
            "version": "10.0",
            "platform": "Windows",
            "full": "Windows Server 2016"
          },
          "ip": [
            "10.85.28.122",
            "10.129.85.251",
            "10.225.157.171"
          ],
          "name": "Host-27l8alqr9n",
          "id": "6a81e318-398b-4204-b3d7-06d5af6a8659",
          "mac": [
            "87-ab-c2-58-cb-b2"
          ],
          "architecture": "cz5kyjdjyn"
        },
        "event": {
          "ingested": "2020-09-12T20:54:43.695403Z",
          "created": 1599944083571,
          "kind": "metric",
          "module": "endpoint",
          "action": "endpoint_metadata",
          "id": "24e42a07-cf54-41c2-851e-448db511312b",
          "category": [
            "host"
          ],
          "type": [
            "info"
          ],
          "dataset": "endpoint.metadata"
        }
      },
      "host_status": "error"
    }
  ],
  "total": 1,
  "query_strategy_version": "v2"
}

@nnamdifrankie nnamdifrankie added release_note:skip Skip the PR/issue when compiling release notes v7.9.0 v7.10.0 v8.0.0 and removed v7.9.0 labels Sep 10, 2020
# Conflicts:
#	x-pack/plugins/security_solution/server/endpoint/endpoint_app_context_services.ts
#	x-pack/plugins/security_solution/server/endpoint/routes/metadata/index.ts
#	x-pack/plugins/security_solution/server/endpoint/routes/metadata/query_builders.test.ts
#	x-pack/plugins/security_solution/server/endpoint/routes/metadata/query_builders.ts
#	x-pack/test/security_solution_endpoint_api_int/apis/metadata.ts
@nnamdifrankie nnamdifrankie marked this pull request as ready for review September 13, 2020 22:59
@nnamdifrankie nnamdifrankie requested a review from a team as a code owner September 13, 2020 22:59
@nnamdifrankie nnamdifrankie requested a review from a team September 13, 2020 22:59
@nnamdifrankie nnamdifrankie requested a review from a team as a code owner September 13, 2020 22:59
@nnamdifrankie
Copy link
Contributor Author

@elasticmachine merge upstream

@nnamdifrankie nnamdifrankie changed the title EMT-179: initial refactor for versioning EMT-179: implement metadata query versioning based on ingest manager installed ES assets Sep 13, 2020
@botelastic botelastic bot added the Team:Fleet Team label for Observability Data Collection Fleet team label Sep 14, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/ingest-management (Team:Ingest Management)

@@ -8,6 +8,7 @@ export const eventsIndexPattern = 'logs-endpoint.events.*';
export const alertsIndexPattern = 'logs-endpoint.alerts-*';
export const metadataIndexPattern = 'metrics-endpoint.metadata-*';
export const metadataCurrentIndexPattern = 'metrics-endpoint.metadata_current-*';
export const metadataTransformPrefix = 'metrics-endpoint.metadata-current-default';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm stumbling over 3 - in here as dataset / namespace should not contain -

Copy link
Contributor Author

@nnamdifrankie nnamdifrankie Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin

This is actually the dataset namespace metrics-endpoint.metadata. The current-default is the name of the transform file as current-default.json. I will be moving the transform as per your request to the metrics-endpoint.metadata_current and use the filename default.json, which means the assets name will be prefixed metrics-endpoint.metadata_current-default. I am working a PR for that, but this is not dependent on that as I will have to test everything end to end before the PR is opened.

*/

const IGNORED_ELASTIC_AGENT_IDS = [
'00000000-0000-0000-0000-000000000000',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit hacky to be honest. I wonder if we could find a better way to do this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is done because the Endpoint will send up documents before getting the first Integration Policy from the Agent. As a result, we'll have IDs that we can't compare against any Agents to determine if an Agent or Endpoint is unenrolled.

An alternative solution, I think, would require some form of messaging from the Endpoint itself as we will not be able to figure our which Agent the Endpoint comes from. @ferullo had said in the past the we should continue to send this initial document, which I agree with since it can signify an important step in the lifecycle and help with debugging later. It's just very difficult to tell when an Endpoint is unenrolled or stopped, etc.

Another long term solution would be to form our views to look for Agents first. Then enrich the list of Agents with Endpoint data. If we cannot find Endpoint data for that Agent, we show some message (Endpoint pending, etc...). I think this could be a vision for more Integration specific views in Ingest. Per Integration, we could see a list of Agents with some Integration specific (i.e. Endpoint) data enrichments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if endpoint did not receive a policy yet, I assume it already knows about the Agent id under which it is running? Or is this only sent down with the policy?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if endpoint did not receive a policy yet, I assume it already knows about the Agent id under which it is running? Or is this only sent down with the policy?

This is only sent down with the Policy. This "initial" policy is hardcoded and ships with the Endpoint. We don't get updated information until the Endpoint gets the first Policy from Agent. These initial documents are sent after the ES connection is established. Is this correct @ferullo ? Just want to make sure I'm not misrepresenting anything

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakerouse @michalpristas It would be nice if we could ship down the agent id already as part of the initial "negotiation" between Agent and Endpoint.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin The initial configuration that we pass to Endpoint already contains the fleet.agent.id. At no point do we send Endpoint a configuration without fleet.agent.id.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blakerouse If I understood the above correct, endpoint already starts to send events before they receive the first policy. If that is the case, is there a way we could have the agent.id as part of the "initiation" phase?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin I question how that is possible, being that Endpoint would not even know the elasticsearch output information until Agent sends them the configuration that already includes fleet.agent.id.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I question how that is possible, being that Endpoint would not even know the elasticsearch output information until Agent sends them the configuration that already includes fleet.agent.id.

@ferullo my understanding is that the Endpoint has the initial documents ready to send and this is essentially a flush?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Endpoint caches data internally to send later when it does not have an active connection to ES.

Comment on lines +280 to +288
packageService: {
getInstalledEsAssetReferences: async (
savedObjectsClient: SavedObjectsClientContract,
pkgName: string
): Promise<EsAssetReference[]> => {
const installation = await getInstallation({ savedObjectsClient, pkgName });
return installation?.installed_es || [];
},
},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jfsiii please let me if you want this moved to another file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for asking. I think it'd ideally it'd be defined elsewhere, but I'm fine shipping as-is and leaving that for later. I'll find/create a ticket and reference this there.

): Promise<MetadataQueryStrategy>;
}

export const createMetadataService = (packageService: PackageService): MetadataService => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this get called on package upgrade/downgrade? Does the Kibana server need to be restarted?

@kevinlog
Copy link
Contributor

Pulled it down and tried it with the UI, so far everything is working. I'll also test it with an old package to make sure the UI still works without the upgrade - @nnamdifrankie you said that we can force install an older package, how does that work?

@nnamdifrankie
Copy link
Contributor Author

nnamdifrankie commented Sep 15, 2020

@kevinlog

you said that we can force install an older package, how does that work?

I have not been able to get a smooth transition to work for upgrade and downgrade. Right now the steps that have worked for me is the following:

  1. Start ES and Kibana

  2. Force install the last version of the endpoint package curl -X POST -H 'kbn-xsrf: xxx' --user elastic:changeme -H "Content-Type: application/json" -d '{"force" : true}' http://localhost:5601/api/ingest_manager/epm/packages/endpoint-0.15.0

  3. run the generator npx yarn test:generate --auth elastic:changeme --nd 3 --ne 2

  4. Visit the endpoint page to see that it still get data from the metadata index. or run the query curl -X POST -H 'kbn-xsrf: xxx' --user elastic:changeme -H "Content-Type: application/json" http://localhost:5601/api/endpoint/metadata | jq .

  5. Ideally we should do a force install on 0.16.0-dev.0, but there appear to be some incompatible changes between the two version around datastream. Therefore the only workable approach is to restart ES and Kibana.

  6. Either run setup by visiting ingest manager page or run the endpoint generator. And visit the endpoint page again. You should also see the "query_strategy_version": "v2" change based on the endpoint package.

Copy link
Contributor

@jfsiii jfsiii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for Ingest Manger changes (adding/exposing PackageService)

Comment on lines +280 to +288
packageService: {
getInstalledEsAssetReferences: async (
savedObjectsClient: SavedObjectsClientContract,
pkgName: string
): Promise<EsAssetReference[]> => {
const installation = await getInstallation({ savedObjectsClient, pkgName });
return installation?.installed_es || [];
},
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for asking. I think it'd ideally it'd be defined elsewhere, but I'm fine shipping as-is and leaving that for later. I'll find/create a ticket and reference this there.

@nnamdifrankie
Copy link
Contributor Author

@elasticmachine merge upstream

@kevinlog
Copy link
Contributor

Was able to confirm that the data in the UI still comes back correctly when I install an older transform. code and functionality LGTM!

image

Copy link
Contributor

@paul-tavares paul-tavares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍
Had a few minor suggestion/questions, but all optional.

)
),
total: totalNumberOfHosts,
query_strategy_version: hostListQueryResult.queryStrategyVersion,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🕺
Awesome. So the UI will use this new response property to conditionally show the KQL bar if version is v2 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

return endpointAppContext.logFactory.get('metadata');
};
export const BASE_ENDPOINT_ROUTE = '/api/endpoint';
export const METADATA_REQUEST_V1_ROUTE = `${BASE_ENDPOINT_ROUTE}/v1/metadata`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These routes are all for non-UI usage, correct? From the UI side, we will continue to use the METADATA_REQUEST_ROUTE defined below, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is transparent to the UI

Comment on lines +17 to +18
export const METADATA_REQUEST_ROUTE = `${BASE_ENDPOINT_ROUTE}/metadata`;
export const GET_METADATA_REQUEST_ROUTE = `${METADATA_REQUEST_ROUTE}/{id}`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: it would be great if these two const could be elevated to x-pack/plugins/security_solution/common/endpoint/constants.ts so that the UI can also use it. See the entries there now for Trusted apps for a reference

{
path: `${GET_METADATA_REQUEST_ROUTE}`,
validate: GetMetadataRequestSchema,
options: { authRequired: true, tags: ['access:securitySolution'] },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q. What does the tags do?

Because I may have missed that in the Trusted Apps APIs. 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is for space access control

@@ -0,0 +1,103 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: name this file mocks. Seems to be common across kibana. Also, I often see the function names have the word mock in it - ex. createV1SearchResponseMock()

@kevinlog
Copy link
Contributor

@elasticmachine merge upstream

@nnamdifrankie
Copy link
Contributor Author

@elasticmachine merge upstream

@nnamdifrankie
Copy link
Contributor Author

@elasticmachine merge upstream

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

async chunks size

id value diff baseline
securitySolution 10.1MB +624.0B 10.1MB

distributable file count

id value diff baseline
default 45964 +4 45960

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@nnamdifrankie nnamdifrankie merged commit 8bfdefe into elastic:master Sep 17, 2020
@nnamdifrankie nnamdifrankie deleted the EMT-179-version-metadata-api-and-queries branch September 17, 2020 01:27
nnamdifrankie added a commit to nnamdifrankie/kibana that referenced this pull request Sep 18, 2020
…installed ES assets (elastic#77252)

* EMT-179: initial refactor for versioning

* EMT-179: move things before pulling from master

* EMT-179: fix build

* EMT-179: clean up

* EMT-179: add ingest hook, and improve all tests

* EMT-179: fix build

* EMT-179: clean up

* EMT-179: fix build

* EMT-179: fix build

* EMT-179: clean up

* EMT-179: more clean up

* EMT-179: clean up

* EMT-179: fix build

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
nnamdifrankie added a commit that referenced this pull request Sep 18, 2020
…installed ES assets (#77252) (#77891)

EMT-179: implement metadata query versioning based on ingest manager installed ES assets
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:skip Skip the PR/issue when compiling release notes Team:Fleet Team label for Observability Data Collection Fleet team v7.10.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants