Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce migration script for data in Kibana files #9998

Merged
merged 6 commits into from
Feb 5, 2019

Conversation

ruflin
Copy link
Member

@ruflin ruflin commented Jan 10, 2019

With 7.x the data structure of the modules was changed to ECS. This has an affect on all Kibana files as the fields changed. For the migration from 6.x to 7.x it is necessary that 6.x and 7.x dashboards can coexist. To not overwrite dashboards in 7.x all Kibana files must have a changed id and to differentiate them in the UI, a different title.

This PR does the following changes:

  • Reads all changed files from ecs-migration.yml and replaces these fields in all Kibana files
  • Reads all ids used in the Kibana files and postfixes these ids with -7x
  • Reads all titles in the Kibana files and appends 7.x to the titles.

The change is impemented so the script can be run multiple times. This is helpful if we make changes later on again to a few dashboards we can rerun the script without having double changes.

Note: I initially wanted to go with the approach to parse the json and then replace fields. But it turned out the find / replace approach I use here works well, is simpler and also catches fields which I didn't think of at first.

@ruflin ruflin added the in progress Pull request is currently in progress. label Jan 10, 2019
@ruflin ruflin self-assigned this Jan 10, 2019
@ruflin ruflin requested review from a team as code owners January 10, 2019 16:06
@ruflin ruflin requested review from a team as code owners January 11, 2019 08:26
@ruflin ruflin changed the title [WIP] Replace fields in dashboards Introduce migration script for data in Kibana files Jan 11, 2019
@ruflin ruflin added the Team:Integrations Label for the Integrations team label Jan 11, 2019
@ruflin ruflin force-pushed the find-replace-dashboards branch 2 times, most recently from 40fb60d to 1d4e16b Compare January 11, 2019 08:38
@ruflin ruflin added the review label Jan 11, 2019
@ruflin ruflin mentioned this pull request Jan 11, 2019
@ruflin
Copy link
Member Author

ruflin commented Jan 11, 2019

@urso @exekias @jsoriano @andrewkroh @webmat This should be ready for a first round of review. So far I mainly tested with the system dashboards. There all looks find. Do you see some other data that also needs migration?

@simianhacker In the Kibana files I have a few more ids for which I was not sure if they should also be unique or if it's ok if they coexist with the 6.x Kibana files. An example looks as following:

 "series": [
    {
        "axis_position": "right",
        "chart_type": "line",
        "color": "#68BC00",
        "fill": 0.5,
        "formatter": "number",
        "id": "6984af11-4d5d-11e7-aa29-87a97a796de6",
        "label": "In Packetloss",
        "line_width": 1,
        "metrics": [
            {
                "field": "system.network.in.dropped",
                "id": "6984af12-4d5d-11e7-aa29-87a97a796de6",
                "type": "max"
            }
        ],

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code-wise it looks good to me, not so sure about the general idea.

For the migration from 6.x to 7.x it is necessary that 6.x and 7.x dashboards can coexist.

I guess there are not many alternatives, but I am afraid of that. I think it can be very confusing for users upgrading from 6.x to 7.x. There will be people with mixed versions for sure, at least during some time, and this has some potential problems for the user experience:

  • They will need to use two dashboards to have a complete overview
  • There can be mixed data in dashboards as index patterns and some fields are still shared (should we also add filters checking for the presence of ECS fields to avoid that?)
  • What links and fields will be used in the curated views?

It'd be great if Kibana could optionally fallback to aliases when a field is not found in a document.

Also, I'd hope to don't have so many breaking changes on dashboards on 8.x 👼 If we only wish to maintain the dashboards we will be reusing ids, but it doesn't make sense if they include a version. We could use a different qualifier for the ids not so coupled to the version like ecs. For the titles it is probably ok to have the versions as we can change them later.

"type": "tile_map"
}
},
"id": "ML-Nginx-Access-Map",
"id": "ML-Nginx-Access-Map-7x-7x",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated 7x?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I tried to prevent these double replaces. Need to check if it's just a left over and I need to migrate it again or if it is one of the edge cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was an edge case, hopefully fixed now.

rename_entries(titles)


def get_replacable_ids():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaceable?

script/kibana-migration.py Show resolved Hide resolved
script/kibana-migration.py Show resolved Hide resolved
script/kibana-migration.py Show resolved Hide resolved
@jsoriano
Copy link
Member

Do you see some other data that also needs migration?

apache2 renaming #9402 will also need migration if merged.

@ruflin
Copy link
Member Author

ruflin commented Jan 11, 2019

@jsoriano Let me elaborate on the migration path:

  • User has 6.x, upgrades to 7.x, has dashboards already loaded: This user will enable migration.enabled: true in the Beat and the 6.x dashboards will show both data, 6.x and 7.x
  • User has upgraded all Beats to 7.x. migration.enabled: false is set and the 7.x dashboards are loaded. All new data will now be in 7.x dashboards.
  • New users load the 7.x dashboards and have migration.enabled: false from the beginning and will never see the "old" fields or dashboards.

For your two other points:

  • For the links used in the curated view: Good questions, let's answer this when we get there.
  • Filtering for 7.x data only: I kind of like this idea but at the same time I'm worried about modifying all dashboards. We should check in more detail if 6.x data actually breaks the dashboards or it means only partial data is shown.

In general this PR should not be merged before all module changes to ECS are merged.#8655

@@ -17,20 +17,20 @@
}
}
},
"title": "Navigation",
"title": "Navigation-7x 7.x",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be some artifact of prior runs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When running locally, it causes the same problem, so this seems to be a problem with the script as it is right now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a rerun. Made quite a bit of improvements during development which should have solved this one 👍

"savedSearchId": "bf3d23b0-d37c-11e7-9914-4982455b3063",
"title": "TLS Versions",
"savedSearchId": "bf3d23b0-d37c-11e7-9914-4982455b3063-7x",
"title": "TLS Version 7.xs 7.x",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one does seem to be an artifact of a previous run, however. I don't see this in my resulting file

ruflin added a commit to ruflin/beats that referenced this pull request Jan 15, 2019
The Kafka had the same id as the title which complicates the automatic conversion to 7.x dashboards, see elastic#9998. This is extracted from the migration PR to be able to easily reset the other data again when needed.
ruflin added a commit that referenced this pull request Jan 16, 2019
The Kafka had the same id as the title which complicates the automatic conversion to 7.x dashboards, see #9998. This is extracted from the migration PR to be able to easily reset the other data again when needed.
@ruflin ruflin removed the in progress Pull request is currently in progress. label Feb 4, 2019
@ruflin
Copy link
Member Author

ruflin commented Feb 4, 2019

@webmat @andrewkroh This is ready for review. It would be great if you could also skim through it to see if you see something that does not look ok. It also made changes to the packetbeat dashboards including some fields like method but the renaming looks ok.

@webmat For Winlogbeat I would not block this PR but if it gets in before merging this one, even better.

@ruflin
Copy link
Member Author

ruflin commented Feb 4, 2019

This PR changes all the id's of the dashboards. The Add Data UI in Kibana uses these id's to link to the dashboards. As soon as this PR is merged, we must update all these id's accordingly.

@elastic/beats @tsg @andrewkroh @elastic/infrastructure FYI

Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason GH isn't rendering all of the diff for me, but looking at it locally I see some changes to Packetbeat-{mongodb,mysql,pgsql,thrift}.json that aren't right w.r.t. the method changes. They still use method and not http.request.method.

The pgsql dashboard change of path to url.path shouldn't happen. In fact after checking the code, it doesn't produce a path field at all.

@ruflin
Copy link
Member Author

ruflin commented Feb 5, 2019

For the method I think the reason is:

- from: method
  to: http.request.method
  alias: false
  comment: Field is used by serveral protocols.
  beat: packetbeat

There "should" already be a check to not rename fields with alias: false but seems that does not work as expected.

For path to url.path I need to investigate further because I can't even find this in the ecs-migration.yml file.

@ruflin
Copy link
Member Author

ruflin commented Feb 5, 2019

@andrewkroh Could you check what your local commit id is? I couldn't find the problems you mentioned above anymore in the code. I know they were there before I pushed last time but they should have been fixed. The current id is 2afaa0b.

@andrewkroh
Copy link
Member

I deleted the branch that I pulled so I'm not sure what I was looking at, sorry. I must have been looking looking at one of the old commits.

But the diff at https://patch-diff.githubusercontent.com/raw/elastic/beats/pull/9998.diff looks good to me. I see no issues with Packetbeat.

With 7.x the data structure of the modules was changed to ECS. This has an affect on all Kibana files as the fields changed. For the migration from 6.x to 7.x it is necessary that 6.x and 7.x dashboards can coexist. To not overwrite dashboards in 7.x all Kibana files must have a changed id and to differentiate them in the UI, a different title.

This PR does the following changes:

* Reads all changed files from ecs-migration.yml and replaces these fields in all Kibana files. It also replaces fields inside queries and filters
* Reads all ids used in the Kibana files and postfixes these ids with `-7x`. It also replaces ids existing in links between dashboards.
* Reads all titles in the Kibana files and appends ` 7.x` to the titles.

The change is impemented so the script can be run multiple times. This is helpful if we make changes later on again to a few dashboards we can rerun the script without having double changes.

improve script

rerun for packetbeat

manual fix for kafka

fix wrong alias

update script to accept rename: false

checkout dashboards

delete dashboards

update migration script to use ECS

checkout again

Command to reset dashboards to master again to reapply. This should be run from inside the script directory.

```
git checkout master ../*/_meta/kibana/7/dashboard/*.json ../*/module/*/_meta/kibana/7/dashboard/*.json ../heartbeat/monitors/active/*/_meta/kibana/7/dashboard/*.json ../x-pack/*/module/*/_meta/kibana/7/dashboard/*.json
```

apply changes to all dashboards

reset dashboards

change all dashboards

reset dashboards

change dashboards

reset fileset.name and metricset.name change

remove heartbeat dashboards
Copy link
Member Author

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went through this PR and only found one place where things look odd. In RabbitMQ it seems titles and ID's overlap. I suggest we merge it as is and I follow up with a PR for the rabbitmq dashboards.

@@ -8,8 +8,8 @@
"filter": []
}
},
"savedSearchId": "Metricbeat-Rabbitmq",
"title": "Memory Usage [Metricbeat RabbitMQ]",
"savedSearchId": "Metricbeat-Rabbitmq-ecs ECS",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks strange. Need to double check later.

@@ -108,11 +108,11 @@
"fontSize": 60,
"handleNoResults": true
},
"title": "Rabbitmq-Number-of-Nodes",
"title": "Rabbitmq-Number-of-Nodes-ecs ECS",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems for the rabbitmq dashboard title and id's get mixed as they are identical.

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ruflin ruflin merged commit 7120870 into elastic:master Feb 5, 2019
@ruflin ruflin deleted the find-replace-dashboards branch February 5, 2019 21:47
@ruflin
Copy link
Member Author

ruflin commented Feb 5, 2019

This PR will need two follow ups:

  • Adjust the rabbitmq dashboards
  • Adjust values where we had a multiple

@ruflin
Copy link
Member Author

ruflin commented Feb 6, 2019

For the RabbitMQ dashboard here is the follow up PR: #10603

For the event.duration unit change I checked all the occurences. It seems @andrewkroh covered everything where it appears in packetbeat already and only found it once used to query in postgres. Most use cases are precentil aggregations so these should not change.

DStape pushed a commit to DStape/beats that referenced this pull request Aug 20, 2019
With 7.x the data structure of the modules was changed to ECS. This has an affect on all Kibana files as the fields changed. For the migration from 6.x to 7.x it is necessary that 6.x and 7.x dashboards can coexist. To not overwrite dashboards in 7.x all Kibana files must have a changed id and to differentiate them in the UI, a different title.

This PR does the following changes:

* Reads all changed files from ecs-migration.yml and replaces these fields in all Kibana files. It also replaces fields inside queries and filters
* Reads all ids used in the Kibana files and postfixes these ids with `-7x`. It also replaces ids existing in links between dashboards.
* Reads all titles in the Kibana files and appends ` 7.x` to the titles.

The change is impemented so the script can be run multiple times. This is helpful if we make changes later on again to a few dashboards we can rerun the script without having double changes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants