Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce aliases for 7.x fields in 6.x #9283

Merged
merged 3 commits into from
Dec 12, 2018
Merged

Conversation

ruflin
Copy link
Member

@ruflin ruflin commented Nov 29, 2018

This introduces for aliases for fields which will exist in 7.x and a 1-1 mapping is possible. Having this fields already around makes sure we do not use them otherwise.

@ruflin ruflin added in progress Pull request is currently in progress. ecs Team:Integrations Label for the Integrations team labels Nov 29, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/infrastructure

@ruflin
Copy link
Member Author

ruflin commented Nov 29, 2018

@graphaelli I wonder if it would be helpful to start a similar PR against 6.x in apm-server to see if all the 1-1 alias do not already have an other field. If it does, CI will complain.

@ruflin ruflin mentioned this pull request Nov 29, 2018
@jsoriano
Copy link
Member

Will these fields appear in 6.x documentation without descriptions? do we want that?

There would be an option to hide them in docs if we are only adding them to not use them?

@graphaelli
Copy link
Member

@ruflin already in progress, I'll add you as reviewer

@ruflin
Copy link
Member Author

ruflin commented Nov 29, 2018

I think I need to backport this one #9283 and probably one more for the docs to look properly.

@ruflin ruflin added review and removed in progress Pull request is currently in progress. labels Dec 10, 2018

--

*`beat.hostname`*::

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused by the semantics here. Is beat.hostname the alias, or the field the alias will point to? Reading the above:

   - name: host.hostname
      type: alias
      path: beat.hostname

makes me think that host.hostname is the alias that is going to point to the real field beat.hostname. I know the ECS field name is host.hostname. Is the plan in 6.x it'll be the alias, and in 7.x, it'll be the real field when the mapping support many : 1 (the 1 being the ECS field, the many being all variations that are used across all the beats).

Can you let me know the specific field that will require the many : 1 mapping? I thought I recalled Boaz saying something about filebeat, but looking at the fields (well at least the ones with host in them) I don't see where the issue specifically is. Just looks like all the beats have beat.hostname which would map to host.hostname, which should still be 1:1? Even though beat.hostname is a field in many different indices... ?

Sorry for so many questions, just trying to wrap my head around all this as I think about what should be tested!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fields that are aliased here are for 6.x and are only 1:1 mappings to make them compatible with 7.x data for Logging UI and Infrastructure UI.

For the many:1 mappings which are only introduce in 7.x you can find some examples here:

This form apache and nginx and you will see that these have many fields in common and the old fields are mapped to the same new fields.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stacey-gammon Here's my attempt at clarifying as well.

For 6.x and for 7.x, the aliases are flipping direction indeed.

6.x: new fields introduced as aliases to the old field they will eventually replace. This is for forward compatibility. We will start being able to depend on the new field names in the later releases in the 6.x line for those. This is only applicable for 1:1 mappings, however (explained below).

7.x: For all fields being renamed (1:1 or many to 1), the ECS field is now the canonical field holding the data, and old fields alias to the ECS field.

Many to 1 mappings: Filebeat modules are indeed where this is happening the most, but there may be other places. All modules used to nest their fields under a namespace [module name].[dataset].[field]. In many cases, different modules had perfectly equivalent fields: apache2.access.url and nginx.access.url. These two examples are becoming url.original. In 6.x we can't create a forward compatible alias to all of these values, as the alias can only point to one field. However in 7.x we'll have all of these old fields be aliases to the new canonical field. Nothing prevents having many alias fields pointing to the same field.

Copy link
Member

@graphaelli graphaelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Planning to introduce an ecs-migration.yml log too?

libbeat/_meta/fields.common.yml Show resolved Hide resolved
Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok overall. Here's a few points, though:

  • ECS doesn't have cloud.project.id
  • Please review my question about beat.name & agent.type. I don't think this is the right mapping.

Not a blocker for this PR, but we must address for 6.6:

# Metricbeat fields
- to: event.module
from: metricset.module
index: metricbeat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realize we had both fileset.module and metricset.module becoming event.module. But I see you've treated them as two 1:1 mappings, in different indices. Good stuff.

libbeat/_meta/fields.common.yml Show resolved Hide resolved
@webmat
Copy link
Contributor

webmat commented Dec 11, 2018

Gil makes good points with some of the missing aliases:

  • beat.timezone ~ event.timezone
  • docker.container.id ~ container.id

Looking at the list of libbeat common fields, I've spotted a few more:

  • beat.version ~ agent.version
  • docker.container.labels ~ container.labels

One we can't tackle because it's an object is fields ~ labels, unfortunately.

@graphaelli
Copy link
Member

@webmat container.labels has the same problem as labels right? in apm-server, we planned to duplicate those in 6.x - that is write them to both places.

agent.version means something different in APM, if you're going to add this I request that it be done per beat.

@webmat
Copy link
Contributor

webmat commented Dec 11, 2018

@graphaelli Ah you're right about container.labels. I had a lapse and was thinking about the semantics of tags (array of keywords).

@ruflin
Copy link
Member Author

ruflin commented Dec 11, 2018

  • cloud.project.id is not in ECS and that is ok
  • I added container.id.
  • I will not add container.labels as they are not used by Infrastructure & Logging UI. I will leave it to apm-server to handle this.
  • beat.version was left out on purpose as we don't use it anywhere. And now that I see it could cause issues, event better.
  • beat.timezone -> event.timezone added
  • I removed beat.name mapping as I agree it's the wrong one here and we don't need to migrate it.

Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, once the build is green 👍

# Beat fields
- from: beat.name
to: agent.type
alias6: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flip to alias6: false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find. Remove it and added event.timezone and container.id instead as it was missing.

@webmat
Copy link
Contributor

webmat commented Dec 11, 2018

Still looking good after the rebase, but I noticed the alias6 on beat.name. Otherwise all good

- from: beat.name
to: agent.type
alias6: true
alias: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. This whole section needed to go, not just the alias6 :-)

@ruflin
Copy link
Member Author

ruflin commented Dec 12, 2018

jenkins, test this

@ruflin
Copy link
Member Author

ruflin commented Dec 12, 2018

Seems like the LS integration tests were flaky which I haven't seen before. Retrying.

@ruflin ruflin force-pushed the introduce-aliases branch 2 times, most recently from e61220a to dffc8c0 Compare December 12, 2018 13:38
@ruflin
Copy link
Member Author

ruflin commented Dec 12, 2018

jenkins, test this

This introduces for aliases for fields which will exist in 7.x and a 1-1 mapping is possible. Having this fields already around makes sure we do not use them otherwise.

need at least 6.4 for testing trying out if snapshot trick works

reset docker compose

update more fields.go files

rebase to get 6.4 environment

remove not needed fields

 add migraton field

update fields.yml again

update fields

rebase

update migration document

add alias docs
Copy link
Contributor

@webmat webmat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase still looks good

@ruflin
Copy link
Member Author

ruflin commented Dec 12, 2018

The failures are very similar to before so I'm starting to guess it's related. Unfortunately so far I couldn't reproduce it locally.

@ruflin
Copy link
Member Author

ruflin commented Dec 12, 2018

jenkins, test this

@ruflin ruflin merged commit c63c98b into elastic:6.x Dec 12, 2018
@ruflin ruflin deleted the introduce-aliases branch December 12, 2018 19:33
@ruflin
Copy link
Member Author

ruflin commented Dec 12, 2018

Strangely now all tests except a new flaky one went green. Merged for now but will keep an eye on it.

DStape pushed a commit to DStape/beats that referenced this pull request Aug 20, 2019
This introduces for aliases for fields which will exist in 7.x and a 1-1 mapping is possible. Having this fields already around makes sure we do not use them otherwise.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ecs review Team:Integrations Label for the Integrations team v6.6.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants