-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify fields.yml generator #7713
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick turn-around.
libbeat/Makefile
Outdated
@@ -6,4 +6,4 @@ include scripts/Makefile | |||
|
|||
# Collects all dependencies and then calls update | |||
.PHONY: collect | |||
collect: libbeat_fields | |||
collect: fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason not to give update
a direct dependency on fields
?
libbeat/scripts/Makefile
Outdated
@@ -309,13 +309,9 @@ coverage-report: | |||
fields: mage | |||
mage -v fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you set this line to @mage fields
. I was debugging when changed it to its current state and didn't undo my changes.
I notice a missing newline from one of the commands. Maybe you can track it down.
|
} | ||
|
||
if len(beatFieldsPaths) == 0 { | ||
if len(beatFieldsPaths) == 0 && *esBeatsPath == *beatPath { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What condition is *esBeatsPath == *beatPath
checking for?
Is this an assertion that the command is not being run from the elastic/beats
dir? If so, comparing the two path strings without normalization isn't advisable; at a minimum I would filepath.Clean
them. Using os.SameFile
would be the most reliable why to check that condition, but why event check it at all.
} | ||
|
||
err = fields.Generate(*esBeatsPath, *beatPath, fieldsFiles) | ||
err := fields.Generate(*esBeatsPath, *beatPath, fieldsFiles) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about internalizing the collection of module files to the fields.Generate()
. Make it accept a list of module directories like Generate(elasticBeatsDir, beatDir string, moduleDirs ...string) ([]byte, error)
.
Note that I added a []byte
return which is the actual content. Then the caller becomes responsible for writing the content which makes it a bit more reusable. I think the YAML content is small enough to be rendered to memory without any issues.
libbeat/generator/fields/fields.go
Outdated
commonFields = append( | ||
[]string{ | ||
filepath.Join(esBeatsPath, "libbeat", "_meta", "fields.common.yml"), | ||
filepath.Join(beatPath, "_meta", "fields.yml"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think _meta/fields.yml is not used anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a place that has some old references to fields files that needs updated to point to plain old fields.yml.
beats/libbeat/tests/system/beat/beat.py
Lines 462 to 496 in 5443c8a
if fields_doc is None: | |
fields_doc = self.beat_path + "/_meta/fields.generated.yml" | |
def extract_fields(doc_list, name): | |
fields = [] | |
dictfields = [] | |
if doc_list is None: | |
return fields, dictfields | |
for field in doc_list: | |
# Skip fields without name entry | |
if "name" not in field: | |
continue | |
# Chain together names | |
if name != "": | |
newName = name + "." + field["name"] | |
else: | |
newName = field["name"] | |
if field.get("type") == "group": | |
subfields, subdictfields = extract_fields(field["fields"], newName) | |
fields.extend(subfields) | |
dictfields.extend(subdictfields) | |
else: | |
fields.append(newName) | |
if field.get("type") in ["object", "geo_point"]: | |
dictfields.append(newName) | |
return fields, dictfields | |
# Not all beats have a fields.generated.yml. Fall back to fields.yml | |
if not os.path.isfile(fields_doc): | |
fields_doc = self.beat_path + "/_meta/fields.yml" |
These lines should be removed now: beats/dev-tools/jenkins_ci.ps1 Lines 27 to 30 in 86dc973
|
@@ -73,7 +73,7 @@ test-module: python-env update metricbeat.test | |||
. ${PYTHON_ENV}/bin/activate && INTEGRATION_TESTS=1 nosetests tests/system/test_${MODULE}.py | |||
|
|||
.PHONY: assets | |||
assets: libbeat_fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I don't think Metricbeat is free from this dependency yet due to the way that the fields.go files are composed.
One solution would be to include both metricbeat common fields and libbeat fields in metricbeat/include/fields.go
. This could be generated by running global_fields
without passing in the module
dir as an argument.
It's be nice if we could pipe the output of one command into another like:
go run ${ES_BEATS}/libbeat/scripts/cmd/global_fields/main.go -es_beats_path ${ES_BEATS} -beat_path ${BEAT_NAME} | go run ${ES_BEATS}/dev-tools/cmd/asset/asset.go -pkg include -in - -out include/fields.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's hope this becomes obsolete again in the near future :-)
@elastic/apm-server FYI, this change is happening in master. |
os.Exit(2) | ||
if len(beatFieldsPaths) == 0 && os.SameFile(esBeatsInfo, beatInfo) { | ||
if !*toStdout { | ||
fmt.Println("No field files to collect") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might consider using the stdlib log
package in the file and the util packages. Just initialize it with log.SetFlags()
to remove the timestamps. All messages will go to stderr all the time.
And all the places where it logs and exits can use a log.Fatal*
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather keep it as it is right now, because this is the "idiomatic" Beats way when printing to console during scripts or commands. I see that you started to use log
in mage files, but apart from that fmt
package is used all over the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewkroh Wasn't aware of this feature. We definitively should start using this in the future.
} | ||
esBeatsInfo, err := esBeats.Stat() | ||
if err != nil { | ||
fmt.Fprintf(os.Stderr, "Error getting file info of elastic/beat: %+v\n", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elastic/beats
(It's missing the s
.)
@@ -29,6 +29,7 @@ import ( | |||
func main() { | |||
esBeatsPath := flag.String("es_beats_path", "..", "Path to elastic/beats") | |||
beatPath := flag.String("beat_path", ".", "Path to your Beat") | |||
toStdout := flag.Bool("stdout", false, "Write output to stdout") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I'd prefer to standardize on using -out
to specify the output file in our CLI tools. And make the default be -
mean output to os.Stdout
like most unix tools. Then you get the same behavior across all tools. The downside being that you have to write -out fields.yml
in the common case, but the common case is almost always done via make
so it's not actually a problem.
libbeat/generator/fields/fields.go
Outdated
var err error | ||
if !isLibbeat(beatPath) { | ||
commonFields = append(commonFields, | ||
filepath.Join(esBeatsPath, "libbeat", "_meta", "fields.common.yml"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW filepath.Join
always calls filepath.Clean
on the result. https://golang.org/pkg/path/filepath/#Join
This means you can write filepath.Join(esBeatsPath, "libbeat/_meta/fields.common.yml")
which is more clear IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Can you rebase this on master to get my changes to fix go test
on Windows.
fmt.Fprintf(os.Stderr, "Cannot generate global fields.yml for %s: %+v\n", name, err) | ||
os.Exit(2) | ||
if len(beatFieldsPaths) == 0 && os.SameFile(esBeatsInfo, beatInfo) { | ||
if *output != "-" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be *input
?
dev-tools/cmd/asset/asset.go
Outdated
|
||
func init() { | ||
pkg = flag.String("pkg", "", "Package name") | ||
input = flag.String("in", "-", "Source of input. \"-\" means reading from stdin") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to switch this file over to flag.StringVar
so we don't have to dereference string pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kvch Want to do this in this PR or follow up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I forgot to push this commit.
* `libbeat/fields.yml` is not a dependency anymore * intermediary files are not written Closes elastic#7671
9cac467
to
6ff6069
Compare
Rebased and refactored. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change LGTM, only a few minor questions.
@@ -71,7 +71,7 @@ func CollectModuleFiles(modulesDir string) ([]*YmlFile, error) { | |||
func CollectFiles(module string, modulesPath string) ([]*YmlFile, error) { | |||
|
|||
var files []*YmlFile | |||
fieldsYmlPath := filepath.Join(modulesPath, module, "_meta", "fields.yml") | |||
fieldsYmlPath := filepath.Join(modulesPath, module, "_meta/fields.yml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason you introduced the /
here? Could break on Windows. Any issues with the previous option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Andrew pointed out that it is working everywhere in #7713 (comment)
@@ -91,7 +91,7 @@ func CollectFiles(module string, modulesPath string) ([]*YmlFile, error) { | |||
if !s.IsDir() { | |||
continue | |||
} | |||
fieldsYmlPath = filepath.Join(modulesPath, module, s.Name(), "_meta", "fields.yml") | |||
fieldsYmlPath = filepath.Join(modulesPath, module, s.Name(), "_meta/fields.yml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above.
if err != nil { | ||
fmt.Fprintf(os.Stderr, "Cannot generate global fields.yml for %s: %+v\n", name, err) | ||
os.Exit(2) | ||
if len(beatFieldsPaths) == 0 && os.SameFile(esBeatsInfo, beatInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to follow what this does. Could you quickly elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a community Beat does not have its ownfields.yml
file, it still requires the fields coming from libbeat to generate e.g assets. In case of Elastic Beats, it's not a problem because all of them has unique fields.yml files somewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the details. Could you perhaps add a comment to the line to make sure if we refactor this in the future we don't get rid of it?
@@ -73,7 +73,7 @@ test-module: python-env update metricbeat.test | |||
. ${PYTHON_ENV}/bin/activate && INTEGRATION_TESTS=1 nosetests tests/system/test_${MODULE}.py | |||
|
|||
.PHONY: assets | |||
assets: libbeat_fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's hope this becomes obsolete again in the near future :-)
@@ -1,2083 +0,0 @@ | |||
- key: common |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this was a left over file that should already have been deleted before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It is a duplication of packetbeat/_meta/fields.common.yml
with fields which coming from protos
. Nothing used this file at all.
@kvch Thanks, could you follow up with the @elastic/apm-server team? |
already aware, thnks ;) |
This PR simplifies the process of generating a new `fields.yml` file for a Beat. * `libbeat/fields.yml` is not a dependency anymore * only `fields.yml` is written to disk Changes in the asset files are due to different number of newlines in generated `fields.yml` files. ### Interface changes of generator tools Both `dev-tools/cmd/asset/asset.go` and `libbeat/scripts/cmd/global_fields/main.go` is changed. The first one has two new flags: `-in`, and `-out`. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag `-out` which specifies the output file. If it's not set, the output is written to stdout. This makes it possible to generate asset files for Metricbeat without writing `libbeat/fields.yml` to disk. It will be useful in the future when more Beats have modular assets. Closes elastic#7671 (cherry picked from commit 5797005)
This PR simplifies the process of generating a new `fields.yml` file for a Beat. * `libbeat/fields.yml` is not a dependency anymore * only `fields.yml` is written to disk Changes in the asset files are due to different number of newlines in generated `fields.yml` files. ### Interface changes of generator tools Both `dev-tools/cmd/asset/asset.go` and `libbeat/scripts/cmd/global_fields/main.go` is changed. The first one has two new flags: `-in`, and `-out`. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag `-out` which specifies the output file. If it's not set, the output is written to stdout. This makes it possible to generate asset files for Metricbeat without writing `libbeat/fields.yml` to disk. It will be useful in the future when more Beats have modular assets. Closes #7671 (cherry picked from commit 5797005)
This PR simplifies the process of generating a new `fields.yml` file for a Beat. * `libbeat/fields.yml` is not a dependency anymore * only `fields.yml` is written to disk Changes in the asset files are due to different number of newlines in generated `fields.yml` files. ### Interface changes of generator tools Both `dev-tools/cmd/asset/asset.go` and `libbeat/scripts/cmd/global_fields/main.go` is changed. The first one has two new flags: `-in`, and `-out`. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag `-out` which specifies the output file. If it's not set, the output is written to stdout. This makes it possible to generate asset files for Metricbeat without writing `libbeat/fields.yml` to disk. It will be useful in the future when more Beats have modular assets. Closes elastic#7671 (cherry picked from commit 5797005)
* Simplify fields.yml generator (#7713) This PR simplifies the process of generating a new `fields.yml` file for a Beat. * `libbeat/fields.yml` is not a dependency anymore * only `fields.yml` is written to disk Changes in the asset files are due to different number of newlines in generated `fields.yml` files. ### Interface changes of generator tools Both `dev-tools/cmd/asset/asset.go` and `libbeat/scripts/cmd/global_fields/main.go` is changed. The first one has two new flags: `-in`, and `-out`. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag `-out` which specifies the output file. If it's not set, the output is written to stdout. This makes it possible to generate asset files for Metricbeat without writing `libbeat/fields.yml` to disk. It will be useful in the future when more Beats have modular assets. Closes #7671 (cherry picked from commit 5797005) * update fields of Packetbeat
…ic#7777) * Simplify fields.yml generator (elastic#7713) This PR simplifies the process of generating a new `fields.yml` file for a Beat. * `libbeat/fields.yml` is not a dependency anymore * only `fields.yml` is written to disk Changes in the asset files are due to different number of newlines in generated `fields.yml` files. ### Interface changes of generator tools Both `dev-tools/cmd/asset/asset.go` and `libbeat/scripts/cmd/global_fields/main.go` is changed. The first one has two new flags: `-in`, and `-out`. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag `-out` which specifies the output file. If it's not set, the output is written to stdout. This makes it possible to generate asset files for Metricbeat without writing `libbeat/fields.yml` to disk. It will be useful in the future when more Beats have modular assets. Closes elastic#7671 (cherry picked from commit ec2e022) * update fields of Packetbeat
This PR simplifies the process of generating a new
fields.yml
file for a Beat.libbeat/fields.yml
is not a dependency anymorefields.yml
is written to diskChanges in the asset files are due to different number of newlines in generated
fields.yml
files.Interface changes of generator tools
Both
dev-tools/cmd/asset/asset.go
andlibbeat/scripts/cmd/global_fields/main.go
is changed. The first one has two new flags:-in
, and-out
. By default it reads from stdin and writes to stdout. Using the flags it's possible to read and write from other files. The second one has one new flag-out
which specifies the output file. If it's not set, the output is written to stdout.This makes it possible to generate asset files for Metricbeat without writing
libbeat/fields.yml
to disk. It will be useful in the future when more Beats have modular assets.Closes #7671