Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding initial spec files #1

Merged
merged 45 commits into from
Aug 5, 2020
Merged

Adding initial spec files #1

merged 45 commits into from
Aug 5, 2020

Conversation

ycombinator
Copy link
Contributor

@ycombinator ycombinator commented Jul 28, 2020

This PR adds specification files for the initial package specification (format_version = 1.0.0, which is what all packages today are using). The specification files describe:

  1. the expected folder structure of packages, including expected files within these folders; and
  2. the expected structure of the expected files' contents.

For 1. above, a schema very similar to JSON Schema is used (but written as YAML for readability), with two notable differences:

  • The type field can be either folder or file, and
  • A new field, contents is introduced to (recursively) describe the contents of folders (i.e. when type == folder).

For 2. above, JSON Schema is used (but, again, written as YAML for readability).

Additionally, this PR adds a README.md file explaining the specification format.

Resolves #2.

Author's TODO

  • Flesh out package folders+expected files specification (1. above)
  • Flesh out specification for expected files' contents (2. above)
    • Main package manifest file
    • Dataset manifest file
    • Various dataset fields files
    • CHANGELOG file
  • Document specification format in README

@ycombinator ycombinator marked this pull request as ready for review July 30, 2020 18:24
@ycombinator ycombinator requested review from ruflin and mtojek July 30, 2020 18:24
type: string
enum:
- custom
default_field:
Copy link
Contributor Author

@ycombinator ycombinator Jul 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonathan-buttner I saw this default_field field used here https://github.com/elastic/package-registry/blob/98d7184cfe5b1d271e1bef848a5720073959888e/testdata/package/ecs_style_dataset/0.0.1/dataset/foo/fields/fields.yml#L16. Could you tell me what it is meant for so I could appropriately describe it in this schema here? Thanks!

Copy link
Collaborator

@ruflin ruflin Jul 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ycombinator I think we got rid of default_field and it was replaced by show_user and some others. So this is only here because it was not cleaned up. Found it also in other places.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, ignore my comment above, just realised this is fields.yml.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can ignore this one too I think:

It is outputted by the ECS beats tooling:
https://github.com/elastic/ecs/blob/master/scripts/generators/beats_default_fields_whitelist.yml

https://github.com/elastic/ecs/blob/master/generated/beats/fields.ecs.yml#L77

I'm not even really sure what it is used for 🤷

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin @mtojek Do you have any idea what the purpose of this default_field is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea at all, I think you can safely ignore/remove this field from spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will remove it from the spec, similar to the level and group fields which were removed in dd76b6d.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 396721e.

Copy link
Contributor

@masci masci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Versioning looks very good to me, thanks for working on it!

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@ruflin
Copy link
Collaborator

ruflin commented Aug 3, 2020

@ycombinator The registry running under epr-epxerimental.elastic.co is for 7.8. It already contains the spec format 1.0.0 (which of course is wrong). Packages under epr.elastic.co which is for >=7.9 also contains 1.0.0 even though the two package formats are not compatible. In this case, it does not matter as the format is not read out and does not break anything. What I'm getting at with my above comment, is that there is a chance the the format we ship in 7.9 is not the final one and we do a breaking change again with the same trick we did for 7.8 with a separate registry. What this means for the spec is that even though you defined 1.0.0 above and it could imply all follow up changes are not breaking, it is likely not the case as what we have today in packages is not necessarly 1.0.0.

All of this should not hold back this PR from merging.

Co-authored-by: Massimiliano Pippi <mpippi@gmail.com>
@ycombinator
Copy link
Contributor Author

ycombinator commented Aug 3, 2020

Thanks for the explanation, @ruflin.

I agree that the breaking changes should not hold back this PR. However, it means that we can't use the 1.0.0 spec for validating packages just yet, until all breaking changes have "settled". I have already labeled this spec version as 1.0.0-alpha1 in the versions/1/changelog.yml file and in the version field of the versions/1/spec.yml file. Additionally I will add a warning to the top of the README stating that the spec is under active development and not ready for use just yet. Once the breaking changes have settled we can remove the alpha1 label, remove the warning from the README, and start using the spec for validating packages. Needless to say, it would be good to get to this state sooner rather than later.


I looked through the comments in this PR and there are a few that are awaiting your responses. I've gathered them below for your convenience. If you could take a few minutes to respond to them, it would help move along this PR. Thanks!

description: Base fields definitions
type: file
name: "base-fields.yml"
required: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm can we make this required: false the endpoint package doesn't include this file. We include the fields directly in fields.yml. Should we allow the package developer to call the file whatever they want? Would it be easy to just validate that the directory isn't empty and that the file conforms to the fields.spec.yml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm happy to change this to whatever matches the actual usage/parsing of these files. I don't know what that is, though! Are the files merged together somehow? If so, is there some ordering/precedence taken into account while merging?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file can have any name and merging order should not matter, at least so far.

Copy link
Contributor Author

@ycombinator ycombinator Aug 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 076597f.

versions/1/dataset/manifest.spec.yml Show resolved Hide resolved
versions/1/dataset/manifest.spec.yml Show resolved Hide resolved
@ruflin
Copy link
Collaborator

ruflin commented Aug 5, 2020

@jonathan-buttner One I couldn't answer in line (not sure why) was around integrations vs solution. We introduced solutions as in the beginning we expected to never show the endpoint package to users but this changed. So having it as integrations now makes sense and we can get rid of a an option.

@ycombinator I have not been very good in following up on all comments, but please do not block on me to get this in. There will be breaking changes in this spec for the upcoming dataset -> data_stream change. The part I'm most looking forward to is start validating packages against it and iterate on it. As soon as we get there, we should challenge again every single entry in the spec.

Ship it.

@ycombinator
Copy link
Contributor Author

Many thanks to @jonathan-buttner, @mtojek, @masci, and @ruflin for reviewing this PR so thoroughly! I believe all your concerns have been addressed at this point.

The initial spec in this PR may not be 100% accurate yet but I think it's in decent enough shape to get us started. I fully expect us to iterate on this in the coming weeks, especially once we start validating packages against it!

To indicate that this initial spec is not quite ready for general consumption, I have done two things:

  • versioned the spec as 1.0.0-alpha1, and
  • added a warning at the top of the README

With that, I'm planning to merge this PR and then make this repository public later today. If anyone has any last-minute concerns please bring them up within the next few hours. Thank you!

Copy link
Collaborator

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@@ -138,5 +138,35 @@ spec:
- title
- description
- input
# https://github.com/elastic/package-registry/blob/v0.8.0/util/dataset.go#L48
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it would be safer to refer through a commit id.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually considering removing these comments. I had put them in there for my own benefit to make sure I had accounted for most of the struct fields in the spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in ac61112.

@mtojek mtojek self-requested a review August 5, 2020 11:25
Copy link
Contributor

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it! I wonder if there is a way to enforce required fields in the spec. It might be part of the validation library.

Side note: if they are required to be present in all definitions, why don't we keep them aside from fields.yml

@ycombinator ycombinator merged commit 78fa992 into elastic:master Aug 5, 2020
@ycombinator ycombinator deleted the init-spec branch August 5, 2020 23:50
rw-access pushed a commit to rw-access/package-spec that referenced this pull request Mar 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create initial specification files
5 participants