Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] - validate files before harvesting #40151

Closed
VihasMakwana opened this issue Jul 9, 2024 · 4 comments
Closed

[Filebeat] - validate files before harvesting #40151

VihasMakwana opened this issue Jul 9, 2024 · 4 comments
Assignees
Labels
Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Comments

@VihasMakwana
Copy link
Contributor

VihasMakwana commented Jul 9, 2024

Current Issue

  • This issue exists is we enable prospector.scanner.symlinks.
  • If we enable prospector.scanner.symlinks and try to ingest a symlink in symlink -> non-regular file manner, the irregular file will still be ingested.
  • This shouldn't happen and we should validate the source file as well.

Describe the enhancement:

  • Currently, we validate a file here and only harvest a regular file (i.e. NOT directories, pipes, sockets, device etc).
  • I propose to move this step before harvesting it i.e. here.

Describe a specific use case for the enhancement or feature:

@VihasMakwana VihasMakwana changed the title [Filebeat] - validate files before trying to ingest them [Filebeat] - validate files before harvesting Jul 9, 2024
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jul 9, 2024
@VihasMakwana VihasMakwana self-assigned this Jul 9, 2024
@VihasMakwana VihasMakwana added Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jul 9, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@pierrehilbert
Copy link
Collaborator

@rdner / @belimawr could you please have a look here to share your thoughts?

@belimawr
Copy link
Contributor

belimawr commented Jul 9, 2024

@VihasMakwana, I'm a bit confused here... We check if a file is a regular file right before opening it, which is indeed after resolving the symlink. But every single file we open for reading is will go through the validation at

err = checkFileBeforeOpening(fi)
if err != nil {
return nil, nil, false, err
}
which checks the file mode
func checkFileBeforeOpening(fi os.FileInfo) error {
if !fi.Mode().IsRegular() {
return fmt.Errorf("tried to open non regular file: %q %s", fi.Mode(), fi.Name())
}
return nil
}

So even if the fileWatcher finds a file under a symlink that is not a regular file and returns it, filestream.openFile will still validate it by calling checkFileBeforeOpening and won't be harvested if it is not a regular file.

Regarding reporting the input status as degraded in this case, I believe it is the correct behaviour. The user has configured Filestream to ingest something that is not a regular file, thus the user should be notified of their error and the input should stay degraded until this is fixed.

We just need to make sure the message returned to the user is clear enough so the can understand and act on it.

@VihasMakwana
Copy link
Contributor Author

@belimawr thanks for sharing your thoughts. I agree with you.

Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

No branches or pull requests

4 participants