Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the ability to pull multiple paths #3

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

r32rtb
Copy link

@r32rtb r32rtb commented Feb 24, 2020

I verified the same data is pulled between the different versions. I also added a sigsci-API attribute to key on for output filters. I think I updated everything that needed to be updated. New to ruby so hopefully it isn't too screwed up... Work for my use case. For some reason both versions are giving me a _jsonparsefailure but lint the json output doesn't indicate any issues so thinking it might be the installation on my machine.

@foospidy
Copy link
Owner

Thanks for this! I ran some tests with my setup and am getting errors. The first run was with my existing index and got the following:

2020-02-27T16:03:23,695][WARN ][logstash.outputs.elasticsearch][main] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>"%{[message][id]}", :_index=>"signalsciences", :routing=>nil, :_type=>"doc"}, #<LogStash::Event:0x66280d5f>], :response=>{"index"=>{"_index"=>"signalsciences", "_type"=>"doc", "_id"=>"%{[message][id]}", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"Could not dynamically add mapping for field [host.cpu]. Existing mapping for [host] must be of type object but found [text]."}}}}

I then tried it with a new index and still got a similar error:

2020-02-27T16:08:26,205][WARN ][logstash.outputs.elasticsearch][main] Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>"%{[message][id]}", :_index=>"sigsci_pr", :routing=>nil, :_type=>"doc"}, #<LogStash::Event:0x40e183a0>], :response=>{"index"=>{"_index"=>"sigsci_pr", "_type"=>"doc", "_id"=>"%{[message][id]}", "status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"mapper [agent.latency_time_50th] cannot be changed from type [long] to [float]"}}}}

Hopefully the errors above help indicate what needs fixing to you. I'll try to take a closer look at some point too.

My other feedback would be to use the term "endpoints" instead of "paths" as that more accurately describes what this is for.

@r32rtb
Copy link
Author

r32rtb commented Feb 28, 2020 via email

@r32rtb
Copy link
Author

r32rtb commented Feb 28, 2020 via email

@r32rtb
Copy link
Author

r32rtb commented Feb 28, 2020

Adjusted to path to endpoint, added config example to remove dots from host.* field for ingest into elastic, and added the ability to have alternate sites in the endpoints hash... Let me know if you have any additional issues or need additional changes. Version 1.3.0...

@foospidy
Copy link
Owner

foospidy commented Mar 2, 2020

Thanks for the updates! I hit a couple of bumps when trying it again. First, in the conf file, the config for "paths" should be renamed to "endpoints". I needed to change that to get past the error. Second, I haven't figured this one out, but the error is:
2020-03-02T14:17:30,435][ERROR][logstash.filters.mutate ] Unknown setting 'covert' for mutate

@r32rtb
Copy link
Author

r32rtb commented Mar 2, 2020

I also made the message field be a json blob so it can be sent as json instead of a hash and be parsed as json in a logstash filter. Should be in the example conf file now. One thing I'm thinking about is adding a tag for the site... I have 8 sites and it might be nice to have that capability...

@foospidy
Copy link
Owner

Sorry for the delay. I've been trying to work through an issue where only one record is appearing in elasticsearch. I think I've determined why, but haven't figured out the fix yet. In the config file the output section specifies an the field [message][id] to deduplicate data. However, it's being interpreted as a string literal so only one row is imported into elasticsearch since it's the same value for every row. Here's a screen shot:
image

I'm still working to resolve this, but let me know if you have any ideas.

@foospidy
Copy link
Owner

foospidy commented Mar 10, 2020

I changed the config to use document_id => "%{id}" and that seems to resolve it for the requests feed, but that doesn't work for the agent data. Perhaps both data sets should not be in the same index? Was this setup working for you? Not sure if I'm doing something wrong.

@r32rtb
Copy link
Author

r32rtb commented Mar 11, 2020

I don't have them going into the same index so I didn't have that problem. I am just using the tags to determine which index to use as I use different timestamp field for the requests and there isn't one for the agent status. I also didn't do the document_id deal... There shouldn't be duplicates since the code now sets the @timestamp_from = @timestamp_until. The original code looked like it just calculated the interval every loop if I'm looking at it right.

A sample of my output:
output
{
if "sigsci_agents" in [tags] {
elasticsearch {
hosts => ["elasticsearch:9200"]
ilm_rollover_alias => "sigsci-agents"
ilm_pattern => "000001"
ilm_policy => "Rollover_50GB"
ilm_enabled => "true"
}
}
if "sigsci_feed_requests" in [tags] {
elasticsearch {
hosts => ["elasticsearch:9200"]
ilm_rollover_alias => "sigsci-feed-requests"
ilm_pattern => "000001"
ilm_policy => "Rollover_50GB"
ilm_enabled => "true"
}
}
}

@foospidy
Copy link
Owner

Sorry for the delayed response. That makes sense now. My config was for the most part based off the original. I think you've filled in the gaps I had for this. I'll test again using your config setup. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants