Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow user to manually set desired hostname #260

Closed
wants to merge 1 commit into from

Conversation

AntonioMeireles
Copy link

Hi, attached patch attempts to fix an issue (specially inside docker containers) where too often logstash-forwarder relays data with the short hostname as source (an alternative approach would be to just actually call hostname -f but here i kept using exclusively golang machinery in order to keep platform portability).

@driskell
Copy link
Contributor

driskell commented Sep 1, 2014

I'm not fully sure - but for "hostname" there is no concept of "short" vs "full". Hostname just returns the machines name. If that name happens to be set to the short name then that will be returned - because that's the machine's name.

Maybe you just need to fix the docker containers hostname in /etc/sysconfig/network or wherever it is set?

I remember there being a debate on whether the machine name should be the short or full name. In general I tend to set it to the full name. Found a snippet here but it seems to be completely down to preference, and different vendors recommend different things.
http://serverfault.com/questions/331936/setting-the-hostname-fqdn-or-short-name

@AntonioMeireles
Copy link
Author

well, in the end this is about consistency and simplicity... what i think would be valuable, for the docker case, is all the times one sets a conatiner name to be (via (docker run's) -h) foo.bar then have logstash-forwarder to relay its logs having 'foo.bar' as their source, regardless of the inner container distro being RH or debian derivative. (and yes, not having to mess anything inside the container at runtime would be an huge plus, imho) as nowadays that does not happen. IMHO that can be seen as a bug as i (lack of imagination, certainly) can't see too many use cases where (in the concrete case of relaying logs) one wouldn't want to have the fqdn passed to the upper level.

just my .0002€ :)

@alphazero
Copy link
Contributor

Hi @AntonioMeireles, (Just fyi & granted we have the same pattern in current code base :) but using _ for the error return is strictly not acceptable going forward. If there is an error, we want to know about it.

@AntonioMeireles
Copy link
Author

@alphazero: point taken :-). patch(es) updated.

@AntonioMeireles
Copy link
Author

for further context this would be an workaround for docker's #7851.

@driskell
Copy link
Contributor

Would it be right to "workaround" things in a product though? I don't know. Might confuse an admin if other utilities don't workaround it, potentially pointing them in wrong direction. Just a thought.

@driskell
Copy link
Contributor

Summary of my two cents:
I'm not a fan of modifying/hiding/mitigating symptoms without the users knowledge.

Maybe an option though to enable it would be useful? That's be best of both :)

@jordansissel
Copy link
Contributor

Hmm.. Thinking about this, if some users can't trust os.Hostname(), then maybe we can provide a way for those users to specify a value to use explicitly; a flag or a config setting. Thoughts?

I think a config setting would be preferable to "trying harder with code" because ultimately there will still be edge cases and allowing users to specify 'this is what the hostname should be reported as' would let humans correct for computers doing things incorrectly.

@driskell
Copy link
Contributor

+1 on that option!

AntonioMeireles added a commit to AntonioMeireles/logstash-forwarder that referenced this pull request Sep 26, 2014
@AntonioMeireles
Copy link
Author

yes, having an option to just set the hostname would do too :-) . above, for comments, is a very quick patch that would allow just that.

@AntonioMeireles AntonioMeireles changed the title trying a bit harder to get golang's os.Hostname() to behave like 'hostname -f' all the time. allow user to manually set desired hostname Sep 26, 2014
@@ -64,7 +66,8 @@ var infolog *log.Logger

func init() {
flag.StringVar(&options.configFile, "config", options.configFile, "path to logstash-forwarder configuration file")

flag.StringVar(&options.hostname, "hostname", options.hostname, "hostname we want to advertise as")
flag.StringVar(&options.hostname, "h", options.hostname, "hostname we want to advertise as")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just -hostname is OK with me, the '-h' flag would override the current '-h' behavior which is to get the usage/help output.

@jordansissel
Copy link
Contributor

reviewed, LGTM once the comments I made are resolved. :)

Thanks for doing this!

@jordansissel
Copy link
Contributor

@driskell
Copy link
Contributor

driskell commented Oct 3, 2014

Is it better in the config file? Problem here is people may need to change init scripts if they already packaged etc

@jordansissel
Copy link
Contributor

I've been trending towards putting all things in config lately; splitting configuration across two interfaces (some as cli flags, some as config file) is confusing and possibly irritating.

@zerkms
Copy link

zerkms commented Oct 3, 2014

If it counts - I vote for having it in config file

AntonioMeireles added a commit to AntonioMeireles/logstash-forwarder that referenced this pull request Oct 3, 2014
@AntonioMeireles
Copy link
Author

@jordansissel regarding step 2 of the CLA i've already done it, at Sep 1st.

@AntonioMeireles
Copy link
Author

@jordansissel et all, code tweaked per your comments.

regarding the comments about the config file all i can say is that things are not mutually exclusive.

unlike my 1st approach this one is fully backwards compatible, i.e. if one does not set -hostname app behaviour is exactly as ever been, for good or worst, (so for people that was never hit by this there is not any need to change anything, anywhere). I can see use cases where it makes sense having the hostname set in the config file too (imagining a volume within a logstash-forwarder container to where multiple containers write their logs (e.g. instead of having logstash-forwarder installed in every container), but IMHO the general case is that the hostname is a global immutable var and in that sense set unconditionally globally by default.

(anyway, open to code that feature, either in addition or instead of present patch suggestion :-) )

@driskell
Copy link
Contributor

driskell commented Oct 3, 2014

Edited to clarify

Hi @AntonioMeireles

I know there are a few cmd line options already that should really be in config file. IMO all configurable options should be in that file, and command line only contain things required before the config file is available (such as the config file path itself) or things that change the action of the daemon (asking for version, asking for what options available, running to background etc.)

Since a cleanup would be needed anyway to move the things like spool size to the config file - there's nothing blocking this getting merged and then it can be shifted to the config later. But if you could shift it to the config that might save some future work. All down to opinion I guess on what should be where - just adding my thoughts :)

Jason

@AntonioMeireles
Copy link
Author

@driskell, matter of trade-offs, i'd guess. personally i'd vote to get this merged now, and after go for a general cleanup - dropping dead code, normalizing a bit internal code conventions (specially error handling (done on too many != ways, etc)) and by then i'd do the more structural and invasive mods (like changing config files conventions/features) and after eventually signalling it with a sound version bump. but that would be me, it's not my call :-)

@ankushnarula
Copy link

Unfortunately I've been relegated to using beaver for this functionality. I would oh so love to see it included in logstash-forwarder since beaver has a slew of other issues. Having the ability to specify a hostname via command-line AND via config file would be ideal since we sometimes test/develop in a single vm/instance via command-line but we are also puppetizing the configuration. 👍

@kiranos
Copy link

kiranos commented Dec 11, 2014

+1 This is really needed, right now my servers are named just as the first part of the hostname
lbl01 and the fqdn is lbl01.domain.com, and I have need for naming: lbl01.domain2.com, If I do this this will also get lbl01 as host variable, which will make both of these identical, it truly need this functionality.

And yes getting this in the config file instead of a config line would really improove startup script etc.
Now I think I'll have to add a /etc/default/logstash-forwarder.conf and parse this in the init script and get the variable of the fqdn from that and add that as a startup parameter.

@FlorinAndrei
Copy link

@kiranos 👍 My thoughts exactly.

@kiranos
Copy link

kiranos commented Dec 12, 2014

I tried this patch with the current master and got _grokparsefailure from exported apache2 logs. I reverted it.

@kiranos
Copy link

kiranos commented Jan 12, 2015

I've done some more testing now, but cant get it to work:

For startes here is my hostname (not fqdn) but the name in /etc/hostname

go-buildhm:~# hostname 
go-buildhm

Here is it running with modified patch applied

logstash-forwarder/logstash-forwarder -config /etc/logstash-forwarder/logstash-forwarder.conf -spool-size 100 -hostname test.yes.se
2015/01/12 15:11:43.797118      --- options -------
2015/01/12 15:11:43.797159      config-arg:          /etc/logstash-forwarder/logstash-forwarder.conf
2015/01/12 15:11:43.797163      hostname:            test.yes.se
2015/01/12 15:11:43.797172      idle-timeout:        5s
2015/01/12 15:11:43.797177      spool-size:          100
2015/01/12 15:11:43.797180      harvester-buff-size: 16384
2015/01/12 15:11:43.797183      --- flags ---------
2015/01/12 15:11:43.797186      tail (on-rotation):  false
2015/01/12 15:11:43.797190      log-to-syslog:          false
2015/01/12 15:11:43.797193      quiet:             false
2015/01/12 15:11:43.797385 {

root     18731  0.0  0.1  50300  4140 pts/0    Sl+  15:11   0:00 logstash-forwarder/logstash-forwarder -config /etc/logstash-forwarder/logstash-forwarder.conf -spool-size 100 -hostname test.yes.se

But I still get host=go-buildhm in my stdout debug:

{
          "message" => "go-build.test.se 192.168.13.115 - - [12/Jan/2015:15:12:42 +0100] \"GET /sdfsdfsdf HTTP/1.1\" 404 233 \"-\" \"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/3
9.0.2171.95 Safari/537.36\"",
         "@version" => "1",
       "@timestamp" => "2015-01-12T14:12:43.988Z",
             "type" => "apache-access",
             "file" => "/var/log/apache2/access.log",
             "host" => "go-buildhm",
           "offset" => "12288",
            "vhost" => "go-build.test.se",
         "clientip" => "192.168.13.115",
            "ident" => "-",
             "auth" => "-",
        "timestamp" => "12/Jan/2015:15:12:42 +0100",
             "verb" => "GET",
          "request" => "/sdfsdfsdf",
      "httpversion" => "1.1",
         "response" => "404",
            "bytes" => "233",
         "referrer" => "\"-\"",
            "agent" => "\"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36\"",
      "received_at" => "2015-01-12 14:12:43 UTC",
    "received_from" => "go-buildhm"
}

@driskell
Copy link
Contributor

@kiranos what's your logstash config? Maybe it's overriding the host field.

@kiranos
Copy link

kiranos commented Jan 12, 2015

it is:

filter {
  if [type] == "apache-access" {
    grok {
      match => { "message" => "%{HOSTNAME:vhost} %{COMBINEDAPACHELOG}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
  }
}
filter {
  if [type] == "apache-error" {
    grok {
      patterns_dir => "/etc/logstash/patterns"
      match => { "message" => "%{APACHE_ERROR_LOG}" }
      add_field => [ "received_at", "%{@timestamp}" ]
      add_field => [ "received_from", "%{host}" ]
    }
  }
}


@driskell
Copy link
Contributor

Can you check again? The received date of the event you gave us before the logs for restarting forwarder. Maybe it's an old event. Can you confirm with new events since the change?

@kiranos
Copy link

kiranos commented Jan 12, 2015

Yes on my remote host running logstash-forwarder:

ps aux |grep logsta
root     18731  0.0  0.3  63808  7140 pts/0    Sl+  15:11   0:07 logstash-forwarder/logstash-forwarder -config /etc/logstash-forwarder/logstash-forwarder.conf -spool-size 100 -hostname test.yes.se

And the modified file with --help:

go-buildhm:~# logstash-forwarder/logstash-forwarder --help
Usage of logstash-forwarder/logstash-forwarder:
  -config="": path to logstash-forwarder configuration file or directory
  -cpuprofile="": path to cpu profile output - note: exits on profile end.
  -harvest-buffer-size=16384: harvester reader buffer size
  -hb=16384: harvester reader buffer size
  -hostname="": hostname we want to advertise as
  -log-to-syslog=false: log to syslog instead of stdout
  -quiet=false: operate in quiet mode - only emit errors to log
  -spool-size=1024: event count spool threshold - forces network flush
  -sv=1024: event count spool threshold - forces network flush
  -syslog=false: log to syslog instead of stdout
  -t=false: always tail on log rotation -note: may skip entries
  -tail=false: always tail on log rotation -note: may skip entries

I did a new post and its the same:

{
          "message" => "go-build.test.se 192.168.13.115 - - [12/Jan/2015:22:19:06 +0100] \"GET / HTTP/1.1\" 304 - \"-\" \"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36\"",
         "@version" => "1",
       "@timestamp" => "2015-01-12T21:19:08.763Z",
             "type" => "apache-access",
             "file" => "/var/log/apache2/access.log",
             "host" => "go-buildhm",
           "offset" => "13874",
            "vhost" => "go-build.test.se",
         "clientip" => "192.168.13.115",
            "ident" => "-",
             "auth" => "-",
        "timestamp" => "12/Jan/2015:22:19:06 +0100",
             "verb" => "GET",
          "request" => "/",
      "httpversion" => "1.1",
         "response" => "304",
         "referrer" => "\"-\"",
            "agent" => "\"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36\"",
      "received_at" => "2015-01-12 21:19:08 UTC",
    "received_from" => "go-buildhm"
}

Using the same logstash conf as above this time aswell.

@driskell
Copy link
Contributor

How did you apply the patch?

Could you build https://github.com/AntonioMeireles/logstash-forwarder directly if you didn't already? Just to rule out something not patching properly.

@kiranos
Copy link

kiranos commented Jan 12, 2015

I applied it myself, I can try above repo tomorrow, I'll update after I test it, thanks!

@kiranos
Copy link

kiranos commented Jan 13, 2015

I just compiled the fork and its exacly the same:

...
         "@version" => "1",
       "@timestamp" => "2015-01-13T07:05:52.379Z",
             "type" => "apache-access",
             "file" => "/var/log/apache2/access.log",
             "host" => "go-buildhm",
...

From my understanding, the logstash server should never see anythin with go-buildhm in it as I have switched hostname, so logstash-forwarder must send out the wrong info. @AntonioMeireles I can send out a root password for a testbox for you to see it first hand if you like? if so just let me know.

@AntonioMeireles
Copy link
Author

@kiranos i'll find a way to slot quality time to dig into this before the end of this week. no need (yet, hopefully) for access to a test box. thanks for your patience.

@kiranos
Copy link

kiranos commented Jan 13, 2015

@AntonioMeireles great thanks! I'll test it straight away, I have a test setup ready.

Would you consider, adding support for placing it in a config file aswell?

jordansissel commented on 3 Oct 2014
I've been trending towards putting all things in config lately; splitting configuration across two interfaces (some as cli flags, some as config file) is confusing and possibly irritating.

@kiranos
Copy link

kiranos commented Jan 18, 2015

@AntonioMeireles Did you have time to look at this, I've saved my testbox so just let me know if you need it.

@kiranos
Copy link

kiranos commented Feb 9, 2015

@AntonioMeireles would be nice to get this merged before next version (which seems to be around the corner) do you need anything from me?

@jordansissel
Copy link
Contributor

So many merge commits. I'll probably merge this by hand to avoid those. Git is so terrible :\

@jordansissel
Copy link
Contributor

I'll see about merging this soon, but for now setting "host" field per file should be a functional workaround.

@AntonioMeireles
Copy link
Author

I can do a git rebase against your current master tomorrow if that helps
you 😄 && thanks!

sent from my mobile
On Mar 8, 2015 9:03 PM, "Jordan Sissel" notifications@github.com wrote:

I'll see about merging this soon, but for now setting "host" field per
file should be a functional workaround.


Reply to this email directly or view it on GitHub
#260 (comment)
.

AntonioMeireles added a commit to AntonioMeireles/logstash-forwarder that referenced this pull request Mar 8, 2015
Signed-off-by: António Meireles <antonio.meireles@reformi.st>
@twood9003
Copy link

Following. +1
I'm newer to github and only use it for logstash. Did we have a solution to this yet?

Wanting to make logstash-forwarder show a specific hostname rather than the FQDM or

Before:
5345433540100.ded.nethosting.com

After:
Twood.host

@pandujar
Copy link

This feature is very convenient, either command line or config setting, or both.

@hron84
Copy link

hron84 commented Jul 7, 2015

Is there any way to get this feature merged? There is some real cases where short name is simply not enough, especially if you collect logs from multiple environments where same names can appear (e.g. webapp01.smallcompany.com, webapp01.bigcompany.com), but there is a reason why the hostname should not be FQDN (usually, it's a bad idea, since all program that use FQDN names can also parse this info by gethostbyaddr(3) system call (this usually returns with FQDN, same as hostname -f gives), but there's a lot of (badly written) program that throws error if hostname is longer than expected or contains non-alphanumeric characters).

@JeroenvHeugten
Copy link

+1. We are using multiple clusters with servers called web1,2,3 and without FQDN they all look the same in Logstash. Please merge this soon, otherwise we will have to move to log courier.

@ruflin ruflin added the libbeat label Sep 14, 2015
@jordansissel
Copy link
Contributor

Thanks for helping make logstash-forwarder better!

Logstash-forwarder is going away and is replaced by filebeat and its friend, libbeat. If this is still an issue, would you mind opening a ticket there?

@hron84
Copy link

hron84 commented Nov 18, 2015

@jordansissel filebeat is allows us to explicitly set the hostname (instead of some magic that try to figure it out - wrongly?)

driskell added a commit to driskell/log-courier that referenced this pull request Feb 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.