Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

appdstatus.sh: Events-service liveness-detection fragile and broken in 4.4(?)+ (4.5.1 confirmed) #94

Open
scosol opened this issue Oct 3, 2018 · 3 comments

Comments

@scosol
Copy link

scosol commented Oct 3, 2018

Issue 1): EC allows installation of events service to any location-
Issue 2): "Events Service" process signature changed from "events_service" to "events-service"
(underscore to hyphen)

Liveness script checks for it in the "usual" <=V4.3 spot (fails)

bash -x output:

`+ events_running

  • grep /data/appdynamics/platform/product/controller/events_service
  • grep java
  • ps -f -u appd
  • return 1
  • case $? in
  • echo 'events service not running'
    events service not running`

Events service doesn't live there, events service lives here:

/data/appdynamics/platform/product/events-service/processor/bin/

Process signature:

appd@ubuntu:/data/appdynamics/platform/product/controller/HA$ ps -ef|grep events_service appd 9925 8428 0 11:33 pts/0 00:00:00 grep --color=auto events_service appd@ubuntu:/data/appdynamics/platform/product/controller/HA$ ps -ef|grep events-service appd 4977 1 1 11:02 tty1 00:00:26 /data/appdynamics/platform/product/jre/1.8.0_162/bin/java -Xmx746m -Xms746m -Djava.net.preferIPv4Stack=true -Dfile.encoding=UTF-8 -Djute.maxbuffer=30000000 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -verbose:gc -XX:GCLogFileSize=64m -XX:NumberOfGCLogFiles=4 -XX:+UseGCLogFileRotation -XX:CompileCommand=exclude,org/apache/lucene/lucene54/Lucene54DocValuesConsumer.addSortedNumericField -XX:CompileCommand=exclude,org/apache/lucene/lucene54/Lucene54DocValuesConsumer.addBinaryField -XX:CompileCommand=exclude,org/elasticsearch/search/aggregations/metrics/percentiles/tdigest/AbstractTDigestPercentilesAggregatorstart.collect -XX:CompileCommand=exclude,org/apache/lucene/index/SortedNumericDocValuesWriter.flush -XX:CompileCommand=exclude,org/apache/lucene/codecs/PushPostingsWriterBase.writeTerm -Xloggc:/data/appdynamics/platform/product/events-service/processor/bin/../logs/%p-gc.log -DAPPLICATION_HOME=/data/appdynamics/platform/product/events-service/processor/bin/.. -classpath /data/appdynamics/platform/product/events-service/processor/bin/../lib/* com.appdynamics.analytics.processor.AnalyticsService -p /data/appdynamics/platform/product/events-service/processor/conf/events-service-api-store.properties -y /data/appdynamics/platform/product/events-service/processor/bin/../conf/events-service-api-store.yml appd 5011 4977 1 11:02 tty1 00:00:20 /data/appdynamics/platform/product/jre/1.8.0_162/bin/java -Xms2238m -Xmx2238m -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Djna.nosys=true -Djava.net.preferIPv4Stack=true -Dfile.encoding=UTF-8 -Djute.maxbuffer=30000000 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+DisableExplicitGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -verbose:gc -XX:GCLogFileSize=64m -XX:NumberOfGCLogFiles=4 -XX:+UseGCLogFileRotation -XX:CompileCommand=exclude,org/apache/lucene/lucene54/Lucene54DocValuesConsumer.addSortedNumericField -XX:CompileCommand=exclude,org/apache/lucene/lucene54/Lucene54DocValuesConsumer.addBinaryField -XX:CompileCommand=exclude,org/elasticsearch/search/aggregations/metrics/percentiles/tdigest/AbstractTDigestPercentilesAggregator$1.collect -XX:CompileCommand=exclude,org/apache/lucene/index/SortedNumericDocValuesWriter.flush -XX:CompileCommand=exclude,org/apache/lucene/codecs/PushPostingsWriterBase.writeTerm -Dmapper.allow_dots_in_name=true -Des.insecure.allow.root=true -Des.path.home=/data/appdynamics/platform/product/events-service/processor/elasticsearch -cp /data/appdynamics/platform/product/events-service/processor/elasticsearch/lib/elasticsearch-2.4.1.jar:/data/appdynamics/platform/product/events-service/processor/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch start -p /data/appdynamics/platform/product/events-service/processor/bin/../elasticsearch.id appd 9927 8428 0 11:33 pts/0 00:00:00 grep --color=auto events-service

Kinda lame fix/hack:

HA/lib/status.sh:

`function events_running {

scosol fix

if ps -f -u $RUNUSER | grep "java" | grep "$APPD_ROOT/events_service" >/dev/null ; then

    if ps -f -u $RUNUSER | grep "java" | grep "events-service" >/dev/null ; then`
@plizonczyk
Copy link

Bumped into it now. I'll post a patch.

@plizonczyk
Copy link

Or not - whole ES handling calls for revamp. We always refer to controller/events_service

@ayushghosh
Copy link
Member

This is because the Event Service used to be an embedded service at the location.
Now in PROD, we ask for a dedicated server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants