do the following in the ddc.yaml:
- set
number-threads: 1
- set
dremio-jstack-freq-seconds: 10
- set
dremio-queries-json-num-days: 7
- if using
--dremio-pat-prompt
when running ddc or settingdremio-pat-token
in the ddc.yaml then setnumber-job-profiles: 50
do the following in the ddc.yaml:
- set
dremio-queries-json-num-days: 7
- set
collect-gc-logs: false
do the following in the ddc.yaml:
- set
number-threads: 4
- if using
--dremio-pat-prompt
when running ddc or settingdremio-pat-token
in the ddc.yaml then setnumber-job-profiles: 50
- set
--transfer-dir
at the cli or if doing a local-collect use--tarball-out-dir
or settarball-out-dir
in ddc.yaml this will avoid the use the /tmp folder (as of ddc 0.9.0)
- read the
ddc-HOSTNAME.log
logs and see what errors there are (ie literally grep for ERROR) - are the dremio-log-dir, dremio-conf-dir set correctly? (assuming the node is offline or the version of DDC is under 0.8 this may be necessary to set)
- the job profiles, KV report, WLM report, and system table report all need
dremio-pat-token
to be set in ddc.yaml or--dremio-pat-prompt
to be passed at the command line - are you running the latest version of DDC? We had over 15 releases in 2023 containing bug fixes and new functionality, check here https://github.com/dremio/dremio-diagnostic-collector/releases
- If you are running ssh..did you remember to use --sudo-user as the dremio user or as a user with admin rights?
DDC has 4 modes - but you can alter the ddc.yaml
and override collection modes to suit your own configuration
- system disk usage
- server.log and 2 days of archives
- metadata_refresh.log and 2 days of archives
- reflection.log and 2 days of archives
- queries.json and up to 2 days of archives
- all Dremio configurations
- all GC logs if present
- perf metrics (cpu and GC usage by thread)
- system disk usage
- Java Flight Recorder recording of 60 seconds
- top output of 60 seconds - older versions used ttop
- server.log and 7 days of archives
- metadata_refresh.log and 7 days of archives
- reflection.log and 7 days of archives
- queries.json and up to 30 days of archives
- all Dremio configurations
- all GC logs if present
- everything stated in standard
- captures 60 seconds of jstack at 1 second intervals
- everything stated in standard
- a sampling of job profiles (note 25000 jobs can take 15 minutes to collect)
- Dremio key value store report
- Dremio work load manager details
- system tables and their details
- access.log and 7 days of archives
- audit.log and 7 days of archives
- Java heap dump