Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug: collect-info.sh enhancements #3467

Merged
merged 3 commits into from
Sep 28, 2023
Merged

Conversation

rouming
Copy link
Contributor

@rouming rouming commented Sep 26, 2023

The PR does two things:

  1. Change date format. Exclude colon symbol from the date format which is the name of the result tarball, because tar utility complains with the following error on attempt to extract the folder:

    tar: Cannot connect to eve-info-v7-2023-09-22T14\: resolve failed
    

Now the resulting tarball name will be as follows:

   eve-info-v8-Tue-26-Sep-2023-10-03-18.tar.gz

which does not cause any nasty errors.

  1. Add read-logs mode. We need a very simple command which can sort and read gzipped logs in one looooong sheet, which can be redirected to a JSON file and processed afterwards. Here it is: call collect-info.sh from the extracted tarball on your machine.

    collect-info.sh : reads all (device and all applications logs) and outputs in JSON

    collect-info.sh -d : reads device logs and outputs in JSON

    collect-info.sh -a UUID : reads application logs and outputs in JSON

  2. Add a few additional command and files outputs: ls -la /dev/, free, vmstat, iomem, /sys/fs/cgroup/memory

@codecov
Copy link

codecov bot commented Sep 26, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (6e22b47) 20.29% compared to head (952e9ad) 20.29%.
Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #3467   +/-   ##
=======================================
  Coverage   20.29%   20.29%           
=======================================
  Files         198      198           
  Lines       45268    45268           
=======================================
+ Hits         9188     9189    +1     
+ Misses      35396    35395    -1     
  Partials      684      684           

see 2 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

if [ ! -d "/persist" ]; then
FIND=".log"

if [ ! -z "$READ_LOGS_DEV" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rouming , all these Yetus errors makes sense to fix.


exit
fi

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rouming , some information that I still miss a lot from collect-info.sh is a list of /dev directory to see which file devices are present. I think we could take this opportunity to add this information (output of ls -l /dev), would you mind to make this change as well in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will add.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SC2156 is explitly ignored in the code, but yetus still complains. I don't have any desire to fight with the tool, so leave it as is.

@rene
Copy link
Contributor

rene commented Sep 26, 2023

arm64 build is failing due to eve-ipxe package. Maybe @jsfakian has some clue about this issue....

@jsfakian
Copy link
Contributor

jsfakian commented Sep 26, 2023

arm64 build is failing due to eve-ipxe package. Maybe @jsfakian has some clue about this issue...

I tried it locally on my MacOS, and it seems to compile.

jsfakian@ioanniss-mbp-2 eve % make eve-ipxe
Creating go builder image for user eve
#1 [internal] load build definition from Dockerfile
#1 sha256:871234f00f714bffe35713c7a34e0e7c53cc09ec2f87e860f48026b96ca13f70
#1 transferring dockerfile: 37B done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 sha256:292bc7163d5e0c76b61cc9d5d8674f693afd8ea1550f00d7134c6042e92f481f
#2 transferring context: 2B done
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/golang:1.20.1-alpine
#3 sha256:b8824521f6804ab725b9546bc2d1925b2388c62a5447343ce12b785b0df9a5a5
#3 DONE 0.6s

#4 [ 1/13] FROM docker.io/library/golang:1.20.1-alpine@sha256:87d0a3309b34e2ca732efd69fb899d3c420d3382370fd6e7e6d2cb5c930f27f9
#4 sha256:86d30c9219a63ea6d0eba48d9cc87e0d9c55e38489e665ca396dfd17cfba1c9a
#4 DONE 0.0s

#5 [ 2/13] RUN apk add --no-cache openssh-client git gcc linux-headers libc-dev util-linux libpcap-dev bash vim make protobuf protobuf-dev sudo tar curl graphviz ttf-freefont patch dnsmasq
#5 sha256:3cacc40d5ebac653f1ad71c078900c83a6be0782971d961326044acaf16fdd67
#5 CACHED

#8 [ 5/13] RUN sed -ie /:1001:/d /etc/passwd /etc/shadow ; sed -ie /:1001:/d /etc/group || :
#8 sha256:d1a8304df497d0a28438bf2bd3b3a9c539ae1dfbbf1f4b85235ad64f0c2df4ea
#8 CACHED

#14 [11/13] RUN go install github.com/seamia/protodot@87817c3d0a8e7af753af15508b51292e941bc7c6
#14 sha256:a5f93f3c1674016af656a64ede4611bc8516d76c467398e57270ad0bc62d78be
#14 CACHED

#7 [ 4/13] RUN deluser eve ; delgroup eve || :
#7 sha256:58f65c959668304ab14ba185a09f8cac19c803829ef069bf6b9d74e8647f9a5d
#7 CACHED

#6 [ 3/13] RUN apk --no-cache --repository https://dl-cdn.alpinelinux.org/alpine/v3.16/main add -U --upgrade zfs-dev zfs-libs
#6 sha256:fe9ad77350dac14c4f613bfe0bb0ca42d38ab166690018b4640cdba732d49e23
#6 CACHED

#12 [ 9/13] RUN go install github.com/golang/protobuf/protoc-gen-go@v1.5.2
#12 sha256:c327870d9510e660b2ba0004129531c5138ac339a924c6cbdc91349131618ed8
#12 CACHED

#15 [12/13] RUN mv /go/bin/protodot /usr/local/bin
#15 sha256:68d8ff6fe4248268a642fd73df4ee42893a0b54cb29d0222698b1cb69aafcc2a
#15 CACHED

#9 [ 6/13] RUN addgroup -g 1001 eve && adduser -h /home/eve -G eve -D -H -u 1001 eve
#9 sha256:e8648550919e2a92b30d889d5db70dad5e6b0eae39c9cefcac5a04ec39099229
#9 CACHED

#10 [ 7/13] RUN echo "eve ALL=(ALL) NOPASSWD:ALL" > /etc/sudoers.d/eve
#10 sha256:8781aac567f44bd268c530ffc23a7dc510a1e8e4336e35e1f4def4c9b6799cbf
#10 CACHED

#11 [ 8/13] RUN OS="$(uname -o | tr '[:upper:]' '[:lower:]')" && PLATFORM="$(go version | sed 's#^.*'${OS}'/##g')" && curl -o /usr/local/bin/dep -L "https://github.com/golang/dep/releases/download/v0.5.4/dep-${OS}-${PLATFORM}" && chmod +x /usr/local/bin/dep
#11 sha256:356ddfbbe14d96704f0663bc5c9be28d87ca40a19c6c4ddd479b344665568967
#11 CACHED

#13 [10/13] RUN go install gotest.tools/gotestsum@v1.7.0
#13 sha256:67628aabee56edd4c09c2b6cd4895c788893212c358374dc35adfb1ae111fa47
#13 CACHED

#16 [13/13] RUN mv /go/bin/* /usr/bin
#16 sha256:b0076535cce7f07e1e504f82c5614c6e2496811483128680a395af7e289c3e4c
#16 CACHED

#17 exporting to image
#17 sha256:e8c613e07b0b7ff33893b694f7759a10d42e180f2b4dc349fb57dc6b71dcab00
#17 exporting layers done
#17 writing image sha256:a70aa3d06080660e6042f43e0a3c386cb42ad8d22a7e432940b8914a5e63bd5e done
#17 naming to docker.io/library/eve-build-eve done
#17 DONE 0.0s

Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
eve-build-eve docker container is ready to use
Done building /Users/jsfakian/Documents/src/eve/build-tools/bin/linuxkit
Building "lfedge/eve-ipxe:39473106fdccfc475245c30ab33f640b8cbd5b21"
checking for docker.io/lfedge/eve-ipxe:39473106fdccfc475245c30ab33f640b8cbd5b21 in local cache...
docker.io/lfedge/eve-ipxe:39473106fdccfc475245c30ab33f640b8cbd5b21 arm64 not found in local cache, checking registry
docker.io/lfedge/eve-ipxe:39473106fdccfc475245c30ab33f640b8cbd5b21 arm64 found on registry
Build complete, not pushing, all done.

Could you try to restart the GitHub action to see if the problem continues?

@@ -15,19 +15,26 @@ VERSION=7
# still attempt to install those packages.
PKG_DEPS="procps tar dmidecode iptables dhcpcd"

DATE=$(date -Is)
DATE=$(date "+%a-%d-%b-%Y-%H-%M-%S")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no issue with getting rid of the colons, but can we use a format which sorts properly (year, month, date) such as date "+%Y-%m-%d-%H-%M-%S"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course

@@ -45,6 +58,30 @@ while getopts "vh" o; do
esac
done

# We are not on EVE? Switch to read-logs mode
if [ ! -d "/persist" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - I have a /persist on my laptop.

Can we avoid making this assymption? E.g., by adding an -e (for extract) or -A (for All) for the case where you currently look for /persist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me do the opposite:

# We are not on EVE? Switch to read-logs mode
if [ -d "$SCRIPT_DIR/persist-newlog" ]; then
      ...
fi

The script is the part of the tarball. So the debugging procedure is the following:

tar -xf eve-info-v8-2023-09-27-08-45-19.tar.gz
./eve-info-v8-2023-09-27-08-45-19/collect-info.sh > all-logs.json

I would like to avoid any mode options, because the only thing you can do once the tarball is collected is to extract all the logs, collecting logs makes sense only when the script is on EVE. So there is no actually an option, no choice, the action is determined by the script placement. Ideally there should be another script, but I do not like this zoo of scripts as well.

Exclude colon symbol from the date format which is the name
of the result tarball, because tar utility complains with the
following error on attempt to extract the folder:

   tar: Cannot connect to eve-info-v7-2023-09-22T14\: resolve failed

Now the resulting tarball name will be as follows:

   eve-info-v8-2023-09-27-08-41-29.tar.gz

which does not cause any nasty errors.

Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
@rouming
Copy link
Contributor Author

rouming commented Sep 27, 2023

Difference to the previous version:

  • Make date format as follows "+%Y-%m-%d-%H-%M-%S"
  • Check the existence of the 'persist-newlog' directory in the same script folder and switch to extraction if found

@@ -200,14 +201,18 @@ lsof > "$DIR/lsof"
lsmod > "$DIR/lsmod"
logread > "$DIR/logread"
dmidecode > "$DIR/dmidecode"
ls -la /dev > "$DIR/ls-la-dev"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there could be directories inside /dev as well, so I think it's better to add a -R: ls -lRa /dev

@@ -186,12 +186,13 @@ collect_pillar_backtraces()
cp "${0}" "$DIR"

# Have to chroot, lsusb does not work from this container
echo "- lsusb, dmesg, ps, lspci, lsblk, lshw, lsof, lsmod, logread, dmidecode"
echo "- lsusb, dmesg, ps, lspci, lsblk, lshw, lsof, lsmod, logread, dmidecode, ls -la /dev, free"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the "Subj." text in your commit message was left behind... you can remove it...

@@ -18,16 +18,24 @@ PKG_DEPS="procps tar dmidecode iptables dhcpcd"
DATE=$(date "+%Y-%m-%d-%H-%M-%S")
INFO_DIR="eve-info-v$VERSION-$DATE"
TARBALL_FILE="/persist/$INFO_DIR.tar.gz"
SCRIPT_DIR=$(dirname "$(readlink -f \"$0\")")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, fix yetus errors

# conversion. Also:
# "-R (. as $line | try fromjson)" means ignore a line if is not a JSON,
# "(nanos // 0)" means return 0 if nanos is null
jq -R '(. as $line | try fromjson) | .timestamp_str.nanos = ((.timestamp.nanos // 0) + 1e9 | tostring | .[1:]) | .timestamp_str.human = (.timestamp.seconds | strftime("%B %d %Y %I:%M:%S")) | .timestamp_str = "\(.timestamp_str.human).\(.timestamp_str.nanos)"'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please put to the comment an example of the transformed JSON entity? Something like:

{
  "message": "Some log message",
  "timestamp": {
    "seconds": 1672444800,
    "nanos": 123456789
  },
  "timestamp_str": {
    "nanos": ".123456789",
    "human": "January 01 2023 12:00:00"
  }
}

It's a bit confusing with all the formats of the timestamps we have... The example will help to understand and to implement parsing scripts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ii's a bit different, this is the output:

  "timestamp": {
    "seconds": 1677023830,
    "nanos": 399834057
  },
  "timestamp_str": "February 21 2023 11:57:10.399834057"

Yes, I can describe why I create a new json entry and how it looks like.

@rouming
Copy link
Contributor Author

rouming commented Sep 28, 2023

Difference to the previous version:

  • yetus fix (hallelujah!)

@@ -122,12 +190,13 @@ collect_pillar_backtraces()
cp "${0}" "$DIR"

# Have to chroot, lsusb does not work from this container
echo "- lsusb, dmesg, ps, lspci, lsblk, lshw, lsof, lsmod, logread, dmidecode"
echo "- lsusb, dmesg, ps, lspci, lsblk, lshw, lsof, lsmod, logread, dmidecode, ls -la /dev, free"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This string also needs to be updated: ls -la /dev -> ls -lRa /dev

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Now there are no excuses, Rene, not to approve :)

Copy link
Contributor

@rene rene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

We need a very simple command which can sort and read gzipped
logs in one looooong sheet, which can be redirected to a JSON
file and processed afterwards. Here it is: call collect-info.sh
from the extracted tarball on your machine.

  collect-info.sh          : reads all (device and all applications
		             logs) and outputs in JSON

  collect-info.sh -d       : reads device logs and outputs in JSON

  collect-info.sh -a UUID  : reads application logs and outputs in JSON

Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
…tputs

Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
@rouming rouming merged commit 3bb6f36 into lf-edge:master Sep 28, 2023
@rouming rouming deleted the collect-info branch September 28, 2023 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants