-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
compatible: add system logging #136
Open
phvalguima
wants to merge
10
commits into
main
Choose a base branch
from
add-sos-report
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
6f8ee32
Add sosreport logging
phvalguima 4c8d1ba
Add fixes to sosreport cmd
phvalguima fa34cd8
removing all_logs option
phvalguima 0a6f10c
Add space check and permission management
phvalguima 02d87e7
Fix sos report cmd and add --no-local to collect, as we are already c…
phvalguima 127e315
Add integration tests
phvalguima 5a34e1a
Fix lint
phvalguima ccd83ce
Add documentation for the log structure
phvalguima fbb900c
Add a check for empty model before sos collect
phvalguima d53a085
Add double-quotes
phvalguima File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -328,6 +328,58 @@ jobs: | |
run: | | ||
juju switch test | ||
mkdir ~/logs/ | ||
- name: Run SOS reports | ||
if: ${{ failure() && steps.tests.outcome == 'failure' }} | ||
run: | | ||
sudo snap install sosreport --channel=latest/stable --classic | ||
# Needed as sosreport does not like 100+ char long paths | ||
mkdir /tmp/sos | ||
sudo sos report \ | ||
--only-plugins kubernetes,systemd,logs,juju \ | ||
--enable-plugins kubernetes,juju \ | ||
-k kubernetes.describe=true -k kubernetes.podlogs=true -k kubernetes.all=true \ | ||
--batch \ | ||
--clean \ | ||
--tmp-dir=/tmp/sos \ | ||
-z gzip | ||
- name: Run SOS in LXCs if Needed | ||
if: ${{ inputs.cloud == 'lxd' && (failure() && steps.tests.outcome == 'failure') }} | ||
Comment on lines
+345
to
+346
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. could you separate this step into another PR? I'd like to get the other changes merged quickly & I think there's some complexity/subtleties (especially around secret redaction) with sosreport inside the lxc container that will hold up the other changes |
||
run: | | ||
if [ -z "$(sudo lxc list -f csv | wc -l)" ]; then | ||
echo "No containers available, nothing to collect logs for..." | ||
exit 0 | ||
fi | ||
|
||
juju exec --parallel=true --all -- sudo snap install sosreport --channel=latest/stable --classic | ||
sudo snap install jq | ||
export NODES | ||
NODES="$(juju status --format=json | jq -r '.machines[]|."ip-addresses"[0]' | paste -s -d, -)" | ||
echo "Found nodes: $NODES" | ||
|
||
echo "Total space before running command:" | ||
sudo df -h | ||
|
||
sudo sos collect \ | ||
-i ~/.local/share/juju/ssh/juju_id_rsa --ssh-user ubuntu --no-local \ | ||
--nodes "$NODES" \ | ||
--only-plugins systemd,logs,juju \ | ||
-k logs.all_logs=true \ | ||
--batch \ | ||
--clean \ | ||
--tmp-dir=/tmp/sos \ | ||
-z gzip -j 1 | ||
- name: Prepare upload - local reports | ||
if: ${{ failure() && steps.tests.outcome == 'failure' }} | ||
run: | | ||
I="$(whoami)" | ||
sudo chown -R "$I" /tmp/sos/ | ||
mv /tmp/sos/*.tar.gz ~/logs/ | ||
|
||
- name: Print kernel messages | ||
if: ${{ failure() }} | ||
run: | | ||
sudo dmesg | ||
|
||
- name: juju status | ||
if: ${{ success() || (failure() && steps.tests.outcome == 'failure') }} | ||
run: juju status --color --relations | tee ~/logs/juju-status.txt | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
Whenever a test fails, data-platform-workflows will capture that run logs using [sosreport](https://github.com/sosreport/sos). | ||
|
||
The logs can be downloaded from the run's "Summary" page. | ||
|
||
The sosreport is ran in the actual runner and captures logs from the host itself as well as the model's containers (LXC / k8s). | ||
|
||
# Log structure | ||
|
||
``` | ||
/ | ||
| | ||
+---- juju-debug-log.txt: captured at the end of the run | ||
| | ||
+---- juju-status.txt: captured at the end of the run | ||
| | ||
+---- sos-collector-... | ||
| | ||
+---- sosreport-... | ||
``` | ||
|
||
## Github Runner logs | ||
|
||
The tarball `sosreport-` contains all the host logs. It will hold its syslog, journal and kernel logs. | ||
|
||
Relevant logs: | ||
* /var/log/{kern,syslog}.log: OS-related logs, including kernel | ||
* /sos_commands/kubernetes/: logs related to the k8s infra and its pods | ||
* /sos_commands/logs/: journalctl outputs | ||
|
||
## LXC logs | ||
|
||
The workflow also runs `sos collect` against each of the LXC containers, if they are available in the model. | ||
|
||
The goal is to collect system level logs of the containers, as well as juju's. | ||
|
||
These logs will be in `sos-collector-...` tarball. In that tarball, each container will have its own `sosreport-...`. | ||
|
||
Each tarball will contain a subset of the logs mentioned in the previous section (since logs such as kern.log or k8s | ||
do not make sense within LXC containers). | ||
|
||
# Missing any extra logs? | ||
|
||
If any logs are missing, e.g. logs in specific folders of /var/snap, then the steps are: | ||
1) Extend or add a new plugin to the sosreport | ||
2) Add it as an extra plugin (if needed) to the `integration_test_charms.yaml`. | ||
|
||
It is important that, not only the sosreport PR has been merged upstream, but the change makes its way into the | ||
sosreport's official snap and the [packages in Ubuntu](https://packages.ubuntu.com/search?suite=all&arch=any&searchon=names&keywords=sosreport). | ||
|
||
# Notes | ||
|
||
It is important to state these commands are ran at the end of the test, if it fails; therefore, if a container has been | ||
created and destroyed during the test, it will not show in the sosreports. However, juju debug logs will contain every | ||
log exchanged with the controller, and hence, even history of destroyed units. | ||
|
||
If the syslog file has no recent logs, then check the /sos_commands/logs for the journalctl outputs. Normally, they will | ||
correspond to the same logs but journalctl may be more up-to-date. | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should maybe install from apt
I'd like to use the newer version, but the snap publisher isn't verified
and given that we might be passing github secrets to sosreport, might be a security risk if the snap maintainer is compromised