Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The log saved during kurtosis log collection is too large #2190

Open
icesfeathers opened this issue Feb 20, 2024 · 7 comments
Open

The log saved during kurtosis log collection is too large #2190

icesfeathers opened this issue Feb 20, 2024 · 7 comments
Assignees
Labels
cli For bugs relating to the CLI feature request painful Painful bug

Comments

@icesfeathers
Copy link

Background & motivation

I'm running a geth private network using kurtosis, and the kurtosis logs used 150G of the hard drive in one day, but the actual geth network only used 20G.
I checked the container kurtosis-logs-collection and found that it was continuously collecting debug logs, causing my hard disk space to be filled up with logs.
20240220-133736

Desired behaviour

Currently, I stopped kurtosis's log collection container and cleared kurtosis-logs-storage to free up space.
For me, kurtosis logs are not more important than eth network.
It is recommended to increase the log collection level of kurtosis logs or disable it.

How important is this to you?

Critical; Kurtosis is unusable for me without it.

What area of the product does this pertain to?

CLI: the Command Line Interface

@github-actions github-actions bot added cli For bugs relating to the CLI critical Critical bug or feature labels Feb 20, 2024
@icesfeathers
Copy link
Author

Here is the command information and configuration information for running kurtosis

kurtosis run --cli-log-level "panic" --enclave my-testnet github.com/kurtosis-tech/[email protected] "$(cat /data/fleet/kurtosis_network_params.yaml)"
participants:
    - validator_count: 512
      beacon_extra_params:
          - "--reconstruct-historic-states"
      el_extra_params:
          - "--gcmode=archive"
      el_max_mem: 8192
      bn_max_mem: 4096
      v_max_mem: 4096
    - validator_count: 1
      el_max_mem: 8192
      bn_max_mem: 4096
      v_max_mem: 4096
network_params:
    seconds_per_slot: 6
    eth1_follow_distance: 2
    deneb_fork_epoch: 9999999999999
additional_services:
    - tx_spammer
    - blob_spammer
    - el_forkmon
    - blockscout
    - beacon_metrics_gazer
    - dora
    - prometheus_grafana
mev_type: "full"

@tedim52
Copy link
Contributor

tedim52 commented Feb 20, 2024

Thanks for filing this @icesfeathers ! As mentioned in Discord, it seems like what would make sense as a solution here is:

the ability the ability to turn off the collection of logs (in which case you won't be able to view logs after services have died)

OR

a byte based retention mechanism where historical logs are removed after a certain threshold is passed so it logs fill up disk space.(For context, currently, we have time based log retention where old logs are removed after 4 weeks but no byte based log retention)

@icesfeathers
Copy link
Author

After I stopped the log container, the memory usage of the dockerd process soared and could not be released. Is this a bug?

@icesfeathers
Copy link
Author

image

@icesfeathers
Copy link
Author

@tedim52

@icesfeathers
Copy link
Author

Since I have not turned on the debug mode, I cannot explain in detail. Since this docker only runs kurtosis, I would like to give you some feedback.

@tedim52
Copy link
Contributor

tedim52 commented Feb 23, 2024

Hey @icesfeathers ! I suspect what's causing the memory issues are the fact that logs are being sent to logs aggregator but aren't being processed/forwarded. Currently, we don't support a way to turn off logs collection safely in the product.

@guisellaDes guisellaDes added painful Painful bug and removed critical Critical bug or feature labels Feb 28, 2024
@tedim52 tedim52 self-assigned this Feb 28, 2024
github-merge-queue bot pushed a commit that referenced this issue Aug 16, 2024
…ace (#2534)

## Description
Log Retention Improvements:

Users have had issues with logs from long running enclaves taking up
tons of storage. We have a retention mechanism that automatically
rotates logs after some time but currently it a) is only able to rotate
logs weekly b) can not be configured.

These improvements will allow retention to be as granular as hourly and
will allow users to configure the retention period (eg. `1hr`, `2hr`,
`1day`, `1week`). To do this, a few changes need to happen. Most notably
the way logs are stored and retrieved needs to change to support
rotating log files hourly. Implementing this requires changes across a
few components(`LogsAggregator`, `LogsDatabaseClient`, `LogFileManager`,
cli, so I'll be making them incrementally.

- [x] PR 1: Introduce `LogFileLayout` and `PerWeekLogFileLayout` and
adjust `LogFileManager` and `LogsDatabaseClient` to use this for
retrieving log file paths
- [ ] PR 2: Make retention configurable via a CLI flag
- [ ] PR 3: Implement `PerHourFileLayout` and swap storage format from
`PerWeekFileLayout` to `PerHourFileLayout`
- [ ] PR 4: Make `LogsAggregator` use `LogFileLayout` for determining
storage format and set it to use `PerHourFileLayout`
------------------------------
This first PR sets the stage for this change by introducing a new
interface called `LogFileLayout` and migrating the existing storage
format to use it. Right now, knowledge about the log storage format is
spread across several entities (`LogsAggregator` - to store,
`LogFileManager` - to rotate logs, `StreamLogsStrategy` - to read logs).
This interface consolidates knowledge of how logs are stored into one
module that can be used by any entity doing operations that require
retrieving log files from the filesystem. This will make the transition
to a different storage format safer (only requires implementing +
testing `PerHourFileLayout` module) and makes it easier to swap out the
storage format in the future (eg. if even more granular retention is
need) without breaking other entities.

## Is this change user facing?
NO

## References 
#2443
#2190
tedim52 added a commit that referenced this issue Aug 30, 2024
… retention period (#2536)

## Description
Log Retention Improvements:

This PR allows users to configure log retention period with
`--log-retention-period` flag on engine starts/restarts. Default is set
to 1 week.

- [x] PR 1: Introduce `LogFileLayout` and `PerWeekLogFileLayout` and
adjust `LogFileManager` and `LogsDatabaseClient` to use this for
retrieving log file paths
- [x] PR 2: Make retention configurable via a CLI flag
- [ ] PR 3: Implement `PerHourFileLayout` and swap storage format from
`PerWeekFileLayout` to `PerHourFileLayout`
- [ ] PR 4: Make `LogsAggregator` use `LogFileLayout` for determining
storage format and set it to use `PerHourFileLayout`
------------------------------

## Is this change user facing?
NO

## References 
#2443
#2190
#2534
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cli For bugs relating to the CLI feature request painful Painful bug
Projects
None yet
Development

No branches or pull requests

3 participants