Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for Time-Based Log Retention in Loki #377

Merged
merged 9 commits into from
Apr 22, 2024
Merged

Conversation

IbraAoad
Copy link
Contributor

@IbraAoad IbraAoad commented Apr 17, 2024

Issue

Fixes #278 and might potentially be complementary to the fix of #369 & #362

This PR adds support for having time based log retention in Loki and it also points loki to use "store": "boltdb-shipper" instead of the currently invalid option "store": "boltdb" which might help in fixing #362 since data compaction will be enabled

Testing has been conducted to validate the following:

  • Deployments that were previously functioning without a compactor ("store": "boltdb") can safely transition to using boltdb-shipper without losing any old logs.
  • Old logs are retained and properly compacted upon startup with the new configuration.
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.167917681Z caller=shipper.go:165 index-store=boltdb-shipper-2020-10-24 msg="starting index shipper in RW mode"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.167338945Z caller=table_manager.go:136 index-store=boltdb-shipper-2020-10-24 msg="uploading tables"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.168149888Z caller=table_manager.go:220 index-store=boltdb-shipper-2020-10-24 msg="found a legacy file index_19830, moving it to folder with same name"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.168235527Z caller=table_manager.go:240 index-store=boltdb-shipper-2020-10-24 msg="loading table index_19830"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.16856759Z caller=table.go:318 msg="handing over indexes to shipper index_19830"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.168603427Z caller=table.go:334 msg="finished handing over table index_19830"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.168626094Z caller=shipper_index_client.go:76 index-store=boltdb-shipper-2020-10-24 msg="starting boltdb shipper in RW mode"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.168728076Z caller=table_manager.go:171 index-store=boltdb-shipper-2020-10-24 msg="handing over indexes to shipper"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.168812945Z caller=table.go:318 msg="handing over indexes to shipper index_19830"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.168820671Z caller=table.go:334 msg="finished handing over table index_19830"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.180144521Z caller=module_service.go:82 msg=initialising module=compactor
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.180241431Z caller=ring.go:273 msg="ring doesn't exist in KV store yet"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.180314544Z caller=basic_lifecycler.go:297 msg="instance not found in the ring" instance=loki-0 ring=compactor
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.180391138Z caller=compactor.go:395 msg="waiting until compactor is JOINING in the ring"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.180879172Z caller=manager.go:995 user=fake msg="Starting rule manager..."
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.182992457Z caller=ingester.go:447 msg="recovered WAL checkpoint recovery finished" elapsed=2.881953ms errors=false
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.183016236Z caller=ingester.go:453 msg="recovering from WAL"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.184145693Z caller=ingester.go:469 msg="WAL segment recovery finished" elapsed=4.03537ms errors=false
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.184164644Z caller=ingester.go:417 msg="closing recoverer"
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.184173336Z caller=ingester.go:425 msg="WAL recovery finished" time=4.063365ms
2024-04-17T14:53:40.246Z [loki] level=info ts=2024-04-17T14:53:40.18419343Z caller=wal.go:156 msg=started component=wal
2024-04-17T14:53:40.346Z [loki] level=info ts=2024-04-17T14:53:40.314000526Z caller=compactor.go:399 msg="compactor is JOINING in the ring"
2024-04-17T14:53:40.346Z [loki] level=info ts=2024-04-17T14:53:40.314112138Z caller=compactor.go:409 msg="waiting until compactor is ACTIVE in the ring"
2024-04-17T14:53:40.446Z [loki] level=info ts=2024-04-17T14:53:40.425459505Z caller=compactor.go:413 msg="compactor is ACTIVE in the ring"
2024-04-17T14:53:40.446Z [loki] level=info ts=2024-04-17T14:53:40.425499846Z caller=loki.go:505 msg="Loki started"
2024-04-17T14:53:43.346Z [loki] level=info ts=2024-04-17T14:53:43.296600331Z caller=worker.go:209 msg="adding connection" addr=loki-0.loki-endpoints.cos.svc.cluster.local:9095
2024-04-17T14:53:43.346Z [loki] level=info ts=2024-04-17T14:53:43.297812733Z caller=scheduler.go:615 msg="this scheduler is in the ReplicationSet, will now accept requests."
2024-04-17T14:53:45.446Z [loki] level=info ts=2024-04-17T14:53:45.425633613Z caller=compactor.go:474 msg="this instance has been chosen to run the compactor, starting compactor"
2024-04-17T14:53:45.446Z [loki] level=info ts=2024-04-17T14:53:45.425675246Z caller=compactor.go:503 msg="waiting 10m0s for ring to stay stable and previous compactions to finish before starting compactor"
2024-04-17T14:54:40.245Z [loki] level=info ts=2024-04-17T14:54:40.168043675Z caller=table_manager.go:136 index-store=boltdb-shipper-2020-10-24 msg="uploading tables"
2024-04-17T14:54:40.245Z [loki] level=info ts=2024-04-17T14:54:40.168074092Z caller=index_set.go:86 msg="uploading table index_19830"
2024-04-17T14:54:40.245Z [loki] level=info ts=2024-04-17T14:54:40.169169813Z caller=table_manager.go:171 index-store=boltdb-shipper-2020-10-24 msg="handing over indexes to shipper"
2024-04-17T14:54:40.245Z [loki] level=info ts=2024-04-17T14:54:40.169240404Z caller=table.go:318 msg="handing over indexes to shipper index_19830"
2024-04-17T14:54:40.245Z [loki] level=info ts=2024-04-17T14:54:40.169247974Z caller=table.go:334 msg="finished handing over table index_19830"
2024-04-17T14:54:40.245Z [loki] level=info ts=2024-04-17T14:54:40.187572448Z caller=index_set.go:107 msg="finished uploading table index_19830"
2024-04-17T14:54:40.245Z [loki] level=info ts=2024-04-17T14:54:40.187608966Z caller=index_set.go:185 msg="cleaning up unwanted indexes from table index_19830"

Solution

Exposing retention_period as a config option in the charm which will set a retention period on all streams with this duration; however, setting retention periods for individual streams is not currently supported and may be considered in future work.

Testing Instructions

  1. juju deploy cos-lite
  2. juju deploy flog-k8s
  3. juju relate loki flog-k8s
  4. juju config loki retention-period=24
  5. validate /etc/loki/loki-local-config.yaml and check Loki logs after 24h to valdidate log deletion

Important Considerations

@IbraAoad IbraAoad marked this pull request as ready for review April 19, 2024 07:40
Copy link
Contributor

@lucabello lucabello left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I'd love another pair of eyes on this

Copy link
Contributor

@mmkay mmkay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well.

src/charm.py Outdated Show resolved Hide resolved
config.yaml Show resolved Hide resolved
src/charm.py Outdated Show resolved Hide resolved
src/charm.py Outdated Show resolved Hide resolved
src/charm.py Show resolved Hide resolved
src/config_builder.py Outdated Show resolved Hide resolved
tests/integration/helpers.py Show resolved Hide resolved
tests/integration/test_loki_configs.py Show resolved Hide resolved
@simskij simskij merged commit 58eb250 into main Apr 22, 2024
13 checks passed
@simskij simskij deleted the log-retention branch April 22, 2024 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expose retention tuning knobs
5 participants