-
Notifications
You must be signed in to change notification settings - Fork 842
Closed
Description
Describe the bug
Ingesters stopped triggering tsdb compactions causing the OOM issue and data loss because of no push to remote storage (google cloud storage)
To Reproduce
- Consul restart due to OOM killed
- Ingester Ring became unhealthy
Expected behavior
- Ingester should not stop triggering the tsdb compaction.
Environment:
- Infrastructure: Kubernetes v1.26.7, Cortex v1.15.3
- Deployment tool: Kustomize
Additional Context
Server logs of consul
[Mon Mar 24 09:33:13 2025] Code: Bad RIP value.
[Mon Mar 24 09:33:13 2025] RSP: 002b:000000c00009df18 EFLAGS: 00010202
[Mon Mar 24 09:33:13 2025] RAX: 0000000000000000 RBX: 0000000000004e20 RCX: 00000000004698dd
[Mon Mar 24 09:33:13 2025] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000c00009df18
[Mon Mar 24 09:33:13 2025] RBP: 000000c00009df28 R08: 000000007645c2a4 R09: 00007ffea5d690b0
[Mon Mar 24 09:33:13 2025] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000439c60
[Mon Mar 24 09:33:13 2025] R13: 0000000000000000 R14: 00000000036e71dc R15: 0000000000000000
[Mon Mar 24 09:33:13 2025] Task in /kubepods/burstable/pod19144e2d-5344-4ea2-a161-fd1e4e57fab1/1f289fc88a99539f34d90c61b7eade3a341bd8fa0fe870c2f6f0f8001949efc4 killed as a result of limit of /kubepods/burstable/pod19144e2d-5344-4ea2-a161-fd1e4e57fab1
[Mon Mar 24 09:33:13 2025] memory: usage 524288kB, limit 524288kB, failcnt 1913986
[Mon Mar 24 09:33:13 2025] memory+swap: usage 524204kB, limit 9007199254740988kB, failcnt 0
[Mon Mar 24 09:33:13 2025] kmem: usage 21224kB, limit 9007199254740988kB, failcnt 0
[Mon Mar 24 09:33:13 2025] Memory cgroup stats for /kubepods/burstable/pod19144e2d-5344-4ea2-a161-fd1e4e57fab1: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
[Mon Mar 24 09:33:13 2025] Memory cgroup stats for /kubepods/burstable/pod19144e2d-5344-4ea2-a161-fd1e4e57fab1/17b14f8338505345e097052aa04c04b3a0db60980bda3fe253e4cd58dcccff24: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:36KB inactive_file:0KB active_file:0KB unevictable:0KB
[Mon Mar 24 09:33:13 2025] Memory cgroup stats for /kubepods/burstable/pod19144e2d-5344-4ea2-a161-fd1e4e57fab1/3d01ee86d45aa5dc52c06cd2144b02dd652c5828c55b4a62c070c1cc766468ed: cache:227528KB rss:0KB rss_huge:0KB shmem:228068KB mapped_file:50688KB dirty:0KB writeback:0KB swap:0KB inactive_anon:3976KB active_anon:223684KB inactive_file:0KB active_file:0KB unevictable:0KB
[Mon Mar 24 09:33:13 2025] Memory cgroup stats for /kubepods/burstable/pod19144e2d-5344-4ea2-a161-fd1e4e57fab1/1f289fc88a99539f34d90c61b7eade3a341bd8fa0fe870c2f6f0f8001949efc4: cache:2616KB rss:271908KB rss_huge:0KB shmem:2196KB mapped_file:660KB dirty:0KB writeback:0KB swap:0KB inactive_anon:96KB active_anon:273872KB inactive_file:1076KB active_file:152KB unevictable:0KB
[Mon Mar 24 09:33:13 2025] Tasks state (memory values in pages):
[Mon Mar 24 09:33:13 2025] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
[Mon Mar 24 09:33:13 2025] [ 25913] 0 25913 242 1 28672 0 -998 pause
[Mon Mar 24 09:33:13 2025] [ 2779] 0 2779 52 2 20480 0 985 docker-entrypoi
[Mon Mar 24 09:33:13 2025] [ 2797] 100 2797 331313 79132 1064960 0 985 consul
[Mon Mar 24 09:33:13 2025] [ 20375] 0 20375 397 16 45056 0 985 sh
[Mon Mar 24 09:33:13 2025] [ 20609] 0 20609 1181 16 40960 0 985 curl
[Mon Mar 24 09:33:13 2025] [ 20631] 0 20631 394 13 32768 0 985 grep
[Mon Mar 24 09:33:13 2025] [ 20971] 0 20971 394 2 32768 0 985 sh
[Mon Mar 24 09:33:13 2025] Memory cgroup out of memory: Kill process 2797 (consul) score 1590 or sacrifice child
[Mon Mar 24 09:33:13 2025] Killed process 2797 (consul) total-vm:1325252kB, anon-rss:265244kB, file-rss:0kB, shmem-rss:51284kB
[Mon Mar 24 09:33:13 2025] oom_reaper: reaped process 2797 (consul), now anon-rss:0kB, file-rss:0kB, shmem-rss:51284kB
[Mon Mar 24 09:33:17 2025] TCP: request_sock_TCP: Possible SYN flooding on port 8500. Sending cookies. Check SNMP counters.
[Mon Mar 24 14:58:40 2025] IPv6: ADDRCONF(NETDEV_UP): cali350c831b699: link is not ready
[Mon Mar 24 14:58:40 2025] IPv6: ADDRCONF(NETDEV_CHANGE): cali350c831b699: link becomes ready
[Mon Mar 24 16:28:41 2025] IPv6: ADDRCONF(NETDEV_UP): cali28c7cc3caa3: link is not ready
[Mon Mar 24 16:28:41 2025] IPv6: ADDRCONF(NETDEV_CHANGE): cali28c7cc3caa3: link becomes ready
[Mon Mar 24 16:58:39 2025] IPv6: ADDRCONF(NETDEV_UP): cali9cc360364f7: link is not ready
[Mon Mar 24 16:58:39 2025] IPv6: ADDRCONF(NETDEV_CHANGE): cali9cc360364f7: link becomes ready
[Mon Mar 24 18:28:42 2025] IPv6: ADDRCONF(NETDEV_UP): cali42981617b56: link is not ready
[Mon Mar 24 18:28:42 2025] IPv6: ADDRCONF(NETDEV_CHANGE): cali42981617b56: link becomes ready
[Mon Mar 24 19:58:42 2025] IPv6: ADDRCONF(NETDEV_UP): califadadf0982a: link is not ready
[Mon Mar 24 19:58:42 2025] IPv6: ADDRCONF(NETDEV_CHANGE): califadadf0982a: link becomes ready
[Mon Mar 24 20:28:41 2025] IPv6: ADDRCONF(NETDEV_UP): cali21ba95b6eca: link is not ready
[Mon Mar 24 20:28:41 2025] IPv6: ADDRCONF(NETDEV_CHANGE): cali21ba95b6eca: link becomes ready
[Mon Mar 24 22:28:43 2025] IPv6: ADDRCONF(NETDEV_UP): caliba27763a131: link is not ready
[Mon Mar 24 22:28:43 2025] IPv6: ADDRCONF(NETDEV_CHANGE): caliba27763a131: link becomes ready
[Mon Mar 24 22:58:39 2025] IPv6: ADDRCONF(NETDEV_UP): cali06cb01d420f: link is not ready
[Mon Mar 24 22:58:39 2025] IPv6: ADDRCONF(NETDEV_CHANGE): cali06cb01d420f: link becomes ready
[Mon Mar 24 22:58:40 2025] IPv6: ADDRCONF(NETDEV_UP): cali745f5a04bdc: link is not ready
[Mon Mar 24 22:58:40 2025] IPv6: ADDRCONF(NETDEV_CHANGE): cali745f5a04bdc: link becomes ready
[Tue Mar 25 00:58:41 2025] IPv6: ADDRCONF(NETDEV_UP): calif0e472cf564: link is not ready
[Tue Mar 25 00:58:41 2025] IPv6: ADDRCONF(NETDEV_CHANGE): calif0e472cf564: link becomes ready

