Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
crimson/monc: fix subscription stall that blocked peering.
There is a scenario when the `active_con` is properly chosen but isn't marked as `ready_to_send`. If `renew_subs()` is called during the `on_session_opened()`, the flag will be turned on after the subscriptions are renewed which cannot happen as it requires the flag to be already set. In other words: there is a circular data dependency. The net result is stalling the subscription machinery, particularly the `OSDMap` subs. This caused a nasty peering issue at Sepia [1] where PG 2.7 got stuck in the `GetInfo` state. ``` rzarzynski@teuthology:/home/teuthworker/archive/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136908$ less ./remote/smithi039/log/ceph-osd.1.log.gz ... DEBUG 2021-05-26 20:19:48,134 [shard 0] osd - pg_epoch 14 pg[2.7( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0) [] r= -1 lpr=0 crt=0'0 mlcod 0'0 unknown enter Initial ... DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=0 crt=0'0 mlcod 0'0 unknown enter Reset ... DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 unknown enter Started ... DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 unknown enter Start ... DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 unknown enter Started/Primary ... DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating enter Started/Primary/Peering ... DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/GetInfo DEBUG 2021-05-26 20:19:48,138 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering build_prior all_probe DEBUG 2021-05-26 20:19:48,139 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering build_prior final: probe 0,1 down blocked_by {} DEBUG 2021-05-26 20:19:48,139 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering up_thru 0 < same_since 14, must notify monitor DEBUG 2021-05-26 20:19:48,139 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: no prior_set down osds, clearing prior_readable_until_ub DEBUG 2021-05-26 20:19:48,139 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: querying info from osd.0 ... DEBUG 2021-05-26 20:19:48,237 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering got osd.0 2.7( DNE empty local-lis/les=0/0 n=0 ec=0/0 lis/c=0/0 les/c/f=0/0/0 sis=0) DEBUG 2021-05-26 20:19:48,237 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Adding osd: 0 peer features: 3f01cfbb7ffdffff DEBUG 2021-05-26 20:19:48,237 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Common peer features: 3f01cfbb7ffdffff DEBUG 2021-05-26 20:19:48,237 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Common acting features: 3f01cfbb7ffdffff DEBUG 2021-05-26 20:19:48,238 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering state<Started/Primary/Peering/GetInfo>: Common upacting features: 3f01cfbb7ffdffff DEBUG 2021-05-26 20:19:48,238 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering exit Started/Primary/Peering/GetInfo 0.099480 4 2021-05-26T20:19:48.146172+0000 ... DEBUG 2021-05-26 20:19:48,238 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/GetLog ... DEBUG 2021-05-26 20:19:48,238 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/GetMissing ... DEBUG 2021-05-26 20:19:48,238 [shard 0] osd - pg_epoch 14 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+peering enter Started/Primary/Peering/WaitUpThru ... DEBUG 2021-05-26 20:19:49,139 [shard 0] osd - pg_epoch 15 pg[2.7( empty local-lis/les=0/0 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating enter Started/Primary/Active ... DEBUG 2021-05-26 20:19:49,142 [shard 0] osd - pg_epoch 15 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=0/0 les/c/f=0/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 creating+activating enter Started/Primary/Active/Activating ... DEBUG 2021-05-26 20:19:49,204 [shard 0] osd - pg_epoch 15 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/0 les/c/f=15/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 active enter Started/Primary/Active/Recovered ... DEBUG 2021-05-26 20:19:49,204 [shard 0] osd - pg_epoch 15 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/0 les/c/f=15/0/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 active enter Started/Primary/Active/Clean ... DEBUG 2021-05-26 20:22:31,223 [shard 0] osd - pg_epoch 86 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=14) [1,0] r=0 lpr=14 crt=0'0 mlcod 0'0 active enter Reset ... <a lot of flipping> ... DEBUG 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163 ) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown activate_map DEBUG 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163 ) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown exit Reset 0.035744 1 2021-05-26T20:24:07.817331+0000 INFO 2021-05-26 20:24:07,851 [shard 0] osd - Exiting state: Reset, entered at 1622060647.8158188, 1622060647.8173316 spent on 1 events DEBUG 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163 ) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Started INFO 2021-05-26 20:24:07,851 [shard 0] osd - Entering state: Started DEBUG 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163 ) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Start INFO 2021-05-26 20:24:07,851 [shard 0] osd - Entering state: Start INFO 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163 ) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown state<Start>: transitioning to Primary DEBUG 2021-05-26 20:24:07,851 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163 ) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown exit Start 0.000041 0 0.000000 INFO 2021-05-26 20:24:07,851 [shard 0] osd - Exiting state: Start, entered at 1622060647.8516333, 0.0 spent on 0 events DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163 ) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Started/Primary INFO 2021-05-26 20:24:07,852 [shard 0] osd - Entering state: Started/Primary DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163 ) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 unknown enter Started/Primary/Peering INFO 2021-05-26 20:24:07,852 [shard 0] osd - Entering state: Started/Primary/Peering DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering enter Started/Primary/Peering/GetInfo INFO 2021-05-26 20:24:07,852 [shard 0] osd - Entering state: Started/Primary/Peering/GetInfo ... DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering build_prior all_probe 0,1,4 DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering build_prior maybe_rw interval:139, acting: 0 DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering build_prior final: probe 0,1,4 down blocked_by {} DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering up_thru 125 < same_since 163, must notify monitor DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering state<Started/Primary/Peering/GetInfo>: no prior_set down osds, clearing prior_readable_until_ub DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering state<Started/Primary/Peering/GetInfo>: querying info from osd.0 DEBUG 2021-05-26 20:24:07,852 [shard 0] osd - pg_epoch 163 pg[2.7( empty local-lis/les=14/15 n=0 ec=14/14 lis/c=14/14 les/c/f=15/15/0 sis=163) [1,0] r=0 lpr=163 pi=[14,163)/1 crt=0'0 mlcod 0'0 peering state<Started/Primary/Peering/GetInfo>: querying info from osd.4 ... DEBUG 2021-05-26 20:24:07,924 [shard 0] ms - [osd.1(cluster) v2:172.21.15.39:6803/34727@61064 >> osd.4 v2:172.21.15.62:6802/34686] connect to existing DEBUG 2021-05-26 20:24:07,924 [shard 0] ms - [osd.1(cluster) v2:172.21.15.39:6803/34727@61064 >> osd.4 v2:172.21.15.62:6802/34686] --> ceph#62 === pg_query2(2.7 2.7 query(info 0'0 epoch_sent 163) e163/163) v1 (131) ... DEBUG 2021-05-26 20:24:07,942 [shard 0] ms - [osd.1(cluster) v2:172.21.15.39:6803/34727@61064 >> osd.4 v2:172.21.15.62:6802/34686] GOT AckFrame: seq=62 ... <plenty of osd_ping messanging but no reply to the pg_query for 2.7> ... DEBUG 2021-05-26 20:58:19,829 [shard 0] ms - [osd.1(hb_front) v2:172.21.15.39:6807/34727 >> osd.4 v2:172.21.15.62:6807/34686@54816] <== ceph#772 = == osd_ping(ping e17 up_from 10 ping_stamp 2021-05-26T20:58:19.825573+0000/2319.780029297s send_stamp 2319.780029297s) v5 (70) DEBUG 2021-05-26 20:58:19,829 [shard 0] ms - [osd.1(hb_front) v2:172.21.15.39:6807/34727 >> osd.4 v2:172.21.15.62:6807/34686@54816] --> ceph#772 === osd_ping(ping_reply e249 up_from 10 ping_stamp 2021-05-26T20:58:19.825573+0000/2319.780029297s send_stamp 2320.039062500s) v5 (70 ``` The peering request got stuck due to awaiting for `OSDMap`. ``` DEBUG 2021-05-26 20:24:07,930 [shard 0] ms - [osd.4(cluster) v2:172.21.15.62:6802/34686 >> osd.1 v2:172.21.15.39:6803/34727@61064] <== ceph#62 === pg_query2(2.7 2.7 query(info 0'0 epoch_sent 163) e163/163) v1 (131) DEBUG 2021-05-26 20:24:07,930 [shard 0] osd - handle_peering_op on 2.7 from 1 DEBUG 2021-05-26 20:24:07,930 [shard 0] osd - peering_event(id=517, detail=PeeringEvent(from=1 pgid=2.7 sent=163 requested=163 evt=epoch_sent: 163 epoch_requested: 163 MQuery 2.7 from 1 query_epoch 163 query: query(info 0'0 epoch_sent 163))): star ``` ``` INFO 2021-05-26 20:19:49,127 [shard 0] osd - evt epoch is 15, i have 14, will wait INFO 2021-05-26 20:19:49,128 [shard 0] osd - osdmap_subscribe(14) DEBUG 2021-05-26 20:19:49,128 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@63208 >> mon.1 v2:172.21.15.62:3300/0] --> #9 === mon_s ubscribe({osdmap=14}) v3 (15) ... INFO 2021-05-26 20:19:49,131 [shard 0] osd - handle_osd_map osd_map(14..15 src has 1..15) v4 INFO 2021-05-26 20:19:49,131 [shard 0] osd - handle_osd_map epochs [14..15], i have 15, src has [1..15] ... INFO 2021-05-26 20:19:49,138 [shard 0] osd - handle_osd_map osd_map(14..15 src has 1..15) v4 INFO 2021-05-26 20:19:49,138 [shard 0] osd - handle_osd_map epochs [14..15], i have 15, src has [1..15] ... INFO 2021-05-26 20:19:49,139 [shard 0] osd - evt epoch is 15, i have 14, will wait INFO 2021-05-26 20:19:49,141 [shard 0] osd - osdmap_subscribe(14) WARN 2021-05-26 20:19:49,141 [shard 0] monc - renew_subs - empty ... INFO 2021-05-26 20:19:50,140 [shard 0] osd - handle_osd_map osd_map(15..16 src has 1..16) v4 INFO 2021-05-26 20:19:50,140 [shard 0] osd - handle_osd_map epochs [15..16], i have 15, src has [1..16] DEBUG 2021-05-26 20:19:50,141 [shard 0] bluestore - do_transaction INFO 2021-05-26 20:19:50,145 [shard 0] osd - osd.4: committed_osd_maps(16, 16) ... INFO 2021-05-26 20:20:42,881 [shard 0] osd - handle_osd_map epochs [16..17], i have 16, src has [1..17] DEBUG 2021-05-26 20:20:42,882 [shard 0] bluestore - do_transaction INFO 2021-05-26 20:20:42,886 [shard 0] osd - osd.4: committed_osd_maps(17, 17) ... INFO 2021-05-26 20:20:43,941 [shard 0] osd - evt epoch is 18, i have 17, will wait INFO 2021-05-26 20:20:43,941 [shard 0] osd - osdmap_subscribe(17) ... INFO 2021-05-26 20:20:43,957 [shard 0] osd - evt epoch is 18, i have 17, will wait INFO 2021-05-26 20:20:43,957 [shard 0] osd - osdmap_subscribe(17) ... INFO 2021-05-26 20:20:43,969 [shard 0] osd - evt epoch is 18, i have 17, will wait INFO 2021-05-26 20:20:43,969 [shard 0] osd - osdmap_subscribe(17) ... DEBUG 2021-05-26 20:20:46,930 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@57288 >> mon.2 v2:172.21.15.39:3301/0] <== #4 === osd_m ap(20..21 src has 1..21) v4 (41) INFO 2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map osd_map(20..21 src has 1..21) v4 INFO 2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map epochs [20..21], i have 17, src has [1..21] INFO 2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map message skips epochs 18..19 INFO 2021-05-26 20:20:46,930 [shard 0] osd - osdmap_subscribe(18) ... DEBUG 2021-05-26 20:20:47,936 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@57288 >> mon.2 v2:172.21.15.39:3301/0] <== #5 === osd_m ap(21..22 src has 1..22) v4 (41) INFO 2021-05-26 20:20:47,936 [shard 0] osd - handle_osd_map osd_map(21..22 src has 1..22) v4 INFO 2021-05-26 20:20:47,936 [shard 0] osd - handle_osd_map epochs [21..22], i have 17, src has [1..22] INFO 2021-05-26 20:20:47,936 [shard 0] osd - handle_osd_map message skips epochs 18..20 INFO 2021-05-26 20:20:47,936 [shard 0] osd - osdmap_subscribe(18) ... <osdmap_subscribe(18) over and over> ``` ``` 2021-05-26T20:19:42.048+0000 7f4712ffd700 1 -- [v2:172.21.15.62:3300/0,v1:172.21.15.62:6789/0] <== osd.4 v2:172.21.15.62:6801/34686 4 ==== mon_subscribe({mgrmap=0+,osd_pg_creates=0+,osdmap=0+}) v3 ==== 82+0+0 (secure 0 0 0) 0x7f46fc04e150 con 0x7f470401c480 2021-05-26T20:19:42.048+0000 7f4712ffd700 20 mon.b@1(peon) e1 _ms_dispatch existing session 0x7f46fc02f500 for osd.4 2021-05-26T20:19:42.048+0000 7f4712ffd700 20 mon.b@1(peon) e1 entity_name osd.4 global_id 4168 (new_ok) caps allow * 2021-05-26T20:19:42.048+0000 7f4712ffd700 10 mon.b@1(peon) e1 handle_subscribe mon_subscribe({mgrmap=0+,osd_pg_creates=0+,osdmap=0+}) v3 ... 2021-05-26T20:19:49.129+0000 7f4712ffd700 1 -- [v2:172.21.15.62:3300/0,v1:172.21.15.62:6789/0] <== osd.4 v2:172.21.15.62:6801/34686 9 ==== mo n_subscribe({osdmap=14}) v3 ==== 36+0+0 (secure 0 0 0) 0x7f46e8556210 con 0x7f470401c480 2021-05-26T20:19:49.129+0000 7f4712ffd700 20 mon.b@1(peon) e1 _ms_dispatch existing session 0x7f46fc02f500 for osd.4 2021-05-26T20:19:49.129+0000 7f4712ffd700 20 mon.b@1(peon) e1 entity_name osd.4 global_id 4168 (new_ok) caps allow * 2021-05-26T20:19:49.129+0000 7f4712ffd700 10 mon.b@1(peon) e1 handle_subscribe mon_subscribe({osdmap=14}) v3 2021-05-26T20:19:49.129+0000 7f4712ffd700 20 is_capable service=mon command= read addr v2:172.21.15.62:6801/34686 on cap allow * 2021-05-26T20:19:49.129+0000 7f4712ffd700 20 allow so far , doing grant allow * 2021-05-26T20:19:49.129+0000 7f4712ffd700 20 allow all 2021-05-26T20:19:49.129+0000 7f4712ffd700 20 is_capable service=osd command= read addr v2:172.21.15.62:6801/34686 on cap allow * 2021-05-26T20:19:49.129+0000 7f4712ffd700 20 allow so far , doing grant allow * 2021-05-26T20:19:49.129+0000 7f4712ffd700 20 allow all 2021-05-26T20:19:49.129+0000 7f4712ffd700 10 mon.b@1(peon).osd e15 check_osdmap_sub 0x7f46e84f0150 next 14 (onetime) 2021-05-26T20:19:49.129+0000 7f4712ffd700 5 mon.b@1(peon).osd e15 send_incremental [14..15] to osd.4 2021-05-26T20:19:49.129+0000 7f4712ffd700 10 mon.b@1(peon).osd e15 build_incremental [14..15] with features 3f01cfbb7ffdffff 2021-05-26T20:19:49.129+0000 7f4712ffd700 20 mon.b@1(peon).osd e15 build_incremental inc 15 622 bytes 2021-05-26T20:19:49.129+0000 7f4712ffd700 20 mon.b@1(peon).osd e15 build_incremental inc 14 578 bytes 2021-05-26T20:19:49.129+0000 7f4712ffd700 1 -- [v2:172.21.15.62:3300/0,v1:172.21.15.62:6789/0] --> v2:172.21.15.62:6801/34686 -- osd_map(14.. 15 src has 1..15) v4 -- 0x7f46e856a100 con 0x7f470401c480 ``` ``` seastar::future<> Client::renew_subs() { if (!sub.have_new()) { logger().warn("{} - empty", __func__); return seastar::now(); } logger().trace("{}", __func__); auto m = crimson::make_message<MMonSubscribe>(); m->what = sub.get_subs(); m->hostname = ceph_get_short_hostname(); return send_message(std::move(m)).then([this] { sub.renewed(); }); } ``` ``` INFO 2021-05-26 20:19:42,081 [shard 0] osd - osdmap_subscribe(1) DEBUG 2021-05-26 20:19:42,081 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@63208 >> mon.1 v2:172.21.15.62:3300/0] --> #6 === mon_s ubscribe({osdmap=1}) v3 (15) ... INFO 2021-05-26 20:19:49,128 [shard 0] osd - osdmap_subscribe(14) DEBUG 2021-05-26 20:19:49,128 [shard 0] ms - [osd.4(client) v2:172.21.15.62:6801/34686@63208 >> mon.1 v2:172.21.15.62:3300/0] --> #9 === mon_subscribe({osdmap=14}) v3 (15) ... INFO 2021-05-26 20:19:49,141 [shard 0] osd - osdmap_subscribe(14) WARN 2021-05-26 20:19:49,141 [shard 0] monc - renew_subs - empty <no MMonSubcribe> ... INFO 2021-05-26 20:20:43,941 [shard 0] osd - evt epoch is 18, i have 17, will wait INFO 2021-05-26 20:20:43,941 [shard 0] osd - osdmap_subscribe(17) <no MMonSubcribe> ... INFO 2021-05-26 20:20:46,930 [shard 0] osd - handle_osd_map message skips epochs 18..19 INFO 2021-05-26 20:20:46,930 [shard 0] osd - osdmap_subscribe(18) <no MMonSubcribe> ``` [1]: http://pulpito.front.sepia.ceph.com/rzarzynski-2021-05-26_12:20:26-rados-master-distro-basic-smithi/6136908 Signed-off-by: Radoslaw Zarzynski <[email protected]>
- Loading branch information