Added Control of link up events by interruptions #136

Ktmi · 2023-06-13T15:08:59Z

This PR adds in an event listener for the new interruptions API to allow kytos-ng/topology to produce the appropriate link up and down events, as well as the topology notify update event. Indirectly, this should resolve the following issues

handle_switch_maintenance_end is assuming interface(s) will go up when maintenance ends #62
- Maintenance must now specifies all affected devices, and only temporarily transforms them through status_funcs
maintenance handlers that disable entities should also call notify_topology_update() #63
- Processing of interruptions automatically calls notify_topology_update()
handle_link_maintenance_start is using new Links references that don't exist on self.links #64
- Interruptions work based on ids and not objects, so no non-existent links generated.

Additional Notes

While I have added visibility to status and status_reason into the UI for Links and Switches, this feature is missing from Interfaces. Additionally, changes to the status/status_reason are not registered unless the page is reloaded. This applies to changes caused by enabling/disabling a device, or causing an interruption at the device.

viniarck

@Ktmi, overall looks good, it's nice that you also took the opportunity to augment some of the UI components too. I asked a few minor changes though. Also:

1 - Let me confirm the understanding regarding #62, looks like this one here will only be covered when the on_*_maintenance_* methods are nuked, which we will only be able to replace when the rest of the topology.interruption.start|end producers (maintenance in this case) is in place, right? Do you want to create a new task to make this explicit or will you reuse the same one?

2 - Once the comments are addressed, I'd be OK merging this one here since topology.interruption.start|end producer hasn't landed yet, so it should be safe, but before also merging, let me know which local test you minimally explored/ran.

main.py

ui/k-info-panel/link_info.kytos

main.py

viniarck

Asked changes on https://github.com/kytos-ng/topology/pull/136/files#r1231527760

main.py

viniarck · 2023-06-20T17:28:13Z

main.py

+            self.handle_interruption_start(event)
+
+    def handle_interruption_start(self, event: KytosEvent):
+        """Deals with the start of service interruption."""


@Ktmi considering the point that @italovalcy has also brought to the attention on yesterday's meeting, maybe on both handle_interruption_start and handle_interruption_end it would good points to centralize log.info entries, after all, we're also using log.info for port status changes, so here it'd reasonable too. Another benefit of centralizing here on topology is that it could simplify certain logs for interruption sources. Wdyt?

@Ktmi let me know about this one here, if you agree and also if it'll be part of this PR or a subsequent one.

viniarck

@Ktmi, I'll leave this PR pre-approved, looks good.

I've also taken the opportunity to explore it locally, I ran these tests:

I confirmed that with liveness it wouldn't redundantly notify up again when it gets enabled.
I also use the same link-flap script that I used on my previous topology PR, no anomalies observed, the final states were as expected.

Once e2e tests are passing and if your local exploratory-ish tests are passing as well we can merge this. Btw, I raised a new point regarding introducing a log.info to get your thoughts on that, but that can also get delivered in a subsequent PR if we want to add it.

Ktmi · 2023-06-21T15:44:49Z

I haven't run e2e tests yet as I kept realizing several potential issues with consistency with link activation/deactivation, that I ended up rewriting a large portion of topology, and am starting to creep into other NApps. A few notes on the matter:

of_lldp liveness will be rewritten to manage liveness status without using metadata and will instead produce interruptions
of_core should probably be in full control of link activation, rather than topology, as it has proper visibility into the space
Setting status to != UP for switches and interfaces shouldn't deactivate links, instead links should inherit the status and status reason of its parent objects

viniarck · 2023-06-21T18:43:39Z

I haven't run e2e tests yet as I kept realizing several potential issues with consistency with link activation/deactivation, that I ended up rewriting a large portion of topology, and am starting to creep into other NApps.

@Ktmi, I ran e2e on this PR that's on top of your branch, it's passing. It's great that you're analyzing and putting thoughts into it, but as usual, let's keep moving and addresses tasks based on their priorities and leave for 2023.2 what we can afford to (liveness for instance), and only keep on 2023.1.1 the rest of maintenance tasks that indeed are part of a minimal v1. If you encounter new issues or unplanned issues - or have ideas where things can be improved - please create a task and or suggestion just so it can be aligned and prioritized, by all means, feel free to use GitHub as you need to map these tasks/efforts.

A few notes on the matter:

of_lldp liveness will be rewritten to manage liveness status without using metadata and will instead produce interruptions

Right. But, let's leave this one for 2023.2 and also compare with liveness BFD, maybe liveness BFD will still needs to be prioritized first.

of_core should probably be in full control of link activation, rather than topology, as it has proper visibility into the space

Topology self.links certainly could be moved to Kytos core since it's a shared core resource (analogous to controller.switches and switch.interfaces dicts), and then of_core could indeed also take the opportunity to activate. But still needs a discussion to see if it's really worth it and when we could afford to. Other than that, race conditions that were impacting have been fixed. Historically, of_core was also designed to be as lightweight as possible, but I agree that activating links there since it's performing certain locks could simplify and make it easier to maintain.

Setting status to != UP for switches and interfaces shouldn't deactivate links, instead links should inherit the status and status reason of its parent objects

Indeed active is independent and status is a factor of self._active and self._enabled, and other logical attrs that we're registering. Regarding status of composed objects, yes, ideally, but again, let's plan incrementally here, let's avoid significant rewrites on parts where we can still move incrementally, and then plan accordingly on which parts we want to keep improving in the next planning, OK?

Ktmi · 2023-06-21T18:56:39Z

I haven't run e2e tests yet as I kept realizing several potential issues with consistency with link activation/deactivation, that I ended up rewriting a large portion of topology, and am starting to creep into other NApps.

@Ktmi, I ran e2e on this PR that's on top of your branch, it's passing. It's great that you're analyzing and putting thoughts into it, but as usual, let's keep moving and addresses tasks based on their priorities and leave for 2023.2 what we can afford to (liveness for instance), and only keep on 2023.1.1 the rest of maintenance tasks that indeed are part of a minimal v1. If you encounter new issues or unplanned issues - or have ideas where things can be improved - please create a task and or suggestion just so it can be aligned and prioritized, by all means, feel free to use GitHub as you need to map these tasks/efforts.

A few notes on the matter:

of_lldp liveness will be rewritten to manage liveness status without using metadata and will instead produce interruptions

Right. But, let's leave this one for 2023.2 and also compare with liveness BFD, maybe liveness BFD will still needs to be prioritized first.

of_core should probably be in full control of link activation, rather than topology, as it has proper visibility into the space

Topology self.links certainly could be moved to Kytos core since it's a shared core resource (analogous to controller.switches and switch.interfaces dicts), and then of_core could indeed also take the opportunity to activate. But still needs a discussion to see if it's really worth it and when we could afford to. Other than that, race conditions that were impacting have been fixed. Historically, of_core was also designed to be as lightweight as possible, but I agree that activating links there since it's performing certain locks could simplify and make it easier to maintain.

Setting status to != UP for switches and interfaces shouldn't deactivate links, instead links should inherit the status and status reason of its parent objects

Indeed active is independent and status is a factor of self._active and self._enabled, and other logical attrs that we're registering. Regarding status of composed objects, yes, ideally, but again, let's plan incrementally here, let's avoid significant rewrites on parts where we can still move incrementally, and then plan accordingly on which parts we want to keep improving in the next planning, OK?

Yeah, I guess we can put those additional changes for a later release, and just push out what we have already. For now, lets get this merged.

Ktmi · 2023-06-21T20:47:24Z

~~Getting failing e2e tests with this and kytos-ng/maintenance#78. Here are the test results. It seems that the evcs being properly notified.~~

I'm an idiot, the issue was on maintenance side. The maintenance events were using an incorrect endpoint.

viniarck · 2023-06-22T14:36:42Z

~~Getting failing e2e tests with this and kytos-ng/maintenance#78. Here are the test results. It seems that the evcs being properly notified.~~

I'm an idiot, the issue was on maintenance side. The maintenance events were using an incorrect endpoint.

Good morning, let's merge this one then, and then once MW PRs is also ready we can make a decision if it'll be on 2023.1 or 2023.1.1 (with the rest of MW related tasks that you mentioned like preview and so on). Thanks, David.

Ktmi added 2 commits June 7, 2023 18:40

Added control of link up events by interruptions

4316b9f

Show status and status_reason in UI

27795a2

Ktmi requested a review from a team as a code owner June 13, 2023 15:08

Removed logging statement from UI

7ee2b9a

viniarck reviewed Jun 14, 2023

View reviewed changes

main.py Outdated Show resolved Hide resolved

main.py Outdated Show resolved Hide resolved

ui/k-info-panel/link_info.kytos Show resolved Hide resolved

main.py Outdated Show resolved Hide resolved

viniarck mentioned this pull request Jun 14, 2023

Add service interruptions to maintenance windows kytos-ng/maintenance#78

Merged

Improve internal consistency with link up/down

d6d1750

Ktmi mentioned this pull request Jun 15, 2023

Add status and status_reason to UI for Interfaces kytos-ng/ui#48

Open

Updated changelog

be4a667

viniarck reviewed Jun 15, 2023

View reviewed changes

main.py Outdated Show resolved Hide resolved

viniarck requested changes Jun 15, 2023

View reviewed changes

Ktmi added 4 commits June 16, 2023 12:03

Swapped link_down check with link_up check

382e097

Improved notify_status_change test

c3a59dd

Linter cleanup

3779f9c

Updated notify_link_status_change to use status

fe751bf

viniarck reviewed Jun 16, 2023

View reviewed changes

main.py Outdated Show resolved Hide resolved

Included link_status_lock

0577465

viniarck reviewed Jun 16, 2023

View reviewed changes

main.py Outdated Show resolved Hide resolved

viniarck mentioned this pull request Jun 19, 2023

fix: Fixed interface and link activation/deactivation race condition #142

Merged

Ktmi mentioned this pull request Jun 19, 2023

Replace link_up timers with CooldownManager #144

Open

Ktmi added 2 commits June 19, 2023 18:54

Removed inconsistant link_up

3ec9891

Merge branch 'master' into feat/service_interruptions

f737eb0

viniarck self-requested a review June 20, 2023 16:53

viniarck reviewed Jun 20, 2023

View reviewed changes

viniarck approved these changes Jun 20, 2023

View reviewed changes

viniarck mentioned this pull request Jun 20, 2023

chore: removed active from models and included scripts/unset_active.py #146

Merged

Added in logging of received interruptions

f820911

Replaced incorrect logical OR with AND

b89088b

viniarck merged commit a40348a into master Jun 22, 2023

viniarck deleted the feat/service_interruptions branch June 22, 2023 14:37

viniarck mentioned this pull request Jun 22, 2023

maintenance handlers that disable entities should also call notify_topology_update() #63

Closed

viniarck mentioned this pull request Jul 5, 2023

Ensure that link_up and link_down notifications are only sent once and if the prior status (metadata.last_status) is different #122

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Control of link up events by interruptions #136

Added Control of link up events by interruptions #136

Ktmi commented Jun 13, 2023

viniarck left a comment

viniarck left a comment

viniarck Jun 20, 2023

viniarck Jun 21, 2023

viniarck left a comment •

edited

Loading

Ktmi commented Jun 21, 2023

viniarck commented Jun 21, 2023 •

edited

Loading

Ktmi commented Jun 21, 2023

Ktmi commented Jun 21, 2023 •

edited

Loading

viniarck commented Jun 22, 2023

Added Control of link up events by interruptions #136

Added Control of link up events by interruptions #136

Conversation

Ktmi commented Jun 13, 2023

Additional Notes

viniarck left a comment

Choose a reason for hiding this comment

viniarck left a comment

Choose a reason for hiding this comment

viniarck Jun 20, 2023

Choose a reason for hiding this comment

viniarck Jun 21, 2023

Choose a reason for hiding this comment

viniarck left a comment • edited Loading

Choose a reason for hiding this comment

Ktmi commented Jun 21, 2023

viniarck commented Jun 21, 2023 • edited Loading

Ktmi commented Jun 21, 2023

Ktmi commented Jun 21, 2023 • edited Loading

viniarck commented Jun 22, 2023

viniarck left a comment •

edited

Loading

viniarck commented Jun 21, 2023 •

edited

Loading

Ktmi commented Jun 21, 2023 •

edited

Loading