Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support an eventual phase for the blueprint actions #1297

Merged
merged 11 commits into from
Apr 11, 2022
Merged

Conversation

viveksinghggits
Copy link
Contributor

@viveksinghggits viveksinghggits commented Mar 21, 2022

Change Overview

This PR is to support eventual phase for the blueprint actions. Even though there are problems in the main phases of an action in the blueprint the deferPhase would always be called. This is how we can specify that in a blueprint

The other details are discussed here #1220

This is an example of a blueprint is going to to look like with artifact being set from deferPhase and being consumed in the other action.

Pull request type

Please check the type of change your PR introduces:

  • 🚧 Work in Progress
  • 🌈 Refactoring (no functional changes, no api changes)
  • 🐹 Trivial/Minor
  • 🐛 Bugfix
  • 🌻 Feature
  • 🗺️ Documentation
  • 🤖 Test

Issues

Test Plan

Tests are added as part of this PR.

  • 💪 Manual
  • ⚡ Unit test
  • 💚 E2E

pkg/controller/controller.go Outdated Show resolved Hide resolved
pkg/apis/cr/v1alpha1/types.go Outdated Show resolved Hide resolved
pkg/controller/controller.go Outdated Show resolved Hide resolved
pkg/controller/controller.go Outdated Show resolved Hide resolved
@viveksinghggits viveksinghggits force-pushed the eventual-phase branch 2 times, most recently from 749dad5 to d285b71 Compare March 23, 2022 17:33
@pavannd1
Copy link
Contributor

Canceling build to let approved PRs go through. Will rerun once TravisCI frees up

@viveksinghggits viveksinghggits changed the title [WIP] Support an eventual phase for the blueprint actions Support an eventual phase for the blueprint actions Mar 28, 2022
docs/architecture.rst Outdated Show resolved Hide resolved
docs/templates.rst Outdated Show resolved Hide resolved
docs/templates.rst Outdated Show resolved Hide resolved
docs/templates.rst Show resolved Hide resolved
docs/tutorial.rst Outdated Show resolved Hide resolved
pkg/controller/controller.go Show resolved Hide resolved
pkg/phase.go Outdated Show resolved Hide resolved
pkg/param/param.go Show resolved Hide resolved
pkg/controller/controller.go Outdated Show resolved Hide resolved
pkg/controller/controller.go Show resolved Hide resolved
@viveksinghggits
Copy link
Contributor Author

~/work/opensource/kanister/pkg (eventual-phase*) » go test                                                                                                                                                              130 ↵ vivek@vivek
{"FallbackVersion":"v0.0.0","File":"pkg/phase_test.go","Function":"anotherTestFunc","Line":200,"PreferredVersion":"v11.11.11","level":"info","msg":"Falling back to default version of the function","time":"2022-03-30T17:04:36.116072664+02:00"}
OK: 3 passed
PASS
ok  	github.com/kanisterio/kanister/pkg	0.015s

@viveksinghggits viveksinghggits requested a review from ihcsim March 30, 2022 15:08
@viveksinghggits
Copy link
Contributor Author

viveksinghggits commented Apr 1, 2022

Hi @ihcsim
can you please have another look. I am adding more tests as part of https://github.com/kanisterio/kanister/pull/1337/files

@ihcsim ihcsim mentioned this pull request Apr 2, 2022
docs/architecture.rst Outdated Show resolved Hide resolved
docs/templates.rst Outdated Show resolved Hide resolved
docs/tutorial.rst Outdated Show resolved Hide resolved
pkg/controller/controller.go Outdated Show resolved Hide resolved
pkg/controller/controller_test.go Outdated Show resolved Hide resolved
pkg/param/param.go Outdated Show resolved Hide resolved
@ihcsim
Copy link
Contributor

ihcsim commented Apr 6, 2022

So far this is working more or less as expected 👍. One thing I notice is that it doesn't look like the validating webhook is checking the deferPhase. Can you confirm?

I tested with the following YAML:

apiVersion: cr.kanister.io/v1alpha1
kind: Blueprint
metadata:
  name: scale
  namespace: kanister
actions:
  scale:
    phases:
    - func: ScaleWorkload
      name: scale-down
      args:
        namespace: default
        name: nginx
        kind: deployment
        replicas: 0
    - func: ScaleWorkload
      name: error
      args:
        namespace: default
        name: non-existent
        kind: deployment
        replicas: 10
    deferPhase:
      func: ScaleWorkload
      name: scale-up
      args:
        namespace: default
        name: nginx
        kind: deployment
        replicas: 3
---
apiVersion: cr.kanister.io/v1alpha1
kind: ActionSet
metadata:
  generateName: scale-
  namespace: kanister
spec:
  actions:
  - name: scale
    blueprint: scale
    object:
      kind: Namespace
      name: default

When I had an invalid key in the args of the deferPhase, the validating webhook didn't seem to catch it.

@ihcsim
Copy link
Contributor

ihcsim commented Apr 6, 2022

I find it hard to follow the multiple places where the Status.State are set and reconcile.ActionSet() are called. Can we should refactor these code in a separate PR, with the following changes:

i. Call reconcile.ActionSet() just once. Right now we try to reconcile the actionset in every iteration of the for loop
i. Within the same for loop, we only store the state of each phase in memory
i. Set as.Status.State and as.Status.ErState.Error just once, based on whether coreErr and deferErr are nil or not. It feels like because we are setting them in different places, we need to have some "final overrides" in renderActionsetArtifacts()

Here's the diff of what I think it can look like. If it makes sense, maybe we can do this in a separate PR.
diff --git a/pkg/controller/controller.go b/pkg/controller/controller.go
index aa16f6a5..51d10f10 100644
--- a/pkg/controller/controller.go
+++ b/pkg/controller/controller.go
@@ -383,7 +383,7 @@ func (c *Controller) handleActionSet(as *crv1alpha1.ActionSet) (err error) {
 
 // nolint:gocognit
 func (c *Controller) runAction(ctx context.Context, as *crv1alpha1.ActionSet, aIDX int) error {
-	action := as.Spec.Actions[aIDX]
+	action := *as.Spec.Actions[aIDX].DeepCopy()
 	c.logAndSuccessEvent(ctx, fmt.Sprintf("Executing action %s", action.Name), "Started Action", as)
 	bpName := as.Spec.Actions[aIDX].Blueprint
 	bp, err := c.crClient.CrV1alpha1().Blueprints(as.GetNamespace()).Get(ctx, bpName, v1.GetOptions{})
@@ -414,65 +414,85 @@ func (c *Controller) runAction(ctx context.Context, as *crv1alpha1.ActionSet, aI
 	t.Go(func() error {
 		var coreErr error
 		defer func() {
+			// always run deferPhase after normal phases exit
 			var deferErr error
 			if deferPhase != nil {
 				deferErr = c.executeDeferPhase(ctx, deferPhase, tp, bp, action.Name, aIDX, as)
 			}
-			c.renderActionsetArtifacts(ctx, as, aIDX, ns, name, action.Name, bp, tp, coreErr, deferErr)
-		}()
 
-		for i, p := range phases {
-			ctx = field.Context(ctx, consts.PhaseNameKey, p.Name())
-			c.logAndSuccessEvent(ctx, fmt.Sprintf("Executing phase %s", p.Name()), "Started Phase", as)
-			err = param.InitPhaseParams(ctx, c.clientset, tp, p.Name(), p.Objects())
-			var output map[string]interface{}
-			var msg string
-			if err == nil {
-				output, err = p.Exec(ctx, *bp, action.Name, *tp)
-			} else {
-				msg = fmt.Sprintf("Failed to init phase params: %#v:", as.Status.Actions[aIDX].Phases[i])
+			if coreErr == nil && deferErr == nil && len(as.Status.Actions[aIDX].Artifacts) > 0 {
+				coreErr = c.renderActionsetArtifacts(ctx, as, aIDX, ns, name, action.Name, bp, tp, coreErr, deferErr)
 			}
-			var rf func(*crv1alpha1.ActionSet) error
-			if err != nil {
-				coreErr = err
-				rf = func(ras *crv1alpha1.ActionSet) error {
+
+			// non-nil coreErr will always be added to the actionset status because
+			// it is likely to be the root cause of the actionset failure.
+			var f func(ras *crv1alpha1.ActionSet) error
+			if coreErr != nil {
+				f = func(ras *crv1alpha1.ActionSet) error {
+					ras.Status.State = crv1alpha1.StateFailed
+					ras.Status.Error = crv1alpha1.Error{
+						Message: coreErr.Error(),
+					}
+					return nil
+				}
+			} else if deferErr != nil {
+				f = func(ras *crv1alpha1.ActionSet) error {
 					ras.Status.State = crv1alpha1.StateFailed
 					ras.Status.Error = crv1alpha1.Error{
-						Message: err.Error(),
+						Message: deferErr.Error(),
 					}
-					ras.Status.Actions[aIDX].Phases[i].State = crv1alpha1.StateFailed
 					return nil
 				}
 			} else {
-				coreErr = nil
-				rf = func(ras *crv1alpha1.ActionSet) error {
-					ras.Status.Actions[aIDX].Phases[i].State = crv1alpha1.StateComplete
-					// this updates the phase output in the actionset status
-					ras.Status.Actions[aIDX].Phases[i].Output = output
+				f = func(ras *crv1alpha1.ActionSet) error {
+					ras.Status.State = crv1alpha1.StateComplete
 					return nil
 				}
 			}
 
-			if rErr := reconcile.ActionSet(ctx, c.crClient.CrV1alpha1(), ns, name, rf); rErr != nil {
+			if err := reconcile.ActionSet(ctx, c.crClient.CrV1alpha1(), ns, name, f); err != nil {
+				reason := fmt.Sprintf("ActionSetFailed Action: %s", action.Name)
+				msg := fmt.Sprintf("Failed to update ActionSet: %s:", name)
+				c.logAndErrorEvent(ctx, msg, reason, err, as, bp)
+				return
+			}
+		}()
+
+		for i, p := range phases {
+			ctx = field.Context(ctx, consts.PhaseNameKey, p.Name())
+			c.logAndSuccessEvent(ctx, fmt.Sprintf("Executing phase %s", p.Name()), "Started Phase", as)
+
+			if err := param.InitPhaseParams(ctx, c.clientset, tp, p.Name(), p.Objects()); err != nil {
+				as.Status.Actions[aIDX].Phases[i].State = crv1alpha1.StateFailed
+				coreErr = err
+
 				reason := fmt.Sprintf("ActionSetFailed Action: %s", as.Spec.Actions[aIDX].Name)
-				msg := fmt.Sprintf("Failed to update phase: %#v:", as.Status.Actions[aIDX].Phases[i])
-				c.logAndErrorEvent(ctx, msg, reason, rErr, as, bp)
-				coreErr = rErr
-				return nil
+				msg := fmt.Sprintf("Failed to init phase params: %#v:", as.Status.Actions[aIDX].Phases[i])
+				c.logAndErrorEvent(ctx, msg, reason, err, as, bp)
+				break
 			}
 
+			var (
+				output map[string]interface{}
+				msg    string
+			)
+			output, err := p.Exec(ctx, *bp, action.Name, *tp)
 			if err != nil {
+				as.Status.Actions[aIDX].Phases[i].State = crv1alpha1.StateFailed
+				coreErr = err
+
 				reason := fmt.Sprintf("ActionSetFailed Action: %s", as.Spec.Actions[aIDX].Name)
-				if msg == "" {
-					msg = fmt.Sprintf("Failed to execute phase: %#v:", as.Status.Actions[aIDX].Phases[i])
-				}
+				msg = fmt.Sprintf("Failed to execute phase: %#v:", as.Status.Actions[aIDX].Phases[i])
 				c.logAndErrorEvent(ctx, msg, reason, err, as, bp)
-				coreErr = err
-				return nil
+				break
 			}
+
+			as.Status.Actions[aIDX].Phases[i].State = crv1alpha1.StateComplete
+			as.Status.Actions[aIDX].Phases[i].Output = output
 			param.UpdatePhaseParams(ctx, tp, p.Name(), output)
 			c.logAndSuccessEvent(ctx, fmt.Sprintf("Completed phase %s", p.Name()), "Ended Phase", as)
 		}
+
 		return nil
 	})
 	return nil
@@ -491,47 +511,24 @@ func (c *Controller) executeDeferPhase(ctx context.Context,
 	aIDX int,
 	as *crv1alpha1.ActionSet,
 ) error {
-	actionsetName, actionsetNS := as.GetName(), as.GetNamespace()
 	ctx = field.Context(ctx, consts.PhaseNameKey, as.Status.Actions[aIDX].DeferPhase.Name)
 	c.logAndSuccessEvent(ctx, fmt.Sprintf("Executing deferPhase %s", as.Status.Actions[aIDX].DeferPhase.Name), "Started deferPhase", as)
 
 	output, err := deferPhase.Exec(context.Background(), *bp, actionName, *tp)
-	var rf func(*crv1alpha1.ActionSet) error
 	if err != nil {
-		rf = func(as *crv1alpha1.ActionSet) error {
-			as.Status.State = crv1alpha1.StateFailed
-			as.Status.Error = crv1alpha1.Error{
-				Message: err.Error(),
-			}
-			as.Status.Actions[aIDX].DeferPhase.State = crv1alpha1.StateFailed
-			return nil
-		}
-	} else {
-		rf = func(as *crv1alpha1.ActionSet) error {
-			as.Status.Actions[aIDX].DeferPhase.State = crv1alpha1.StateComplete
-			as.Status.Actions[aIDX].DeferPhase.Output = output
-			return nil
-		}
-	}
-	var msg string
-	if rErr := reconcile.ActionSet(context.Background(), c.crClient.CrV1alpha1(), actionsetNS, actionsetName, rf); rErr != nil {
-		reason := fmt.Sprintf("ActionSetFailed Action: %s", as.Spec.Actions[aIDX].Name)
-		msg := fmt.Sprintf("Failed to update defer phase: %#v:", as.Status.Actions[aIDX].DeferPhase)
-		c.logAndErrorEvent(ctx, msg, reason, rErr, as, bp)
-		return rErr
-	}
+		as.Status.Actions[aIDX].DeferPhase.State = crv1alpha1.StateFailed
 
-	if err != nil {
 		reason := fmt.Sprintf("ActionSetFailed Action: %s", as.Spec.Actions[aIDX].Name)
-		if msg == "" {
-			msg = fmt.Sprintf("Failed to execute defer phase: %#v:", as.Status.Actions[aIDX].DeferPhase)
-		}
+		msg := fmt.Sprintf("Failed to execute defer phase: %#v:", as.Status.Actions[aIDX].DeferPhase)
 		c.logAndErrorEvent(ctx, msg, reason, err, as, bp)
-		return err
+	} else {
+		as.Status.Actions[aIDX].DeferPhase.State = crv1alpha1.StateComplete
+		as.Status.Actions[aIDX].DeferPhase.Output = output
+
+		c.logAndSuccessEvent(ctx, fmt.Sprintf("Completed deferPhase %s", as.Status.Actions[aIDX].DeferPhase.Name), "Ended deferPhase", as)
+		param.UpdateDeferPhaseParams(context.Background(), tp, output)
 	}
 
-	c.logAndSuccessEvent(ctx, fmt.Sprintf("Completed deferPhase %s", as.Status.Actions[aIDX].DeferPhase.Name), "Ended deferPhase", as)
-	param.UpdateDeferPhaseParams(context.Background(), tp, output)
 	return nil
 }
 
@@ -542,64 +539,16 @@ func (c *Controller) renderActionsetArtifacts(ctx context.Context,
 	bp *crv1alpha1.Blueprint,
 	tp *param.TemplateParams,
 	coreErr, deferErr error,
-) {
-	// Check if output artifacts are present
-	artTpls := as.Status.Actions[aIDX].Artifacts
-	if len(artTpls) == 0 {
-		// No artifacts, set ActionSetStatus to complete
-		if rErr := reconcile.ActionSet(ctx, c.crClient.CrV1alpha1(), actionsetNS, actionsetName, func(ras *crv1alpha1.ActionSet) error {
-			if coreErr == nil && deferErr == nil {
-				ras.Status.State = crv1alpha1.StateComplete
-			} else {
-				ras.Status.State = crv1alpha1.StateFailed
-			}
-
-			return nil
-		}); rErr != nil {
-			reason := fmt.Sprintf("ActionSetFailed Action: %s", actionName)
-			msg := fmt.Sprintf("Failed to update ActionSet: %s", actionsetName)
-			c.logAndErrorEvent(ctx, msg, reason, rErr, as, bp)
-		}
-		return
-	}
+) error {
 	// Render the artifacts
+	artTpls := as.Status.Actions[aIDX].Artifacts
 	arts, err := param.RenderArtifacts(artTpls, *tp)
-	var af func(*crv1alpha1.ActionSet) error
 	if err != nil {
-		af = func(ras *crv1alpha1.ActionSet) error {
-			ras.Status.State = crv1alpha1.StateFailed
-			ras.Status.Error = crv1alpha1.Error{
-				Message: err.Error(),
-			}
-			return nil
-		}
-	} else {
-		af = func(ras *crv1alpha1.ActionSet) error {
-			ras.Status.Actions[aIDX].Artifacts = arts
-			// make sure that the core phases that were run also didnt return any error
-			// and then set actionset's state to be complete
-			if coreErr == nil && deferErr == nil {
-				ras.Status.State = crv1alpha1.StateComplete
-			} else {
-				ras.Status.State = crv1alpha1.StateFailed
-			}
-			return nil
-		}
-	}
-	// Update ActionSet
-	if aErr := reconcile.ActionSet(ctx, c.crClient.CrV1alpha1(), actionsetNS, actionsetName, af); aErr != nil {
-		reason := fmt.Sprintf("ActionSetFailed Action: %s", actionName)
-		msg := fmt.Sprintf("Failed to update Output Artifacts: %#v:", artTpls)
-		c.logAndErrorEvent(ctx, msg, reason, aErr, as, bp)
-		return
+		return err
 	}
 
-	if err != nil {
-		reason := fmt.Sprintf("ActionSetFailed Action: %s", actionName)
-		msg := "Failed to render output artifacts"
-		c.logAndErrorEvent(ctx, msg, reason, err, as, bp)
-		return
-	}
+	as.Status.Actions[aIDX].Artifacts = arts
+	return nil
 }
 
 func (c *Controller) logAndErrorEvent(ctx context.Context, msg, reason string, err error, objects ...runtime.Object) {

@viveksinghggits
Copy link
Contributor Author

Hi @ihcsim
I will raise another PR to make sure that DeferPhase is also being validated by our blueprint validating logic. I have raised a separate issue to track that.

@viveksinghggits
Copy link
Contributor Author

reconcile.ActionSet() just once. R

I totally agree with you, we should refactor that to make make sure we are setting the status only once and then calling reconcile actionset.

one of the core phases failed

The actionset status eventually must be failed if either one of core
phases or deferphases failed but we had a problem because of which
the actionset status was being set to complete even if the one of the
core phases failed. This commit fixes that.
It was happening because we were unknowingly setting error to nil
in deferPhase because deferPhase ran successfully.
@ihcsim
Copy link
Contributor

ihcsim commented Apr 6, 2022

@viveksinghggits @pavannd1 I feel like we can merge this. But we should update the documentation and the upcoming release changelog to mark this feature as experimental until the following issues are resolved:

i. #1364
i. #1337
i. #1360

WDYT?

@viveksinghggits
Copy link
Contributor Author

Hi @ihcsim
I agree that we should merge last two PRs with this main PR. But I think we can merge refactor one later as well.
From two issues that we want to get in with this PR, one PR is already raised, I will raise another one tomorrow.

@viveksinghggits
Copy link
Contributor Author

Meanwhile can you please review the PR that adds the tests.

@ihcsim
Copy link
Contributor

ihcsim commented Apr 6, 2022

I will still like to mark this as experimental in the documentation and the release changelog until at least we add the validating webhook fix. I think we already have all the two issues discussed in this PR.

Yes, I will take a look at the tests PR. Thanks!

Copy link
Contributor

@ihcsim ihcsim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍.

@viveksinghggits
Copy link
Contributor Author

Let me raise the PR for blueprint validation and then we can merge this.

@viveksinghggits
Copy link
Contributor Author

I have raised a PR to resolve #1360.

@mergify mergify bot merged commit c71b5aa into master Apr 11, 2022
@mergify mergify bot deleted the eventual-phase branch April 11, 2022 12:02
@viveksinghggits
Copy link
Contributor Author

@ihcsim
just pointing out Github actions failed but travis passed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants