Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parts artifacts in the map-reduce example are not garbage collected #14091

Open
4 tasks done
kaancfidan opened this issue Jan 16, 2025 · 7 comments · May be fixed by #14096
Open
4 tasks done

Parts artifacts in the map-reduce example are not garbage collected #14091

kaancfidan opened this issue Jan 16, 2025 · 7 comments · May be fixed by #14096
Labels
area/artifacts S3/GCP/OSS/Git/HDFS etc area/gc Garbage collection, such as TTLs, retentionPolicy, delays, and more type/bug

Comments

@kaancfidan
Copy link
Contributor

kaancfidan commented Jan 16, 2025

Pre-requisites

  • I have double-checked my configuration
  • I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened? What did you expect to happen?

I have a default workflow configuration that has:

artifactGC:
  strategy: OnWorkflowDeletion

and I am directly applying this map-reduce example without any modifications (except adding a namespace).

The workflow runs smoothly without any problems:
Image

and creates these artifacts on S3:
Image

after I delete the workflow manually, workflow controller seems to think it garbage collected everything:

time="2025-01-16T16:34:32.240Z" level=info msg="Workflow to be dehydrated" Workflow Size=12482
time="2025-01-16T16:34:32.274Z" level=info msg="Workflow update successful" namespace=integrations phase=Succeeded resourceVersion=3877085 workflow=map-reduce-2z98q
time="2025-01-16T16:34:32.279Z" level=info msg="Queueing Succeeded workflow integrations/map-reduce-2z98q for delete in 1439h59m55s due to TTL"
time="2025-01-16T16:34:37.288Z" level=info msg="Processing workflow" Phase=Succeeded ResourceVersion=3877085 namespace=integrations workflow=map-reduce-2z98q
time="2025-01-16T16:34:37.292Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-2z98q
time="2025-01-16T16:35:39.882Z" level=info msg="Processing workflow" Phase=Failed ResourceVersion=3716373 namespace=integrations workflow=cci-pk-9dwzt
time="2025-01-16T16:35:39.882Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=cci-pk-9dwzt
time="2025-01-16T16:36:36.488Z" level=info msg="Processing workflow" Phase=Succeeded ResourceVersion=3877657 namespace=integrations workflow=map-reduce-2z98q
time="2025-01-16T16:36:36.491Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-2z98q
time="2025-01-16T16:36:36.491Z" level=info msg="Creating Artifact GC Task map-reduce-2z98q-artgc-wfdel-2166136261-0" namespace=integrations workflow=map-reduce-2z98q
time="2025-01-16T16:36:36.526Z" level=info msg="creating pod to delete artifacts: map-reduce-2z98q-artgc-wfdel-2166136261" namespace=integrations strategy=OnWorkflowDeletion workflow=map-reduce-2z98q
time="2025-01-16T16:36:36.585Z" level=info msg="Workflow to be dehydrated" Workflow Size=12518
time="2025-01-16T16:36:36.628Z" level=info msg="Workflow update successful" namespace=integrations phase=Succeeded resourceVersion=3877713 workflow=map-reduce-2z98q
time="2025-01-16T16:36:41.641Z" level=info msg="Processing workflow" Phase=Succeeded ResourceVersion=3877713 namespace=integrations workflow=map-reduce-2z98q
time="2025-01-16T16:36:41.643Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-2z98q
time="2025-01-16T16:36:52.016Z" level=info msg="Processing workflow" Phase=Succeeded ResourceVersion=3877713 namespace=integrations workflow=map-reduce-2z98q
time="2025-01-16T16:36:52.020Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-2z98q
time="2025-01-16T16:36:52.020Z" level=info msg="reconciling artifact-gc pod" message= namespace=integrations phase=Succeeded pod=map-reduce-2z98q-artgc-wfdel-2166136261 workflow=map-reduce-2z98q
time="2025-01-16T16:36:52.020Z" level=info msg="processing completed Artifact GC Pod \"map-reduce-2z98q-artgc-wfdel-2166136261\"" namespace=integrations workflow=map-reduce-2z98q
time="2025-01-16T16:36:52.071Z" level=info msg="no remaining artifacts to GC, removing artifact GC finalizer (forceFinalizerRemoval=false)" namespace=integrations workflow=map-reduce-2z98q
time="2025-01-16T16:36:52.072Z" level=info msg="Workflow to be dehydrated" Workflow Size=12528
time="2025-01-16T16:36:52.108Z" level=info msg="Workflow update successful" namespace=integrations phase=Succeeded resourceVersion=3877713 workflow=map-reduce-2z98q

alas, the parts/ folder remains on S3 dangling indefinitely:

Image

I expected all the artifacts to be collected properly.

Version(s)

v3.6.2, latest (74ed09303da63b29fc08319236e2ae412269dd4bc0c919ca802bbf75e205e898)

Paste a minimal workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflow that uses private images.

https://github.com/argoproj/argo-workflows/blob/dd19e49/examples/map-reduce.yaml

Logs from the workflow controller

(Taken from a secondary run that reproduced the same issue as above)

kubectl logs -n argo deploy/workflow-controller | grep ${workflow}

time="2025-01-16T16:44:53.592Z" level=info msg="Processing workflow" Phase= ResourceVersion=3880259 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.611Z" level=info msg="adding artifact GC finalizer" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.611Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:44:53.611Z" level=info msg="Updated phase  -> Running" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.611Z" level=info msg="Creating pvc map-reduce-v844c-workdir" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.639Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.640Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.640Z" level=info msg="Retry node map-reduce-v844c initialized Running" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.640Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.640Z" level=info msg="DAG node map-reduce-v844c-3636553275 initialized Running" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.640Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:44:53.640Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:44:53.640Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2225669197, taskName split"
time="2025-01-16T16:44:53.640Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2225669197, taskName split"
time="2025-01-16T16:44:53.640Z" level=info msg="All of node map-reduce-v844c(0).split dependencies [] completed" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.640Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.640Z" level=info msg="Retry node map-reduce-v844c-2225669197 initialized Running" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.640Z" level=info msg="Pod node map-reduce-v844c-1792940316 initialized Pending" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.680Z" level=info msg="Created pod: map-reduce-v844c(0).split(0) (map-reduce-v844c-split-1792940316)" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.680Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:44:53.680Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:44:53.680Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:44:53.680Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:44:53.680Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.680Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.700Z" level=info msg="Workflow update successful" namespace=integrations phase=Running resourceVersion=3880268 workflow=map-reduce-v844c
time="2025-01-16T16:44:53.710Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3880268 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.714Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:44:53.715Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:44:53.715Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:44:53.716Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:44:53.716Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:44:53.716Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:44:53.716Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:44:53.716Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.716Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.733Z" level=info msg="Workflow update successful" namespace=integrations phase=Running resourceVersion=3880271 workflow=map-reduce-v844c
time="2025-01-16T16:44:53.735Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3880271 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.736Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:44:53.737Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:44:53.737Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:44:53.737Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:44:53.737Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:44:53.737Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:44:53.737Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:44:53.737Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:44:53.737Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:45:03.679Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3880271 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:45:03.681Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:45:03.681Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:45:03.681Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:45:03.682Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:45:03.682Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:45:03.682Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:45:03.682Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:45:03.682Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:45:03.682Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:04.277Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3880271 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:04.279Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:46:04.280Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:04.280Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:46:04.283Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:46:04.283Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:04.283Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:46:04.283Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:04.283Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:04.283Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:04.302Z" level=info msg="Workflow update successful" namespace=integrations phase=Running resourceVersion=3880732 workflow=map-reduce-v844c
time="2025-01-16T16:46:04.306Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3880732 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:04.307Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:46:04.308Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:04.308Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:46:04.309Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:46:04.309Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:04.309Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:46:04.309Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:04.310Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:04.310Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:14.305Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3880732 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:14.308Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=1 workflow=map-reduce-v844c
time="2025-01-16T16:46:14.309Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:14.309Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:46:14.310Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:46:14.310Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:14.310Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:46:14.310Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:14.310Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:14.310Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.609Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3880732 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.611Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=1 workflow=map-reduce-v844c
time="2025-01-16T16:46:24.612Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:24.612Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:46:24.613Z" level=info msg="node map-reduce-v844c-2225669197 phase Running -> Succeeded" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.613Z" level=info msg="node map-reduce-v844c-2225669197 finished: 2025-01-16 16:46:24.61305996 +0000 UTC" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.613Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-2467414799, taskName map"
time="2025-01-16T16:46:24.613Z" level=info msg="TaskGroup node map-reduce-v844c-2467414799 initialized Running (message: )" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.613Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-4134248812, taskName map(0:0)"
time="2025-01-16T16:46:24.613Z" level=info msg="All of node map-reduce-v844c(0).map(0:0) dependencies [split] completed" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.613Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.614Z" level=info msg="Retry node map-reduce-v844c-4134248812 initialized Running" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.614Z" level=info msg="Pod node map-reduce-v844c-3621362279 initialized Pending" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.657Z" level=info msg="Created pod: map-reduce-v844c(0).map(0:0)(0) (map-reduce-v844c-map-3621362279)" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.657Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-822721520, taskName map(1:1)"
time="2025-01-16T16:46:24.657Z" level=info msg="All of node map-reduce-v844c(0).map(1:1) dependencies [split] completed" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.657Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.660Z" level=info msg="Retry node map-reduce-v844c-822721520 initialized Running" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.661Z" level=info msg="Pod node map-reduce-v844c-968410771 initialized Pending" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.697Z" level=info msg="Created pod: map-reduce-v844c(0).map(1:1)(0) (map-reduce-v844c-map-968410771)" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.697Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-3063595644, taskName map(2:2)"
time="2025-01-16T16:46:24.697Z" level=info msg="All of node map-reduce-v844c(0).map(2:2) dependencies [split] completed" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.697Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.701Z" level=info msg="Retry node map-reduce-v844c-3063595644 initialized Running" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.703Z" level=info msg="Pod node map-reduce-v844c-1832424119 initialized Pending" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.732Z" level=info msg="Created pod: map-reduce-v844c(0).map(2:2)(0) (map-reduce-v844c-map-1832424119)" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.732Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1039925336, taskName map(3:3)"
time="2025-01-16T16:46:24.732Z" level=info msg="All of node map-reduce-v844c(0).map(3:3) dependencies [split] completed" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.732Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.732Z" level=info msg="Retry node map-reduce-v844c-1039925336 initialized Running" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.733Z" level=info msg="Pod node map-reduce-v844c-719975211 initialized Pending" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.765Z" level=info msg="Created pod: map-reduce-v844c(0).map(3:3)(0) (map-reduce-v844c-map-719975211)" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.765Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:24.765Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:24.765Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.765Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.793Z" level=info msg="Workflow update successful" namespace=integrations phase=Running resourceVersion=3880870 workflow=map-reduce-v844c
time="2025-01-16T16:46:24.809Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3880870 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.813Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=1 workflow=map-reduce-v844c
time="2025-01-16T16:46:24.814Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:24.817Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:24.817Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:24.817Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.817Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.840Z" level=info msg="Workflow update successful" namespace=integrations phase=Running resourceVersion=3880875 workflow=map-reduce-v844c
time="2025-01-16T16:46:24.845Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3880875 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.849Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=1 workflow=map-reduce-v844c
time="2025-01-16T16:46:24.850Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:24.852Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:24.852Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:24.852Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:24.852Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:29.810Z" level=info msg="cleaning up pod" action=deletePod key=integrations/map-reduce-v844c-split-1792940316/deletePod
time="2025-01-16T16:46:34.658Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3880875 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:34.660Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=5 workflow=map-reduce-v844c
time="2025-01-16T16:46:34.661Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:34.663Z" level=info msg="node map-reduce-v844c-3063595644 phase Running -> Succeeded" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:34.663Z" level=info msg="node map-reduce-v844c-3063595644 finished: 2025-01-16 16:46:34.663215967 +0000 UTC" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:34.663Z" level=info msg="node map-reduce-v844c-1039925336 phase Running -> Succeeded" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:34.663Z" level=info msg="node map-reduce-v844c-1039925336 finished: 2025-01-16 16:46:34.663541307 +0000 UTC" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:34.663Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:34.663Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:34.663Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:34.663Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:34.666Z" level=info msg="cleaning up pod" action=terminateContainers key=integrations/map-reduce-v844c-map-968410771/terminateContainers
time="2025-01-16T16:46:34.666Z" level=info msg="cleaning up pod" action=terminateContainers key=integrations/map-reduce-v844c-map-3621362279/terminateContainers
time="2025-01-16T16:46:34.667Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/integrations/pods/map-reduce-v844c-map-3621362279/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2025-01-16T16:46:34.683Z" level=info msg="Workflow update successful" namespace=integrations phase=Running resourceVersion=3881007 workflow=map-reduce-v844c
time="2025-01-16T16:46:34.703Z" level=info msg="signaled container" container=wait error="unable to upgrade connection: container not found (\"wait\")" namespace=integrations pod=map-reduce-v844c-map-3621362279 stderr="<nil>" stdout="<nil>"
time="2025-01-16T16:46:39.687Z" level=info msg="cleaning up pod" action=deletePod key=integrations/map-reduce-v844c-map-1832424119/deletePod
time="2025-01-16T16:46:39.687Z" level=info msg="cleaning up pod" action=deletePod key=integrations/map-reduce-v844c-map-719975211/deletePod
time="2025-01-16T16:46:44.865Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3881007 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.869Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=5 workflow=map-reduce-v844c
time="2025-01-16T16:46:44.870Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:44.871Z" level=info msg="node map-reduce-v844c-4134248812 phase Running -> Succeeded" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.871Z" level=info msg="node map-reduce-v844c-4134248812 finished: 2025-01-16 16:46:44.871225777 +0000 UTC" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.871Z" level=info msg="node map-reduce-v844c-822721520 phase Running -> Succeeded" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.871Z" level=info msg="node map-reduce-v844c-822721520 finished: 2025-01-16 16:46:44.871479822 +0000 UTC" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.872Z" level=info msg="node map-reduce-v844c-2467414799 phase Running -> Succeeded" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.872Z" level=info msg="node map-reduce-v844c-2467414799 finished: 2025-01-16 16:46:44.872697315 +0000 UTC" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.872Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:44.873Z" level=warning msg="was unable to obtain the node for map-reduce-v844c-1300622241, taskName reduce"
time="2025-01-16T16:46:44.873Z" level=info msg="All of node map-reduce-v844c(0).reduce dependencies [map] completed" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.873Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.873Z" level=info msg="Retry node map-reduce-v844c-1300622241 initialized Running" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.874Z" level=info msg="Pod node map-reduce-v844c-3957023480 initialized Pending" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.906Z" level=info msg="Created pod: map-reduce-v844c(0).reduce(0) (map-reduce-v844c-reduce-3957023480)" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.906Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.906Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.936Z" level=info msg="Workflow update successful" namespace=integrations phase=Running resourceVersion=3881080 workflow=map-reduce-v844c
time="2025-01-16T16:46:44.947Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3881080 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.951Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=5 workflow=map-reduce-v844c
time="2025-01-16T16:46:44.956Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.956Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.981Z" level=info msg="Workflow update successful" namespace=integrations phase=Running resourceVersion=3881086 workflow=map-reduce-v844c
time="2025-01-16T16:46:44.989Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3881086 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.992Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=5 workflow=map-reduce-v844c
time="2025-01-16T16:46:44.993Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:44.993Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:49.946Z" level=info msg="cleaning up pod" action=deletePod key=integrations/map-reduce-v844c-map-3621362279/deletePod
time="2025-01-16T16:46:49.946Z" level=info msg="cleaning up pod" action=deletePod key=integrations/map-reduce-v844c-map-968410771/deletePod
time="2025-01-16T16:46:54.910Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3881086 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:54.913Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=6 workflow=map-reduce-v844c
time="2025-01-16T16:46:54.914Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:54.914Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:46:54.918Z" level=info msg="cleaning up pod" action=terminateContainers key=integrations/map-reduce-v844c-reduce-3957023480/terminateContainers
time="2025-01-16T16:46:54.919Z" level=info msg="https://172.20.0.1:443/api/v1/namespaces/integrations/pods/map-reduce-v844c-reduce-3957023480/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2025-01-16T16:46:54.944Z" level=info msg="Workflow update successful" namespace=integrations phase=Running resourceVersion=3881163 workflow=map-reduce-v844c
time="2025-01-16T16:46:54.954Z" level=info msg="signaled container" container=wait error="unable to upgrade connection: container not found (\"wait\")" namespace=integrations pod=map-reduce-v844c-reduce-3957023480 stderr="<nil>" stdout="<nil>"
time="2025-01-16T16:47:04.667Z" level=info msg="cleaning up pod" action=killContainers key=integrations/map-reduce-v844c-map-968410771/killContainers
time="2025-01-16T16:47:04.704Z" level=info msg="cleaning up pod" action=killContainers key=integrations/map-reduce-v844c-map-3621362279/killContainers
time="2025-01-16T16:47:04.940Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=3881163 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.942Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=6 workflow=map-reduce-v844c
time="2025-01-16T16:47:04.945Z" level=info msg="node map-reduce-v844c-1300622241 phase Running -> Succeeded" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.947Z" level=info msg="node map-reduce-v844c-1300622241 finished: 2025-01-16 16:47:04.94751492 +0000 UTC" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.947Z" level=info msg="Outbound nodes of map-reduce-v844c-3636553275 set to [map-reduce-v844c-3957023480]" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.947Z" level=info msg="node map-reduce-v844c-3636553275 phase Running -> Succeeded" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.947Z" level=info msg="node map-reduce-v844c-3636553275 finished: 2025-01-16 16:47:04.947829288 +0000 UTC" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.948Z" level=info msg="node map-reduce-v844c phase Running -> Succeeded" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.948Z" level=info msg="node map-reduce-v844c finished: 2025-01-16 16:47:04.948053229 +0000 UTC" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.948Z" level=info msg="TaskSet Reconciliation" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.948Z" level=info msg=reconcileAgentPod namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.948Z" level=info msg="Updated phase Running -> Succeeded" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.948Z" level=info msg="Marking workflow completed" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.948Z" level=info msg="Deleting PVC map-reduce-v844c-workdir" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.958Z" level=info msg="Removing PVC \"kubernetes.io/pvc-protection\" finalizer" claimName=map-reduce-v844c-workdir namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:04.982Z" level=info msg="Deleted 1/1 PVCs" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:05.016Z" level=info msg="Workflow update successful" namespace=integrations phase=Succeeded resourceVersion=3881221 workflow=map-reduce-v844c
time="2025-01-16T16:47:05.016Z" level=info msg="Queueing Succeeded workflow integrations/map-reduce-v844c for delete in 1439h59m59s due to TTL"
time="2025-01-16T16:47:05.096Z" level=info msg="Processing workflow" Phase=Succeeded ResourceVersion=3881221 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:05.102Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:47:05.128Z" level=info msg="Workflow update successful" namespace=integrations phase=Succeeded resourceVersion=3881233 workflow=map-reduce-v844c
time="2025-01-16T16:47:05.135Z" level=info msg="Queueing Succeeded workflow integrations/map-reduce-v844c for delete in 1439h59m59s due to TTL"
time="2025-01-16T16:47:10.095Z" level=info msg="cleaning up pod" action=deletePod key=integrations/map-reduce-v844c-reduce-3957023480/deletePod
time="2025-01-16T16:47:10.097Z" level=info msg="Processing workflow" Phase=Succeeded ResourceVersion=3881233 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:10.100Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:47:20.116Z" level=info msg="Processing workflow" Phase=Succeeded ResourceVersion=3881233 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:20.119Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:47:24.955Z" level=info msg="cleaning up pod" action=killContainers key=integrations/map-reduce-v844c-reduce-3957023480/killContainers
time="2025-01-16T16:47:55.489Z" level=info msg="Processing workflow" Phase=Succeeded ResourceVersion=3881487 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:55.493Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:47:55.494Z" level=info msg="Creating Artifact GC Task map-reduce-v844c-artgc-wfdel-2166136261-0" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:55.512Z" level=info msg="creating pod to delete artifacts: map-reduce-v844c-artgc-wfdel-2166136261" namespace=integrations strategy=OnWorkflowDeletion workflow=map-reduce-v844c
time="2025-01-16T16:47:55.598Z" level=info msg="Workflow update successful" namespace=integrations phase=Succeeded resourceVersion=3881491 workflow=map-reduce-v844c
time="2025-01-16T16:47:55.608Z" level=info msg="Processing workflow" Phase=Succeeded ResourceVersion=3881491 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:47:55.610Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:48:00.609Z" level=info msg="Processing workflow" Phase=Succeeded ResourceVersion=3881491 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:48:00.612Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:48:17.004Z" level=info msg="Processing workflow" Phase=Succeeded ResourceVersion=3881491 namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:48:17.008Z" level=info msg="Task-result reconciliation" namespace=integrations numObjs=0 workflow=map-reduce-v844c
time="2025-01-16T16:48:17.008Z" level=info msg="reconciling artifact-gc pod" message= namespace=integrations phase=Succeeded pod=map-reduce-v844c-artgc-wfdel-2166136261 workflow=map-reduce-v844c
time="2025-01-16T16:48:17.008Z" level=info msg="processing completed Artifact GC Pod \"map-reduce-v844c-artgc-wfdel-2166136261\"" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:48:17.029Z" level=info msg="no remaining artifacts to GC, removing artifact GC finalizer (forceFinalizerRemoval=false)" namespace=integrations workflow=map-reduce-v844c
time="2025-01-16T16:48:17.058Z" level=info msg="Workflow update successful" namespace=integrations phase=Succeeded resourceVersion=3881491 workflow=map-reduce-v844c

Logs from in your workflow's wait container

kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded

This is not possible as pods are gone when the workflow is deleted.
@shuangkun shuangkun added area/artifacts S3/GCP/OSS/Git/HDFS etc area/gc Garbage collection, such as TTLs, retentionPolicy, delays, and more labels Jan 17, 2025
@shuangkun
Copy link
Member

Looks like it should be gc.

@kaancfidan
Copy link
Contributor Author

kaancfidan commented Jan 17, 2025

I think I understand what the issue is.

At https://github.com/argoproj/argo-workflows/blob/19b2322/workflow/artifacts/s3/s3.go#L188 the S3 driver checks if the path has a trailing slash to decide if it's a folder or a single file.

In this case, it's a folder but the example does not put a trailing slash at the end. I think a better way to implement it could be to check if the artifact can use ListDirectory first, and if not treat it as a single file.

If the current implementation is decided to be kept, I think the documentation should show the proper way to produce folder artifacts with a trailing slash to avoid confusion.

edit: I just saw the comment // check suffix instead of s3cli.IsDirectory as it requires another request for file delete (most scenarios) right above the line I pointed at. If this is indeed by design, the example YAMLs should reflect it.

@kaancfidan
Copy link
Contributor Author

I have just confirmed that when a trailing slash is put (i.e. key: "{{workflow.name}}/parts/") the workflow treats it as a folder and actually visualizes it differently:

Image

and when the artifacts are garbage collected, the parts folder is also cleaned in this case.

I'll create a PR to fix the example YAML.

@shuangkun
Copy link
Member

Good discovery

@kaancfidan
Copy link
Contributor Author

I still think that missing a trailing slash in folder artifacts resulting in indefinitely ever-growing S3 buckets is a foot-gun. I think it might be worth an additional S3 API call to check if it's a folder or not.

@shuangkun
Copy link
Member

I still think that missing a trailing slash in folder artifacts resulting in indefinitely ever-growing S3 buckets is a foot-gun. I think it might be worth an additional S3 API call to check if it's a folder or not.

Agree. Although the number of requests is reduced, this judgment method is not correct

@kaancfidan
Copy link
Contributor Author

The weird thing is, it checks if the artifact is a folder in the Azure driver.

The GCS driver does not explicitly handle folders at all, maybe the client itself handles it I'm not familiar with it.

I think the behavior should be symmetrical between all implementations. Should we create a separate issue for it? I might take a swing at it and fix their implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/artifacts S3/GCP/OSS/Git/HDFS etc area/gc Garbage collection, such as TTLs, retentionPolicy, delays, and more type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants