Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flaky test]solution for aliases.js #920

Merged

Conversation

kohinoor98
Copy link
Contributor

@kohinoor98 kohinoor98 commented Nov 7, 2023

Description

Solution:
Intercepting the apiCaller POST call and waiting for the call to complete before moving on to check if the "Rows per page" is present.

The reason for it being flaky is that the apiCaller sometimes takes exteremely long time to return data as the number of elements are high and if the network tab is open in cypress while the test is being conducted the API calls take a very long time.

If the flaky test reoccurs then:

  1. Change the number of POST calls in the before in aliases.js
  2. Increase the timeout in the line below:
cy.wait("@apiCaller", { timeout: 120000 });

Issues Resolved

#910

Check List

  • Commits are signed per the DCO using --signoff
  • All test cases are pasing for aliases.js

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

codecov bot commented Nov 7, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (1c8f0c6) 63.37% compared to head (3315434) 63.37%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #920   +/-   ##
=======================================
  Coverage   63.37%   63.37%           
=======================================
  Files         341      341           
  Lines       11554    11554           
  Branches     2243     2243           
=======================================
  Hits         7322     7322           
  Misses       3658     3658           
  Partials      574      574           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kohinoor98 kohinoor98 mentioned this pull request Nov 8, 2023
2 tasks
@SuZhou-Joe
Copy link
Member

SuZhou-Joe commented Nov 8, 2023

Nice try to fix the flaky test.

image

Actually the root cause that makes alias.js flaky is that OSD may take a long time to start(bundling specific js from source) and alias.js is the first test case. I'd recommend:

  • Add some extra wait time before the cy.visit(${BASE_PATH}/app/${IM_PLUGIN_NAME}#/aliases); to wait for OSD starting completely.
  • Please fix the DCO check.

@kohinoor98
Copy link
Contributor Author

Hi @SuZhou-Joe

Thanks lot for the reply. I will make the recommended change. I will also go ahead and run the test cases multiple times to ensure no other code is affected due to this change.

KC

@kohinoor98
Copy link
Contributor Author

Hi @SuZhou-Joe

I have added a wait for the test case.

Another solution would be to ping to a specific endpoint of OSD using long polling and try to see if OSD is ready. If it does not load in due time then the only option is throw an error.
Is this a viable option? If yes, which end point would be most reliable to check if the OSD is ready?

Thanks,
KC

@kohinoor98 kohinoor98 force-pushed the #910_aliases.js_flaky_test branch from 04e29e4 to f562f3e Compare November 12, 2023 23:19
@kohinoor98 kohinoor98 force-pushed the #910_aliases.js_flaky_test branch from 477c0dd to aa3e0cc Compare November 12, 2023 23:22
@SuZhou-Joe
Copy link
Member

Hi @SuZhou-Joe

I have added a wait for the test case.

Another solution would be to ping to a specific endpoint of OSD using long polling and try to see if OSD is ready. If it does not load in due time then the only option is throw an error. Is this a viable option? If yes, which end point would be most reliable to check if the OSD is ready?

Thanks, KC

Hi KC, I am afraid that for OSD there is not an endpoint that can check the status of bundling process, the only thing we can do should be more wait time for that.

@bowenlan-amzn
Copy link
Member

@kohinoor98 I see the test is still failing here https://github.com/opensearch-project/index-management-dashboards-plugin/actions/runs/6843755781/job/18606696212?pr=920
I triggered a re-run

As for the fix, as @SuZhou-Joe mentioned, the problem is dashboard server is not ready yet, would you look at this part to tackle this problem
https://github.com/opensearch-project/index-management-dashboards-plugin/blame/1c8f0c660a6a7ef4960862e67ab8140a8d156691/.github/workflows/cypress-workflow.yml#L73-L75

@kohinoor98
Copy link
Contributor Author

kohinoor98 commented Nov 15, 2023

@kohinoor98 I see the test is still failing here https://github.com/opensearch-project/index-management-dashboards-plugin/actions/runs/6843755781/job/18606696212?pr=920 I triggered a re-run

As for the fix, as @SuZhou-Joe mentioned, the problem is dashboard server is not ready yet, would you look at this part to tackle this problem https://github.com/opensearch-project/index-management-dashboards-plugin/blame/1c8f0c660a6a7ef4960862e67ab8140a8d156691/.github/workflows/cypress-workflow.yml#L73-L75

Hi @bowenlan-amzn

After repeatedly running this test(more than five times), I have increased the wait for the ISM API call and OSD bootstrap. If aliases.js fails to pass in the future based on this, my approach would be to keep doubling the wait time and maintaining an upper boundary of 420 seconds.

Another solution could be upgrading cypress.

Thanks,
KC

@kohinoor98 kohinoor98 force-pushed the #910_aliases.js_flaky_test branch from d5eb979 to aa3e0cc Compare November 15, 2023 10:45
Comment on lines 39 to 41
// Wait for 120 seconds for OSD to start.
// eslint-disable-next-line cypress/no-unnecessary-waiting
cy.wait(120000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't want this to slow down the testing, I am looking for the way to make sure OSD starts, 2 things worth try

https://github.com/opensearch-project/OpenSearch-Dashboards/blob/f6ce6e707035f791ed98a9edeab785d0785a8dbd/.github/workflows/cypress_workflow.yml#L149

# timeout 300 bash -c 'while [[ "$(curl -s localhost:5601/api/status | jq -r '.status.overall.state')" != "green" ]]; do sleep 5; done'

For now you can just remove this wait, and we can try these 2 things in another PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bowenlan-amzn Done! Could you please create the appropriate issue for this and assign it to me? I will get started on it in a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create one #938 but still need your comment to assign, thanks!

@bowenlan-amzn bowenlan-amzn merged commit ab2c497 into opensearch-project:main Nov 17, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants