Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Development: Fix flaky e2e tests #9326

Merged
merged 15 commits into from
Oct 3, 2024
Merged

Conversation

SimonEntholzer
Copy link
Contributor

@SimonEntholzer SimonEntholzer commented Sep 17, 2024

Checklist

General

Motivation and Context

A few e2e server tests are still flaky. Here is an overview: https://confluence.ase.in.tum.de/display/ArTEMiS/E2E+Playwright+Tests

Description

This PR tries to find the core causes of the flakiness, and fix it. For those, where the cause could not yet identified by 100 % I added additional steps, that help finding the issue if the flakiness continues.

Steps for Testing

Prerequisites:
Start client and server locally, and run all e2e tests with
npm run playwright:open

Check the e2e runs on bamboo below.

Testserver States

Note

These badges show the state of the test servers.
Green = Currently available, Red = Currently locked
Click on the badges to get to the test servers.







Review Progress

Performance Review

Code Review

  • Code Review 1
  • Code Review 2

Manual Tests

  • Test 1
  • Test 2

Performance Tests

  • Test 1
  • Test 2

Test Coverage

Screenshots

Summary by CodeRabbit

  • New Features

    • Enhanced test suite for student exams with the addition of a new student.
    • New methods to facilitate navigation to exercise submissions and verify student submissions.
    • Improved tutor assessment flow by ensuring proper login and submission visibility.
    • Introduced a new exam participation step with the almostStartExam method.
  • Bug Fixes

    • Simplified login process in programming exercise management tests.
  • Tests

    • Added a new test suite to intentionally fail for testing purposes.

@SimonEntholzer SimonEntholzer changed the title Chore/bugfix/fix flaky e2e tests Development: Fix flaky e2e tests Sep 29, 2024
@SimonEntholzer SimonEntholzer marked this pull request as ready for review September 29, 2024 09:09
@SimonEntholzer SimonEntholzer requested a review from a team as a code owner September 29, 2024 09:09
Copy link

coderabbitai bot commented Sep 29, 2024

Walkthrough

The changes in this pull request enhance the functionality of the test suite for student exams and related components. New methods have been introduced to facilitate the exam participation process and manage submissions in various exercise contexts. Additionally, existing test cases have been updated to accommodate new student scenarios and streamline login processes.

Changes

Files Change Summary
src/test/playwright/e2e/exam/test-exam/TestExamStudentExams.spec.ts Added studentThree to the test, updated expected student exam row count from 2 to 3, and included checks for studentThree's exam status. Introduced almostStartExam() method to the exam participation process.
src/test/playwright/e2e/exercise/modeling/ModelingExerciseAssessment.spec.ts Enhanced the "Tutor can assess a submission" test by adding login for the tutor and checks for student submissions.
src/test/playwright/e2e/exercise/programming/ProgrammingExerciseManagement.spec.ts Simplified the login function call by removing the URL path argument, streamlining the login process.
src/test/playwright/support/pageobjects/course/CourseManagementPage.ts Added openSubmissionsForExerciseAndCourse(courseID: number, exerciseID: number) and checkIfStudentSubmissionExists(studentName: string) methods to manage exercise submissions and verify student submissions.
src/test/playwright/support/pageobjects/exam/ExamParticipationPage.ts Introduced almostStartExam() method to facilitate a new step in the exam starting process.
src/test/playwright/support/pageobjects/exam/ExamStartEndPage.ts Added onlyClickConfirmationCheckmark() method to handle confirmation in the exam process.
src/test/playwright/e2e/exam/ExamAssessment.spec.ts Added a new test suite "This test should fail" with a test that asserts true is false, ensuring it will fail when executed.

Possibly related PRs

Suggested labels

client, bugfix, priority:high, small

Suggested reviewers

  • krusche
  • BBesrour
  • JohannesStoehr
  • pzdr7
  • az108
  • coolchock
  • undernagruzez

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range comments (2)
src/test/playwright/e2e/exercise/modeling/ModelingExerciseAssessment.spec.ts (1)

Line range hint 30-36: Consider revisiting the commented-out test.beforeEach block

There's a commented-out test.beforeEach block that updates the assessment due date. Given that this PR aims to fix flaky tests and add steps to diagnose ongoing issues, it might be worth revisiting this block.

Consider whether this test.beforeEach block could be useful in addressing flakiness or diagnosing issues. If it's no longer needed, it might be better to remove it entirely. If it's potentially useful, consider uncommenting and adapting it to fit the current test structure.

test.beforeEach('Ending assessment period', async ({ browser }) => {
    const context = await browser.newContext();
    const page = await context.newPage();
    const exerciseAPIRequests = new ExerciseAPIRequests(page);
    const response = await exerciseAPIRequests.updateModelingExerciseAssessmentDueDate(modelingExercise, dayjs());
    modelingExercise = await response.json();
});

If you decide to keep it commented out, please add a comment explaining why it's being kept for future reference.

🧰 Tools
🪛 Biome

[error] 48-48: Forbidden non-null assertion.

(lint/style/noNonNullAssertion)


[error] 48-48: Forbidden non-null assertion.

(lint/style/noNonNullAssertion)


[error] 51-51: Forbidden non-null assertion.

(lint/style/noNonNullAssertion)

src/test/playwright/e2e/exam/test-exam/TestExamStudentExams.spec.ts (1)

Line range hint 1-141: Consider adding additional test cases for improved stability.

The current test suite covers various scenarios for student exams. To further improve stability and align with the PR objective of fixing flaky tests, consider adding the following:

  1. Error handling: Add tests that simulate and handle potential error scenarios, such as network issues or unexpected server responses.

  2. Retry mechanism: Implement a retry mechanism for potentially unstable operations, like starting or submitting an exam.

  3. Timing-sensitive operations: Add tests that specifically target any timing-sensitive operations in the exam process, as these are often sources of flakiness.

  4. Edge cases: Consider adding tests for edge cases, such as:

    • A student losing connection mid-exam
    • Multiple students trying to start an exam simultaneously
    • A student attempting to submit an exam after the time limit
  5. Performance tests: If applicable, add tests to ensure the system performs well under load, which could uncover potential sources of flakiness.

These additions could help identify and prevent potential sources of test flakiness, further supporting the PR's objectives.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE

📥 Commits

Files that changed from the base of the PR and between 92209a3 and ad7a54a.

📒 Files selected for processing (6)
  • src/test/playwright/e2e/exam/test-exam/TestExamStudentExams.spec.ts (2 hunks)
  • src/test/playwright/e2e/exercise/modeling/ModelingExerciseAssessment.spec.ts (1 hunks)
  • src/test/playwright/e2e/exercise/programming/ProgrammingExerciseManagement.spec.ts (1 hunks)
  • src/test/playwright/support/pageobjects/course/CourseManagementPage.ts (1 hunks)
  • src/test/playwright/support/pageobjects/exam/ExamParticipationPage.ts (1 hunks)
  • src/test/playwright/support/pageobjects/exam/ExamStartEndPage.ts (1 hunks)
🧰 Additional context used
🪛 Biome
src/test/playwright/e2e/exercise/modeling/ModelingExerciseAssessment.spec.ts

[error] 48-48: Forbidden non-null assertion.

(lint/style/noNonNullAssertion)


[error] 48-48: Forbidden non-null assertion.

(lint/style/noNonNullAssertion)

🔇 Additional comments (12)
src/test/playwright/support/pageobjects/exam/ExamStartEndPage.ts (2)

51-53: LGTM. Can you provide more context on how this helps with flaky tests?

The new onlyClickConfirmationCheckmark method is well-implemented and follows the existing coding patterns. It provides a more granular way to interact with the exam page, which could be beneficial for more precise test control.

Could you elaborate on how this new method specifically addresses the flaky e2e test issues mentioned in the PR objectives? Understanding the context would help ensure this change effectively contributes to resolving the flakiness.


51-53: 🛠️ Refactor suggestion

Consider refactoring and verify usage in e2e tests

The new onlyClickConfirmationCheckmark method provides a more granular approach to interacting with the exam page. Consider the following suggestions:

  1. You might want to refactor the startExam and finishExam methods to use this new method for better consistency and to potentially simplify those methods.

  2. It would be beneficial to verify that this new method is being utilized in the updated e2e tests to address the flakiness issues mentioned in the PR objectives.

To verify the usage of the new method in e2e tests, you can run the following script:

This will help ensure that the new method is being utilized as intended to address the flaky test issues.

✅ Verification successful

[/run_scripts]


Verification Successful: onlyClickConfirmationCheckmark is in Use

The onlyClickConfirmationCheckmark method is successfully utilized in the ExamParticipationPage.ts e2e test, ensuring that the new approach is integrated to address the flakiness issues.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for usage of onlyClickConfirmationCheckmark in e2e tests

# Test: Search for the method usage in e2e test files
rg --type typescript 'onlyClickConfirmationCheckmark' src/test/playwright

Length of output: 108


Script:

#!/bin/bash
# Description: Search for usage of onlyClickConfirmationCheckmark in e2e test TypeScript files

# Test: Search for the method usage in .ts and .tsx files within src/test/playwright
rg 'onlyClickConfirmationCheckmark' src/test/playwright --glob '*.ts' --glob '*.tsx'

Length of output: 330

src/test/playwright/e2e/exercise/modeling/ModelingExerciseAssessment.spec.ts (1)

Line range hint 1-97: Overall assessment of changes

The modifications made to this file align well with the PR objectives of fixing flaky e2e tests. The additional steps in the "Tutor can assess a submission" test case enhance its robustness and should help reduce flakiness.

The changes look good overall, with only minor suggestions for improvement:

  1. Remove the duplicate login action for the tutor.
  2. Revisit the commented-out test.beforeEach block to determine if it's still relevant.

These adjustments should further improve the test reliability and maintainability.

🧰 Tools
🪛 Biome

[error] 48-48: Forbidden non-null assertion.

(lint/style/noNonNullAssertion)


[error] 48-48: Forbidden non-null assertion.

(lint/style/noNonNullAssertion)


[error] 51-51: Forbidden non-null assertion.

(lint/style/noNonNullAssertion)

src/test/playwright/support/pageobjects/exam/ExamParticipationPage.ts (2)

118-120: Summary: Changes align well with PR objectives

The addition of the almostStartExam method is a focused change that aligns well with the PR objectives of fixing flaky e2e tests. It provides more granular control over the exam starting process, which can be beneficial for creating more robust test scenarios.

The implementation is clean, consistent with the existing code structure, and doesn't introduce any breaking changes. This change has the potential to improve the reliability of the e2e tests by allowing for more precise control and potentially isolating issues in the exam starting process.

To fully validate the impact of this change:

  1. Ensure that the new method is being used effectively in the updated e2e tests.
  2. Verify that the flaky tests identified in the Confluence document (mentioned in the PR description) are indeed stabilized with this change.
  3. Consider running a full suite of e2e tests to confirm that this change doesn't introduce any unintended side effects.

Great job on this focused and potentially impactful change!


118-120: Verify usage in e2e tests

The new almostStartExam method is a good addition for more granular control in the exam starting process. To ensure it's effectively addressing the flaky e2e tests mentioned in the PR objectives, it would be beneficial to verify its usage in the updated test scenarios.

Run the following script to check for usage of the new method in e2e tests:

This will help confirm that the new method is being utilized as intended in the e2e tests, contributing to the resolution of the flaky test issues.

✅ Verification successful

Re-run verification with corrected search parameters

The previous script encountered an error due to an unrecognized file type. Please execute the following script to verify the usage of the almostStartExam method in the e2e tests:


almostStartExam method usage verified in e2e tests

The almostStartExam method is correctly utilized within the TestExamStudentExams.spec.ts e2e test, ensuring it effectively contributes to addressing the flaky test issues outlined in the PR objectives.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for usage of almostStartExam in e2e tests

# Test: Search for almostStartExam usage in e2e test files
rg --type typescript 'almostStartExam' src/test/playwright

Length of output: 93


Script:

#!/bin/bash
# Description: Check for usage of almostStartExam in e2e test files

# Test: Search for almostStartExam usage in TypeScript e2e test files
rg 'almostStartExam' src/test/playwright -g '*.ts' -g '*.tsx'

Length of output: 284

src/test/playwright/e2e/exam/test-exam/TestExamStudentExams.spec.ts (4)

72-74: LGTM: Addition of studentThree and updated row count.

The changes correctly reflect the addition of a third student to the test scenario. This improves the test coverage and aligns with the PR objective of fixing flaky tests by ensuring all students are accounted for.


78-78: LGTM: Verification of studentThree's initial 'Started' state.

This assertion correctly checks that studentThree has not started the exam, which is an important verification of the initial state. This addition contributes to the robustness of the test suite and aligns with the PR objective of fixing flaky tests.


82-82: LGTM: Verification of studentThree's initial 'Submitted' state.

This assertion correctly checks that studentThree has not submitted the exam, which is another important verification of the initial state. This addition further enhances the test suite's ability to catch potential issues and aligns with the PR objective of fixing flaky tests.


121-121: Please clarify the purpose and implementation of almostStartExam().

The addition of examParticipation.almostStartExam() introduces a new step in the exam process. Could you please provide more information on:

  1. The purpose of this new method?
  2. How it differs from the regular startExam() method?
  3. How this change contributes to fixing flaky tests?

Additionally, it would be helpful to see the implementation of almostStartExam() to ensure it's correctly handling any potential race conditions or timing issues that might have caused flakiness in the past.

To help with the review, could you please run the following command to show the implementation of almostStartExam()?

✅ Verification successful

Verification Successful: The implementation of almostStartExam() has been located and appears to appropriately introduce an intermediate step to enhance the exam start process, likely mitigating flaky tests.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for the implementation of almostStartExam()
rg --type typescript -A 10 'almostStartExam\s*\(' src/test/playwright/support/pageobjects/exam/

Length of output: 132


Script:

#!/bin/bash
# Search for the implementation of almostStartExam() in TypeScript files
rg 'almostStartExam\s*\(' --glob '*.ts' -A 10 src/test/playwright/support/pageobjects/exam/

Length of output: 1087

src/test/playwright/e2e/exercise/programming/ProgrammingExerciseManagement.spec.ts (2)

Line range hint 1-150: Overall assessment of the file changes

The modification to the login function call is the only change in this file. It appears to be a targeted adjustment aimed at reducing potential flakiness in the e2e tests, which aligns with the PR's objectives.

The rest of the file remains unchanged and appears to be well-structured. The test cases cover various aspects of programming exercise management, including creation, deletion, and team management. This comprehensive coverage contributes to the overall robustness of the e2e test suite.

🧰 Tools
🪛 Biome

[error] 111-111: Forbidden non-null assertion.

(lint/style/noNonNullAssertion)


[error] 112-112: Forbidden non-null assertion.

(lint/style/noNonNullAssertion)


109-109: Verify login behavior after removing the initial navigation path.

The removal of the '/' argument from the login function call may change the initial page the user lands on after login. This simplification could potentially reduce flakiness in the test, aligning with the PR's objective.

Please confirm that:

  1. The login function still navigates to the correct initial page without the explicit path argument.
  2. The subsequent navigation steps in the test are not affected by this change.

To assist in verification, you can run the following script:

Monitor this test case for improved stability. If flakiness persists, consider adding more detailed logging or additional waits after the login step.

src/test/playwright/support/pageobjects/course/CourseManagementPage.ts (1)

137-146: Summary: New methods align with PR objectives

The additions to the CourseManagementPage class, namely openSubmissionsForExerciseAndCourse and checkIfStudentSubmissionExists, are well-aligned with the PR's objective of fixing flaky e2e tests. These methods provide more granular control over navigation and verification in the tests, which can contribute to improved reliability and reduced flakiness.

To further enhance the robustness of these methods and align with best practices for e2e testing:

  1. Implement proper error handling and logging.
  2. Use dynamic URL construction to improve maintainability.
  3. Add timeouts and return values to provide more control and information to the test scripts.

These enhancements will contribute to the overall goal of improving test stability and diagnosability, as mentioned in the PR objectives.

coderabbitai[bot]
coderabbitai bot previously approved these changes Sep 29, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: ASSERTIVE

📥 Commits

Files that changed from the base of the PR and between ad7a54a and f8644a0.

📒 Files selected for processing (1)
  • src/test/playwright/e2e/exam/ExamAssessment.spec.ts (1 hunks)

src/test/playwright/e2e/exam/ExamAssessment.spec.ts Outdated Show resolved Hide resolved
coderabbitai[bot]
coderabbitai bot previously approved these changes Sep 29, 2024
Copy link
Contributor

@sarpsahinalp sarpsahinalp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code lgtm

Copy link
Contributor

@cremertim cremertim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

Copy link
Contributor

@coolchock coolchock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested locally, tests passed

Copy link
Contributor

@JohannesStoehr JohannesStoehr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

6 participants