Skip to content

Commit

Permalink
changelog and more info pages update (#243)
Browse files Browse the repository at this point in the history
* changelog and more info pages update

* Update changelog.mdx

* Update changelog.mdx

* Update changelog.mdx

* new pic

* Update more-info/octo-dictionary.mdx

* Update more-info/under-the-hood.mdx
  • Loading branch information
ma-zah authored Aug 14, 2024
1 parent 82b7814 commit 95d6fc3
Show file tree
Hide file tree
Showing 10 changed files with 214 additions and 48 deletions.
138 changes: 138 additions & 0 deletions changelog.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,144 @@ description: "All the nice stuff you asked for"
icon: "sparkles"
---

## 2024-08-09

- `UI tweak` for first time users: Test result details of user's first test run are visible directly in project overview.

<Frame caption="First time users project overview with test result detail, 08/2024">
<img
src="images/changelog/first-report.png"
alt="First time users project overview with test result detail"
/>
</Frame>

## 2024-08-08

- `Save & run`: New button to **run and save** an edited test case at once. You can click on the `run only` button if you want to validate the test first before saving it.

<Frame caption="'save & run' the test to validate and save changes, 08/2024">
<img
src="images/editing/save-run.png"
alt="Save and run the test case"
width="300px"
/>
</Frame>

<div class="mt-8" />

<Frame caption="'run only' to validate the test before saving it, 08/2024">
<img
src="images/editing/save-run-only.png"
alt="Run only to validate the test before saving it"
width="300px"
/>
</Frame>

## 2024-08-06

- `AI suggesting more tests`: Similar to the AI discovery at set-up, we expanded the AI discovery feature to “spawn” more test cases.

<Frame caption="Have the AI agent suggest more test cases, 08/2024">
<img
src="images/setup/setup-11-suggest-more.png"
alt="have AI suggest more test cases"
/>
</Frame>

- `Open status for new test cases`: In order to mark test cases coming out of **suggest more** as new (until you've opened them for the first time), we introduced a new column **open status** .

<Frame caption="Open status on new suggested tests, 08/2024">
<img
src="images/changelog/open-status-proposals.png"
alt="Open status on new suggested tests"
/>
</Frame>

## 2024-08-05

- Speed is important. We sped up loading times on test cases overviews.
- `AI agent can scroll now`: improved scrolling behavior of our AI agents when exploring elements increasing its success rate

## 2024-07-29

- `Draftless editing`: No more draft state of a test case. Just open the test case and save your changes.

## 2024-07-23

- `Guidance through card focus`: We highlight the sections of our app to gently guide you to your **first test report**.
- `AI agent progress indicator`: We reintroduced the progress bar and updated it. At any given moment, you'll have an idea what the agent is doing.

<Frame caption="AI agent progress indicator, 07/2024">
<img
src="/images/changelog/agent-running.png"
alt="AI agent progress indicator"
/>
</Frame>

## 2024-07-18

- `Copy paste test steps`: Use Ctrl+C Ctrl+V keystrokes (cmd+c cmd+v for mac users) to copy paste test steps within the same test or from other test cases.

<Frame caption="copy pasting steps, 07/2024">
<img src="/images/changelog/copy-pasting-steps.gif" alt="test case filters" />
</Frame>

## 2024-07-16

- `Faster AI Agent`: Natural scrolling when extracting web elements and checking for visibility took up 20% of the whole test generation process. We made it faster.
- `Test case filters`: You can filter your tests based on the test statuses, now.

<Frame caption="test case filters, 07/2024">
<img
src="/images/changelog/filters.png"
alt="test case filters"
width={400}
/>
</Frame>

## 2024-07-11

- `Automatic login and cookie banner tests` at sign-up. When you sign up to Octomind or create new project (add a new URL):

1. AI agent goes through your site and checks for cookie banners.
2. If a cookie banner is present, it creates a cookie banner test.
3. AI agent checks if you have a login at your site.
4. If you do, it asks you for test credentials.
5. You give it your test credentials.
6. AI agent auto-generates a login test.
7. The cookie and the login tests will be populated into all new auto-generated tests as dependencies.

<Frame caption="AI agent asks for test credentials to auto-generate and run a login test, screenshot 07/2024">
<img
src="/images/setup/setup-5-login-credentials.png"
alt="AI agent asks for login test credentials, screenshot 07/2024"
/>
</Frame>

## 2024-07-04

- `Mobile + tablet use`: Test case editing works better on smaller screens now.

## 2024-07-01

- `Agent benchmark automation`: Every major change in the codebase impacts the AI agent performance. We get an automated ping whether the agent does what it's supposed to do, now.

## 2024-06-27

- `Variable templates`: We fill more robust variable templates into the test when your test steps are auto-generated. [Learn more.](/variables)
- `Shadow DOM & iFrames`: It looks like we figured out how to handle them. Until you serve us the next 100 edge cases.

## 2024-06-24

- `New project overview`: Your project homepage has a new overview layout. We aimed for a less cluttered design while keeping the flexibility and all features at hand.

<Frame caption="Project overview page after sign-up, 06/2024">
<img
src="/images/setup/setup-4-overview.png"
alt="Opening the Octomind app on project overview for the first time"
/>
</Frame>

## 2024-06-20

- We merged and simplified test case view. Bye bye, way too many tabs.
Expand Down
Binary file added images/changelog/agent-running.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/changelog/copy-pasting-steps.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/changelog/filters.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/changelog/first-report.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/changelog/open-status-proposals.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/moreinfo/playwright-diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
27 changes: 16 additions & 11 deletions more-info/faq.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,45 +7,50 @@ icon: "question"
## 1. What user flows do you cover?

For the time being, we cover basic user flows which happen inside a browser window. We don't test canvas or multi-user applications yet.

We will add building blocks which allow for more demanding scenarios over time, like e-mail or mobile phone based flows, multi-user setups or the inclusion of external apps.

## 2. How are my tests generated?

We are using our AI agent for test case discovery. We create the first test case for a **sign-in user flow** at your set-up automatically and we'll ask you to give us more user flows that need to be end-to-end tested.
We'll discover the interaction chain of the test case in an intermediate representation. We'll generate a corresponding Playwright code on the fly and execute it against your pull request.
We are using our AI agents for test case discovery and test step generation. We'll discover the interaction chain of the test case in an intermediate representation. We'll generate a corresponding Playwright code on the fly and execute on a manual trigger, a schedule or against your pull request.

We generate tests on [sign-up](/first-steps#4-open-the-octomind-app-for-the-first-time), when you launch [test discovery](/first-steps#5-discover-more-test-cases) and when you ask our AI agents to [suggest more](http://localhost:3000/new-test-case#have-ai-agent-suggest-and-auto-generate-more-tests) tests.

## 3. What code are you using for your tests?

We are using the [Playwright](https://playwright.dev/) framework to generate tests.
We are using the [Playwright](https://playwright.dev/) framework to generate tests in standard Playwright test code.

## 4. How are you securing the stability of your tests?

Some of our strategies to fight flakiness are:
End-to-end tests are notoriously flaky. Some of our strategies to fight flakiness are:

- Smart learning based retries
- active interaction timing (sleeps)
- AI based analysis of unexpected circumstances
- Rediscovery in case of user flow changes.
- Rediscovery in case of user flow changes

## 5. How can I run your tests locally?

Our open source tool [Debugtopus](https://github.com/OctoMind-dev/debugtopus) can pull the latest test case from our repository and execute it against your local environment.
Our open source tool [Debugtopus](https://github.com/OctoMind-dev/debugtopus) can pull the latest test case from our repository and execute it against your local environment. [Learn how.](/debugtopus)

## 6. How does the auto-maintenance work?

We are following a playbook to find out if a test failure is caused by a behavioral change of your user flows, the test code itself or a bug in your code.
This feature is under active development and not publicly accesible yet. We will follow a playbook to find out if a test failure is caused by a behavioral change of your user flows, the test code itself or a bug in your code.

In case of a behavioral change, we pinpoint the failing interaction. We apply machine learning to find out what's the new desired interaction to achieve the original goal of the test case.
The interaction chain of this test case will be adjusted permanently to the new behavior as a result.

## 7. How do I write a good prompt?

See [our section about prompting](../new-test-case.mdx#free-prompting-best-practices)
See [our section about prompting a new test case](../new-test-case.mdx#free-prompting-best-practices).

## 8. I do not use use GitHub, can I use your tests?

Yes. Apart from GitHub we do offer a native integration for Azure DevOps and API based integrations for Vercel and Jenkins. For all other build pipelines you can script your own test trigger so that our test suite is triggered whenever you run a pull request.
All you need is an API key to identify your test suite and an externally accessible test target. Unfortunately, we won't be able to comment back into your pipeline.
Instead, you'll be able to receive the test results through our app.
Lean more about our [CI integration options](/integrations-overview).

Unfortunately, we won't be able to comment back into your pipeline. Instead, you'll be able to receive the test results through our app.
You can also run us [programmatically without using a CI](/execution-without-ci), schedule regular test runs or trigger test runs manually.

## 9. How can I get in touch with you?

Expand All @@ -54,7 +59,7 @@ or [write us an email](mailto:[email protected])

## 10. From which IP addresses are your tests run?

Pelase refere to the [Data Governance](/data-governance/no-code-access#one-ip-address) section.
Please, see our [Data Governance](/data-governance/no-code-access#two-ip-addresses) page.

## 11. What is the User-agent string of your test agent?

Expand Down
55 changes: 33 additions & 22 deletions more-info/octo-dictionary.mdx
Original file line number Diff line number Diff line change
@@ -1,9 +1,18 @@
---
title: Octo Dictionary
description: "Essential terminology used within the app"
description: "Essential terminology used within the Octomind app"
icon: "book"
---

### AI discovery

Happens when our AI agents traverse the publicly accessible code in the DOM and use the vision capability of the underlying multimodal LLM to add visual context.
They are looking for meaningful user flows to be tested.

### AI generation

Describes the process when our AI agents generate sequential test steps in order to achieve a user flow goal.

### Debugtopus

Debugtopus is our open source tool enabling you to run tests locally for easier debugging.
Expand All @@ -12,54 +21,45 @@ Debugtopus is our open source tool enabling you to run tests locally for easier

A test case can be dependent on another test case. This way code can be reused and test case generation gets much easier. Many test cases can form a chain.

### Generate report
### Locator

Generate report lets you run all published test cases. Test reports can be generated in many ways from in-app as well as through CI commands or via command line.

### Manual test case creation

Manual test case creation is the process of creating a draft with no steps in it. Steps can be added manually after creation.
[Locators](https://playwright.dev/docs/api/class-locator) represent a way to find element(s) on the page at any moment in the Playwright test framework.

### Project

A [project](/projects) is a collection of test cases and test reports tied to a certain app (the test target).
A [project](/projects) is a collection of test cases and test reports tied to a certain app (the test target, URL).

### Prompt

The prompt is describing the goal the AI agent needs to achieve when generating the steps for a specific user flow.

### Recording

Recording is another way to add a new test case to Octomind. It is best used via Playwright Codegen and allows to paste code directly.
[Recording](/new-test-case#record-a-test-case) is another way to add a new test case to Octomind. It is best used via **Playwright Codegen** and allows to paste code directly.

### Run locally

You can [run one or all test cases on your local machine](/debugtopus) against any test target.

### Run test case
### Run a test case

You can run a edited test cases by clicking the `save & run` or the `run only` button. This functionality is intended to validate the test case itself.

### Snapshots

You can run any draft or published test case by clicking the run button. This functionality is intended to validate the test case itself.
Snapshot is a visual screenshot of a particular state of the tested app. They are used in test steps to pick / change a locator and in test results to show you what happened during the test execution.

### Test case

A Test case is the Octomind base entity and it models a user flow.
A **test case** is the Octomind base entity and it models a user flow. It is the end-to-end test itself.

<Frame caption="Anatomy of a test case, 03/2024">
<img src="/images/TestCase.png" alt="anatomy of a test case" />
</Frame>

A test case can have multiple states:

1. **draft:** test case ends up in draft state after being created. While a test case is in draft state, you can add/remove steps to fulfil your testing goal
2. **published:** once you're happy with your test case's steps, you can publish it. Published test cases will be included into test report
3. **archived:** A test case is archived once you click the remove button on a published test case.
This excludes the test case from future test reports and we keep them as archived in case you change your mind.

Draft and published test cases can be run to check out if they are working.

### Test report

The [test report](/test-reports) consists of the test results of all published test cases.
Every test run will produce a [test report](/account-view/test-reports.mdx) where you will see all test results. They will tell you if everything runs as it's supposed to or if something is broken.

### Test step

Expand All @@ -80,3 +80,14 @@ to book a hotel on a hotel booking page.
<Frame caption="example of a user flow, 03/2024">
<img src="/images/user-flow.png" alt="example of a user flow" />
</Frame>

### Visual locator picker

Is the key tool for editing test steps. It let's you select a new locator visually from a snapshot of the app.

<Frame caption="Changing locators with the virtual locator picker, 7/2024">
<img
src="/images/editing/change-locator.gif"
alt="changing locators with virtual locator picker"
/>
</Frame>
42 changes: 27 additions & 15 deletions more-info/under-the-hood.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,31 +4,41 @@ description: "If you are interested in how we make the tests work, this is your
icon: "engine"
---

## Test case building
## AI agents discovering and generating test steps

A web app is typically composed of user flows. A user flow lets a user accomplish a certain goal. "Sign-in with email & password" is an example.
To perform a sign-in with email & password, a certain sequence of steps - `interactions` - is required.
We are recording and storing the interaction chain of a test case in an intermediate representation.
The corresponding [Playwright](https://playwright.dev/) code which is exercising the UI is generated on the fly just before test execution. The interaction chain can be examined in the test detail view and the Playwright trace viewer.

## Auto-maintenance
Our AI agents mimic human users (i.e., clicks input fields, signs up for newsletter) to navigate apps, interpret app intent, and identify all relevant user flows. We are recording and storing the interaction chain of a test case in an intermediate representation.

## Playwright generates test code

After we record and store each test case's interaction chain, we generate the corresponding [Playwright code](https://playwright.dev/) deterministically on the fly immediately prior to test execution.

The interaction chain can be examined in the test detail view and the Playwright trace viewer.

<Frame caption="Use of AI agents and Playwright in Octomind end-to-end tests">
<img
src="/images/moreinfo/playwright-diagram.png"
alt="playwright & AI diagram in Octomind tests"
/>
</Frame>

## Work in progress: AI auto-maintenance

We will follow a playbook to find out if a test failure is caused by a behavioral change of your user flows, the test code itself or a bug in your code. In the case of a behavioral change, we pinpoint failing interactions, and deploy the AI Agent to detect new desired interaction that will allow us to achieve the test case's goals.

We follow a playbook to find out if a test failure is caused by a behavioral change of your user flows, the test code itself or a bug in your code. In the case of a behavioral change, we pinpoint the failing interaction.
We then apply machine learning to detect the new desired interaction to achieve the original goal of the test case. The interaction chain of the test case will be adjusted permanently to the new behavior as a result.
This feature is under active development and not publicly accesible yet.

## Issue pinpointing

When a test case rightfully fails, we help you quickly understand what went wrong. We are providing a set of tools to help you understand the issue.
When a test case fails, we help you quickly understand what went wrong. We are providing a set of tools to help you understand the issue.

- Screenshots at the time of test failure
- Execution log of the test cases
- Playwright traces via trace viewer
- `Debugtopus` tooling, which lets you run our tests localy, so you can set breakpoints to step through the code.
- Snapshots at the time of test failure
- traces via [Playwright trace viewer](https://playwright.dev/docs/trace-viewer)
- open source [Debugtopus](https://github.com/OctoMind-dev/debugtopus) tooling, which lets you run our tests localy, so you can set breakpoints to step through the code.

### Debugtopus

See details at [Debug your code](/debugtopus).
See more details in [Debug your code](/debugtopus).

<Frame caption="Debugtopus interaction diagram">
<img
Expand All @@ -37,11 +47,13 @@ See details at [Debug your code](/debugtopus).
/>
</Frame>

## Test suite runtime
## Parallel test runs for shorter runtime

Browser tests are not super fast since they are simulating a real user. To provide test results as fast as possible we parallelize test execution to the max.
To do so, we are fully cloud based and we scale instances up and down as needed. We are also working on techniques to separate test execution to avoid side effects for better scaling.

Octomind tests are run in parallel, so your test suite will complete in 20 minutes or less, regardless of size.

## Flakiness

Test flakiness is the biggest problem of browser tests. Fighting flakiness is an active focus of research on our end. Some of the strategies we follow are:
Expand Down

0 comments on commit 95d6fc3

Please sign in to comment.