Skip to content

Commit

Permalink
Update index.html
Browse files Browse the repository at this point in the history
  • Loading branch information
boyuanzheng010 committed Jan 3, 2024
1 parent 9576b12 commit 733f22b
Showing 1 changed file with 11 additions and 9 deletions.
20 changes: 11 additions & 9 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -727,24 +727,26 @@ <h1 class="title is-1 mmmu">
<h2 class="title is-3">Results</h2>
<div class="content has-text-justified">
<p>
SeeAct can successfully complete 50% of tasks on different websites if provided an oracle grounding method. We further investigate the performance of web agents on tasks across different difficulty levels. We estimate the task difficulty based on the number of actions taken by annotators during action trace annotation, i.e., Easy: 2-4, Medium: 5-7, and Hard: 8-12, with 26, 15, and 9 tasks in each group, respectively.
SeeAct can successfully complete 50% of tasks on different websites if provided an oracle grounding method. We further investigate the performance of web agents on tasks across different difficulty levels. We estimate the task difficulty based on the number of actions taken by annotators during action trace annotation.
<!-- , i.e., Easy: 2-4, Medium: 5-7, and Hard: 8-12, with 26, 15, and 9 tasks in each group, respectively.-->
</p>
<!-- <img src="static/images/sr_by_action_length_difficulty.png" alt="algebraic reasoning" class="center" style="width: 45%; height: auto;">-->

<div style="display: flex; justify-content: space-around;">
<img src="static/images/online_results_table.jpg" alt="table" style="width: 45%; height: 60%;">
<p></p>
<img src="static/images/sr_by_action_length_difficulty.png" alt="algebraic reasoning" style="width: 45%; height: auto;">
</div>
<!-- <div style="display: flex; justify-content: space-around;">-->
<!-- <img src="static/images/online_results_table.jpg" alt="table" style="width: 45%; height: 60%;">-->
<!-- <p></p>-->
<!-- <img src="static/images/sr_by_action_length_difficulty.png" alt="algebraic reasoning" style="width: 45%; height: auto;">-->
<!-- </div>-->

<div style="display: flex; justify-content: space-around;">
<figure>
<img src="static/images/sr_by_action_length_difficulty.png" alt="algebraic reasoning" style="width: 45%; height: auto;">
<figcaption>Fig.1 - Image caption one</figcaption>
<figcaption>Whole task success rate across task difficulty levels. We categorize tasks based on the number of actions to complete, i.e., Easy: 2-4, Medium: 5-7, and Hard: 8-12, with 26, 15, and 9 tasks in each group, respectively.</figcaption>
</figure>
<figure>
<img src="static/images/your_second_image.png" alt="your second image description" style="width: 45%; height: auto;">
<figcaption>Fig.2 - Image caption two</figcaption>
<img src="static/images/online_results_table.jpg" alt="your second image description" style="width: 45%; height: auto;">
<figcaption>Whole task success rate (%) under both offline and online evaluation.
Offline0 and Offline1 refer to no tolerance for error at any step and allowing for error at one step, respectively.</figcaption>
</figure>
</div>

Expand Down

0 comments on commit 733f22b

Please sign in to comment.