Spaces:
Running
Running
Update index.html
Browse files- index.html +40 -0
index.html
CHANGED
|
@@ -234,6 +234,46 @@
|
|
| 234 |
</div>
|
| 235 |
</section>
|
| 236 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 237 |
<!-- BibTeX -->
|
| 238 |
<section class="section" id="BibTeX">
|
| 239 |
<div class="container is-max-desktop content">
|
|
|
|
| 234 |
</div>
|
| 235 |
</section>
|
| 236 |
|
| 237 |
+
<!-- Results and Analysis -->
|
| 238 |
+
<section class="section" id="results-analysis">
|
| 239 |
+
<div class="container is-max-desktop">
|
| 240 |
+
<div class="columns is-centered">
|
| 241 |
+
<div class="column is-four-fifths">
|
| 242 |
+
<h2 class="title is-3">Results and Analysis</h2>
|
| 243 |
+
|
| 244 |
+
<div class="content has-text-justified">
|
| 245 |
+
<p>
|
| 246 |
+
We evaluate multiple agent configurations on <strong>Automotive-ENV</strong>, reporting success
|
| 247 |
+
rates across General tasks (Explicit Control, Implicit Intent) and Safety-Aware tasks
|
| 248 |
+
(Driving Alignment, Environment Alerts). We also analyze the effect of GPS-aware context
|
| 249 |
+
on inference token usage and task-wise performance across hotspot categories.
|
| 250 |
+
</p>
|
| 251 |
+
</div>
|
| 252 |
+
|
| 253 |
+
<!-- Figure 1: Success rates -->
|
| 254 |
+
<figure class="system-figure has-text-centered" style="margin-top:12px;">
|
| 255 |
+
<img src="./static/images/results.jpg" alt="Success rates of different agent configurations across task groups">
|
| 256 |
+
<figcaption class="subtitle is-6" style="margin-top:8px;">
|
| 257 |
+
Success rates (SR %) of different agent configurations on Automotive-ENV. Results are
|
| 258 |
+
reported across General tasks (Explicit Control, Implicit Intent) and Safety-Aware tasks
|
| 259 |
+
(Driving Alignment, Environment Alerts).
|
| 260 |
+
</figcaption>
|
| 261 |
+
</figure>
|
| 262 |
+
|
| 263 |
+
<!-- Figure 2: Tokens & GPS comparison -->
|
| 264 |
+
<figure class="system-figure has-text-centered" style="margin-top:18px;">
|
| 265 |
+
<img src="./static/images/task_and_check.jpg" alt="Token length distributions and task-wise performance with vs. without GPS">
|
| 266 |
+
<figcaption class="subtitle is-6" style="margin-top:8px;">
|
| 267 |
+
Comparison of inference tokens with and without GPS information. Left: distribution of
|
| 268 |
+
token lengths. Right: task-wise performance across hotspot categories.
|
| 269 |
+
</figcaption>
|
| 270 |
+
</figure>
|
| 271 |
+
|
| 272 |
+
</div>
|
| 273 |
+
</div>
|
| 274 |
+
</div>
|
| 275 |
+
</section>
|
| 276 |
+
|
| 277 |
<!-- BibTeX -->
|
| 278 |
<section class="section" id="BibTeX">
|
| 279 |
<div class="container is-max-desktop content">
|