madmax3366 commited on
Commit
9b4a15e
·
verified ·
1 Parent(s): 89d88d4

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +178 -129
index.html CHANGED
@@ -5,164 +5,213 @@
5
  <meta name="viewport" content="width=device-width, initial-scale=1" />
6
  <title>AUTOMOTIVE-ENV: Benchmarking Multimodal Agents in Vehicle Interface Systems</title>
7
  <meta name="description" content="AUTOMOTIVE-ENV: Benchmarking Multimodal Agents in Vehicle Interface Systems" />
 
8
  <link rel="preconnect" href="https://fonts.googleapis.com">
9
  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
10
  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&display=swap" rel="stylesheet">
 
11
  <style>
12
- :root {
 
 
13
  --bg: #ffffff;
14
- --fg: #0a0a0a;
15
  --muted: #555;
16
- --border: #e6e6e6;
17
  --accent: #111;
18
- --maxw: 960px;
19
- --radius: 14px;
20
- --shadow: 0 1px 2px rgba(0,0,0,0.05), 0 6px 20px rgba(0,0,0,0.06);
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  }
22
- * { box-sizing: border-box; }
23
- html, body { margin: 0; padding: 0; background: var(--bg); color: var(--fg); font-family: Inter, system-ui, -apple-system, Segoe UI, Roboto, Helvetica, Arial, "Apple Color Emoji", "Segoe UI Emoji"; }
24
- a { color: var(--accent); text-decoration: none; border-bottom: 1px solid rgba(0,0,0,0.1); }
25
- a:hover { border-bottom-color: rgba(0,0,0,0.3); }
26
- .wrap { max-width: var(--maxw); margin: 0 auto; padding: 32px 20px 80px; }
27
- header { text-align: center; padding: 40px 0 24px; }
28
- h1 { font-size: clamp(28px, 4.5vw, 40px); line-height: 1.15; margin: 0 0 16px; letter-spacing: -0.02em; }
29
- .lead { color: var(--muted); margin: 8px auto 18px; font-size: clamp(16px, 2vw, 18px); max-width: 840px; }
30
- .authors, .affils { margin: 10px auto 0; color: var(--muted); font-size: 15px; }
31
- .authors a { border-bottom: 1px dashed rgba(0,0,0,0.2); }
32
- .badgebar { display: inline-flex; gap: 10px; margin-top: 18px; }
33
- .badge { display: inline-block; font-size: 14px; padding: 8px 12px; border: 1px solid var(--border); border-radius: 999px; box-shadow: var(--shadow); background: #fff; }
34
- .section { margin: 30px 0; }
35
- .card { border: 1px solid var(--border); border-radius: var(--radius); box-shadow: var(--shadow); background: #fff; padding: 20px; }
36
- .card h2 { margin: 0 0 12px; font-size: 22px; }
37
- .video { overflow: hidden; }
38
- .video video, .video iframe { width: 100%; height: auto; display: block; border-radius: 12px; }
39
- .grid { display: grid; grid-template-columns: 1fr; gap: 16px; }
40
- @media (min-width: 900px) { .grid.two { grid-template-columns: 1fr 1fr; } }
41
- footer { margin-top: 40px; padding-top: 20px; border-top: 1px solid var(--border); color: var(--muted); text-align: center; font-size: 14px; }
42
- .mono { font-family: ui-monospace, SFMono-Regular, Menlo, Monaco, Consolas, "Liberation Mono", "Courier New", monospace; font-size: 13px; white-space: pre-wrap; word-break: break-word; background: #fafafa; border: 1px solid var(--border); border-radius: 8px; padding: 12px; }
43
  </style>
44
  </head>
 
45
  <body>
46
- <div class="wrap">
47
- <!-- header -->
48
- <header>
49
- <h1>AUTOMOTIVE-ENV: BENCHMARKING MULTIMODAL AGENTS IN VEHICLE INTERFACE SYSTEMS</h1>
 
 
 
 
 
 
 
 
 
 
50
  <div class="authors">
51
- <strong>Junfeng Yan</strong><sup>*1</sup>, <strong>Biao Wu</strong><sup>*1</sup>, <strong>Meng Fang</strong><sup>2</sup>, <strong>Ling Chen</strong><sup>1</sup>
 
 
 
52
  </div>
53
  <div class="affils">
54
- <sup>1</sup>Australian Artificial Intelligence Institute, Sydney, Australia &nbsp;&nbsp;|&nbsp;&nbsp; <sup>2</sup>University of Liverpool, Liverpool, United Kingdom
 
55
  </div>
 
56
  <p class="lead">
57
- Multimodal agents have shown strong general GUI abilities, but in-vehicle systems impose unique constraints: limited driver attention, strict safety, and location-aware interaction. <em>Automotive-ENV</em> is a high-fidelity benchmark and interaction environment for vehicle GUIs with 185 parameterized tasks and reproducible checks. We further propose <em>ASURADA</em>, a geo-aware agent that leverages GPS context for safer decisions.
58
  </p>
59
- <div class="badgebar">
60
- <a class="badge" href="https://arxiv.org/abs/2509.21143" target="_blank" rel="noopener">Paper</a>
61
- <a class="badge" href="#" target="_blank" rel="noopener">Code: Release soon</a>
 
 
 
 
 
62
  </div>
63
- </header>
 
64
 
65
- <section class="card figure" aria-label="system-overview">
 
 
66
  <h2>System Overview</h2>
67
- <img src="demo_arch.jpg" alt="System architecture diagram" loading="lazy">
68
- <p class="caption">Figure 1. Automotive-ENV architecture overview.</p>
69
- </section>
70
-
71
- <!-- demo video -->
72
- <section class="section">
73
- <div class="card video" aria-label="demo video">
74
- <!-- Place demo.mp4 at the repo root (same folder as this index.html) -->
75
- <video src="demo.mp4" autoplay muted loop playsinline controls></video>
76
- </div>
77
- </section>
78
-
79
- <!-- abstract + quick highlights -->
80
- <section class="section grid two">
81
- <div class="card">
82
- <h2>Abstract</h2>
83
- <p>
84
- In-vehicle GUIs present distinct challenges: drivers’ limited attention, strict safety
85
- requirements, and complex location-based interaction patterns. We introduce
86
- <strong>Automotive-ENV</strong>, the first high-fidelity benchmark and interaction
87
- environment tailored for vehicle GUIs. The platform defines <strong>185 parameterized tasks</strong>
88
- spanning explicit control, implicit intent, and safety-aware tasks, and provides structured
89
- multimodal observations with precise programmatic checks for reproducible evaluation.
90
- </p>
91
- <p>
92
- Building on this benchmark, we propose <strong>ASURADA</strong>, a geo-aware multimodal agent that
93
- integrates GPS-informed context to adapt actions by location, environment, and regional norms.
94
- Experiments show geo-awareness significantly improves safety-aware task success. We will release
95
- Automotive-ENV, with tasks and tooling, to advance safe and adaptive in-vehicle agents.
96
- </p>
97
  </div>
98
- <div class="card">
99
- <h2>Highlights</h2>
100
- <ul>
101
- <li>High-fidelity vehicle GUI environment with reproducible checks.</li>
102
- <li>185 parameterized tasks across control, intent, and safety categories.</li>
103
- <li>Structured multimodal observations and programmatic success criteria.</li>
104
- <li>ASURADA: GPS/geo-aware planning boosts safety-aware task success.</li>
105
- </ul>
 
 
106
  </div>
107
- </section>
108
-
109
- <!-- tasks placeholder (you can expand later) -->
110
- <section class="section">
111
- <div class="card">
112
- <h2>Tasks (preview)</h2>
113
- <p>
114
- This section is reserved for a compact task overview similar to os-world:
115
- categories, difficulty tiers, and a few illustrative examples with thumbnails or short clips.
116
- </p>
117
- <div class="grid two">
118
- <div>
119
- <h3>Explicit Control</h3>
120
- <ul>
121
- <li>Climate, media, navigation, connectivity</li>
122
- <li>Deterministic UI manipulations with constraints</li>
123
- </ul>
124
- </div>
125
- <div>
126
- <h3>Implicit Intent</h3>
127
- <ul>
128
- <li>Goal inference from short user context</li>
129
- <li>Minimal UI steps with preference awareness</li>
130
- </ul>
131
- </div>
132
- <div>
133
- <h3>Safety-Aware</h3>
134
- <ul>
135
- <li>Sensor + context classification (danger vs. do-nothing)</li>
136
- <li>Strict action gating and escalation logic</li>
137
- </ul>
138
- </div>
139
- <div>
140
- <h3>Evaluation</h3>
141
- <ul>
142
- <li>Programmatic checks, success/failure traces</li>
143
- <li>Generalization splits and ablations</li>
144
- </ul>
145
- </div>
 
 
 
 
146
  </div>
147
  </div>
148
- </section>
 
149
 
150
- <!-- bibtex -->
151
- <section class="section">
152
- <div class="card">
153
- <h2>Citation</h2>
154
- <pre class="mono">@article{yan2025automotive_env,
155
  title = {AUTOMOTIVE-ENV: Benchmarking Multimodal Agents in Vehicle Interface Systems},
156
  author = {Yan, Junfeng and Wu, Biao and Fang, Meng and Chen, Ling},
157
  journal = {arXiv preprint arXiv:2509.21143},
158
  year = {2025}
159
- }</pre>
160
- </div>
161
- </section>
 
 
 
 
162
 
163
- <footer>
164
- © 2025 automotive-env hosted on GitHub Pages
165
- </footer>
166
- </div>
167
  </body>
168
  </html>
 
5
  <meta name="viewport" content="width=device-width, initial-scale=1" />
6
  <title>AUTOMOTIVE-ENV: Benchmarking Multimodal Agents in Vehicle Interface Systems</title>
7
  <meta name="description" content="AUTOMOTIVE-ENV: Benchmarking Multimodal Agents in Vehicle Interface Systems" />
8
+
9
  <link rel="preconnect" href="https://fonts.googleapis.com">
10
  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
11
  <link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&display=swap" rel="stylesheet">
12
+
13
  <style>
14
+ :root{
15
+ --page-w: 1100px;
16
+ --fg: #0b0b0b;
17
  --bg: #ffffff;
 
18
  --muted: #555;
19
+ --border: #e7e7ea;
20
  --accent: #111;
21
+ --accent-weak: rgba(0,0,0,.08);
22
+ --shadow: 0 1px 2px rgba(0,0,0,.05), 0 6px 22px rgba(0,0,0,.06);
23
+ --radius: 12px;
24
+ }
25
+ *{box-sizing:border-box}
26
+ html,body{margin:0;padding:0;background:var(--bg);color:var(--fg);font-family:Inter,system-ui,-apple-system,Segoe UI,Roboto,Helvetica,Arial}
27
+ a{color:var(--accent);text-decoration:none;border-bottom:1px solid var(--accent-weak)}
28
+ a:hover{border-bottom-color:rgba(0,0,0,.28)}
29
+ .container{max-width:var(--page-w);margin:0 auto;padding:0 20px}
30
+
31
+ /* Minimal top nav (centered like many paper pages) */
32
+ .nav{border-bottom:1px solid var(--border);background:#fff}
33
+ .nav-inner{display:flex;align-items:center;justify-content:center;gap:20px;height:54px}
34
+ .nav a{border-bottom:0;font-weight:600;color:#222}
35
+
36
+ /* Hero / header */
37
+ header.hero{padding:48px 0 28px;border-bottom:1px solid var(--border)}
38
+ h1.title{font-size:clamp(28px,4.2vw,46px);line-height:1.12;margin:0 0 10px;letter-spacing:-0.02em;text-align:center}
39
+ .authors,.affils{color:var(--muted);text-align:center}
40
+ .authors{margin:6px auto 0;font-size:15px}
41
+ .affils{margin:2px auto 0;font-size:14px}
42
+ .lead{max-width:900px;margin:14px auto 0;text-align:center;color:var(--muted);font-size:clamp(16px,2vw,18px)}
43
+
44
+ /* Link badges (Paper / Code) */
45
+ .links{display:flex;gap:12px;justify-content:center;margin-top:18px}
46
+ .badge{display:inline-flex;align-items:center;gap:8px;padding:10px 14px;border:1px solid var(--border);border-radius:999px;background:#fff;box-shadow:var(--shadow);font-size:14px}
47
+ .badge span.icon{font-weight:700;font-size:14px}
48
+
49
+ /* Sections in paper style */
50
+ section{padding:34px 0;border-bottom:1px solid var(--border)}
51
+ section:last-of-type{border-bottom:0}
52
+ h2{font-size:22px;margin:0 0 14px}
53
+ p{margin:10px 0}
54
+
55
+ /* Figure (image above video) with zoom */
56
+ .figure{margin-top:6px}
57
+ .figure img{
58
+ width:100%;height:auto;display:block;
59
+ max-height:72vh;object-fit:contain;
60
+ border:1px solid var(--border);border-radius:var(--radius);
61
+ background:#fff;
62
  }
63
+ .caption{font-size:14px;color:var(--muted);text-align:center;margin-top:8px}
64
+
65
+ /* CSS-only lightbox */
66
+ .lightbox{position:fixed;inset:0;display:none;align-items:center;justify-content:center;background:rgba(0,0,0,.92);padding:24px;z-index:999}
67
+ .lightbox:target{display:flex}
68
+ .lightbox img{max-width:96vw;max-height:96vh}
69
+
70
+ /* Video */
71
+ .video video, .video iframe{width:100%;height:auto;display:block;border-radius:var(--radius);background:#000;border:1px solid var(--border)}
72
+
73
+ /* Grid for “Tasks (preview)” */
74
+ .grid{display:grid;gap:18px}
75
+ @media (min-width: 880px){ .grid.two{grid-template-columns:1fr 1fr} }
76
+
77
+ /* Code block (BibTeX) */
78
+ pre{background:#fafafa;border:1px solid var(--border);border-radius:10px;padding:14px;overflow:auto}
79
+ code{font-family:ui-monospace,SFMono-Regular,Menlo,Consolas,monospace;font-size:13px}
80
+
81
+ /* Footer */
82
+ footer{padding:26px 0;color:var(--muted);text-align:center;font-size:14px}
 
83
  </style>
84
  </head>
85
+
86
  <body>
87
+ <!-- minimal top nav (optional) -->
88
+ <nav class="nav">
89
+ <div class="container nav-inner">
90
+ <a href="#">AUTOMOTIVE-ENV</a>
91
+ <a href="https://arxiv.org/abs/2509.21143" target="_blank" rel="noopener">Paper</a>
92
+ <a href="#" target="_blank" rel="noopener">Code</a>
93
+ </div>
94
+ </nav>
95
+
96
+ <!-- hero -->
97
+ <header class="hero">
98
+ <div class="container">
99
+ <h1 class="title">AUTOMOTIVE-ENV: BENCHMARKING MULTIMODAL AGENTS IN VEHICLE INTERFACE SYSTEMS</h1>
100
+
101
  <div class="authors">
102
+ <strong>Junfeng Yan</strong><sup>*1</sup>,
103
+ <strong>Biao Wu</strong><sup>*1</sup>,
104
+ <strong>Meng Fang</strong><sup>2</sup>,
105
+ <strong>Ling Chen</strong><sup>1</sup>
106
  </div>
107
  <div class="affils">
108
+ <sup>1</sup>Australian Artificial Intelligence Institute, Sydney, Australia &nbsp;&nbsp;|&nbsp;&nbsp;
109
+ <sup>2</sup>University of Liverpool, Liverpool, United Kingdom
110
  </div>
111
+
112
  <p class="lead">
113
+ Multimodal agents show strong generic GUI skills, but in-vehicle systems impose unique constraints: limited driver attention, strict safety, and location-aware interaction. <em>Automotive-ENV</em> is a high-fidelity benchmark for vehicle GUIs with 185 parameterized tasks and reproducible checks. We further propose <em>ASURADA</em>, a geo-aware agent leveraging GPS context for safer decisions.
114
  </p>
115
+
116
+ <div class="links">
117
+ <a class="badge" href="https://arxiv.org/abs/2509.21143" target="_blank" rel="noopener">
118
+ <span class="icon">⧉</span><span>Paper (arXiv)</span>
119
+ </a>
120
+ <a class="badge" href="#" target="_blank" rel="noopener">
121
+ <span class="icon">★</span><span>Code (coming soon)</span>
122
+ </a>
123
  </div>
124
+ </div>
125
+ </header>
126
 
127
+ <!-- system overview image (click to zoom) -->
128
+ <section aria-label="system-overview">
129
+ <div class="container">
130
  <h2>System Overview</h2>
131
+ <div class="figure">
132
+ <!-- Put your image next to index.html as demo_arch.jpg (or change the src) -->
133
+ <a href="#fig-arch"><img src="demo_arch.jpg" alt="Automotive-ENV architecture overview"></a>
134
+ <p class="caption">Figure 1. Automotive-ENV architecture overview. Click to zoom.</p>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
135
  </div>
136
+ </div>
137
+ </section>
138
+
139
+ <!-- teaser / demo video -->
140
+ <section aria-label="demo">
141
+ <div class="container">
142
+ <h2>Demo</h2>
143
+ <!-- Place demo.mp4 next to this index.html -->
144
+ <div class="video">
145
+ <video src="demo.mp4" autoplay muted loop playsinline controls></video>
146
  </div>
147
+ </div>
148
+ </section>
149
+
150
+ <!-- abstract -->
151
+ <section aria-label="abstract">
152
+ <div class="container">
153
+ <h2>Abstract</h2>
154
+ <p>
155
+ In-vehicle GUIs present distinct challenges: drivers’ limited attention, strict safety requirements, and
156
+ complex location-based interaction patterns. We introduce <strong>Automotive-ENV</strong>, a high-fidelity benchmark and
157
+ interaction environment tailored for vehicle GUIs. The platform defines <strong>185 parameterized tasks</strong> spanning
158
+ explicit control, implicit intent understanding, and safety-aware tasks, and provides structured multimodal
159
+ observations with precise programmatic checks for reproducible evaluation.
160
+ </p>
161
+ <p>
162
+ Building on this benchmark, we propose <strong>ASURADA</strong>, a geo-aware multimodal agent that integrates GPS-informed
163
+ context to adapt actions by location, environment, and regional norms. Experiments show geo-awareness
164
+ significantly improves safety-aware task success. We will release Automotive-ENV, with tasks and tooling, to
165
+ advance safe and adaptive in-vehicle agents.
166
+ </p>
167
+ </div>
168
+ </section>
169
+
170
+ <!-- tasks preview (reserved area you can expand later) -->
171
+ <section aria-label="tasks">
172
+ <div class="container">
173
+ <h2>Tasks (preview)</h2>
174
+ <div class="grid two">
175
+ <div>
176
+ <h3>Explicit Control</h3>
177
+ <p>Deterministic UI manipulations under constraints (climate, media, navigation, connectivity).</p>
178
+ </div>
179
+ <div>
180
+ <h3>Implicit Intent</h3>
181
+ <p>Goal inference from short user context with preference awareness and minimal steps.</p>
182
+ </div>
183
+ <div>
184
+ <h3>Safety-Aware</h3>
185
+ <p>Sensor + context classification (danger vs. do-nothing) with strict action gating and escalation logic.</p>
186
+ </div>
187
+ <div>
188
+ <h3>Evaluation</h3>
189
+ <p>Programmatic checks, success/failure traces, generalization splits, and ablations.</p>
190
  </div>
191
  </div>
192
+ </div>
193
+ </section>
194
 
195
+ <!-- citation -->
196
+ <section aria-label="citation">
197
+ <div class="container">
198
+ <h2>Citation</h2>
199
+ <pre><code>@article{yan2025automotive_env,
200
  title = {AUTOMOTIVE-ENV: Benchmarking Multimodal Agents in Vehicle Interface Systems},
201
  author = {Yan, Junfeng and Wu, Biao and Fang, Meng and Chen, Ling},
202
  journal = {arXiv preprint arXiv:2509.21143},
203
  year = {2025}
204
+ }</code></pre>
205
+ </div>
206
+ </section>
207
+
208
+ <footer>
209
+ © 2025 automotive-env — hosted on GitHub Pages
210
+ </footer>
211
 
212
+ <!-- Lightbox target (click anywhere to close) -->
213
+ <a id="fig-arch" class="lightbox" href="#">
214
+ <img src="demo_arch.jpg" alt="Automotive-ENV architecture overview (full size)">
215
+ </a>
216
  </body>
217
  </html>