-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathindex.html
executable file
·442 lines (409 loc) · 33.7 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
<!DOCTYPE HTML>
<html lang="en">
<head>
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
<title>Akshita Gupta</title>
<meta content="Akshita Gupta" name="author">
<meta content="width=device-width, initial-scale=1" name="viewport">
<link href="stylesheet.css" rel="stylesheet" type="text/css">
<link href="images/seal_icon.png" rel="icon" type="image/png">
<style>
/* Apply Inter font to the entire page */
body {
font-family: 'calibri', sans-serif;
}
#myimg {
width: 100%;
max-width: 100%;
border-radius: 50%;
border: 1px solid #ddd;
padding: 5px;
}
p {
line-height: 1.7;
font-size: 16px;
}
ul li {
font-size: 16px;
line-height: 1.6;
}
</style>
</head>
<body>
<table style="width:100%;max-width:800px;border:0px;border-spacing:0px;border-collapse:separate;margin-right:auto;margin-left:auto;">
<tbody>
<tr style="padding:0px">
<td style="padding:0px">
<table style="width:100%;border:0px;border-spacing:0px;border-collapse:separate;margin-right:auto;margin-left:auto;">
<tbody>
<tr style="padding:0px">
<td style="padding:2.5%;width:63%;vertical-align:middle">
<p align="center" id="namechange">
<name>Akshita Gupta</name>
</p>
<p style="text-align:justify; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;">
I am an ELLIS PhD student at <span style="color:#C50000;">TU Darmstadt</span>, co-supervised
by <a href="https://rohrbach.vision/" style="text-decoration: none; color: #0071C5; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;"><b>Prof. Marcus Rohrbach</b></a> and
<a href="https://federicotombari.github.io/" style="text-decoration: none; color: #0071C5; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;"><b>Dr. Federico Tombari</b></a> at
<span style="color:#C50000;">Google Zurich</span>.
I completed my MASc at the <span style="color:#C50000;">University of Guelph</span>, where I was advised by
<a href="https://www.gwtaylor.ca/" style="text-decoration: none; color: #0071C5; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;"><b>Prof. Graham Taylor</b></a>.
During that time, I was also a student researcher at the
<a href="https://vectorinstitute.ai/" style="text-decoration: none; color: #C50000; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;">Vector Institute</a>.
</p>
<p style="text-align:justify; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;">
I was fortunate to spend time as a research intern at
<span style="color:#C50000;">Apple</span> under
<a href="https://scholar.google.com/citations?user=x7Z3ysQAAAAJ&hl=ru" style="text-decoration: none; color:#0071C5; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;"><b>Dr. Tatiana Likhomanenko</b></a>,
<span style="color:#C50000;">Microsoft</span> under
<a href="https://g1910.github.io/" style="text-decoration: none; color: #0071C5; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;"><b>Gaurav Mittal</b></a> and
<a href="https://www.microsoft.com/en-us/research/people/meic/" style="text-decoration: none; color: #0071C5; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;"><b>Mei Chen</b></a>,
<span style="color:#C50000;">Vector Institute</span> under
<a href="https://sites.google.com/view/dbemerson" style="text-decoration: none; color: #0071C5; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;"><b>Dr. David Emerson</b></a>,
and as a scientist in residence at
<span style="color:#C50000;">NextAI</span> with <a href="https://www.gwtaylor.ca/" style="text-decoration: none; color: #0071C5; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;"><b>Prof. Graham Taylor</b></a>.
</p>
<p style="text-align:justify; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;">
Before academia, I worked as a Data Scientist at
<a href="https://space42.ai/en" style="text-decoration: none; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;">
<span style="color:#C50000;">Bayanat</span>
</a>,
where I focused on projects related to detection and segmentation.
Prior to that, I was a Research Engineer at the
<span style="color:#C50000;">Inception Institute of Artificial Intelligence (IIAI)</span>,
working with
<a href="https://sites.google.com/view/sanath-narayan" style="text-decoration: none; color: #0071C5; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;"><b>Dr. Sanath Narayan</b></a>,
<a href="https://salman-h-khan.github.io/" style="text-decoration: none; color: #0071C5; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;"><b>Dr. Salman Khan</b></a>, and
<a href="https://sites.google.com/view/fahadkhans/home" style="text-decoration: none; color: #0071C5; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;"><b>Dr. Fahad Shahbaz Khan</b></a>.
</p>
<p style="text-align:center; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;">
<a href="mailto:akshita.sem.iitr@gmail.com" style="text-decoration: none; color: #0071C5; font-weight: bold;">Email</a> /
<a href="https://scholar.google.com/citations?user=G01YeI0AAAAJ&hl=en" style="text-decoration: none; color: #0071C5; font-weight: bold;">Google Scholar</a> /
<a href="https://twitter.com/akshitac8" style="text-decoration: none; color: #0071C5; font-weight: bold;">Twitter</a> /
<a href="https://github.com/akshitac8" style="text-decoration: none; color: #0071C5; font-weight: bold;">Github</a> /
<a href="https://akshitac8.github.io/Gupta_Akshita_resume-12.pdf" style="text-decoration: none; color: #0071C5; font-weight: bold;">Resume/CV</a>
</p>
</td>
<td style="padding:2.5%;width:40%;max-width:40%">
<a href="images/profile_aks.png"><img alt="profile photo" class="hoverZoomLink"
id="myimg" src="images/profile_aks.png"></a>
</td>
</tr>
</tbody>
</table>
<div style="display: grid; grid-template-columns: repeat(4, 1fr); gap: 40px; justify-items: center; align-items: center; margin-top: 30px; margin-bottom: 30px;">
<div style="text-align: center;">
<img src="logos/darmstadt.png" style="height: 100px;">
<div style="font-weight: bold;">TU Darmstadt<br>2025-Present</div>
</div>
<div style="text-align: center;">
<img src="logos/apple.jpeg" style="height: 100px;">
<div style="font-weight: bold;">Apple<br>2024-2025</div>
</div>
<div style="text-align: center;">
<img src="logos/uog.png" style="height: 100px;">
<div style="font-weight: bold;">University of Guelph<br>2022-2024</div>
</div>
<div style="text-align: center;">
<img src="logos/vector.png" style="height: 100px;">
<div style="font-weight: bold;">Vector Institute<br>2022-2024</div>
</div>
<div style="text-align: center;">
<img src="logos/microsoft.png" style="height: 100px;">
<div style="font-weight: bold;">Microsoft Research<br>2023-2024</div>
</div>
<div style="text-align: center;">
<img src="logos/nextAI.jpeg" style="height: 100px;">
<div style="font-weight: bold;">NextAI<br>2024</div>
</div>
<div style="text-align: center;">
<img src="logos/bayanat.jpeg" style="height: 100px;">
<div style="font-weight: bold;">Bayanat for Mapping & Surveying<br>2022</div>
</div>
<div style="text-align: center;">
<img src="logos/iiai_logo.png" style="height: 100px;">
<div style="font-weight: bold;">Inception Institute of Artificial Intelligence<br>2018-2022</div>
</div>
</div>
<h2 style="text-align:left; margin-left: 10px;">What's New ✨</h2>
<div class="news-container">
<table style="width:100%; border:0px; border-spacing:4px; border-collapse:separate; margin-right:auto; margin-left:auto; font-family: calibri, sans-serif; font-size:16px; line-height:1.7;">
<tbody>
<tr><td><strong>[Mar 2025]</strong></td><td>🎓 Excited to be an ELLIS PhD student at <strong>TU Darmstadt</strong> under Prof. Marcus Rohrbach and Dr. Federico Tombari (Google Zurich) 🎉</td></tr>
<tr><td><strong>[Oct 2024]</strong></td><td>🎓 Graduated and Defended my <a href="https://atrium.lib.uoguelph.ca/items/67a35868-ca5a-494f-9116-62ea1c57b733" style="text-decoration:none; color:#0071C5; font-weight:bold;">Masters Thesis</a></td></tr>
<tr><td><strong>[Nov 2024]</strong></td><td>📝 Our paper <a href="https://arxiv.org/pdf/2411.17690" style="text-decoration:none; color:#0071C5; font-weight:bold;">Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis</a> is now on <strong>ArXiv</strong>!</td></tr>
<tr><td><strong>[Jun 2024]</strong></td><td>🍏 Joined <strong>Apple</strong> as a Research Intern!</td></tr>
<tr><td><strong>[May 2024]</strong></td><td>🧠 Serving as a Scientist-in-Residence at <strong>NextAI</strong>.</td></tr>
<tr><td><strong>[Jan 2024]</strong></td><td>🏆 Our paper <a href="https://arxiv.org/abs/2404.01282" style="text-decoration:none; color:#0071C5; font-weight:bold;">Long-Short-range Adapter for Scaling End-to-End Temporal Action Localization</a> accepted at <strong>WACV 2025 (<span style="color:red;">Oral</span>)</strong>! 🎤</td></tr>
<tr><td><strong>[Dec 2023]</strong></td><td>📚 Our work <a href="https://arxiv.org/pdf/2406.15556" style="text-decoration:none; color:#0071C5; font-weight:bold;">Open-Vocabulary Temporal Action Localization using Multimodal Guidance</a> accepted at <strong>BMVC 2024</strong>!</td></tr>
<tr><td><strong>[Jun 2023]</strong></td><td>🧪 Our paper <a href="https://arxiv.org/pdf/2101.11606.pdf" style="text-decoration:none; color:#0071C5; font-weight:bold;">Generative Multi-Label Zero-Shot Learning</a> accepted at TPAMI 2023.</td></tr>
<tr><td><strong>[Jun 2023]</strong></td><td>🚀 Started interning at Microsoft, ROAR team.</td></tr>
<tr><td><strong>[Jan 2023]</strong></td><td>🤖 Interned at Vector Institute with AI Engineering team.</td></tr>
<tr><td><strong>[Sep 2022]</strong></td><td>🔬 Joined Prof. Graham Taylor's Lab and Vector Institute.</td></tr>
<tr><td><strong>[Mar 2022]</strong></td><td>🏅 OW-DETR accepted at CVPR 2022.</td></tr>
<tr><td><strong>[Sep 2021]</strong></td><td>✍️ Reviewer for CVPR 2023, CVPR 2022, ECCV 2022, ICCV 2021, TPAMI.</td></tr>
<tr><td><strong>[Jul 2021]</strong></td><td>🏅 BiAM accepted at ICCV 2021.</td></tr>
<tr><td><strong>[Feb 2021]</strong></td><td>✍️ Serving as a reviewer for ML Reproducibility Challenge 2020.</td></tr>
<tr><td><strong>[Jan 2021]</strong></td><td>📝 Paper out on arXiv: <a href="https://arxiv.org/pdf/2101.11606.pdf" style="text-decoration:none; color:#0071C5; font-weight:bold;">Generative Multi-Label Zero-Shot Learning</a></td></tr>
<tr><td><strong>[Jul 2020]</strong></td><td>🏅 TF-VAEGAN accepted at ECCV 2020.</td></tr>
<tr><td><strong>[Aug 2019]</strong></td><td>🛰️ A Large-scale Instance Segmentation Dataset for Aerial Images (iSAID) available for <a href="https://captain-whu.github.io/iSAID/index.html" style="text-decoration:none; color:#0071C5; font-weight:bold;">download</a>.</td></tr>
<tr><td><strong>[Aug 2018]</strong></td><td>🎤 One paper accepted at Interspeech, CHiME Workshop 2018.</td></tr>
<tr><td><strong>[May 2018]</strong></td><td>🌟 Selected as an Outreachy intern with Mozilla.</td></tr>
</tbody>
</table>
</div>
<!-- Reviewing Section -->
<h2 style="text-align:left; margin-left: 10px;">Conference and Journal Reviewing 📚</h2>
<p style="font-family: calibri, sans-serif; font-size:16px; line-height:1.7; margin-left:10px;">
<strong>CVPR</strong> (2022–2025) |
<strong>ECCV</strong> (2022, 2024) |
<strong>ICCV</strong> (2021) |
<strong>TPAMI</strong> (Journal)
</p>
<!-- Invited Talks Section -->
<h2 style="text-align:left; margin-left: 10px;">Invited Talks 🎤</h2>
<ul style="font-family: calibri, sans-serif; font-size:16px; line-height:1.7; margin-left:10px;">
<li><strong>[Mar 2025]</strong> — Gave a talk at UCF CRCV lab — thank you Prof. Shah for hosting me!</li>
<li><strong>[Dec 2021]</strong> — Computer Vision Talks (<a href="https://youtu.be/0MZxWozdRiM" target="_blank" style="text-decoration: none; color: #0071C5; font-weight: bold;">YouTube Link</a>)</li>
</ul>
<!-- Research Interests Section -->
<h2 style="text-align:left; margin-left: 10px;">Research Interests 🔍 </h2>
<p style="text-align:justify; margin-left: 10px; margin-right: 10px;">
I am broadly interested in building scalable, multimodal models that combine vision, language, and speech modalities with interests in efficient modeling, temporal understanding, and open-world generalization.
</p>
<h2 style="text-align:left; margin-left: 10px;">Publications 📄 </h2>
<table style="width:100%; border:0px; border-spacing:0px; border-collapse:separate; margin:auto; font-family:calibri,sans-serif; font-size:16px; line-height:1.7;">
<tbody>
<tr>
<td style="padding:20px; width:30%; text-align:center; vertical-align:middle;">
<img src="images/visatronic.png" style="max-width:100%; height:auto;">
</td>
<td style="padding:20px; width:70%; vertical-align:middle;">
<a href="https://arxiv.org/pdf/2411.17690" style="text-decoration:none; color:black;">
<div style="font-size:16px; font-weight:bold; line-height:1.4;">Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis</div>
</a>
<div>
<strong>Akshita Gupta</strong>,
<a href="https://scholar.google.com/citations?user=x7Z3ysQAAAAJ&hl=ru" style="text-decoration:none; color:#0071C5;">Tatiana Likhomanenko</a>,
<a href="https://karreny.github.io" style="text-decoration:none; color:#0071C5;">Karren Yang</a>,
<a href="https://richardbaihe.github.io" style="text-decoration:none; color:#0071C5;">Richard Bai</a>,
<a href="https://scholar.google.com/citations?user=1AHzh04AAAAJ&hl=en" style="text-decoration:none; color:#0071C5;">Zakaria Aldeneh</a>,
<a href="https://scholar.google.com/citations?user=kjMNMLkAAAAJ&hl=en" style="text-decoration:none; color:#0071C5;">Navdeep Jaitly</a>
</div>
<strong>ArXiv 2025</strong> |
<a href="https://arxiv.org/pdf/2411.17690" style="text-decoration:none; color:#0071C5; font-weight:bold;">Paper</a>
</td>
</tr>
</tbody>
</table>
<table style="width:100%; border:0px; border-spacing:0px; border-collapse:separate; margin-right:auto;margin-left:auto; font-family:calibri,sans-serif; font-size:16px; line-height:1.7;">
<tbody><tr>
<td style="padding:20px; width:30%; text-align:center; vertical-align:middle;">
<img src="images/losa.png" style="max-width:100%; height:auto;">
</td>
<td style="padding:20px; width:70%; vertical-align:middle;">
<a href="https://arxiv.org/pdf/2404.01282" style="text-decoration:none; color:black;">
<div style="font-size:16px; font-weight:bold; line-height:1.4;">LoSA: Long-Short-range Adapter for Scaling End-to-End Temporal Action Localization</div>
</a>
<strong>Akshita Gupta<sup>*</sup></strong>,
<a href="https://g1910.github.io/" style="text-decoration:none; color:#0071C5;">Gaurav Mittal<sup>*</sup></a>,
<a href="#" style="text-decoration:none; color:#0071C5;">Ahmed Magooda</a>,
<a href="#" style="text-decoration:none; color:#0071C5;">Ye Yu</a>,
<a href="https://www.gwtaylor.ca/" style="text-decoration:none; color:#0071C5;">Graham W. Taylor</a>,
<a href="https://www.microsoft.com/en-us/research/people/meic/" style="text-decoration:none; color:#0071C5;">Mei Chen</a><br>
<strong>WACV 2025</strong> <span style="color:red;">(Oral)</span> |
<a href="https://arxiv.org/abs/2404.01282" style="text-decoration:none; color:#0071C5; font-weight:bold;">Arxiv</a>
<!-- <a href="https://github.com/akshitac8/tfvaegan" style="text-decoration:none; color:#0071C5; font-weight:bold;">Code</a>-->
</td>
</tr></tbody>
</table>
<table style="width:100%; border:0px; border-spacing:0px; border-collapse:separate; margin-right:auto;margin-left:auto; font-family:calibri,sans-serif; font-size:16px; line-height:1.7;">
<tbody><tr>
<td style="padding:20px; width:30%; text-align:center; vertical-align:middle;">
<img src="images/ovformer.png" style="max-width:100%; height:auto;">
</td>
<td style="padding:20px; width:70%; vertical-align:middle;">
<a href="https://arxiv.org/pdf/2406.15556" style="text-decoration:none; color:black;">
<div style="font-size:16px; font-weight:bold; line-height:1.4;">Open-Vocabulary Temporal Action Localization using Multimodal Guidance</div>
</a>
<strong>Akshita Gupta</strong>,
<a href="https://adityac8.github.io/" style="text-decoration:none; color:#0071C5;">Aditya Arora</a>,
<a href="https://sites.google.com/view/sanath-narayan" style="text-decoration:none; color:#0071C5;">Sanath Narayan</a>,
<a href="https://salman-h-khan.github.io/" style="text-decoration:none; color:#0071C5;">Salman Khan</a>,
<a href="https://sites.google.com/view/fahadkhans/home" style="text-decoration:none; color:#0071C5;">Fahad Shahbaz Khan</a>,
<a href="https://www.gwtaylor.ca/" style="text-decoration:none; color:#0071C5;">Graham W. Taylor</a><br>
<strong>BMVC 2024</strong> |
<a href="https://arxiv.org/pdf/2406.15556" style="text-decoration:none; color:#0071C5; font-weight:bold;">Paper</a>
<!-- /-->
<!-- <a href="https://github.com/akshitac8/tfvaegan" style="text-decoration:none; color:#0071C5; font-weight:bold;">Code</a>-->
</td>
</tr></tbody>
</table>
<table style="width:100%; border:0px; border-spacing:0px; border-collapse:separate; margin-right:auto; margin-left:auto; font-family:calibri,sans-serif; font-size:16px; line-height:1.7;">
<tbody><tr>
<td style="padding:20px; width:30%; text-align:center; vertical-align:middle;">
<img src="images/gam_mlzsl.png" style="max-width:100%; height:auto;">
</td>
<td style="padding:20px; width:70%; vertical-align:middle;">
<a href="https://arxiv.org/pdf/2101.11606" style="text-decoration:none; color:black;">
<div style="font-size:16px; font-weight:bold; line-height:1.4;">Generative Multi-Label Zero-Shot Learning</div>
</a>
<strong>Akshita Gupta<sup>*</sup></strong>,
<a href="https://sites.google.com/view/sanath-narayan" style="text-decoration:none; color:#0071C5;">Sanath Narayan<sup>*</sup></a>,
<a href="https://salman-h-khan.github.io/" style="text-decoration:none; color:#0071C5;">Salman Khan</a>,
<a href="https://sites.google.com/view/fahadkhans/home" style="text-decoration:none; color:#0071C5;">Fahad Shahbaz Khan</a>,
<a href="https://scholar.google.com/citations?user=z84rLjoAAAAJ&hl=en" style="text-decoration:none; color:#0071C5;">Ling Shao</a>,
<a href="http://www.cvc.uab.es/LAMP/joost/" style="text-decoration:none; color:#0071C5;">Joost van de Weijer</a><br>
<strong>TPAMI 2023</strong> |
<a href="https://arxiv.org/pdf/2101.11606" style="text-decoration:none; color:#0071C5; font-weight:bold;">Paper</a>
<!-- /-->
<!-- <a href="https://github.com/akshitac8/tfvaegan" style="text-decoration:none; color:#0071C5; font-weight:bold;">Code</a>-->
<!-- <ul>-->
<!-- <li><u>Description:</u> Proposed a generative feature synthesizing framework for multi-label zero-shot learning tasks.</li>-->
<!-- </ul>-->
</td>
</tr></tbody>
</table>
<table style="width:100%; border:0px; border-spacing:0px; border-collapse:separate; margin-right:auto; margin-left:auto; font-family:calibri,sans-serif; font-size:16px; line-height:1.7;">
<tbody><tr>
<td style="padding:20px; width:30%; text-align:center; vertical-align:middle;">
<img src="images/OWDETR_intro.png" style="max-width:100%; height:auto;">
</td>
<td style="padding:20px; width:70%; vertical-align:middle;">
<a href="https://akshitac8.github.io/OWDETR" style="text-decoration:none; color:black;">
<div style="font-size:16px; font-weight:bold; line-height:1.4;">OW-DETR: Open-world Detection Transformer</div>
</a>
<strong>Akshita Gupta<sup>*</sup></strong>,
<a href="https://sites.google.com/view/sanath-narayan">Sanath Narayan<sup>*</sup></a>,
<a href="https://josephkj.in">Joseph KJ</a>,
<a href="https://salman-h-khan.github.io/">Salman Khan</a>,
<a href="https://sites.google.com/view/fahadkhans/home">Fahad Shahbaz Khan</a>,
<a href="https://www.crcv.ucf.edu/person/mubarak-shah/">Mubarak Shah</a><br>
<strong>CVPR 2022</strong><br>
<a href="https://arxiv.org/pdf/2112.01513.pdf" style="text-decoration:none; color:#0071C5; font-weight:bold;">Paper</a>
/
<a href="https://github.com/akshitac8/OW-DETR" style="text-decoration:none; color:#0071C5; font-weight:bold;">Code</a>
<!-- <ul>-->
<!-- <li><u>Description:</u> Proposed an open-world detection transformer with multi-scale pseudo-labeling and context-driven training.</li>-->
<!-- </ul>-->
</td>
</tr></tbody>
</table>
<table style="width:100%; border:0px; border-spacing:0px; border-collapse:separate; margin-right:auto; margin-left:auto; font-family:calibri,sans-serif; font-size:16px; line-height:1.7;">
<tbody><tr>
<td style="padding:20px; width:30%; text-align:center; vertical-align:middle;">
<img src="images/image834.png" style="max-width:100%; height:auto;">
</td>
<td style="padding:20px; width:70%; vertical-align:middle;">
<a href="https://akshitac8.github.io/BiAM" style="text-decoration:none; color:black;">
<div style="font-size:16px; font-weight:bold; line-height:1.4;">Discriminative Region-based Multi-Label Zero-Shot Learning</div>
</a>
<a href="https://sites.google.com/view/sanath-narayan">Sanath Narayan<sup>*</sup></a>,
<strong>Akshita Gupta<sup>*</sup></strong>,
<a href="https://salman-h-khan.github.io/">Salman Khan</a>,
<a href="https://sites.google.com/view/fahadkhans/home">Fahad Shahbaz Khan</a>,
<a href="https://scholar.google.com/citations?user=z84rLjoAAAAJ&hl=en">Ling Shao</a>,
<a href="https://www.crcv.ucf.edu/person/mubarak-shah/">Mubarak Shah</a><br>
<strong>ICCV 2021</strong><br>
<a href="https://openaccess.thecvf.com/content/ICCV2021/papers/Narayan_Discriminative_Region-Based_Multi-Label_Zero-Shot_Learning_ICCV_2021_paper.pdf" style="text-decoration:none; color:#0071C5; font-weight:bold;">Paper</a>
/
<a href="https://github.com/akshitac8/BiAM" style="text-decoration:none; color:#0071C5; font-weight:bold;">Code</a>
<!-- <ul>-->
<!-- <li><u>Description:</u> Proposed a region-based attention module for multi-label zero-shot learning.</li>-->
<!-- </ul>-->
</td>
</tr></tbody>
</table>
<table style="width:100%; border:0px; border-spacing:0px; border-collapse:separate; margin-right:auto; margin-left:auto; font-family:calibri,sans-serif; font-size:16px; line-height:1.7;">
<tbody><tr>
<td style="padding:20px; width:30%; text-align:center; vertical-align:middle;">
<img src="images/feedback_vis.png" style="max-width:100%; height:auto;">
</td>
<td style="padding:20px; width:70%; vertical-align:middle;">
<a href="https://akshitac8.github.io/tfvaegan/" style="text-decoration:none; color:black;">
<div style="font-size:16px; font-weight:bold; line-height:1.4;">Latent Embedding Feedback and Discriminative Features for Zero-Shot Classification</div>
</a>
<a href="https://sites.google.com/view/sanath-narayan">Sanath Narayan<sup>*</sup></a>,
<strong>Akshita Gupta<sup>*</sup></strong>,
<a href="https://salman-h-khan.github.io/">Salman Khan</a>,
<a href="https://sites.google.com/view/fahadkhans/home">Fahad Shahbaz Khan</a>,
<a href="https://www.ceessnoek.info/">Cees G. M. Snoek</a>,
<a href="https://scholar.google.com/citations?user=z84rLjoAAAAJ&hl=en">Ling Shao</a><br>
<strong>ECCV 2020</strong><br>
<a href="https://arxiv.org/abs/2003.07833" style="text-decoration:none; color:#0071C5; font-weight:bold;">Paper</a>
/
<a href="https://github.com/akshitac8/tfvaegan" style="text-decoration:none; color:#0071C5; font-weight:bold;">Code</a>
<!-- <ul>-->
<!-- <li><u>Description:</u> Proposed a feedback-based generative model to synthesize more discriminative features for zero-shot learning.</li>-->
<!-- </ul>-->
</td>
</tr></tbody>
</table>
<table style="width:100%; border:0px; border-spacing:0px; border-collapse:separate; margin-right:auto; margin-left:auto; font-family:calibri,sans-serif; font-size:16px; line-height:1.7;">
<tbody><tr>
<td style="padding:20px; width:30%; text-align:center; vertical-align:middle;">
<img src="images/isaid.png" style="max-width:100%; height:auto;">
</td>
<td style="padding:20px; width:70%; vertical-align:middle;">
<a href="https://captain-whu.github.io/iSAID/" style="text-decoration:none; color:black;">
<div style="font-size:16px; font-weight:bold; line-height:1.4;">iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images</div>
</a>
<a href="https://scholar.google.es/citations?user=WNGPkVQAAAAJ&hl=en">Syed Waqas Zamir</a>,
<a href="https://adityac8.github.io/">Aditya Arora</a>,
<strong>Akshita Gupta</strong>,
<a href="https://salman-h-khan.github.io/">Salman Khan</a>,
<a href="https://scholar.google.ae/citations?user=qd8Blw0AAAAJ&hl=en">Guolei Sun</a>,
<a href="https://sites.google.com/view/fahadkhans/home">Fahad Shahbaz Khan</a>,
<a href="https://scholar.google.com/citations?user=vD-ezyQAAAAJ&hl=en">Fan Zhu</a>,
<a href="https://scholar.google.com/citations?user=z84rLjoAAAAJ&hl=en">Ling Shao</a>,
<a href="http://www.captain-whu.com/xia_En.html">Gui-Song Xia</a>,
<a href="https://scholar.google.com/citations?user=UeltiQ4AAAAJ&hl=en">Xiang Bai</a><br>
<strong>CVPR Workshop 2019 </strong><span style="color:red;">(Oral)</span> <br>
<a href="https://github.com/CAPTAIN-WHU/iSAID_Devkit" style="text-decoration:none; color:#0071C5; font-weight:bold;">Code</a>
/
<a href="https://captain-whu.github.io/iSAID/index.html" style="text-decoration:none; color:#0071C5; font-weight:bold;">Dataset</a>
<!-- <ul>-->
<!-- <li><u>Description:</u> Introduced a large-scale benchmark for instance segmentation and object detection in aerial imagery.</li>-->
<!-- </ul>-->
</td>
</tr></tbody>
</table>
<table style="width:100%; border:0px; border-spacing:0px; border-collapse:separate; margin-right:auto; margin-left:auto; font-family:calibri,sans-serif; font-size:16px; line-height:1.7;">
<tbody><tr>
<td style="padding:20px; width:30%; text-align:center; vertical-align:middle;">
<img src="images/interspeech.png" style="max-width:100%; height:auto;">
</td>
<td style="padding:20px; width:70%; vertical-align:middle;">
<a href="https://arxiv.org/abs/1811.00936" style="text-decoration:none; color:black;">
<div style="font-size:16px; font-weight:bold; line-height:1.4;">Acoustic Features Fusion Using Attentive Multi-Channel Deep Architecture</div>
</a>
<a href="http://deeplearn-ai.com/about-3/?i=1">Gaurav Bhatt</a>,
<strong>Akshita Gupta</strong>,
<a href="https://adityac8.github.io/">Aditya Arora</a>,
<a href="http://bala.cs.faculty.iitr.ac.in/">Balasubramanian Raman</a><br>
<strong>Interspeech Workshop 2018</strong> |
<a href="https://github.com/DeepLearn-lab/Acoustic-Feature-Fusion_Chime18" style="text-decoration:none; color:#0071C5; font-weight:bold;">Code</a>
<!-- <ul>-->
<!-- <li><u>Description:</u> Proposed an attention-based deep model for fusing multi-channel acoustic features for audio scene recognition and tagging.</li>-->
<!-- </ul>-->
</td>
</tr></tbody>
</table>
<table align="center" border="0" cellpadding="20" cellspacing="0" width="100%">
<tr>
<td>
<br>
<p align="right">
<font size="2">
<strong>I borrowed this website layout from <a href="https://jonbarron.info/"
target="_blank">here</a>!</strong>
</font>
</p>
</td>
</tr>
</table>
</td>
</tr>
</table>
</body>
</html>