Publicaciones
A continuación se presentan algunas publicaciones clave que sustentan teóricamente y experimentalmente el proyecto LightVED.
Congresos y Workshops
2025
- Luis-Jesus Marhuenda, Miquel Obrador-Reina, Mohamed Aas-Alas, Alberto Albiol, Roberto Paredes
Unveiling Differences: A Vision Encoder-Decoder Model for Difference Medical Visual Question Answering
Medical Imaging in Deep Learning, 2025.
🔗Website
2024
-
Miguel Zaragozá-Portolés, David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos
Extending LIP-RTVE: Towards A Large-Scale Audio-Visual Dataset for Continuous Spanish in the Wild
IberSPEECH, 2024.
🔗DOI
-
David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos
Towards Parameter-Efficient Non-Autoregressive Spanish Audio-Visual Speech Recognition
IberSPEECH, 2024.
🔗DOI
-
David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos
The PRHLT Speech Recognition System for the Albayz´ın 2024 Bilingual Basque-Spanish Speech to Text Challenge
IberSPEECH, 2024.
🔗DOI
Revistas
2025
- David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos
Tailored design of Audio–Visual Speech Recognition models using Branchformers
Computer Speech & Language, 2025.
🔗DOI