Saltar a contenido

Publicaciones

A continuación se presentan algunas publicaciones clave que sustentan teóricamente y experimentalmente el proyecto LightVED.

Congresos y Workshops

2025

  • Luis-Jesus Marhuenda, Miquel Obrador-Reina, Mohamed Aas-Alas, Alberto Albiol, Roberto Paredes
    Unveiling Differences: A Vision Encoder-Decoder Model for Difference Medical Visual Question Answering
    Medical Imaging in Deep Learning, 2025.
    🔗 Website

2024

  • Miguel Zaragozá-Portolés, David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos
    Extending LIP-RTVE: Towards A Large-Scale Audio-Visual Dataset for Continuous Spanish in the Wild
    IberSPEECH, 2024.
    🔗 DOI

  • David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos
    Towards Parameter-Efficient Non-Autoregressive Spanish Audio-Visual Speech Recognition
    IberSPEECH, 2024.
    🔗 DOI

  • David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos
    The PRHLT Speech Recognition System for the Albayz´ın 2024 Bilingual Basque-Spanish Speech to Text Challenge
    IberSPEECH, 2024.
    🔗 DOI

Revistas

2025

  • David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos
    Tailored design of Audio–Visual Speech Recognition models using Branchformers
    Computer Speech & Language, 2025.
    🔗 DOI