Artículos
- González-Docasal, Ander; Alonso, Jon; Olaizola, Jon; Mendicute, Mikel; Franco, María Patricia; Pozo, Arantza Del; Aguinaga, Daniel; Álvarez, Aitor; Lleida, Eduardo. Design and Evaluation of a Voice-Controlled Elevator System to Improve the Safety and Accessibility. IEEE OPEN JOURNAL OF THE INDUSTRIAL ELECTRONICS SOCIETY. 2024. DOI: 10.1109/OJIES.2024.3483552
- Vidal, Jazmin; Ribas, Dayana; Bonomi, Cyntia; Lleida, Eduardo; Ferrer, Luciana; Ortega, Alfonso. Automatic voice disorder detection from a practical perspective. JOURNAL OF VOICE. 2024. DOI: 10.1016/j.jvoice.2024.03.001
- Mingote, Victoria; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo. Class token and knowledge distillation for multi-head self-attention speaker verification systems. DIGITAL SIGNAL PROCESSING. 2023. DOI: 10.1016/j.dsp.2022.103859
- Ribas, Dayana; Pastor, Miguel A.; Miguel, Antonio; Martinez, David; Ortega, Alfonso; Lleida, Eduardo. Automatic voice disorder detection using self-supervised representations. IEEE ACCESS. 2023. DOI: 10.1109/ACCESS.2023.3243986
- Prieto, S.; Ortega, A.; López-Espejo, I.; Lleida, E. Shouted and whispered speech compensation for speaker verification systems. DIGITAL SIGNAL PROCESSING. 2022. DOI: 10.1016/j.dsp.2022.103536
- Mingote, Victoria; Viñals, Ignacio; Gimeno, Pablo; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo. Multimodal Diarization Systems by Training Enrollment Models as Identity Representations. APPLIED SCIENCES (SWITZERLAND). 2022. DOI: 10.3390/app12031141
- Mingote, V.; Miguel, A.; Ribas, D.; Ortega, A.; Lleida, E. aDCF loss function for deep metric learning in end-to-end text-dependent speaker verification systems. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. 2022. DOI: 10.1109/TASLP.2022.3145307
- Gimeno, P.; Ribas, D.; Ortega, A.; Miguel, A.; Lleida, E. Unsupervised adaptation of deep speech activity detection models to unseen domains. APPLIED SCIENCES (SWITZERLAND). 2022. DOI: 10.3390/app12041832
- Gimeno, P; Mingote, V; Ortega, A; Miguel, A; Lleida, E. Generalizing AUC Optimization to Multiclass Classification for Audio Segmentation With Limited Training Data. IEEE SIGNAL PROCESSING LETTERS. 2021. DOI: 10.1109/LSP.2021.3084501
- Llombart, J.; Ribas, D.; Miguel, A.; Vicente, L.; Ortega, A.; Lleida, E. Progressive loss functions for speech enhancement with deep neural networks. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING. 2021. DOI: 10.1186/s13636-020-00191-3
- Mingote, V.; Miguel, A.; Ortega, A.; Lleida, E. Training speaker enrollment models by network optimization. INTERSPEECH (USB). 2020. DOI: 10.21437/Interspeech.2020-2325
- Mingote, V.; Miguel, A.; Ortega, A.; Lleida, E. Optimization of the area under the ROC curve using neural network supervectors for text-dependent speaker verification. COMPUTER SPEECH AND LANGUAGE. 2020. DOI: 10.1016/j.csl.2020.101078
- Gimeno, Pablo; Viñals, Ignacio; Ortega, Alfonso; Miguel, Antonio; Lleida, Eduardo. Multiclass audio segmentation based on recurrent neural networks for broadcast domain data. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING. 2020. DOI: 10.1186/s13636-020-00172-6
- Gimeno, P.; Mingote, V.; Ortega, A.; Miguel, A.; Lleida, E. Partial AUC optimisation using recurrent neural networks for music detection with limited training data. INTERSPEECH (USB). 2020. DOI: 10.21437/Interspeech.2020-1108
- Prieto, S.; Ortega, A.; López-Espejo, I.; Lleida, E. Shouted speech compensation for speaker verification robust to vocal effort conditions. INTERSPEECH (USB). 2020. DOI: 10.21437/Interspeech.2020-1402
- Viñals, Ignacio; Ortega, Alfonso; Villalba, Jesús; Miguel, Antonio; Lleida, Eduardo. Unsupervised adaptation of PLDA models for broadcast diarization. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING. 2019. DOI: 10.1186/s13636-019-0167-7
- Mingote, Victoria; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo. Supervector extraction for encoding speaker and phrase information with neural networks for text-dependent speaker verification. APPLIED SCIENCES (SWITZERLAND). 2019. DOI: 10.3390/app9163295
- Viñals, Ignacio; Ortega, Alfonso; Miguel, Antonio; Lleida, Eduardo. An analysis of the short utterance problem for speaker characterization. APPLIED SCIENCES (SWITZERLAND). 2019. DOI: 10.3390/app9183697
- Lleida, Eduardo; Ortega, Alfonso; Miguel, Antonio; Bazán-Gil, Virginia; Perez, Carmen; Gómez, Manuel; de Prada, Alberto. Albayzin 2018 Evaluation: The IberSpeech-RTVE Challenge on Speech Technologies for Spanish Broadcast Media. APPLIED SCIENCES (SWITZERLAND). 2019. DOI: 10.3390/app9245412
- Mingote, V.; Castan, D.; Mclaren, M.; Nandwana, M.K.; Ortega, A.; Lleida, E.; Miguel, A. Language recognition using triplet neural networks. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-2437
- Viñals, I.; Ribas, D.; Mingote, V.; Llombart, J.; Gimeno, P.; Miguel, A.; Ortega, A.; Lleida, E. Phonetically-aware embeddings, wide residual networks with time-delay neural networks and self attention models for the 2018 NIST speaker recognition evaluation. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-2417
- Mingote, V.; Miguel, A.; Ribas, D.; Ortega, A.; Lleida, E. Optimization of false acceptance/rejection rates and decision threshold for end-to-end text-dependent speaker verification systems. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-2550
- Viñals, I.; Gimeno, P.; Ortega, A.; Miguel, A.; Lleida, E. Vivolab speaker diarization system for the Dihard 2019 challenge. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-2462
- Llombart, J.; Ribas, D.; Miguel, A.; Vicente, L.; Ortega, A.; Lleida, E. Progressive speech enhancement with residual connections. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-1748
- Llombart, J.; Ribas, D.; Miguel, A.; Vicente, L.; Ortega, A.; Lleida, E. Speech enhancement with wide residual networks in reverberant environments. INTERSPEECH (USB). 2019. DOI: 10.21437/Interspeech.2019-1745
- Cabello, L.; Lleida, E.; Simon, J.; Miguel, A.; Ortega, A. Text-to-Pictogram Summarization for Augmentative and Alternative Communication. PROCESAMIENTO DEL LENGUAJE NATURAL. 2018. DOI: 10.26342/2018-61-1
- Lleida, E.; Rodriguez-Fuentes, L.J. Speaker and language recognition and characterization: Introduction to the CSL special issue. COMPUTER SPEECH AND LANGUAGE. 2018. DOI: 10.1016/j.csl.2017.12.001
- Viñals, I.; Gimeno, P.; Ortega, A.; Miguel, A.; Lleida, E. Estimation of the number of speakers with variational Bayesian PLDA in the dihard diarization challenge. INTERSPEECH (USB). 2018. DOI: 10.21437/Interspeech.2018-1841
- Villalba, J.; Ortega, A.; Miguel, A.; Lleida, E. Analysis of speech quality measures for the task of estimating the reliability of speaker verification decisions. SPEECH COMMUNICATION. 2016. DOI: 10.1016/j.specom.2016.01.005
- Villalba López, Jesús; Ortega Giménez, Alfonso; Miguel Artiaga, Antonio; Lleida Solano, Eduardo. Bayesian Networks to Model the Variability of Speaker Veri¿cation Scores in Adverse Environments. IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. 2016. DOI: 10.1109/TASLP.2016.2607343
- Martínez, D.;Lleida, E.;Green, P.;Christensen, H.;Ortega, A.;Miguel, A. Intelligibility assessment and speech recognizer word accuracy rate prediction for dysarthric speakers in a factor analysis subspace. ACM TRANSACTIONS ON ACCESSIBLE COMPUTING. 2015. DOI: 10.1145/2746405
- Garcìa, P.; Lleida, E.; Castan, D.; Marcos, J. M.; Romero, D. Context-aware communicator for all. LECTURE NOTES IN COMPUTER SCIENCE. 2015. DOI: 10.1007/978-3-319-20678-3_41
- Castán, D.; Tavarez, D.; Lopez-Otero, P.; Franco-Pedroso, J.; Delgado, H.; Navas, E.; Docio-Fernández, L.; Ramos, D.; Serrano, J.; Ortega, A.; Lleida, E. Albayzín-2014 evaluation: audio segmentation and classification in broadcast news domains. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING. 2015. DOI: 10.1186/s13636-015-0076-3
- Garcia, José Enrique; Ortega Giménez, Alfonso; Miguel Artiaga, Antonio; Lleida Solano, Eduardo. Low bit rate compression methods of feature vectors for distributed speech recognition. SPEECH COMMUNICATION. 2014. DOI: 10.1016/j.specom.2013.11.007
- Castán, D.; Ortega, A.; Miguel, A.; Lleida, E. Audio segmentation-by-classification approach based on factor analysis in broadcast news domain. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING. 2014. DOI: 10.1186/s13636-014-0034-5
- Martínez González, David; Burget, Lukas; Stafylakis, Themos; Lei, Yun; Kenny, Patrick; Lleida, Eduardo. Unscented Transform for iVector-Based Noisy Speaker Recognition. PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING. 2014. DOI: 10.1109/ICASSP.2014.6854361
- Justo, R.;Saz, O.;Miguel, A.;Torres, M. I.;Lleida, E. Improving language models in speech-based human-machine interaction. INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS. 2013. DOI: 10.5772/55407
- Martínez González, David; Ribas, Dayana; Lleida, Eduardo; Ortega, Alfonso; Miguel, Antonio Suprasegmental information modelling for autism disorder spectrum and specific language impairment classification. PROCEEDINGS OF THE ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, INTERSPEECH. 2013
- Vaquero,C.;Ortega,A.;Miguel,A.;Lleida,E. Quality assessment for speaker diarization and its application in speaker characterization. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING. 2013. DOI: 10.1109/TASL.2012.2236317
Capítulos
- Viñals Bailo, Ignacio; Villalba Lopez, Jesús; Ortega Giménez, Alfonso; Miguel Artiaga, Antonio; Lleida Solano, Eduardo. Bottleneck Based Front-End for Diarization Systems. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES: IBERSPEECH 2016. 2016
- Martínez González, David; Villalba, Jesús; Lleida, Eduardo; Ortega, Alfonso. Unsupervised Accent Modeling for Language Identification. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2014
- Ortega Giménez, Alfonso; Castán, Diego; Miguel, Antonio; Lleida, Eduardo. A preliminary study of Acoustic Events Classification with Factor Analysis in Meeting Rooms. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2014
- Ortega Giménez, Alfonso; Olcoz, Julia; Miguel, Antonio; Lleida, Eduardo. Confidence Measures in Automatic Speech Recognition for Error Detection in Restricted Domains. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. IBERSPEECH 2014. 2014
- Castán Lavilla, Diego; Ortega Giménez, Alfonso; Lleida Solano, Eduardo. Factor Analysis Segmentation and Classification in Broadcast News Domain. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2012
- Martínez González, David; Lleida, Eduardo; Ortega, Alfonso; Miguel, Antonio. Score Level versus Audio Level Fusion for Voice Pathology Detection on the Saarbrücken Voice Database. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2012
- Martínez González, David; Lleida, Eduardo; Ortega, Alfonso; Miguel, Antonio; Villalba, Jesús. Voice Pathology Detection on the Saarbrücken Voice Database with Calibration and Fusion of Scores Using MultiFocal Toolkit. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2012
- Ribas Gonzalez, Dayana; García Laínez, Enrique; Ortega Giménez, Alfonso; Miguel Artiaga, Antonio; Lleida Solano, Eduardo; Calvo de Lara, José Ramón. Evaluation of a New Beam-Search Formant Tracking Algorithm in Noisy Environments. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2012
- Villalba Lopez, Jesús; Lleida Solano, Eduardo; Ortega Giménez, Alfonso; Miguel Artiaga, Antonio. Reliability Estimation of the Speaker Verification Decisions Using Bayesian Networks to Combine Information from Multiple Speech Quality Measures. ADVANCES IN SPEECH AND LANGUAGE TECHNOLOGIES FOR IBERIAN LANGUAGES. 2012
- Martínez González, David; Villalba, Jesús; Miguel, Antonio; Ortega, Alfonso; Lleida, Eduardo. ViVoLab UZ Language Recognition System for Albayzin 2010 LRE. PROCEEDINGS OF VI JORNADAS DE TECNOLOGÍA DEL HABLA AND II IBERIAN SLTECH WORKSHOP. 2010
- Buera Rodriguez, Luis; Miguel Artiaga, Antonio; Saz Torralba, Oscar; Lleida Solano, Eduardo; Ortega Giménez, Alfonso. Cross-Probability Model Based on Gmm for Feature Vector Normalization. IN-VEHICLE CORPUS AND SIGNAL PROCESSING FOR DRIVER BEHAVIOR.
Proyectos
- TIN2014-54288-C4-2-R: PROCESADO DE AUDIO, HABLA Y LENGUAJE PARA ANÁLISIS DE INFORMACIÓN MULTIMEDIA-UZ. 01/01/15 - 30/09/18
- IRIS / Towards Natural Interaction and Communication (G.A.no. 610986). 01/01/14 - 31/12/17
Contratos
- SISTEMA DE RECONOCIMIENTO DE VOZ DE LOS SUBTÍTULOS EMITIDOS EN LOS PROGRAMAS DEL TIEMPO PARA LA UNIDAD DE TELETEXTO DE LA CORPORACIÓN RTVE EN TORRESPAÑA, MADRID. 27/02/13 - 29/04/14
- SISTEMA DE SUPERVISIÓN DE SUBTÍTULOS EMITIDOS MEDIANTE RECONOCIMIENTO DE VOZ PARA LA UNIDAD DE TELETEXTO DE LA SME TVE EN TORRESPAÑA. 15/08/12 - 16/08/13
- SISTEMA AUTOMÁTICO DE SUBTITULADO DIFERIDO ASISTIDO DE GUIONES POR RECONOCIMIENTO DE VOZ. 01/01/12 - 31/12/12
Dirección de tesis
- Subspace Gaussian Mixture Models for Language Identification and Dysarthric Speech Intelligibility Assessment. Universidad de Zaragoza. Sobresaliente cum laude. 22/09/15
- Advances on Speaker Recognition in non Collaboarative Environments. Universidad de Zaragoza. Sobresaliente "Cum Laude". 27/11/14
- Discriminative methods for model optimization in speaker verification. Universidad de Zaragoza. Sobresaliente "Cum Laude". 29/05/14
- Aplicación de las tecnologías del habla en la educación de la voz infantil alterada. Universidad de Zaragoza. Sobresaliente cum laude. 17/12/10
- Personalización y adaptación on-line y variaciones de la voz en sistemas de reconocimiento del habla. Universidad de Zaragoza. Sobresaliente "Cum Laude". 09/12/09
- Acoustic Modeling Advances for Speech Recognition. Universidad de Zaragoza. Sobresaliente "Cum Laude". 12/12/08
- Normalización y adaptación a entornos acústicos para la robustez en sistemas de reconocimiento automático del habla. Universidad de Zaragoza. Sobresaliente "Cum Laude". 03/12/07
- Improvements in speech recognition for embedded devices by taking advantage of lip reading techniques. Universidad de Zaragoza. Sobresaliente "Cum Laude". 26/09/06
- Sistema de refuerzo de luz para el interior de un vehículo a motor. Universidad de Zaragoza. Sobresaliente "Cum Laude". 20/12/05
- Segregación de fuentes sonoras para reconocimiento robusto del habla. Universidad de Zaragoza. Sobresaliente "Cum Laude". 23/11/00
Participaciones en congresos
- 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. Participativo - Ponencia oral (comunicación oral). Tied Hidden Factors in Neural Networks for End-to-End Speaker Recognition. Estocolmo. 29/08/17
- 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. Participativo - Ponencia oral (comunicación oral). Domain Adaptation of PLDA models in Broadcast Diarization by means of Unsupervised Speaker Clustering. Estocolmo. 29/08/17
- IEEE Automatic Speech Recognition and Understanding (ASRU 2015). Participativo - Ponencia oral (comunicación oral). Variational Bayesian PLDA for Speaker Diarization in the MGB Challenge. Arizona. 12/12/15
- 16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015. Participativo - Ponencia oral (comunicación oral). Spoofing Detection with DNN and One-class SVM for the ASVspoof 2015 Challenge. Dresden. 09/09/15
- 15th Annual Conference of the International Speech Communication Association, INTERSPEECH 2014. Participativo - Ponencia oral (comunicación oral). Factor Analysis with Sampling Methods for Text Dependent Speaker Recognition. Singapur. 02/09/14
- 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013. Participativo - Ponencia oral (comunicación oral). A New Bayesian Network to Assess the Reliability of Speaker Verification Decisions. Lyon. 28/08/13
- 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013. Participativo - Ponencia oral (comunicación oral). The I3A Speaker Recognition System for NIST SRE12: Post-evaluation Analysis. Lyon. 28/08/13
- 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013. Participativo - Ponencia oral (comunicación oral). Suprasegmental Information Modelling for Autism Disorder Spectrum and Specific Language Impairment Classification. Lyon. 28/08/13
- SLAM 2013 Speech, Language and Audio in Multimedia. Participativo - Ponencia oral (comunicación oral). Broadcast News Segmentation with Factor Analysis System. Marsella. 25/08/13
- 24th EAEEIE Annual Conference (EAEEIE), 2013. Participativo - Ponencia oral (comunicación oral). Collaborative learning in international teams on Technologies to Reduce the Access Barrier in Human Computer Interaction (TrabHCI) Erasmus Intensive Programme. Chania. 25/05/13
- IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013). Participativo - Ponencia oral (comunicación oral). Prosodic features and formant modeling for an ivector-based language recognition system. Vancouver. 12/05/13
- IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013). Participativo - Póster. Segmentation-by-classification system based on factor analysis. Vancouver. 12/05/13
UNIZAR teaching of the last six courses
- Speech processing. Máster Universitario Erasmus Mundus en Ciencia de Datos Lingüísticos / Erasmus Mundus Master in Ling. During academic year 2025-26
- Tecnologías del habla y del lenguaje. Máster Universitario en Ingeniería de Telecomunicación. From the 2024-25 course to the 2025-26 course
- Ingeniería acústica. Graduado en Ingeniería de Tecnologías y Servicios de Telecomunicación. From the 2021-22 course to the 2025-26 course
- Procesado de audio e imagen. Graduado en Ingeniería de Tecnologías y Servicios de Telecomunicación. From the 2020-21 course to the 2025-26 course
- Tecnologías del habla. Máster Universitario en Ingeniería de Telecomunicación. From the 2021-22 course to the 2023-24 course
- Circuitos y sistemas. Graduado en Ingeniería de Tecnologías y Servicios de Telecomunicación. From the 2020-21 course to the 2021-22 course
- Comunicaciones audiovisuales. Graduado en Ingeniería de Tecnologías y Servicios de Telecomunicación. During academic year 2020-21
- Tecnologías del habla. Máster Universitario en Ingeniería de Telecomunicación. During academic year 2020-21
|