The MERIT Dataset: Modelling and efficiently rendering interpretable transcripts

de Rodrigo Tobías, Ignacio; Sánchez Cuadrado, Alberto; Boal Martín-Larrauri, Jaime; López López, Álvaro Jesús

dc.contributor.author	de Rodrigo Tobías, Ignacio	es-ES
dc.contributor.author	Sánchez Cuadrado, Alberto	es-ES
dc.contributor.author	Boal Martín-Larrauri, Jaime	es-ES
dc.contributor.author	López López, Álvaro Jesús	es-ES
dc.date.accessioned	2025-10-16T12:26:26Z
dc.date.available	2025-10-16T12:26:26Z
dc.date.issued	2026-04-01	es_ES
dc.identifier.issn	0031-3203	es_ES
dc.identifier.uri	https://doi.org/10.1016/j.patcog.2025.112502	es_ES
dc.description	Artículos en revistas	es_ES
dc.description.abstract	This paper introduces the MERIT Dataset, a multimodal, fully labeled dataset of school grade reports. Comprising over 400 labels and 33k samples, the MERIT Dataset is a resource for training models in demanding Visually-rich Document Understanding tasks. It contains multimodal features that link patterns in the textual, visual, and layout domains. The MERIT Dataset also includes biases in a controlled way, making it a valuable tool to benchmark biases induced in Language Models. The paper outlines the dataset’s generation pipeline and highlights its main features and patterns in its different domains. We benchmark the dataset for token classification, showing that it poses a significant challenge even for SOTA models.	es-ES
dc.description.abstract	This paper introduces the MERIT Dataset, a multimodal, fully labeled dataset of school grade reports. Comprising over 400 labels and 33k samples, the MERIT Dataset is a resource for training models in demanding Visually-rich Document Understanding tasks. It contains multimodal features that link patterns in the textual, visual, and layout domains. The MERIT Dataset also includes biases in a controlled way, making it a valuable tool to benchmark biases induced in Language Models. The paper outlines the dataset’s generation pipeline and highlights its main features and patterns in its different domains. We benchmark the dataset for token classification, showing that it poses a significant challenge even for SOTA models.	en-GB
dc.language.iso	en-GB	es_ES
dc.source	Revista: Pattern Recognition, Periodo: 1, Volumen: online, Número: Part B, Página inicial: 112502-1, Página final: 112502-14	es_ES
dc.subject.other	Instituto de Investigación Tecnológica (IIT)	es_ES
dc.title	The MERIT Dataset: Modelling and efficiently rendering interpretable transcripts	es_ES
dc.type	info:eu-repo/semantics/article	es_ES
dc.description.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.rights.holder		es_ES
dc.rights.accessRights	info:eu-repo/semantics/openAccess	es_ES
dc.keywords	Synthetic Dataset; Multimodal Dataset; Visually-rich Document Understanding; Vision-Language Models	es-ES
dc.keywords	Synthetic Dataset; Multimodal Dataset; Visually-rich Document Understanding; Vision-Language Models	en-GB

Ficheros en el ítem

Nombre:: IIT-25-307R_preprint.pdf
Tamaño:: 4.961Mb
Formato:: PDF

Ver/

Nombre:: IIT-25-307R_preview.pdf
Tamaño:: 2.765Kb
Formato:: PDF

Ver/

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos
Artículos de revista, capítulos de libro y contribuciones en congresos publicadas.

Mostrar el registro sencillo del ítem