Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/11531/106393
Registro completo de metadatos
Campo DC Valor Lengua/Idioma
dc.contributor.authorde Rodrigo Tobías, Ignacioes-ES
dc.contributor.authorBoal Martín-Larrauri, Jaimees-ES
dc.contributor.authorLópez López, Álvaro Jesúses-ES
dc.date.accessioned2025-10-16T12:26:26Z-
dc.date.available2025-10-16T12:26:26Z-
dc.date.issued2026-04-01es_ES
dc.identifier.issn0031-3203es_ES
dc.identifier.urihttps:doi.org10.1016j.patcog.2025.112502es_ES
dc.descriptionArtículos en revistases_ES
dc.description.abstractThis paper introduces the MERIT Dataset, a multimodal, fully labeled dataset of school grade reports. Comprising over 400 labels and 33k samples, the MERIT Dataset is a resource for training models in demanding Visually-rich Document Understanding tasks. It contains multimodal features that link patterns in the textual, visual, and layout domains. The MERIT Dataset also includes biases in a controlled way, making it a valuable tool to benchmark biases induced in Language Models. The paper outlines the dataset’s generation pipeline and highlights its main features and patterns in its different domains. We benchmark the dataset for token classification, showing that it poses a significant challenge even for SOTA models.es-ES
dc.description.abstractThis paper introduces the MERIT Dataset, a multimodal, fully labeled dataset of school grade reports. Comprising over 400 labels and 33k samples, the MERIT Dataset is a resource for training models in demanding Visually-rich Document Understanding tasks. It contains multimodal features that link patterns in the textual, visual, and layout domains. The MERIT Dataset also includes biases in a controlled way, making it a valuable tool to benchmark biases induced in Language Models. The paper outlines the dataset’s generation pipeline and highlights its main features and patterns in its different domains. We benchmark the dataset for token classification, showing that it poses a significant challenge even for SOTA models.en-GB
dc.language.isoen-GBes_ES
dc.sourceRevista: Pattern Recognition, Periodo: 1, Volumen: online, Número: Part B, Página inicial: 112502-1, Página final: 112502-14es_ES
dc.subject.otherInstituto de Investigación Tecnológica (IIT)es_ES
dc.titleThe MERIT Dataset: Modelling and efficiently rendering interpretable transcriptses_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.description.versioninfo:eu-repo/semantics/publishedVersiones_ES
dc.rights.holderes_ES
dc.rights.accessRightsinfo:eu-repo/semantics/openAccesses_ES
dc.keywordsSynthetic Dataset; Multimodal Dataset; Visually-rich Document Understanding; Vision-Language Modelses-ES
dc.keywordsSynthetic Dataset; Multimodal Dataset; Visually-rich Document Understanding; Vision-Language Modelsen-GB
Aparece en las colecciones: Artículos

Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
IIT-25-307R_preprint5,08 MBUnknownVisualizar/Abrir
IIT-25-307R_preview2,77 kBUnknownVisualizar/Abrir


Los ítems de DSpace están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.