Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/11531/15474
Título : Evaluation of semantic similarity metrics applied to the automatic retrieval of medical documents: An UMLS approach
Autor : Alonso Martínez, Israel
Contreras Bárcena, David
Fecha de publicación : 1-feb-2016
Resumen : One promise of current information retrieval systems is the capability to identify risk groups for certain diseases and pathologies based on the automatic analysis of vast amounts of Electronic Medical Records repositories. However, the complexity and the degree of specialization of the language used by the experts in this context, make this task both challenging and complex. In this work, we introduce a novel experimental study to evaluate the performance of the two semantic similarity metrics (Path and Intrinsic IC-Path, both widely accepted in the literature) in a real-life information retrieval situation. In order to achieve this goal and due to the lack of methodologies for this context in the literature, we propose a straightforward information retrieval system for the biomedical field based on the UMLS Metathesaurus and on semantic similarity metrics. In contrast with previous studies which focus on testbeds with limited and controlled sets of concepts, we use a large amount of information (101,712 medical documents extracted from TREC Medical Records Track 2011). Our results show that in real-life cases, both metrics display similar performance, Path (F-Measure = 0.430) e Intrinsic IC-Path (F-Measure = 0.427). Thereby we suggest that the use of Intrinsic IC-Path is not justified in real scenarios.
One promise of current information retrieval systems is the capability to identify risk groups for certain diseases and pathologies based on the automatic analysis of vast amounts of Electronic Medical Records repositories. However, the complexity and the degree of specialization of the language used by the experts in this context, make this task both challenging and complex. In this work, we introduce a novel experimental study to evaluate the performance of the two semantic similarity metrics (Path and Intrinsic IC-Path, both widely accepted in the literature) in a real-life information retrieval situation. In order to achieve this goal and due to the lack of methodologies for this context in the literature, we propose a straightforward information retrieval system for the biomedical field based on the UMLS Metathesaurus and on semantic similarity metrics. In contrast with previous studies which focus on testbeds with limited and controlled sets of concepts, we use a large amount of information (101,712 medical documents extracted from TREC Medical Records Track 2011). Our results show that in real-life cases, both metrics display similar performance, Path (F-Measure = 0.430) e Intrinsic IC-Path (F-Measure = 0.427). Thereby we suggest that the use of Intrinsic IC-Path is not justified in real scenarios.
Descripción : Artículos en revistas
URI : https://doi.org/10.1016/j.eswa.2015.09.028
ISSN : 0957-4174
Aparece en las colecciones: Artículos

Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
IIT-15-147A.pdf1,51 MBAdobe PDFVisualizar/Abrir     Request a copy
IIT-15-147A_preview2,92 kBUnknownVisualizar/Abrir
IIT-15-147A_preview.pdf2,92 kBAdobe PDFVisualizar/Abrir


Los ítems de DSpace están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.