Characterization of Institutional Texts for an Automated
Golden Standard: Enhancing Machine Translation
Quality Assessment between English and Spanish

Romana García, María Luisa; Hernández Pardo, Blanca

View/Open

202562416378822_NeTTT Proceedings_Romana-Hernandez.pdf (2.266Mb)

Date

2024-07-06

Author

Romana García, María Luisa

Hernández Pardo, Blanca

Estado

info:eu-repo/semantics/publishedVersion

Metadata

Show full item record

Mostrar METS del ítem

Ver registro en CKH

Abstract

The purpose of this paper is to collect a set of features that can contribute to the linguistic characterization of the institutional textual genre. The aim is to describe as exhaustively as possible the archetypal text to be obtained as a target text in this type of specialized translation. The tools used were Orange Data Mining© and Google Colab (Python code), and the data was obtained using the following processing mechanisms: word cloud, text preprocessing (cleaning, tokenization, normalization, lemmatization and PoS annotation). With these tools, lexical and grammatical frequencies, lexical and documentary embeddings, cosine distances, hierarchical clustering, and 20-component dimensionality reduction (t-SNE) were extracted. As a result, a series of useful descriptive parameters have been obtained for the characterization of model texts for economic translation of institutional domains into Spain Spanish: lexical and terminological density, phraseological and terminological lexicalizations, grammatical frequencies, and semantic maps. In conclusion, the study provides several quantifiable features that characterize the analyzed register and opens the way for further research to deepen these parameters and develop the research by searching for complementary parameters until a complete and exhaustive picture of the reference model in this genre is obtained.

URI

http://hdl.handle.net/11531/99668

Characterization of Institutional Texts for an Automated Golden Standard: Enhancing Machine Translation Quality Assessment between English and Spanish

Tipo de Actividad

Presentación en congreso

Palabras Clave

.
Machine Translation, Golden Standard, Translation Quality Assessment, Specialized Translation, AI Processing.

Collections

Artículos