Por favor, use este identificador para citar o enlazar este ítem:
http://hdl.handle.net/11531/110122Registro completo de metadatos
| Campo DC | Valor | Lengua/Idioma |
|---|---|---|
| dc.contributor.author | Rodríguez Abella, Álvaro | es-ES |
| dc.contributor.author | Silvestre, Joao Pedro | es-ES |
| dc.contributor.author | Tabuada, Paulo | es-ES |
| dc.date.accessioned | 2026-05-18T13:34:55Z | - |
| dc.date.available | 2026-05-18T13:34:55Z | - |
| dc.date.issued | 2025-05-01 | es_ES |
| dc.identifier.issn | 2640-3498 | es_ES |
| dc.identifier.uri | http://hdl.handle.net/11531/110122 | - |
| dc.description | Artículos en revistas | es_ES |
| dc.description.abstract | . | es-ES |
| dc.description.abstract | A key component of transformers is the attention mechanism orchestrating how each token influences the propagation of every other token along the layers of a transformer. In this paper we provide a rigorous, mathematical analysis of the asymptotic properties of attention in transformers. Although we present several results based on different assumptions, all of them point to the same conclusion, all tokens asymptotically converge to each other, a phenomenon that has been empirically reported in the literature. Our findings are carefully compared with existing theoretical results and illustrated by simulations and experimental studies using the GPT-2 model. | en-GB |
| dc.format.mimetype | application/pdf | es_ES |
| dc.language.iso | en-GB | es_ES |
| dc.rights | Creative Commons Reconocimiento-NoComercial-SinObraDerivada España | es_ES |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/3.0/es/ | es_ES |
| dc.source | Revista: Proceedings of Machine Learning Research, Periodo: 1, Volumen: , Número: 267, Página inicial: 174, Página final: 184 | es_ES |
| dc.title | Consensus is all you get: the role of attention in transformers | es_ES |
| dc.type | info:eu-repo/semantics/article | es_ES |
| dc.description.version | info:eu-repo/semantics/publishedVersion | es_ES |
| dc.rights.holder | es_ES | |
| dc.rights.accessRights | info:eu-repo/semantics/openAccess | es_ES |
| dc.keywords | . | es-ES |
| dc.keywords | transformers; attention mechanism; token convergence; asymptotic analysis. | en-GB |
| Aparece en las colecciones: | Artículos | |
Ficheros en este ítem:
| Fichero | Tamaño | Formato | |
|---|---|---|---|
| abella25a.pdf | 963,38 kB | Adobe PDF | Visualizar/Abrir |
Los ítems de DSpace están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.