TY - GEN
T1 - Spectral Probing
AU - Müller-Eberstein, Max
AU - van der Goot, Rob
AU - Plank, Barbara
PY - 2022/12
Y1 - 2022/12
N2 - Linguistic information is encoded at varying timescales (subwords, phrases, etc.) and communicative levels, such as syntax and semantics. Contextualized embeddings have analogously been found to capture these phenomena at distinctive layers and frequencies. Leveraging these findings, we develop a fully learnable frequency filter to identify spectral profiles for any given task. It enables vastly more granular analyses than prior handcrafted filters, and improves on efficiency. After demonstrating the informativeness of spectral probing over manual filters in a monolingual setting, we investigate its multilingual characteristics across seven diverse NLP tasks in six languages. Our analyses identify distinctive spectral profiles which quantify cross-task similarity in a linguistically intuitive manner, while remaining consistent across languages—highlighting their potential as robust, lightweight task descriptors.
AB - Linguistic information is encoded at varying timescales (subwords, phrases, etc.) and communicative levels, such as syntax and semantics. Contextualized embeddings have analogously been found to capture these phenomena at distinctive layers and frequencies. Leveraging these findings, we develop a fully learnable frequency filter to identify spectral profiles for any given task. It enables vastly more granular analyses than prior handcrafted filters, and improves on efficiency. After demonstrating the informativeness of spectral probing over manual filters in a monolingual setting, we investigate its multilingual characteristics across seven diverse NLP tasks in six languages. Our analyses identify distinctive spectral profiles which quantify cross-task similarity in a linguistically intuitive manner, while remaining consistent across languages—highlighting their potential as robust, lightweight task descriptors.
KW - Linguistic Information
KW - Contextualized Embeddings
KW - Frequency Filter
KW - Spectral Probing
KW - Multilingual NLP
KW - Cross-task Similarity
KW - Syntax and Semantics
KW - Granular Analyses
KW - Monolingual Setting
KW - Robust Task Descriptors
M3 - Article in proceedings
SP - 7730
EP - 7741
BT - Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
PB - Association for Computational Linguistics
CY - Abu Dhabi, United Arab Emirates
ER -