Abstract
Multiple studies have shown that Transformers are remarkably robust to pruning. Contrary to this received wisdom, we demonstrate that pre-trained Transformer encoders are surprisingly fragile to the removal of a very small number of features in the layer outputs (
| Originalsprog | Engelsk |
|---|---|
| Titel | Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 |
| Antal sider | 14 |
| Udgivelsessted | Online |
| Forlag | Association for Computational Linguistics |
| Publikationsdato | 1 aug. 2021 |
| Sider | 3392-3405 |
| Status | Udgivet - 1 aug. 2021 |
Emneord
- Transformers robustness
- Pre-trained models
- Layer outputs
- Feature pruning
- Model fragility