Compressing Large-Scale Transformer-Based Models: A Case Study on BERT

Publication
Transactions of the Association for Computational Linguistics