资讯

Abstract: Transformer-based pre-trained models have gained much advance in recent years, Transformer architecture also becomes one of the most important backbones in natural language processing.