通过成额越野跨注意网络的语音增强

论文标题

通过成额越野跨注意网络的语音增强

Speech Enhancement with Fullband-Subband Cross-Attention Network

论文作者

Chen, Jun, Rao, Wei, Wang, Zilin, Wu, Zhiyong, Wang, Yannan, Yu, Tao, Shang, Shidong, Meng, Helen

论文摘要

FullSubnet通过利用满器和子带信息来表现出其在语音增强方面的有希望的表现。但是，通过简单地将成型号模型和子带单元的输出连接到FullSubnet中的成额和子带之间的关系。它仅用少量的全局信息为子带单元补充了子带单元，并且没有考虑到成圈和子带之间的相互作用。本文提出了一个互动互动的跨界跨注意事项（FSCA）模块，以互动地融合全局和本地信息，并将其应用于FullSubnet。这个新框架称为FS-Canet。此外，与FullSubnet不同，提议的FS-Canet通过时间卷积网络（TCN）块优化了成型提取器，以进一步降低模型大小。 DNS挑战的实验结果 - Interspeech 2021数据集表明，所提出的FS-Canet的表现优于其他最先进的语音增强方法，并证明了成型额外频带交叉注意的有效性。

FullSubNet has shown its promising performance on speech enhancement by utilizing both fullband and subband information. However, the relationship between fullband and subband in FullSubNet is achieved by simply concatenating the output of fullband model and subband units. It only supplements the subband units with a small quantity of global information and has not considered the interaction between fullband and subband. This paper proposes a fullband-subband cross-attention (FSCA) module to interactively fuse the global and local information and applies it to FullSubNet. This new framework is called as FS-CANet. Moreover, different from FullSubNet, the proposed FS-CANet optimize the fullband extractor by temporal convolutional network (TCN) blocks to further reduce the model size. Experimental results on DNS Challenge - Interspeech 2021 dataset show that the proposed FS-CANet outperforms other state-of-the-art speech enhancement approaches, and demonstrate the effectiveness of fullband-subband cross-attention.

下载PDF全文

下载文献需遵守相关版权规定

论文标题