记忆控制的顺序自我关注声音识别

论文标题

记忆控制的顺序自我关注声音识别

Memory Controlled Sequential Self Attention for Sound Recognition

论文作者

Pankajakshan, Arjun, Bear, Helen L., Subramanian, Vinod, Benetos, Emmanouil

论文摘要

在本文中，我们研究了记忆程度在顺序的自我关注中对声音识别的重要性。我们建议在卷积复发性神经网络（CRNN）模型的顶部使用记忆控制的顺序自我注意机制，以进行多音形声音事件检测（SED）。 Urban-SED数据集上的实验证明了记忆程度对自我关注引起的SED模型的声音识别性能的影响。我们通过多头自我注意机制扩展了提出的思想，每个注意力头部都会以明确的注意宽度值处理音频嵌入。提出的记忆控制顺序自我注意的使用提供了一种诱导声音事件令牌框架之间关系的方法。我们表明，我们的记忆控制的自我注意力模型可在城市网络数据集上实现基于事件的F -SCORE，为33.92％，表现优于该模型所报告的20.10％的F -SCORE而没有自我关注。

In this paper we investigate the importance of the extent of memory in sequential self attention for sound recognition. We propose to use a memory controlled sequential self attention mechanism on top of a convolutional recurrent neural network (CRNN) model for polyphonic sound event detection (SED). Experiments on the URBAN-SED dataset demonstrate the impact of the extent of memory on sound recognition performance with the self attention induced SED model. We extend the proposed idea with a multi-head self attention mechanism where each attention head processes the audio embedding with explicit attention width values. The proposed use of memory controlled sequential self attention offers a way to induce relations among frames of sound event tokens. We show that our memory controlled self attention model achieves an event based F -score of 33.92% on the URBAN-SED dataset, outperforming the F -score of 20.10% reported by the model without self attention.

下载PDF全文

下载文献需遵守相关版权规定

论文标题