Contextual Attention Mechanism with Convolutional Operations
In Figure 4(a), X is used as the input feature, and the other three defined variables are Q=X, K=X, V=XWv. Firstly, a kk convolution is applied to extract contextual information from the key, resulting in K'. K' reflects the contextual information between adjacent K values. Then, K' is concatenated with Q, and the concatenated result is subjected to two 11 convolution operations to obtain the attention matrix.
原文地址: http://www.cveoy.top/t/topic/fUS 著作权归作者所有。请勿转载和采集!