A variation of the attention model that uses a self-attention mechanism to attend to every position in the input sequence. This allows the model to capture dependencies that are too far apart for RNN-based models.
A variation of the attention model that uses a self-attention mechanism to attend to every position in the input sequence. This allows the model to capture dependencies that are too far apart for RNN-based models.