A type of neural network architecture that can be used to process sequence data with attention mechanisms.