Text this: A Transformer-Based Approach Combining Deep Learning Network and Spatial-Temporal Information for Raw EEG Classification