[94] Recipe for a General, Powerful, Scalable Graph Transformer

2023λ…„ 1μ›” 3일 Β· 1 λΆ„ Β· long8v Β· 

[48] SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection

2022λ…„ 8μ›” 9일 Β· 1 λΆ„ Β· long8v Β· 

[21] cosFormer: Rethinking Softmax in Attention

2022λ…„ 4μ›” 20일 Β· 2 λΆ„ Β· long8v Β· 

[20] Memorizing Transformer

2022λ…„ 4μ›” 7일 Β· 3 λΆ„ Β· long8v Β· 

[14] Longformer: The Long-Document Transformer

2022λ…„ 2μ›” 22일 Β· 2 λΆ„ Β· long8v Β·