Posts by Tao Bu Long-Context Attention Benchmark 19 October 2025 Tao Bu , Qiangang Wang , Bowen Zeng , Hanwen Sun , Yunpeng Huang , Zewei Tao China English MagiAttention Benchmark Blackwell Flex-Flash-Attention Distributed Attention Context Parallelism From Kernel Efficiency to Distributed Scalability Read more ... MagiAttention 21 April 2025 Zewei Tao , Yunpeng Huang , Qiangang Wang , Hanwen Sun , Jin Li , Tao Bu , Bowen Zeng China English MagiAttention Attention Slice Representation Computation Load-Balance Zero-Redundant Communication Multi-Stage Overlap Flex-Flash-Attention Group Collective Flash-Attention Distributed Attention Context Parallelism A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Mask Training Read more ...