Skip to main content

Ctrl+K

MagiAttention

User Guide
Blogs

English
简体中文

Github
Blog

User Guide
Blogs

English
简体中文

Github
Blog

Recent Posts

15 February - How to Ensure Kernels Actually Overlap
14 February - Distributed-Native FFA (Coming Soon)
08 February - Attention Engine for Inference (Coming Soon)
07 February - Support Blackwell with FFA_FA4 Backend
04 February - Support Muon QK-Clip

Tags

AF Disaggregation
Attention Sink
Attention Slice Representation
Benchmark
Blackwell
Collective Communication
Computation Load-Balance
Computation-Communication Overlap
Context Parallelism
DSA
DeepEP
Distributed Attention
Dynamic Load Balance
Flash-Attention
Flex-Flash-Attention
Group Collective
HSTU Function Representation
Hybrid Attention
Multi-Stage Overlap
Muon
NSA
QK-Clip
Sparse Attention
Zero-Redundant Communication

Categories

MagiAttention (12)

Archives

2026 (8)
2025 (4)

Authors

Bowen Zeng (3)
Hanwen Sun (3)
Jerry Chen (1)
Jin Li (4)
Kunlun Li (1)
Qiangang Wang (3)
Tao Bu (2)
Yufeng Yang (1)
Yujia Liu (1)
Yunpeng Huang (11)
Zewei Tao (8)

Locations

China (12)

Posts tagged DSA

Posts tagged DSA

Optimize Sparse Attention in FFA (Coming Soon)

25 January 2026

Zewei Tao , Hanwen Sun , Bowen Zeng , Jin Li , Yunpeng Huang

China

MagiAttention

Sparse Attention NSA DSA Flex-Flash-Attention Flash-Attention

The upcoming blog post will be released in the near future. Stay tuned!

Read more ...

© Copyright 2025-2026, Sandai.

Created using Sphinx 9.1.0.

Built with the PyData Sphinx Theme 0.17.0.