Tags — MagiAttention main documentation

Posts tagged AF Disaggregation

08 February 2026 - Attention Engine for Inference (Coming Soon)

Posts tagged Attention Sink

17 November 2025 - Support Learnable Attention Sink

Posts tagged Attention Slice Representation

21 April 2025 - MagiAttention

Posts tagged Benchmark

19 October 2025 - Long-Context Attention Benchmark

Posts tagged Blackwell

07 February 2026 - Support Blackwell with FFA_FA4 Backend

19 October 2025 - Long-Context Attention Benchmark

Posts tagged Collective Communication

24 January 2026 - Support Native Group Collective

Posts tagged Computation Load-Balance

21 April 2025 - MagiAttention

Posts tagged Computation-Communication Overlap

15 February 2026 - How to Ensure Kernels Actually Overlap

Posts tagged Context Parallelism

15 February 2026 - How to Ensure Kernels Actually Overlap

14 February 2026 - Distributed-Native FFA (Coming Soon)

08 February 2026 - Attention Engine for Inference (Coming Soon)

24 January 2026 - Support Native Group Collective

21 January 2026 - Dynamic Attention Solver (Coming Soon)

19 October 2025 - Long-Context Attention Benchmark

21 April 2025 - MagiAttention

Posts tagged DSA

25 January 2026 - Optimize Sparse Attention in FFA (Coming Soon)

Posts tagged DeepEP

24 January 2026 - Support Native Group Collective

Posts tagged Distributed Attention

15 February 2026 - How to Ensure Kernels Actually Overlap

14 February 2026 - Distributed-Native FFA (Coming Soon)

08 February 2026 - Attention Engine for Inference (Coming Soon)

24 January 2026 - Support Native Group Collective

21 January 2026 - Dynamic Attention Solver (Coming Soon)

19 October 2025 - Long-Context Attention Benchmark

21 April 2025 - MagiAttention

Posts tagged Dynamic Load Balance

21 January 2026 - Dynamic Attention Solver (Coming Soon)

Posts tagged Flash-Attention

07 February 2026 - Support Blackwell with FFA_FA4 Backend

04 February 2026 - Support Muon QK-Clip

25 January 2026 - Optimize Sparse Attention in FFA (Coming Soon)

22 December 2025 - Flash Attention 2 Math Derivation

17 November 2025 - Support Learnable Attention Sink

21 April 2025 - MagiAttention

Posts tagged Flex-Flash-Attention

14 February 2026 - Distributed-Native FFA (Coming Soon)

07 February 2026 - Support Blackwell with FFA_FA4 Backend

04 February 2026 - Support Muon QK-Clip

25 January 2026 - Optimize Sparse Attention in FFA (Coming Soon)

22 December 2025 - Flash Attention 2 Math Derivation

17 November 2025 - Support Learnable Attention Sink

19 October 2025 - Long-Context Attention Benchmark

21 April 2025 - MagiAttention

Posts tagged Group Collective

24 January 2026 - Support Native Group Collective

21 April 2025 - MagiAttention

Posts tagged HSTU Function Representation

07 February 2026 - Support Blackwell with FFA_FA4 Backend

Posts tagged Hybrid Attention

21 January 2026 - Dynamic Attention Solver (Coming Soon)

Posts tagged Multi-Stage Overlap

21 April 2025 - MagiAttention

Posts tagged Muon

04 February 2026 - Support Muon QK-Clip

Posts tagged NSA

25 January 2026 - Optimize Sparse Attention in FFA (Coming Soon)

Posts tagged QK-Clip

04 February 2026 - Support Muon QK-Clip

Posts tagged Sparse Attention

25 January 2026 - Optimize Sparse Attention in FFA (Coming Soon)

21 January 2026 - Dynamic Attention Solver (Coming Soon)

Posts tagged Zero-Redundant Communication

21 April 2025 - MagiAttention