Skip to main content

Ctrl+K

MagiAttention

User Guide
Blogs

Github
Blog

User Guide
Blogs

Github
Blog

Recent Posts

15 February - How to Ensure Kernels Actually Overlapped
14 February - Distributed-Native FFA
08 February - Attention Engine for Inference
07 February - Support Blackwell with FFA_FA4 Backend
04 February - Support Muon QK-Clip

Tags

AF Disaggregation
Attention Sink
Attention Slice Representation
Benchmark
Blackwell
Computation Load-Balance
Computation-Communication Overlap
Context Parallelism
DSA
DeepEP
Distributed Attention
Dynamic Load Balance
Flash-Attention
Flex-Flash-Attention
Group Collective
HSTU Function Representation
Hybrid Attention
Multi-Stage Overlap
Muon
NSA
QK-Clip
Sparse Attention
Zero-Redundant Communication

Categories

MagiAttention (12)

Archives

2026 (8)
2025 (4)

Authors

Bowen Zeng (3)
Hanwen Sun (3)
Jerry Chen (1)
Jin Li (4)
Kunlun Li (1)
Qiangang Wang (4)
Tao Bu (2)
Yufeng Yang (1)
Yujia Liu (1)
Yunpeng Huang (11)
Zewei Tao (7)

Languages

English (12)

Locations

China (12)

Posts by Yufeng Yang

Posts by Yufeng Yang

Support Blackwell with FFA_FA4 Backend

07 February 2026

Jerry Chen , Yujia Liu , Yufeng Yang , Yunpeng Huang , Zewei Tao , Qiangang Wang , Kunlun Li

China

English

MagiAttention

Blackwell Flex-Flash-Attention Flash-Attention HSTU Function Representation

The upcoming blog post will be released in the near future. Stay tuned!

Read more ...

© Copyright 2025-2026, Sandai.

Created using Sphinx 9.1.0.

Built with the PyData Sphinx Theme 0.16.1.