Posts tagged AF Disaggregation
08 February 2026 - Attention Engine for Inference
Posts tagged Attention Sink
17 November 2025 - Support Learnable Attention Sink
Posts tagged Attention Slice Representation
21 April 2025 - MagiAttention
Posts tagged Benchmark
19 October 2025 - Long-Context Attention Benchmark
Posts tagged Blackwell
07 February 2026 - Support Blackwell with FFA_FA4 Backend
19 October 2025 - Long-Context Attention Benchmark
Posts tagged Computation Load-Balance
21 April 2025 - MagiAttention
Posts tagged Computation-Communication Overlap
15 February 2026 - How to Ensure Kernels Actually Overlapped
Posts tagged Context Parallelism
15 February 2026 - How to Ensure Kernels Actually Overlapped
14 February 2026 - Distributed-Native FFA
08 February 2026 - Attention Engine for Inference
24 January 2026 - Support Native Group Collective Based on DeepEP
21 January 2026 - Dynamic Attention Solver
19 October 2025 - Long-Context Attention Benchmark
21 April 2025 - MagiAttention
Posts tagged DSA
25 January 2026 - Optimize Sparse Attention in FFA
Posts tagged DeepEP
24 January 2026 - Support Native Group Collective Based on DeepEP
Posts tagged Distributed Attention
15 February 2026 - How to Ensure Kernels Actually Overlapped
14 February 2026 - Distributed-Native FFA
08 February 2026 - Attention Engine for Inference
24 January 2026 - Support Native Group Collective Based on DeepEP
21 January 2026 - Dynamic Attention Solver
19 October 2025 - Long-Context Attention Benchmark
21 April 2025 - MagiAttention
Posts tagged Dynamic Load Balance
21 January 2026 - Dynamic Attention Solver
Posts tagged Flash-Attention
07 February 2026 - Support Blackwell with FFA_FA4 Backend
04 February 2026 - Support Muon QK-Clip
25 January 2026 - Optimize Sparse Attention in FFA
22 December 2025 - Flash Attention 2 Math Derivation
17 November 2025 - Support Learnable Attention Sink
21 April 2025 - MagiAttention
Posts tagged Flex-Flash-Attention
14 February 2026 - Distributed-Native FFA
07 February 2026 - Support Blackwell with FFA_FA4 Backend
04 February 2026 - Support Muon QK-Clip
25 January 2026 - Optimize Sparse Attention in FFA
22 December 2025 - Flash Attention 2 Math Derivation
17 November 2025 - Support Learnable Attention Sink
19 October 2025 - Long-Context Attention Benchmark
21 April 2025 - MagiAttention
Posts tagged Group Collective
24 January 2026 - Support Native Group Collective Based on DeepEP
21 April 2025 - MagiAttention
Posts tagged HSTU Function Representation
07 February 2026 - Support Blackwell with FFA_FA4 Backend
Posts tagged Hybrid Attention
21 January 2026 - Dynamic Attention Solver
Posts tagged Multi-Stage Overlap
21 April 2025 - MagiAttention
Posts tagged Muon
04 February 2026 - Support Muon QK-Clip
Posts tagged NSA
25 January 2026 - Optimize Sparse Attention in FFA
Posts tagged QK-Clip
04 February 2026 - Support Muon QK-Clip
Posts tagged Sparse Attention
25 January 2026 - Optimize Sparse Attention in FFA
21 January 2026 - Dynamic Attention Solver
Posts tagged Zero-Redundant Communication
21 April 2025 - MagiAttention