[Submitted on 29 Oct 2025]
Context-Adaptive Attention: A Balanced Approach for Efficient Language Modeling
View PDFAbstract:We present Context-Adaptive Attention (CAA), a hybrid attention mechanism that dynamically balances local and global patterns through learned gating. On the FineWeb benchmark with a 134M parameter Qwen architecture, CAA achieves improved efficiency while maintaining model performance. Our analysis reveals that the optimal attention pattern varies significantly across different linguistic contexts, motivating our gated approach. Through careful ablation studies and comparison to recent sparse attention methods \cite{yao2021combiner,chen2024fast,beltagy2020longformer}, we demonstrate CAA's effectiveness while acknowledging its 2.1x memory overhead compared to baseline.
Submission history
[v1] Wed, 29 Oct 2025 12:29 UTC