[Submitted on 21 Oct 2025]
Position-Aware Gompertz Gating for Transformer Feedforward Networks
View PDFAbstract:We present Position-Aware Gompertz Gating (PAGG), an improved feedforward module for transformers that systematically addresses three limitations of standard gated linear units (GLUs). Our method combines asymmetric activation with position-aware scaling and achieves a 4.889 validation loss on FineWeb, improving upon SwiGLU (4.927) while maintaining similar computational cost.
Submission history
[v1] Tue, 21 Oct 2025 00:54 UTC