[Submitted on 21 Oct 2025]
Adaptive Gated Pathways for Transformer Feedforward Networks
View PDFAbstract:We present Adaptive Gated Pathways (AGP), a novel feedforward architecture that dynamically blends SiLU and GELU gating mechanisms. On the FineWeb benchmark, AGP achieves 4.847 validation loss versus 4.927 for SwiGLU baseline, with p < 0.01 significance.
Submission history
[v1] Tue, 21 Oct 2025 22:09 UTC