[Submitted on 26 Oct 2025]
Understanding the Limitations of Temperature-Controlled Gating in Feedforward Networks
View PDFAbstract:This paper presents a detailed investigation into temperature-controlled gating mechanisms for transformer feedforward networks. While our proposed Gated ReLU with Temperature (GRT) approach showed initial promise, comprehensive evaluation revealed a 3.4\% higher validation loss (5.096) compared to the SwiGLU baseline (4.9266). We analyze potential reasons for this underperformance through ablation studies and theoretical examination of the temperature scaling mechanism. Our findings suggest that while temperature control offers interesting properties for gating functions, its benefits may be offset by increased optimization challenges in standard transformer architectures.
Submission history
[v1] Sun, 26 Oct 2025 16:59 UTC