[Submitted on 29 Oct 2025]
Gated MLP with Isotropy Maintenance: \\ A Systematic Study of Feedforward Network Design
View PDFAbstract:This paper presents a comprehensive investigation of gated multi-layer perceptron (MLP) architectures with explicit isotropy maintenance for transformer feedforward networks. Through extensive experimentation and ablation studies, we systematically evaluate the potential benefits of combining gated linear units with isotropy-preserving pathways. While our final model achieves a validation loss of 4.997 on the FineWeb benchmark, slightly underperforming the SwiGLU baseline (4.9266), the study provides valuable insights into the challenges of improving feedforward network design.
Submission history
[v1] Wed, 29 Oct 2025 03:08 UTC