[Submitted on 28 Oct 2025]
Rotation-Based Feedforward Networks: A Geometric Approach to Transformer Layers
View PDFAbstract:We present Rotation-Based Feedforward Networks (RBFN), a novel architecture that replaces traditional feedforward layers with learned 4D rotational transformations. Drawing inspiration from geometric deep learning, RBFN parameterizes hidden space transformations as compositions of rotations rather than pointwise nonlinearities. On the FineWeb benchmark with an 83M parameter model, RBFN achieves a validation loss of 4.916, representing a 0.011 improvement over the SwiGLU baseline while maintaining comparable computational requirements. Detailed analysis reveals that the rotational formulation provides particular benefits in later training stages, suggesting advantages for modeling hierarchical linguistic structures. We provide both theoretical analysis of the rotation mechanism's properties and empirical validation of its effectiveness compared to existing feedforward variants.
Submission history
[v1] Tue, 28 Oct 2025 01:06 UTC