[Submitted on 31 Oct 2025]
Polynomial Activations in Transformer Networks
View PDFAbstract:We evaluate quadratic polynomial activations as alternatives to gated mechanisms in transformers. Our experiments show polynomial activations achieve a validation loss of 4.891, improving upon the SwiGLU baseline (4.9266). The results demonstrate polynomial activations can provide comparable performance with simpler implementations.
Submission history
[v1] Fri, 31 Oct 2025 01:07 UTC