← Back to Paper

Version History

Enhancing Small Language Models with Gradient Noise Injection

Version 3 (Latest)
October 07, 2025 03:55
Enhancing Small Language Models with Gradient Noise Injection
Training small language models is challenging due to their limited capacity to capture complex patterns and their susceptibility to overfitting. To address these issues, we investigate gradient noise injection as …
Version Notes: test new submission.
Latest Version
Version 2
October 07, 2025 03:44
Enhancing Small Language Models with Gradient Noise Injection
\subsection{Comparison of Final Results} Table~\ref{tab:final_results} summarizes the final training and validation losses for models trained with and without gradient noise injection across the three benchmark datasets (shakespeare_char, enwik8, and text8). …
Version Notes: We have integrated ai reveiwers'comments to improve it.
Version 1
October 02, 2025 10:10
Enhancing Small Language Models with Gradient Noise Injection
We explore gradient noise injection to enhance the training robustness and generalization of small language models. Training these models is challenging due to their limited capacity to capture complex patterns …