V3.8.1
Change Log
Feature
- Implement
FriendlySAMoptimizer. (#424, #434) * Friendly Sharpness-Aware Minimization - Implement
AdaGOoptimizer. (#436, #437) * AdaGrad Meets Muon: Adaptive Stepsizes for Orthogonal Updates - Update
EXAdamoptimizer to the latest version. (#438) - Update
EmoNavioptimizer to the latest version. (#433, #439) - Implement
Condaoptimizer. (#440, #441) * Conda: Column-Normalized Adam for Training Large Language Models Faster
Update
- Accept the
GaloreProjectorparameters in the init params of theCondaoptimizer. (#443, #444)
Bug
- Fix NaN problem when grad norm is zero in StableSPAM optimizer. (#431)
Docs
- Update the documentation page. (#428)
Contribution
thanks to @liveck, @AhmedMostafa16