Skip to content

V3.3.4

Change Log

Feature

  • Support OrthoGrad feature for create_optimizer(). (#324)
  • Enhanced flexibility for the optimizer parameter in Lookahead, TRAC, and OrthoGrad optimizers. (#324) * Now supports both torch.optim.Optimizer instances and classes * You can now use Lookahead optimizer in two ways. * Lookahead(AdamW(model.parameters(), lr=1e-3), k=5, alpha=0.5) * Lookahead(AdamW, k=5, alpha=0.5, params=model.parameters())
  • Implement SPAM optimizer. (#324) * Spike-Aware Adam with Momentum Reset for Stable LLM Training
  • Implement TAM, and AdaTAM optimizers. (#325) * Torque-Aware Momentum