V3.1.0
Change Log
Feature
- Implement
AdaLomooptimizer. (#258) * Low-memory Optimization with Adaptive Learning Rate - Support
Q-GaLoreoptimizer. (#258) * Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients. * you can use byoptimizer = load_optimizer('q_galore_adamw8bit') - Support more bnb optimizers. (#258)
*
bnb_paged_adam8bit,bnb_paged_adamw8bit,bnb_*_*32bit. - Improve
power_iteration()speed up to 40%. (#259) - Improve
reg_noise()(E-MCMC) speed up to 120%. (#260) - Support
disable_lr_schedulerparameter forRanger21optimizer to disable built-in learning rate scheduler. (#261)
Refactor
- Refactor
AdamMinioptimizer. (#258) - Deprecate optional dependency,
bitsandbytes. (#258) - Move
get_rms,approximate_sq_gradfunctions toBaseOptimizerfor reusability. (#258) - Refactor
shampoo_utils.py. (#259) - Add
debias,debias_adammethods inBaseOptimizer. (#261) - Refactor to use
BaseOptimizeronly, not inherit multiple classes. (#261)
Bug
- Fix several bugs in
AdamMinioptimizer. (#257)
Contributions
thanks to @sdbds