Skip to content

Releases: ClashLuke/TrueGrad

4.0.3

08 Aug 08:47
Compare
Choose a tag to compare
  • fix(Sign): self-graft correctly; previously, we did update.sign() * update.norm(), omitting the required division by the original norm. Now, it's F.normalize(update.sign()) * update.norm(). This changes the required learning rates for self-grafted tg.optim.Sign.

4.0.2

23 Apr 06:18
Compare
Choose a tag to compare
  • Use WeightDecayChain in OptimizerOptimizer

4.0.1

23 Apr 06:16
Compare
Choose a tag to compare
  • Add missing params_flat in Graft

4.0.0

23 Apr 06:13
Compare
Choose a tag to compare
  • Add configurable weight decay via WeightDecayChain
    • L1/L2 Decay
    • Decay to Init/EMA
  • Remove decay_to_init flag. Use weight_decay_cls=tg.WeightDecayChain(tg.optim.WeightDecayToInit()) instead.
  • Remove default_to_adam flag. Use default_to_baseline.

2.3.5

14 Jan 19:40
6e40c08
Compare
Choose a tag to compare

Fix the bugs

2.2.0

14 Jan 13:16
d65147a
Compare
Choose a tag to compare
  • Improve TG-Optimizer extensibility by adding TrueGrad base optimizer class
  • Add (TG-)LaProp

2.1.0

29 Nov 07:34
4279d61
Compare
Choose a tag to compare
  • feat(nn.functional): allow parameters in more truegrad.nn.functional ops
  • fix(functional): allow odd shapes in truegrad.functional.einsum's backward
  • feat(utils): allow the combination of truegrad.nn with truegrad.utils.patch_model
  • fix(TGAdamW): improve stability

Together, these features allow performant usage of off-the-shelf HuggingFace Transformers using truegrad.utils.patch_torch.

2.0.0

27 Nov 15:14
bb868cc
Compare
Choose a tag to compare
  • Feature: Patch torch and torch.nn.functional in truegrad.utils.patch_torch
  • Feature: Add chunk, split and transpose to truegrad.functional
  • Fix: publicly expose truegrad.nn.functional
  • Fix: use patched chunk, split and transpose functions in truegrad.nn.functional.multi_head_attention_forward (closes #1)

1.0.0

27 Nov 14:26
6fae8fd
Compare
Choose a tag to compare
  • Add truegrad.nn.functional
  • Extend truegrad.nn
  • Add truegrad.utils.patch_torch
  • Add truegrad.functional.TrueGradTensor to store sum_grad_squared (-> fixed truegrad.functional.reshape)

0.1.0

26 Nov 16:02
0a14e38
Compare
Choose a tag to compare
  • Add BackPack as possible backend
  • default_to_adam option for TGAdamW
  • rename square_grad to sum_grad_squared