Schedulers

Additional model training schedulers for fastai

source

Learner.fit_flat_warmup

 Learner.fit_flat_warmup (n_epoch:int, lr:Optional[float]=None,
                          div:Union[int,float]=25.0,
                          div_final:Union[int,float]=100000.0,
                          pct_start:float=0.75, warm_pct:float=0.2,
                          warm_epoch:int=5, warm_mode:str='auto', warm_sch
                          ed:Callable[...,fastai.callback.schedule._Anneal
                          er]=<function SchedCos>,
                          wd:Optional[float]=None, cbs:Union[fastai.callba
                          ck.core.Callback,Iterable[fastai.callback.core.C
                          allback],fastcore.foundation.L,fastcore.basics.f
                          astuple,NoneType]=None, reset_opt:bool=False)

Fit self.model for n_epoch at flat lr with a warmup and ending with cosine annealing.

Type Default Details
n_epoch int Number of epochs
lr float | None None Maximum learning rate
div Number 25.0 Initial learning rate: lr/div
div_final Number 100000.0 Final learning rate: lr/div_final
pct_start float 0.75 Start learning rate cosine annealing
warm_pct float 0.2 Learning rate warmup in percent
warm_epoch int 5 Learning rate warmup in epochs
warm_mode str auto Warmup using ‘epoch’, ‘pct’, or min of epoch/pct if ‘auto’
warm_sched Callable[…, _Annealer] SchedCos Learning rate warmup schedule
wd float | None None Weight decay, defaults to Optimizer weight decay
cbs listified[Callback] | None None Temporary Callbacks to apply during fit
reset_opt bool False Reset Optimizer before fit

fit_flat_warmup is identical to fastai’s fit_flat_cos, except with an added learning rate warmup phase.

By default, fit_flat_warmup will apply learning rate warmup for a minimum of warm_pct percent of training steps or warm_epoch number of training epochs. Set warm_mode='pct' to warmup the learning rate for warm_pct percent of training steps or set warm_mode='epoch' to warmup the learning rate for warm_epoch number of epochs.

warm_sched can be one of SchedCos (the default), SchedLin,SchedExp, SchedPoly, or a custom fastai annealer based schedule. SchedPoly must be passed as partial function: partial(SchedPoly, power=0.5).


source

Learner.fit_cos_anneal

 Learner.fit_cos_anneal (n_epoch:int, lr:Optional[float]=None,
                         div:Union[int,float]=25.0,
                         div_final:Union[int,float]=100000.0,
                         warm_pct:float=0.2, warm_epoch:int=5,
                         warm_mode:str='auto', warm_sched:Callable[...,fas
                         tai.callback.schedule._Annealer]=<function
                         SchedCos>, wd:Optional[float]=None, cbs:Union[fas
                         tai.callback.core.Callback,Iterable[fastai.callba
                         ck.core.Callback],fastcore.foundation.L,fastcore.
                         basics.fastuple,NoneType]=None,
                         reset_opt:bool=False)

Fit self.model for n_epoch using a with cosine annealing schedule with a max lr and optional warmup.

Type Default Details
n_epoch int Number of epochs
lr float | None None Maximum learning rate
div Number 25.0 Initial learning rate: lr/div
div_final Number 100000.0 Final learning rate: lr/div_final
warm_pct float 0.2 Learning rate warmup in percent
warm_epoch int 5 Learning rate warmup in epochs
warm_mode str auto Warmup using ‘epoch’, ‘pct’, or min of epoch/pct ‘auto’
warm_sched Callable[…, _Annealer] SchedCos Learning rate warmup schedule
wd float | None None Weight decay, defaults to Optimizer weight decay
cbs listified[Callback] | None None Temporary Callbacks to apply during fit
reset_opt bool False Reset Optimizer before fit

To disable learning rate warmup, set warm_pct=0.

By default, fit_cos_anneal will apply learning rate warmup for a minimum of warm_pct percent of training steps or warm_epoch number of training epochs. Set warm_mode='pct' to warmup the learning rate for warm_pct percent of training steps or set warm_mode='epoch' to warmup the learning rate for warm_epoch number of epochs.

warm_sched can be one of SchedCos (the default), SchedLin,SchedExp, SchedPoly, or a custom fastai annealer based schedule. SchedPoly must be passed as partial function: partial(SchedPoly, power=0.5).

With optional learning rate warmup:

And without learning ratewarmup:


source

Learner.fit_flat_varied

 Learner.fit_flat_varied (n_epoch:int, start_lr:Optional[float]=None,
                          div_final:Union[int,float]=100000.0,
                          pct_start:float=0.75, wd:Optional[float]=None, n
                          ext_lr:Union[float,Iterable[float],fastcore.foun
                          dation.L,fastcore.basics.fastuple,slice,NoneType
                          ]=None, change_by:Union[int,Iterable[int],fastco
                          re.foundation.L,fastcore.basics.fastuple,float,I
                          terable[float],NoneType]=None, change_time:Union
                          [int,Iterable[int],fastcore.foundation.L,fastcor
                          e.basics.fastuple,float,Iterable[float]]=1, chan
                          ge_sched:Union[Callable[...,fastai.callback.sche
                          dule._Annealer],Iterable[Callable[...,fastai.cal
                          lback.schedule._Annealer]],fastcore.foundation.L
                          ,fastcore.basics.fastuple,NoneType]=None, cbs:Un
                          ion[fastai.callback.core.Callback,Iterable[fasta
                          i.callback.core.Callback],fastcore.foundation.L,
                          fastcore.basics.fastuple,NoneType]=None,
                          reset_opt:bool=False)

Fit self.model for n_epoch at flat start_lr, then change to flat next_lr at change_by, optionally with cosine annealing or custom change_sched over change_time. Final cosine annealing at pct_start.

Type Default Details
n_epoch int Number of epochs
start_lr float | None None Initial learning rate
div_final Number 100000.0 Final learning rate: lr/div_final
pct_start float 0.75 Start learning rate cosine annealing
wd float | None None Weight decay, defaults to Optimizer weight decay
next_lr listified[float] | slice | None None Learning rates to switch to at change_by. Must be same length as change_by
change_by listified[int] | listified[float] | None None Epochs or percent of steps to switch to next_lr by. Must be same length as next_lr
change_time listified[int] | listified[float] 1 If greater than 0 (percent of steps or epochs), how long to change to next_lr. Must be same length as next_lr
change_sched listified[Callable[…, _Annealer]] | None None Schedule(s) for change. Defaults to SchedCos. Must be same length as next_lr
cbs listified[Callback] | None None Temporary Callbacks to apply during fit
reset_opt bool False Reset Optimizer before fit

change_sched can be one of SchedLin, SchedCos (the default), SchedExp, SchedPoly, or a custom fastai annealer based schedule. SchedPoly must be passed as partial function: partial(SchedPoly, power=0.5).

Example Fit Flat Varied Schedules

Discriminative Linear Warmup:

learn.fit_flat_varied(4, slice(3e-5, 3e-3), next_lr=3e-3, change_by=1, change_time=1, change_sched=SchedLin)

discriminative linear warmup

Multiple Cosine Annealing:

learn.fit_flat_varied((15, 8e-3, next_lr=[6e-3, 4e-3], change_by=[4, 8], change_time=2)

multiple cosine annealing

Immediate Change:

learn.fit_flat_varied((10, 8e-3, next_lr=[6e-3, 4e-3], change_by=[0.25, 0.5], change_time=0)

immediate change