Schedulers

Additional model training schedulers for fastai

Learner.fit_flat_warmup

 Learner.fit_flat_warmup (n_epoch:int, lr:Optional[float]=None,
                          div:Union[int,float]=25.0,
                          div_final:Union[int,float]=100000.0,
                          pct_start:float=0.75, warm_pct:float=0.2,
                          warm_epoch:int=5, warm_mode:str='auto', warm_sch
                          ed:Callable[...,fastai.callback.schedule._Anneal
                          er]=<function SchedCos>,
                          wd:Optional[float]=None, cbs:Union[fastai.callba
                          ck.core.Callback,Iterable[fastai.callback.core.C
                          allback],MutableSequence[fastai.callback.core.Ca
                          llback],fastcore.foundation.L,fastcore.basics.fa
                          stuple,NoneType]=None, reset_opt:bool=False)

Fit self.model for n_epoch at flat lr with a warmup and ending with cosine annealing.

	Type	Default	Details
n_epoch	int		Number of epochs
lr	float \| None	None	Maximum learning rate
div	Numeric	25.0	Initial learning rate: `lr/div`
div_final	Numeric	100000.0	Final learning rate: `lr/div_final`
pct_start	float	0.75	Start learning rate cosine annealing
warm_pct	float	0.2	Learning rate warmup in percent
warm_epoch	int	5	Learning rate warmup in epochs
warm_mode	str	auto	Warmup using ‘epoch’, ‘pct’, or min of epoch/pct if ‘auto’
warm_sched	Callable[…, _Annealer]	SchedCos	Learning rate warmup schedule
wd	float \| None	None	Weight decay, defaults to `Optimizer` weight decay
cbs	Listified[Callback] \| None	None	Temporary Callbacks to apply during fit
reset_opt	bool	False	Reset `Optimizer` before fit

fit_flat_warmup is identical to fastai’s fit_flat_cos, except with an added learning rate warmup phase.

By default, fit_flat_warmup will apply learning rate warmup for a minimum of warm_pct percent of training steps or warm_epoch number of training epochs. Set warm_mode='pct' to warmup the learning rate for warm_pct percent of training steps or set warm_mode='epoch' to warmup the learning rate for warm_epoch number of epochs.

warm_sched can be one of SchedCos (the default), SchedLin,SchedExp, SchedPoly, or a custom fastai annealer based schedule. SchedPoly must be passed as partial function: partial(SchedPoly, power=0.5).

source

Learner.fit_cos_anneal

 Learner.fit_cos_anneal (n_epoch:int, lr:Optional[float]=None,
                         div:Union[int,float]=25.0,
                         div_final:Union[int,float]=100000.0,
                         warm_pct:float=0.2, warm_epoch:int=5,
                         warm_mode:str='auto', warm_sched:Callable[...,fas
                         tai.callback.schedule._Annealer]=<function
                         SchedCos>, wd:Optional[float]=None, cbs:Union[fas
                         tai.callback.core.Callback,Iterable[fastai.callba
                         ck.core.Callback],MutableSequence[fastai.callback
                         .core.Callback],fastcore.foundation.L,fastcore.ba
                         sics.fastuple,NoneType]=None,
                         reset_opt:bool=False)

Fit self.model for n_epoch using a with cosine annealing schedule with a max lr and optional warmup.

	Type	Default	Details
n_epoch	int		Number of epochs
lr	float \| None	None	Maximum learning rate
div	Numeric	25.0	Initial learning rate: `lr/div`
div_final	Numeric	100000.0	Final learning rate: `lr/div_final`
warm_pct	float	0.2	Learning rate warmup in percent
warm_epoch	int	5	Learning rate warmup in epochs
warm_mode	str	auto	Warmup using ‘epoch’, ‘pct’, or min of epoch/pct ‘auto’
warm_sched	Callable[…, _Annealer]	SchedCos	Learning rate warmup schedule
wd	float \| None	None	Weight decay, defaults to `Optimizer` weight decay
cbs	Listified[Callback] \| None	None	Temporary Callbacks to apply during fit
reset_opt	bool	False	Reset `Optimizer` before fit

To disable learning rate warmup, set warm_pct=0.

By default, fit_cos_anneal will apply learning rate warmup for a minimum of warm_pct percent of training steps or warm_epoch number of training epochs. Set warm_mode='pct' to warmup the learning rate for warm_pct percent of training steps or set warm_mode='epoch' to warmup the learning rate for warm_epoch number of epochs.

With optional learning rate warmup:

And without learning ratewarmup:

source

Learner.fit_flat_varied

 Learner.fit_flat_varied (n_epoch:int, start_lr:Optional[float]=None,
                          div_final:Union[int,float]=100000.0,
                          pct_start:float=0.75, wd:Optional[float]=None, n
                          ext_lr:Union[float,Iterable[float],MutableSequen
                          ce[float],fastcore.foundation.L,fastcore.basics.
                          fastuple,slice,NoneType]=None, change_by:Union[i
                          nt,Iterable[int],MutableSequence[int],fastcore.f
                          oundation.L,fastcore.basics.fastuple,float,Itera
                          ble[float],MutableSequence[float],NoneType]=None
                          , change_time:Union[int,Iterable[int],MutableSeq
                          uence[int],fastcore.foundation.L,fastcore.basics
                          .fastuple,float,Iterable[float],MutableSequence[
                          float]]=1, change_sched:Union[Callable[...,fasta
                          i.callback.schedule._Annealer],Iterable[Callable
                          [...,fastai.callback.schedule._Annealer]],Mutabl
                          eSequence[Callable[...,fastai.callback.schedule.
                          _Annealer]],fastcore.foundation.L,fastcore.basic
                          s.fastuple,NoneType]=None, cbs:Union[fastai.call
                          back.core.Callback,Iterable[fastai.callback.core
                          .Callback],MutableSequence[fastai.callback.core.
                          Callback],fastcore.foundation.L,fastcore.basics.
                          fastuple,NoneType]=None, reset_opt:bool=False)

Fit self.model for n_epoch at flat start_lr, then change to flat next_lr at change_by, optionally with cosine annealing or custom change_sched over change_time. Final cosine annealing at pct_start.

	Type	Default	Details
n_epoch	int		Number of epochs
start_lr	float \| None	None	Initial learning rate
div_final	Numeric	100000.0	Final learning rate: `lr/div_final`
pct_start	float	0.75	Start learning rate cosine annealing
wd	float \| None	None	Weight decay, defaults to `Optimizer` weight decay
next_lr	Listified[float] \| slice \| None	None	Learning rates to switch to at `change_by`. Must be same length as `change_by`
change_by	Listified[int] \| Listified[float] \| None	None	Epochs or percent of steps to switch to `next_lr` by. Must be same length as `next_lr`
change_time	Listified[int] \| Listified[float]	1	If greater than 0 (percent of steps or epochs), how long to change to `next_lr`. Must be same length as `next_lr`
change_sched	Listified[Callable[…, _Annealer]] \| None	None	Schedule(s) for change. Defaults to `SchedCos`. Must be same length as `next_lr`
cbs	Listified[Callback] \| None	None	Temporary Callbacks to apply during fit
reset_opt	bool	False	Reset `Optimizer` before fit

change_sched can be one of SchedLin, SchedCos (the default), SchedExp, SchedPoly, or a custom fastai annealer based schedule. SchedPoly must be passed as partial function: partial(SchedPoly, power=0.5).

Example Fit Flat Varied Schedules

Discriminative Linear Warmup:

learn.fit_flat_varied(4, slice(3e-5, 3e-3), next_lr=3e-3, change_by=1, change_time=1, change_sched=SchedLin)

Multiple Cosine Annealing:

learn.fit_flat_varied((15, 8e-3, next_lr=[6e-3, 4e-3], change_by=[4, 8], change_time=2)

Immediate Change:

learn.fit_flat_varied((10, 8e-3, next_lr=[6e-3, 4e-3], change_by=[0.25, 0.5], change_time=0)