Learning Rate Finder

Learning rate finder modified to restore dataloader and random state after running

When finished running, fastai’s learning rate finder only restores the model weights and optimizer to the initial state.

By default, fastxtend’s learning rate finder additionally restores the dataloader and random state to their inital state, so running Learner.lr_find has no effect on model training.

source

LRFinder

 LRFinder (start_lr=1e-07, end_lr=10, num_it=100, stop_div=True,
           restore_state=True)

Training with exponentially growing learning rate

source

LRFinder.before_fit

 LRFinder.before_fit ()

Initialize container for hyper-parameters and save the model & optimizer, optionally saving dataloader & random state

source

LRFinder.before_batch

 LRFinder.before_batch ()

Set the proper hyper-parameters in the optimizer

source

LRFinder.after_batch

 LRFinder.after_batch ()

Record hyper-parameters of this batch and potentially stop training

source

LRFinder.before_validate

 LRFinder.before_validate ()

Skip the validation part of training

source

LRFinder.after_fit

 LRFinder.after_fit ()

Save the hyper-parameters in the recorder if there is one and load the original model & optimizer, optionally restoring dataloader & random state

source

Learner.lr_find

 Learner.lr_find (start_lr=1e-07, end_lr=10, num_it=100, stop_div=True,
                  show_plot=True, suggest_funcs=<function valley>,
                  restore_state=True)

Launch a mock training to find a good learning rate and return suggestions based on suggest_funcs as a named tuple.

Use restore_state to reset dataloaders and random state after running.

Without restore_state, running lr_find advances both the random state and DataLoaders and behaves the same way as fastai’s lr_find. Which means the following two code blocks will result with different training output.

with no_random():
    dls = get_dls()
    learn = Learner(dls, xresnet18(n_out=dls.c))

with no_random():
    learn.lr_find(restore_state=False)
    learn.fit_one_cycle(2, 3e-3)

with no_random():
    dls = get_dls()
    learn = Learner(dls, xresnet18(n_out=dls.c))

with no_random():
    learn.fit_one_cycle(2, 3e-3)

While the default of restore_state=True prevents this from occurring, it has the potential downside of showing less variance in learning rate results due to every call to lr_find will be over the same first n_iter items using the same random state.