Progressive Resizing

A callback to add automatic progressive resizing of images during training

ProgressiveResize is inspired by MosaicML’s Progressive Resizing algorithm for Composer which in turn was inspired by fastai’s manual progressive resizing.

progressive resizing illustrated

Progressive Resizing decreases model training time by training on smaller images then gradually increasing to the full image size. This allows training on more samples for the same compute budget, often leading to higher performance then training on full sized images.


source

IncreaseMode

 IncreaseMode (value, names=None, module=None, qualname=None, type=None,
               start=1)

Increase mode for ProgressiveResize


source

ProgressiveResize

 ProgressiveResize (initial_size:float|tuple[int,int]=0.5,
                    start:Union[int,float]=0.5,
                    finish:Union[int,float]=0.75, increase_by:int=4, incre
                    ase_mode:__main__.IncreaseMode=<IncreaseMode.Batch:
                    'batch'>, resize_mode:str='bilinear',
                    resize_valid:bool=True,
                    final_size:tuple[int,int]|None=None,
                    add_resize:bool=False, resize_targ:bool=False,
                    empty_cache:bool=False, verbose:str=True,
                    logger_callback:str='wandb')

Basic class handling tweaks of the training loop by changing a Learner in various events

Type Default Details
initial_size float | tuple[int, int] 0.5 Staring size to increase from. Image shape must be square
start Number 0.5 Earliest upsizing epoch in percent of training time or epoch (index 0)
finish Number 0.75 Last upsizing epoch in percent of training time or epoch (index 0)
increase_by int 4 Progressivly increase image size by increase_by, or minimum increase per upsizing epoch
increase_mode IncreaseMode IncreaseMode.Batch Increase image size by training percent or before an epoch starts
resize_mode str bilinear PyTorch interpolate mode string for upsizing. Resets to existing fastai DataLoader mode at final_size.
resize_valid bool True Apply progressive resizing to valid dataset
final_size tuple[int, int] | None None Final image size. Set if using a non-fastai DataLoaders, automatically detected from fastai DataLoader with batch_tfms
add_resize bool False Add a seperate resize step. Use for non-fastai DataLoaders or fastai DataLoader without batch_tfms
resize_targ bool False Applies the seperate resize step to targets
empty_cache bool False Call torch.cuda.empty_cache() before a resizing epoch. May prevent cuda & magma errors. Don’t use with multiple GPUs
verbose str True Print a summary of the progressive resizing schedule
logger_callback str wandb Log image size to logger_callback using Callback.name if avalible

Progressive Resizing initially trains on downsampled images then gradually increases the image size over to the full size for the remainder of training. This can significantly reduce training time at the possible expense of lower model performance, but allows training on more samples in the same compute budget often leading to increased performance. The model must be capable of variable image sizes.

Important

ProgressiveResize should increase GPU throughput which may cause other parts of the training pipeline become a bottleneck. An easy way to increase fastai’s DataLoader throughput is by replacing pillow with pillow-simd.

When testing Composer’s Progressive Resizing callback MosiacML found:

In our experiments, Progressive Resizing improves the attainable tradeoffs between training speed and the final quality of the trained model. In some cases, it leads to slightly lower quality than the original model for the same number of training steps. However, Progressive Resizing increases training speed so much (via improved throughput during the early part of training) that it is possible to train for more steps, recover accuracy, and still complete training in less time.

ProgressiveResize modifies the fastai batch augmentation pipeline by changing the batch_tfms size during training. Specifically, it modifies AffineCoordTfm size, which is set by any rotate, warp, or resize batch augmentation, and/or RandomResizedCropGPU size. This modification prevents unnecessarily resizing images a second time on the GPU, speeding up the process. If there are no batch_tfms or if training with a non-fastai DataLoader, set add_resize=True to resize the batch on the GPU using PyTorch’s interpolate.

Note

If training with ProgressiveResize results in CUDA or Magma errors, try setting increase_mode=IncreaseMode.Epoch and empty_cache=True. This will upsize once per epoch and call torch.cuda.empty_cache() before a resizing epoch. empty_cache=True may interfere with training multiple models on multi-GPU systems.

Progressive resizing works best when the resize steps are small, 8 or less pixels, and spread out over multiple epochs.

ProgressiveResize fully compatible with CutMixUpAugment.

Example

In this example, a ResNet50 is trained for 20 & 25 epochs on Imagenette at an image size of 224 pixels on a SageMaker Studio Lab Tesla T4 instance. Due to the short training run, start and final are set to 0.2 and 0.8, respectively.

Despite increasing the image size early relative to the default hyperparameters, ProgressiveResize yields significant training time savings compared to training at full size. At a similar compute budget of roughly 14 minutes progressive resizing results with 87.8% accuracy compared to 86.2% accuracy with full sized training.

Mode Epochs Time (Mins) Accuracy
Full Size 20 14.3 86.2%
Progressive Batch 20 11.5 85.8%
Progressive Epoch 20 10.5 85.6%
Progressive Batch 25 13.9 87.8%

Progressive Resizing

ProgressiveResize with the default increase_mode=IncreaseMode.Batch.

imagenette = untar_data(URLs.IMAGENETTE_320)

with less_random():
    dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
                        splitter=GrandparentSplitter(valid_name='val'),
                        get_items=get_image_files, get_y=parent_label,
                        item_tfms=Resize(224),
                        batch_tfms=[*aug_transforms(), Normalize.from_stats(*imagenet_stats)])
    dls =  dblock.dataloaders(imagenette, bs=128, num_workers=num_cpus(), pin_memory=True)

    # ProgressiveResizeTest is for additional testing and shouldn't be used
    cbs = [ProgressiveResize(start=0.2, finish=0.8), ProgressiveResizeTest]
    learn = Learner(dls, resnet50(num_classes=dls.c), metrics=Accuracy(), cbs=cbs).to_channelslast()

    start = time.perf_counter()
    learn.fit_one_cycle(20, 3e-3)
    total = time.perf_counter() - start
    print(f'Total training time: {scale_time(total)}')
Progressively increase the initial image size of [112, 112] by 4 pixels every 0.4444 epochs for 28 resizes. 
Starting at epoch 4 and finishing at epoch 16 for a final training size of [224, 224].
epoch train_loss valid_loss accuracy time
0 2.122014 2.182029 0.227516 00:25
1 1.905108 2.192691 0.363057 00:25
2 1.690208 1.839630 0.487389 00:26
3 1.433910 1.606834 0.480764 00:28
4 1.358350 1.460259 0.555669 00:28
5 1.260721 1.169747 0.634904 00:27
6 1.092935 1.578764 0.542166 00:28
7 1.009341 1.090838 0.666242 00:29
8 0.927261 1.372288 0.608917 00:30
9 0.873940 1.454040 0.623185 00:32
10 0.843020 0.977987 0.692994 00:33
11 0.768998 0.878254 0.725605 00:35
12 0.697043 0.772750 0.757962 00:38
13 0.609062 0.762754 0.765095 00:40
14 0.556327 0.709908 0.787771 00:41
15 0.482874 0.562079 0.822675 00:43
16 0.428716 0.499115 0.837707 00:43
17 0.372526 0.468393 0.857580 00:42
18 0.335535 0.467523 0.856306 00:43
19 0.303651 0.456422 0.858344 00:42
Total training time: 687.7 s

ProgressiveResize with increase_mode=IncreaseMode.Epoch.

imagenette = untar_data(URLs.IMAGENETTE_320)

with less_random():
    dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
                        splitter=GrandparentSplitter(valid_name='val'),
                        get_items=get_image_files, get_y=parent_label,
                        item_tfms=Resize(224),
                        batch_tfms=[*aug_transforms(), Normalize.from_stats(*imagenet_stats)])
    dls =  dblock.dataloaders(imagenette, bs=128, num_workers=num_cpus(), pin_memory=True)

    # ProgressiveResizeTest is for additional testing and shouldn't be used
    cbs = [ProgressiveResize(start=0.2, finish=0.8, increase_mode=IncreaseMode.Epoch), ProgressiveResizeTest]
    learn = Learner(dls, resnet50(num_classes=dls.c), metrics=Accuracy(), cbs=cbs).to_channelslast()

    start = time.perf_counter()
    learn.fit_one_cycle(20, 3e-3)
    total = time.perf_counter() - start
    print(f'Total training time: {scale_time(total)}')
Progressively increase the initial image size of [112, 112] by 14 pixels every 1 epoch for 8 resizes.
Starting at epoch 9 and finishing at epoch 16 for a final training size of [224, 224].
epoch train_loss valid_loss accuracy time
0 2.122014 2.182029 0.227516 00:25
1 1.905108 2.192691 0.363057 00:25
2 1.690208 1.839630 0.487389 00:25
3 1.433910 1.606834 0.480764 00:25
4 1.286285 3.116283 0.494522 00:25
5 1.200042 1.254510 0.626242 00:25
6 1.105527 1.396334 0.572484 00:25
7 1.106385 1.339236 0.564586 00:25
8 0.982895 0.885877 0.723822 00:25
9 0.953809 1.116866 0.637197 00:26
10 0.843665 1.307829 0.638471 00:27
11 0.802480 1.135976 0.643822 00:29
12 0.725916 0.727306 0.765350 00:30
13 0.624359 0.705979 0.771465 00:34
14 0.564622 0.571569 0.814267 00:37
15 0.488334 0.559648 0.821911 00:41
16 0.438291 0.491797 0.842548 00:42
17 0.381706 0.471103 0.853503 00:43
18 0.343013 0.457413 0.855032 00:42
19 0.308196 0.461995 0.855796 00:42
Total training time: 632.1 s

Normal Training

imagenette = untar_data(URLs.IMAGENETTE_320)

with less_random():
    dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
                        splitter=GrandparentSplitter(valid_name='val'),
                        get_items=get_image_files, get_y=parent_label,
                        item_tfms=Resize(224),
                        batch_tfms=[*aug_transforms(),Normalize.from_stats(*imagenet_stats)])
    dls =  dblock.dataloaders(imagenette, bs=128, num_workers=num_cpus(), pin_memory=True)

    learn = Learner(dls, resnet50(num_classes=dls.c), metrics=Accuracy()).to_channelslast()

    start = time.perf_counter()
    learn.fit_one_cycle(20, 3e-3)
    total = time.perf_counter() - start
    print(f'Total training time: {scale_time(total)}')
epoch train_loss valid_loss accuracy time
0 2.016391 2.256088 0.241274 00:43
1 1.769153 3.686334 0.311083 00:43
2 1.529073 1.638564 0.471847 00:42
3 1.348534 1.439297 0.554650 00:43
4 1.181351 1.534368 0.530446 00:42
5 1.127666 1.914330 0.532994 00:43
6 1.025626 3.243782 0.461911 00:42
7 0.951557 1.247780 0.625987 00:42
8 0.887196 0.973012 0.694777 00:42
9 0.855779 1.063800 0.663694 00:42
10 0.786207 0.939424 0.721019 00:42
11 0.734812 0.883735 0.711083 00:42
12 0.650073 0.719170 0.777580 00:43
13 0.573434 0.835425 0.745732 00:42
14 0.527194 0.652105 0.806115 00:43
15 0.460990 0.541590 0.826752 00:42
16 0.402692 0.477356 0.851465 00:43
17 0.361319 0.463299 0.861146 00:43
18 0.321333 0.447778 0.863949 00:43
19 0.293849 0.447774 0.862420 00:42
Total training time: 860.5 s

Progressive Resizing with Normalized Compute Budget

ProgressiveResize with the default increase_mode=IncreaseMode.Batch.

imagenette = untar_data(URLs.IMAGENETTE_320)

with less_random():
    dblock = DataBlock(blocks=(ImageBlock, CategoryBlock),
                        splitter=GrandparentSplitter(valid_name='val'),
                        get_items=get_image_files, get_y=parent_label,
                        item_tfms=Resize(224),
                        batch_tfms=[*aug_transforms(), Normalize.from_stats(*imagenet_stats)])
    dls =  dblock.dataloaders(imagenette, bs=128, num_workers=num_cpus(), pin_memory=True)

    # ProgressiveResizeTest is for additional testing and shouldn't be used
    cbs = [ProgressiveResize(start=0.2, finish=0.8), ProgressiveResizeTest]
    learn = Learner(dls, resnet50(num_classes=dls.c), metrics=Accuracy(), cbs=cbs).to_channelslast()

    start = time.perf_counter()
    learn.fit_one_cycle(25, 3e-3)
    total = time.perf_counter() - start
    print(f'Total training time: {scale_time(total)}')
Progressively increase the initial image size of [112, 112] by 4 pixels every 0.5556 epochs for 28 resizes. 
Starting at epoch 5 and finishing at epoch 20 for a final training size of [224, 224].
epoch train_loss valid_loss accuracy time
0 2.140554 2.323328 0.215541 00:25
1 1.924575 1.785451 0.406115 00:25
2 1.716462 2.180465 0.432357 00:25
3 1.513920 1.579820 0.514650 00:25
4 1.322239 3.293287 0.423949 00:25
5 1.261505 1.819019 0.530446 00:26
6 1.116962 1.576113 0.546497 00:26
7 1.065735 1.277910 0.610446 00:26
8 1.023897 1.102415 0.662420 00:27
9 0.937714 1.181165 0.627771 00:28
10 0.887837 1.122100 0.672102 00:28
11 0.846309 1.031982 0.667771 00:30
12 0.786773 0.884016 0.722293 00:31
13 0.693110 1.067082 0.687134 00:32
14 0.653226 0.762704 0.766624 00:33
15 0.614739 0.634725 0.803312 00:35
16 0.551392 0.664690 0.788790 00:37
17 0.509965 0.587787 0.812994 00:38
18 0.454482 0.551515 0.826242 00:40
19 0.391981 0.463849 0.853248 00:42
20 0.354720 0.431224 0.864968 00:43
21 0.304196 0.443742 0.863439 00:43
22 0.261721 0.404501 0.880764 00:43
23 0.248861 0.414708 0.877962 00:43
24 0.242153 0.407861 0.878981 00:43
Total training time: 832.8 s

Progressive Resizing Wandb Logging

try:
    import wandb

    @patch
    def _wandb_log_after_resize(self:ProgressiveResize):
        size = _to_size(self.current_size, step=1)
        wandb.log({'progressive_resize_size': size[0]}, self.learn.wandb._wandb_step+step)
except:
    pass

Extend to other Loggers

To extend to new loggers, follow the Weights & Biases code above and create patches for ProgressiveResize to add a _{Callback.name}_log_after_resize, where Callback.name is the name of the logger callback.

Then to use, pass logger_callback='{Callback.name}' to Learner.profile().

ProgressiveResize sets its _log_after_resize method to f'_{self.logger_callback}_log_after_resize', which should match the patched method.

self._log_after_resize = getattr(self, f'_{self.logger_callback}_log_after_resize', noop)