Progressive Resizing decreases model training time by training on smaller images then gradually increasing to the full image size. This allows training on more samples for the same compute budget, often leading to higher performance then training on full sized images.
Progressively increase the size of input images during training. Starting from initial_size and ending at the valid image size or final_size.
Type
Default
Details
initial_size
float | tuple[int, int]
0.5
Staring size to increase from. Image shape must be square
start
Numeric
0.5
Earliest upsizing epoch in percent of training time or epoch (index 0)
finish
Numeric
0.75
Last upsizing epoch in percent of training time or epoch (index 0)
increase_by
int
4
Progressively increase image size by increase_by, or minimum increase per upsizing epoch
increase_mode
IncreaseMode
IncreaseMode.Batch
Increase image size anytime during training or only before an epoch starts
resize_mode
str
bilinear
PyTorch interpolate mode string for upsizing. Resets to existing fastai DataLoader mode at final_size
resize_valid
bool
True
Apply progressive resizing to valid dataset
final_size
tuple[int, int] | None
None
Final image size. Set if using a non-fastai DataLoaders, automatically detected from fastai DataLoader with batch_tfms
add_resize
bool
False
Add a separate resize step. Use for non-fastai DataLoaders or fastai DataLoader without batch_tfms
resize_targ
bool
False
Applies the separate resize step to targets
preallocate_bs
int | None
None
Preallocation batch size. Set if the valid DataLoader has a larger batch size than the train DataLoader.
preallocate
bool
True
Preallocate GPU memory with full size image. Can mitigate memory allocation slowdowns during training. If False set final_size.
empty_cache
bool
False
Call torch.cuda.empty_cache() before a resizing epoch. May prevent Cuda & Magma errors. Don’t use with multiple GPUs
verbose
bool
True
Print a summary of the progressive resizing schedule
Progressive Resizing initially trains on downsampled images then gradually increases the image size over to the full size for the remainder of training.
This can significantly reduce training time at the possible expense of lower model performance. However, Progressive Resizing allows training on more samples within the same compute budget, usually leading to increased performance.
The model must be capable of variable image sizes.
Tip: Increase DataLoader Throughput
ProgressiveResize should increase GPU throughput which may cause other parts of the training pipeline become a bottleneck. You can test for a DataLoader bottleneck using a fastxtend profiler.
In our experiments, Progressive Resizing improves the attainable tradeoffs between training speed and the final quality of the trained model. In some cases, it leads to slightly lower quality than the original model for the same number of training steps. However, Progressive Resizing increases training speed so much (via improved throughput during the early part of training) that it is possible to train for more steps, recover accuracy, and still complete training in less time.
ProgressiveResize modifies the fastai batch augmentation pipeline by changing the batch_tfms size during training. Specifically, it modifies AffineCoordTfm size, which is set by any rotate, warp, or resize batch augmentation, and/or RandomResizedCropGPU size. This modification prevents unnecessarily resizing images a second time on the GPU, speeding up the process. If there are no batch_tfms or if training without a fastai DataLoader or fastxtend Loader, set add_resize=True to resize the batch on the GPU using PyTorch’s interpolate.
Progressive Resizing works best when the resize steps are spread out over a significant portion of the dataset.
Tip: Progressive Resizing & Small Datasets
If training small datasets with ProgressiveResize, such as Imagenette, scale the batch mode increase amount to be larger than the default of 4 by setting increase_by to a custom value.
In the example section, increase_by=16 gives good results for training Imagenette for 20-25 epochs.
Important: Preallocating GPU Memory Uses a Validation Batch
Before training starts, ProgressiveResize performs a dry run to preallocate GPU memory required for training on full images. This can prevent stuttering during training due to memory allocation.
If the validation batch size is larger than the training batch size, set preallocate_bs to the training batch size so ProgressiveResize will preallocate the correct amount of memory.
Validation images must be the same size as the full training image size.
If training on older versions of PyTorch with ProgressiveResize results in CUDA or Magma errors, try setting increase_mode=IncreaseMode.Epoch and empty_cache=True.
This will upsize once per epoch and call torch.cuda.empty_cache() before a resizing epoch. empty_cache=True may interfere with training multiple models on multi-GPU systems.
In this example1, a xresnext50 is trained for 20 & 25 epochs on Imagenette at an image size of 224 pixels. Due to the short training run and small dataset, ProgressiveResize in batch mode is set to increase_by=16.
ProgressiveResize yields significant training time savings compared to training at full size. At a normalized compute budget of roughly 6.5 minutes, Progressive Resizing results with 92.7% accuracy compared to 92% accuracy with full sized training.
Mode
Epochs
Time (Mins)
Accuracy
Full Size
20
6.5
92.0%
Progressive Batch
20
5.2
92.3%
Progressive Epoch
20
5.2
91.8%
Progressive Batch
25
6.5
92.7%
Due to the regularization effect of training on different sized images, Progressive Resizing with increase_by=16 outperforms full sized training by 0.3% in 25 percent less timeon the same number of epochs2 .
Progressive Resizing
There are two Progressive Resizing IncreaseMode types:
increase_mode=IncreaseMode.Batch
increase_mode=IncreaseMode.Epoch
this example will show both.
Batch Resizing
ProgressiveResize with the default increase_mode=IncreaseMode.Batch.
Progressively increase the initial image size of [112, 112] by 16 pixels every 0.8333 epochs for 7 resizes.
Starting at epoch 10 and finishing at epoch 15 for a final training size of [224, 224].
Total training time: 311.8 s
Progressively increase the initial image size of [112, 112] by 28 pixels every 1 epoch for 4 resizes.
Starting at epoch 12 and finishing at epoch 15 for a final training size of [224, 224].
Total training time: 309.3 s
epoch
train_loss
valid_loss
accuracy
time
0
1.670977
1.883999
0.454268
00:13
1
1.403678
1.226364
0.710573
00:13
2
1.251599
1.446574
0.626497
00:13
3
1.136825
1.079901
0.768662
00:13
4
1.062239
1.250891
0.718981
00:13
5
1.006945
0.955187
0.820127
00:13
6
0.957047
1.238453
0.703439
00:13
7
0.910177
0.900485
0.842548
00:13
8
0.889880
0.963289
0.816560
00:13
9
0.860453
0.881689
0.849936
00:14
10
0.839285
0.916867
0.835159
00:13
11
0.817720
0.837916
0.866242
00:13
12
0.792356
0.844864
0.869045
00:14
13
0.780980
0.811714
0.878471
00:15
14
0.780541
0.870851
0.853758
00:17
15
0.766215
0.788430
0.888153
00:19
16
0.709244
0.788267
0.887134
00:19
17
0.649643
0.732368
0.915159
00:19
18
0.611495
0.717171
0.915414
00:19
19
0.590308
0.708605
0.918471
00:19
Normal Training
fastai model training without Progressive Resizing.
Progressively increase the initial image size of [112, 112] by 16 pixels every 1.042 epochs for 7 resizes.
Starting at epoch 12.5 and finishing at epoch 18.75 for a final training size of [224, 224].
Total training time: 390.5 s
epoch
train_loss
valid_loss
accuracy
time
0
1.670977
1.883999
0.454268
00:13
1
1.403678
1.226364
0.710573
00:13
2
1.251599
1.446574
0.626497
00:13
3
1.136825
1.079901
0.768662
00:13
4
1.062239
1.250891
0.718981
00:13
5
1.006945
0.955187
0.820127
00:13
6
0.957047
1.238453
0.703439
00:13
7
0.910177
0.900485
0.842548
00:13
8
0.889880
0.963289
0.816560
00:13
9
0.860453
0.881689
0.849936
00:13
10
0.839285
0.916867
0.835159
00:13
11
0.817720
0.837916
0.866242
00:13
12
0.806093
0.869887
0.850701
00:14
13
0.780977
0.805412
0.877962
00:14
14
0.766974
0.899283
0.839490
00:14
15
0.757296
0.811422
0.878726
00:15
16
0.736302
0.855174
0.853758
00:15
17
0.723357
0.769306
0.901401
00:16
18
0.714021
0.765733
0.895287
00:18
19
0.697444
0.736115
0.911847
00:19
20
0.663537
0.790711
0.881783
00:19
21
0.617896
0.712593
0.919745
00:19
22
0.583567
0.710089
0.918471
00:19
23
0.562689
0.685103
0.927643
00:19
24
0.551753
0.686037
0.926879
00:19
Footnotes
All models are trained on a GeForce 3080 Ti using PyTorch 1.13.1 and Cuda 11.7. Results may differ with other datasets, hardware, and across runs.↩︎
While Progressive Resizing can sometimes outperform full sized trained model in the same number of epochs, it is just as likely to perform worse, depending on setup.↩︎