FFCV Utilities
Utilities for the fastxtend FFCV integration
fastxtend provides the rgb_dataset_to_ffcv
convenience method for easy FFCV image dataset creation.
rgb_dataset_to_ffcv
uses fastxtend’s Writer
for dataset interoperability with FFCV.
LabelField
LabelField (value, names=None, module=None, qualname=None, type=None, start=1)
An enumeration.
rgb_dataset_to_ffcv
rgb_dataset_to_ffcv (dataset:Union[torch.utils.data.dataset.Dataset,fasta i.data.core.Datasets], write_path:Union[str,pathlib.Path], max_resolution:Optional[int]=None, min_resolution:Optional[int]=None, write_mode:str='raw', smart_threshold:Optional[int]=None, compress_probability:float=0.5, jpeg_quality:float=90, interpolation=3, resample=<Resampling.LANCZOS: 1>, num_workers:int=-1, chunk_size:int=100, pillow_resize:bool=True, label_field:__main__.LabelField=<LabelField.int: 'int'>)
Writes PyTorch/fastai compatible dataset
into FFCV format at filepath write_path
.
Type | Default | Details | |
---|---|---|---|
dataset | Dataset | Datasets | A PyTorch Dataset or single fastai Datasets | |
write_path | str | Path | File name to store dataset in FFCV beton format | |
max_resolution | int | None | None | If maximum side length is greater than max_resolution , resize so maximum side length equals max_resolution |
min_resolution | int | None | None | If minimum side length is greater than min_resolution , resize so minimum side length equals min_resolution |
write_mode | str | raw | RGBImageField write mode: ‘raw’, ‘jpg’, ‘smart’, ‘proportion’ |
smart_threshold | int | None | None | If write_mode='smart' , JPEG-compress RAW bytes is larger than smart_threshold |
compress_probability | float | 0.5 | Probability with which image is JPEG-compressed |
jpeg_quality | float | 90 | Quality to use for jpeg compression if write_mode='proportion' |
interpolation | int | 3 | OpenCV interpolation flag for resizing images with OpenCV |
resample | Resampling | Resampling.LANCZOS | Pillow resampling filter for resizing images with Pillow |
num_workers | int | -1 | Number of workers to use. Defaults to number of CPUs |
chunk_size | int | 100 | Size of chunks processed by each worker |
pillow_resize | bool | True | Use Pillow to resize images instead of OpenCV |
label_field | LabelField | LabelField.int | Use FFCV IntField or FloatField for labels |
write_mode
should be one of:
- ‘raw’: write
uint8
pixel values - ‘jpg’: compress to JPEG format
- ‘smart’: decide between saving pixel values and JPEG compressing based on image size
- ‘proportion’: JPEG compress a random subset of the data with size specified by the
compress_probability
argument