FFCV Utilities
Utilities for the fastxtend FFCV integration
fastxtend provides the rgb_dataset_to_ffcv convenience method for easy FFCV image dataset creation.
rgb_dataset_to_ffcv uses fastxtend’s Writer for dataset interoperability with FFCV.
LabelField
LabelField (value, names=None, module=None, qualname=None, type=None, start=1)
An enumeration.
rgb_dataset_to_ffcv
rgb_dataset_to_ffcv (dataset:Union[torch.utils.data.dataset.Dataset,fasta i.data.core.Datasets], write_path:Union[str,pathlib.Path], max_resolution:Optional[int]=None, min_resolution:Optional[int]=None, write_mode:str='raw', smart_threshold:Optional[int]=None, compress_probability:float=0.5, jpeg_quality:float=90, interpolation=3, resample=<Resampling.LANCZOS: 1>, num_workers:int=-1, chunk_size:int=100, pillow_resize:bool=True, label_field:__main__.LabelField=<LabelField.int: 'int'>)
Writes PyTorch/fastai compatible dataset into FFCV format at filepath write_path.
| Type | Default | Details | |
|---|---|---|---|
| dataset | Dataset | Datasets | A PyTorch Dataset or single fastai Datasets | |
| write_path | str | Path | File name to store dataset in FFCV beton format | |
| max_resolution | int | None | None | If maximum side length is greater than max_resolution, resize so maximum side length equals max_resolution |
| min_resolution | int | None | None | If minimum side length is greater than min_resolution, resize so minimum side length equals min_resolution |
| write_mode | str | raw | RGBImageField write mode: ‘raw’, ‘jpg’, ‘smart’, ‘proportion’ |
| smart_threshold | int | None | None | If write_mode='smart', JPEG-compress RAW bytes is larger than smart_threshold |
| compress_probability | float | 0.5 | Probability with which image is JPEG-compressed |
| jpeg_quality | float | 90 | Quality to use for jpeg compression if write_mode='proportion' |
| interpolation | int | 3 | OpenCV interpolation flag for resizing images with OpenCV |
| resample | Resampling | Resampling.LANCZOS | Pillow resampling filter for resizing images with Pillow |
| num_workers | int | -1 | Number of workers to use. Defaults to number of CPUs |
| chunk_size | int | 100 | Size of chunks processed by each worker |
| pillow_resize | bool | True | Use Pillow to resize images instead of OpenCV |
| label_field | LabelField | LabelField.int | Use FFCV IntField or FloatField for labels |
write_mode should be one of:
- ‘raw’: write
uint8pixel values - ‘jpg’: compress to JPEG format
- ‘smart’: decide between saving pixel values and JPEG compressing based on image size
- ‘proportion’: JPEG compress a random subset of the data with size specified by the
compress_probabilityargument