FFCV Writer

FFCV’s DatasetWriter Modified to Support fastxtend’s RGBImageField

fastxtend’s RGBImageField only differs in encoding from FFCV’s RGBImageField. Decoding is the same for both.

This module modifies FFCV’s DatasetWriter to write fastxtend’s RGBImageField as FFCV’s RGBImageField during dataset creation so both FFCV’s Loader and fastxtend’s Loader will read RGBImageField without requiring a custom field.

For dataset interoperability, use fastxtend’s DatasetWriter when creating FFCV datasets using fastxtend’s RGBImageField.


source

DatasetWriter

 DatasetWriter (fname:str, fields:Mapping[str,ffcv.fields.base.Field],
                page_size:int=8388608, num_workers:int=-1)

Writes given dataset into FFCV format (.beton). Supports indexable objects (e.g., PyTorch Datasets) and webdataset.

Type Default Details
fname str
fields typing.Mapping[str, ffcv.fields.base.Field] Map from keys to Field’s (order matters!)
page_size int 8388608 Page size used internally
num_workers int -1 Number of processes to use

DatasetWriter.from_indexed_dataset

 DatasetWriter.from_indexed_dataset (dataset, indices:List[int]=None,
                                     chunksize=100,
                                     shuffle_indices:bool=False)

Read dataset from an indexable dataset. See https://docs.ffcv.io/writing_datasets.html#indexable-dataset for sample usage.

Type Default Details
dataset
indices typing.List[int] None Use a subset of the dataset specified by indices.
chunksize int 100 Size of chunks processed by each worker during conversion.
shuffle_indices bool False Shuffle order of the dataset.

DatasetWriter.from_webdataset

 DatasetWriter.from_webdataset (shards:List[str], pipeline:Callable)

Read from webdataset-like format. See https://docs.ffcv.io/writing_datasets.html#webdataset for sample usage.