FFCV Writer
DatasetWriter Modified to Support fastxtend’s RGBImageField
fastxtend’s RGBImageField only differs in encoding from FFCV’s RGBImageField. Decoding is the same for both.
This module modifies FFCV’s DatasetWriter to write fastxtend’s RGBImageField as FFCV’s RGBImageField during dataset creation so both FFCV’s Loader and fastxtend’s Loader will read RGBImageField without requiring a custom field.
For dataset interoperability, use fastxtend’s DatasetWriter when creating FFCV datasets using fastxtend’s RGBImageField.
DatasetWriter
DatasetWriter (fname:str, fields:Mapping[str,ffcv.fields.base.Field], page_size:int=8388608, num_workers:int=-1)
Writes given dataset into FFCV format (.beton). Supports indexable objects (e.g., PyTorch Datasets) and webdataset.
| Type | Default | Details | |
|---|---|---|---|
| fname | str | ||
| fields | typing.Mapping[str, ffcv.fields.base.Field] | Map from keys to Field’s (order matters!) | |
| page_size | int | 8388608 | Page size used internally |
| num_workers | int | -1 | Number of processes to use |
DatasetWriter.from_indexed_dataset
DatasetWriter.from_indexed_dataset (dataset, indices:List[int]=None, chunksize=100, shuffle_indices:bool=False)
Read dataset from an indexable dataset. See https://docs.ffcv.io/writing_datasets.html#indexable-dataset for sample usage.
| Type | Default | Details | |
|---|---|---|---|
| dataset | |||
| indices | typing.List[int] | None | Use a subset of the dataset specified by indices. |
| chunksize | int | 100 | Size of chunks processed by each worker during conversion. |
| shuffle_indices | bool | False | Shuffle order of the dataset. |
DatasetWriter.from_webdataset
DatasetWriter.from_webdataset (shards:List[str], pipeline:Callable)
Read from webdataset-like format. See https://docs.ffcv.io/writing_datasets.html#webdataset for sample usage.