FFCV Writer
DatasetWriter
Modified to Support fastxtend’s RGBImageField
fastxtend’s RGBImageField
only differs in encoding from FFCV’s RGBImageField. Decoding is the same for both.
This module modifies FFCV’s DatasetWriter to write fastxtend’s RGBImageField
as FFCV’s RGBImageField during dataset creation so both FFCV’s Loader and fastxtend’s Loader
will read RGBImageField
without requiring a custom field.
For dataset interoperability, use fastxtend’s DatasetWriter
when creating FFCV datasets using fastxtend’s RGBImageField
.
DatasetWriter
DatasetWriter (fname:str, fields:Mapping[str,ffcv.fields.base.Field], page_size:int=8388608, num_workers:int=-1)
Writes given dataset into FFCV format (.beton). Supports indexable objects (e.g., PyTorch Datasets) and webdataset.
Type | Default | Details | |
---|---|---|---|
fname | str | ||
fields | typing.Mapping[str, ffcv.fields.base.Field] | Map from keys to Field’s (order matters!) | |
page_size | int | 8388608 | Page size used internally |
num_workers | int | -1 | Number of processes to use |
DatasetWriter.from_indexed_dataset
DatasetWriter.from_indexed_dataset (dataset, indices:List[int]=None, chunksize=100, shuffle_indices:bool=False)
Read dataset from an indexable dataset. See https://docs.ffcv.io/writing_datasets.html#indexable-dataset for sample usage.
Type | Default | Details | |
---|---|---|---|
dataset | |||
indices | typing.List[int] | None | Use a subset of the dataset specified by indices. |
chunksize | int | 100 | Size of chunks processed by each worker during conversion. |
shuffle_indices | bool | False | Shuffle order of the dataset. |
DatasetWriter.from_webdataset
DatasetWriter.from_webdataset (shards:List[str], pipeline:Callable)
Read from webdataset-like format. See https://docs.ffcv.io/writing_datasets.html#webdataset for sample usage.