fastxtend
Train fastai models faster with fastxtend’s fused optimizers, Progressive Resizing callback, integrated FFCV DataLoader, and integrated PyTorch Compile support.
Feature overview
Train Models Faster
- Drop in fused optimizers, which are 21 to 293 percent faster then fastai native optimizers.
- Up to 75% optimizer memory savings with integrated bitsandbytes 8-bit optimizers.
- Increase GPU throughput and decrease training time with the Progressive Resizing callback.
- Use the highly optimized FFCV DataLoader, fully integrated with fastai.
- Integrated support for
torch.compilevia the Compile callbacks.
General Features
- Fused implementations of modern optimizers, such as Adan, Lion, & StableAdam.
- Hugging Face Transformers compatibility with fastai
- Flexible metrics which can log on train, valid, or both. Backwards compatible with fastai metrics.
- Easily use multiple losses and log each individual loss on train and valid.
- Multiple profilers for profiling training and identifying bottlenecks.
- A fast Exponential Moving Average callback for smoother training.
Vision
- Apply
MixUp,CutMix, or Augmentations at once withCutMixUporCutMixUpAugment. - Additional image augmentations.
- Support for running fastai batch transforms on CPU.
- More attention and pooling modules
- A flexible implementation of fastai’s
XResNet.
Check out the documentation for additional splitters, callbacks, schedulers, utilities, and more.
Install
fastxtend is avalible on pypi:
pip install fastxtendfastxtend can be installed with task-specific dependencies for vision, ffcv, text, audio, or all:
pip install "fastxtend[all]"To easily install most prerequisites for all fastxtend features, use Conda or Miniconda:
conda create -n fastxtend python=3.11 "pytorch>=2.1" torchvision torchaudio \
pytorch-cuda=12.1 fastai nbdev pkg-config libjpeg-turbo opencv tqdm psutil \
terminaltables numpy "numba>=0.57" librosa timm kornia rich typer wandb \
"transformers>=4.34" "tokenizers>=0.14" "datasets>=2.14" ipykernel ipywidgets \
"matplotlib<3.8" -c pytorch -c nvidia -c fastai -c huggingface -c conda-forge
conda activate fastxtend
pip install "fastxtend[all]"replacing pytorch-cuda=12.1 with your preferred supported version of Cuda.
To create an editable development install:
git clone https://github.com/warner-benjamin/fastxtend.git
cd fastxtend
pip install -e ".[dev]"Usage
Like fastai, fastxtend provides safe wildcard imports using python’s __all__.
from fastai.vision.all import *
from fastxtend.vision.all import *
from fastxtend.ffcv.all import *In general, import fastxtend after all fastai imports, as fastxtend modifies fastai. Any method modified by fastxtend is backwards compatible with the original fastai code.
Examples
Use a fused ForEach optimizer:
Learner(..., opt_func=adam(foreach=True))Or a bitsandbytes 8-bit optimizer:
Learner(..., opt_func=adam(eightbit=True))Speed up image training using Progressive Resizing:
Learner(... cbs=ProgressiveResize())Log an accuracy metric on the training set as a smoothed metric and validation set like normal:
Learner(..., metrics=[Accuracy(log_metric=LogMetric.Train, metric_type=MetricType.Smooth),
Accuracy()])Log multiple losses as individual metrics on train and valid:
mloss = MultiLoss(loss_funcs=[nn.MSELoss, nn.L1Loss],
weights=[1, 3.5], loss_names=['mse_loss', 'l1_loss'])
Learner(..., loss_func=mloss, metrics=RMSE(), cbs=MultiLossCallback)Compile a model with torch.compile:
from fastxtend.callback import compiler
learn = Learner(...).compile()Profile a fastai training loop:
from fastxtend.callback import simpleprofiler
learn = Learner(...).profile()
learn.fit_one_cycle(2, 3e-3)Benchmark
To run the benchmark on your own machine, see the example scripts for details on how to replicate.