fastxtend
Train fastai models faster with fastxtend’s fused optimizers, Progressive Resizing callback, integrated FFCV DataLoader, and integrated PyTorch Compile support.
Feature overview
Train Models Faster
- Drop in fused optimizers, which are 21 to 293 percent faster then fastai native optimizers.
- Up to 75% optimizer memory savings with integrated bitsandbytes 8-bit optimizers.
- Increase GPU throughput and decrease training time with the Progressive Resizing callback.
- Use the highly optimized FFCV DataLoader, fully integrated with fastai.
- Integrated support for
torch.compile
via the Compile callbacks.
General Features
- Fused implementations of modern optimizers, such as Adan, Lion, & StableAdam.
- Hugging Face Transformers compatibility with fastai
- Flexible metrics which can log on train, valid, or both. Backwards compatible with fastai metrics.
- Easily use multiple losses and log each individual loss on train and valid.
- Multiple profilers for profiling training and identifying bottlenecks.
- A fast Exponential Moving Average callback for smoother training.
Vision
- Apply
MixUp
,CutMix
, or Augmentations at once withCutMixUp
orCutMixUpAugment
. - Additional image augmentations.
- Support for running fastai batch transforms on CPU.
- More attention and pooling modules
- A flexible implementation of fastai’s
XResNet
.
Check out the documentation for additional splitters, callbacks, schedulers, utilities, and more.
Install
fastxtend is avalible on pypi:
pip install fastxtend
fastxtend can be installed with task-specific dependencies for vision
, ffcv
, text
, audio
, or all
:
pip install "fastxtend[all]"
To easily install most prerequisites for all fastxtend features, use Conda or Miniconda:
conda create -n fastxtend python=3.11 "pytorch>=2.1" torchvision torchaudio \
\
pytorch-cuda=12.1 fastai nbdev pkg-config libjpeg-turbo opencv tqdm psutil "numba>=0.57" librosa timm kornia rich typer wandb \
terminaltables numpy "transformers>=4.34" "tokenizers>=0.14" "datasets>=2.14" ipykernel ipywidgets \
"matplotlib<3.8" -c pytorch -c nvidia -c fastai -c huggingface -c conda-forge
conda activate fastxtend
pip install "fastxtend[all]"
replacing pytorch-cuda=12.1
with your preferred supported version of Cuda.
To create an editable development install:
git clone https://github.com/warner-benjamin/fastxtend.git
cd fastxtend
pip install -e ".[dev]"
Usage
Like fastai, fastxtend provides safe wildcard imports using python’s __all__
.
from fastai.vision.all import *
from fastxtend.vision.all import *
from fastxtend.ffcv.all import *
In general, import fastxtend after all fastai imports, as fastxtend modifies fastai. Any method modified by fastxtend is backwards compatible with the original fastai code.
Examples
Use a fused ForEach optimizer:
=adam(foreach=True)) Learner(..., opt_func
Or a bitsandbytes 8-bit optimizer:
=adam(eightbit=True)) Learner(..., opt_func
Speed up image training using Progressive Resizing:
=ProgressiveResize()) Learner(... cbs
Log an accuracy metric on the training set as a smoothed metric and validation set like normal:
=[Accuracy(log_metric=LogMetric.Train, metric_type=MetricType.Smooth),
Learner(..., metrics Accuracy()])
Log multiple losses as individual metrics on train and valid:
= MultiLoss(loss_funcs=[nn.MSELoss, nn.L1Loss],
mloss =[1, 3.5], loss_names=['mse_loss', 'l1_loss'])
weights
=mloss, metrics=RMSE(), cbs=MultiLossCallback) Learner(..., loss_func
Compile a model with torch.compile
:
from fastxtend.callback import compiler
= Learner(...).compile() learn
Profile a fastai training loop:
from fastxtend.callback import simpleprofiler
= Learner(...).profile()
learn 2, 3e-3) learn.fit_one_cycle(
Benchmark
To run the benchmark on your own machine, see the example scripts for details on how to replicate.