All fastxtend metrics are classes which inherit from fastai's Metric
and run on Learner
via a modified Recorder
callback.
There are three main metric types: AvgMetricX
, AccumMetricX
, and AvgSmoothMetricX
. These correspond one-to-one with fastai's AvgMetric
, AccumMetric
, and AvgSmoothMetric
.
To jump to the fastxtend metrics reference, click here.
Using a Metric
To use the accuracy metric, or any fastxtend metrics detailed below, create a Learner
like normal (or task specific learner such as vision_learner
, text_classifier_learner
, etc) and add the metric(s) to the metrics
argument:
from fastai.vision.all import *
from fastxtend.vision.all import *
Learner(..., metrics=Accuracy())
Fastxtend metrics can be mixed with fastai metrics:
Learner(..., metrics=[accuracy, Accuracy()])
Fastxtend metrics can be logged during training, validation, or both by setting the log_metric
argument to LogMetric.Train
, LogMetric.Valid
, or LogMetric.Both
. The sole exception is AvgSmoothMetricX
which only logs during training.
LogMetric.Train
to log_metric
:
Learner(..., metrics=Accuracy(log_metric=LogMetric.Train))
Non-scikit-learn metrics can have the log type set via the metric_type
argument to one of MetricType.Avg
, MetricType.Accum
, MetricType.Smooth
, corresponding to AvgMetricX
, AccumMetricX
, and AvgSmoothMetricX
, respectively.
To log a smooth metric on the training set and normal metric on the valid set:
Learner(...,
metrics=[Accuracy(log_metric=LogMetric.Train, metric_type=MetricType.Smooth),
Accuracy()])
Fastxtend metrics also support custom names via the name
argument:
Learner(..., metrics=Accuracy(name='metric_name'))
which will result in Accuracy logging under "metric_name" instead of the default "accuracy".
If a fastxtend metric is logged with multiple MetricType
s, the fastxtend Recorder
will automatically deduplication the metric names. Unless the metric's name
argument is set. Then fastxtend will not deduplicate any metric names.
Creating a Metric
AvgMetricX
, AccumMetricX
, and AvgSmoothMetricX
all require func
, which is a funcational implementation of the metric. The signature of func
should be inp,targ
(where inp
are the predictions of the model and targ
the corresponding labels).
Fastxtend metrics can be logged during training, validation, or both by setting the log_metric
argument to LogMetric.Train
, LogMetric.Valid
, or LogMetric.Both
. The sole exception is AvgSmoothMetricX
which only computes during training.
AvgMetricX
, AccumMetricX
, and AvgSmoothMetricX
will automatically recognize and pass any func
's unique arguments to func
.
AvgMetricX
via MetricType.Avg
, as the mean of multiple batches of RMSE isn’t equal to the RMSE of the whole dataset. For these metrics use AccumMetricX
via MetricType.Accum
.def example_accuracy(inp, targ):
return (inp == targ).float().mean()
def ExampleAccuracy(dim_argmax=-1, log_metric=LogMetric.Valid, **kwargs):
return AvgMetricX(example_accuracy, dim_argmax=dim_argmax, log_metric=log_metric, **kwargs)
Alternatively, use the func_to_metric
convenience method to create the metric:
def ExampleAccuracy(axis=-1, log_metric=LogMetric.Valid, **kwargs):
return func_to_metric(example_accuracy, MetricType.Avg, True, axis=axis, log_metric=log_metric, **kwargs)
It is also possible to inherit directly from MetricX
to create a fastxtend metric.
class ExampleAccuracy(MetricX):
def __init__(self, dim_argmax=-1, log_metric=LogMetric.Valid, **kwargs):
super().__init__(dim_argmax=dim_argmax, log_metric=log_metric, **kwargs)
def reset(self): self.preds,self.targs = [],[]
def accumulate(self, learn):
super().accumulate(learn)
self.preds.append(learn.to_detach(self.pred))
self.targs.append(learn.to_detach(self.targ))
@property
def value(self):
if len(self.preds) == 0: return
preds,targs = torch.cat(self.preds),torch.cat(self.targs)
return (preds == targs).float().mean()
MetricX
has state depending on tensors, don’t forget to store it on the CPU to avoid any potential memory leaks.Additional Metrics Functionality
MetricX
, and classes which inherit from MetricX
such as AvgMetricX
, AccumMetricX
, and AvgSmoothMetricX
, have optional helper functionality in MetricX.accumulate
to assist in developing metrics.
For classification problems with single label, predictions need to be transformed with a softmax then an argmax before being compared to the targets. Since a softmax doesn't change the order of the numbers, apply the argmax. Pass along dim_argmax
to have this done by MetricX
(usually -1 will work pretty well). If the metric implementation requires probabilities and not predictions, use softmax=True
.
For classification problems with multiple labels, or if targets are one-hot encoded, predictions may need to pass through a sigmoid (if it wasn't included in in the model) then be compared to a given threshold (to decide between 0 and 1), this is done by MetricX
by passing sigmoid=True
and/or a value for thresh
.
AvgMetricX
, AccumMetricX
, and AvgSmoothMetricX
have two additional arguments to assist in creating metrics: to_np
and invert_arg
.
For example, if using a functional metric from sklearn.metrics, predictions and labels will need to be converted to numpy arrays with to_np=True
. Also, scikit-learn metrics adopt the convention y_true
, y_preds
which is the opposite from fastai, so pass invert_arg=True
to make AvgMetricX
, AccumMetricX
, and AvgSmoothMetricX
do the inversion. Alternatively, use the skm_to_fastxtend
convenience method to handle sklearn.metrics automatically.
MetricX
will only log during validation. Metrics can individually set to run during training, validation, or both by passing LogMetric.Train
, LogMetric.Valid
, or LogMetric.Both
to log_metric
, respectively.dim_argmax
to have this done by MetricX
(usually -1 will work pretty well). If the metric implementation requires probabilities and not predictions, use softmax=True
.
For classification problems with multiple labels, or if targets are one-hot encoded, predictions may need to pass through a sigmoid (if it wasn't included in in the model) then be compared to a given threshold (to decide between 0 and 1), this is done by MetricX
by passing sigmoid=True
and/or a value for thresh
.
Metrics can be simple averages (like accuracy) but sometimes their computation is a little bit more complex and can't be averaged over batches (like precision or recall), which is why we need a special AccumMetricX
class for them. For simple functions that can be computed as averages over batches, we can use the class AvgMetricX
, otherwise you'll need to implement the following methods.
MetricX
has state depending on tensors, don’t forget to store it on the CPU to avoid any potential memory leaks.func
is applied to each batch of predictions/targets and then averaged when value
attribute is asked for.The signature of func
should be inp,targ
(where inp
are the predictions of the model and targ
the corresponding labels).
AvgMetricX
, as the mean of multiple batches of RMSE isn’t equal to the RMSE of the whole dataset. For these metrics use AccumMetricX
.to_np=True
. Also, scikit-learn metrics adopt the convention y_true
, y_preds
which is the opposite from fastai, so pass invert_arg=True
to make AvgMetricX
, AccumMetricX
, and AvgSmoothMetricX
do the inversion. Alternatively, use the skm_to_fastxtend
convenience method to handle sklearn.metrics automatically.
By default, fastxtend's scikit-learn metrics use AccumMetricX
.
func
is only applied to the accumulated predictions/targets when the value
attribute is asked for (so at the end of a validation/training phase, in use with Learner
and its Recorder
).The signature of func
should be inp,targ
(where inp
are the predictions of the model and targ
the corresponding labels).
If using a functional metric from sklearn.metrics, predictions and labels will need to be converted to numpy arrays with to_np=True
. Also, scikit-learn metrics adopt the convention y_true
, y_preds
which is the opposite from fastai, so pass invert_arg=True
to make AvgMetricX
, AccumMetricX
, and AvgSmoothMetricX
do the inversion. Alternatively, use the skm_to_fastxtend
convenience method to handle sklearn.metrics automatically.
By default, fastai's scikit-learn metrics use AccumMetricX
.
func
is only applied to the accumulated predictions/targets when the value
attribute is asked for (so at the end of a validation/training phase, in use with Learner
and its Recorder
).The signature of func
should be inp,targ
(where inp
are the predictions of the model and targ
the corresponding labels).
If using a functional metric from sklearn.metrics, predictions and labels will need to be converted to numpy arrays with to_np=True
. Also, scikit-learn metrics adopt the convention y_true
, y_preds
which is the opposite from fastai, so pass invert_arg=True
to make AvgMetricX
, AccumMetricX
, and AvgSmoothMetricX
do the inversion. Alternatively, use the skm_to_fastxtend
convenience method to handle sklearn.metrics automatically.
This is the quickest way to use a functional metric as a fastxtend metric.
metric_type
is one of MetricType.Avg
, MetricType.Accum
, or MetricType.Smooth
which set the metric to use AvgMetricX
, AccumMetricX
, or AvgSmoothMetricX
, respectively.
is_class
indicates if you are in a classification problem or not. In this case:
- leaving
thresh
toNone
indicates it's a single-label classification problem and predictions will pass through an argmax overaxis
before being compared to the targets - setting a value for
thresh
indicates it's a multi-label classification problem and predictions will pass through a sigmoid (can be deactivated withsigmoid=False
) and be compared tothresh
before being compared to the targets
If is_class=False
, it indicates you are in a regression problem, and predictions are compared to the targets without being modified. In all cases, kwargs
are extra keyword arguments passed to func
.
AvgMetricX
via MetricType.Avg
, as the mean of multiple batches of RMSE isn’t equal to the RMSE of the whole dataset. For these metrics use AccumMetricX
by setting metric_type
to MetricType.Accum
.This is the quickest way to use a scikit-learn metric using fastxtend metrics. It is the same as func_to_metric
except it defaults to using AccumMetricX
.
vocab.map_obj
.See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scikit-learn documentation for more details.
See the scipy documentation for more details.
See the scipy documentation for more details.
The DiceMulti method implements the "Averaged F1: arithmetic mean over harmonic means" described in this publication: https://arxiv.org/pdf/1911.03347.pdf
The BLEU metric was introduced in this article to come up with a way to evaluate the performance of translation models. It's based on the precision of n-grams in your prediction compared to your target. See the fastai NLP course BLEU notebook for a more detailed description of BLEU.
The smoothing used in the precision calculation is the same as in SacreBLEU, which in turn is "method 3" from the Chen & Cherry, 2014 paper.