Metrics Extended

A backwards compatible reimplementation of fastai metrics to increase usability and flexibility.

fastxtend’s Metrics Extended is an enhancement of fastai metrics and is backward compatible with fastai metrics. You can mix and match fastxtend and fastai metrics in the same Learner.

fastxtend metrics add the following features to fastai metrics:

  1. fastxtend metrics can independently log on train, valid, or both train and valid
  2. All fastxtend metrics can use the activation support of fastai.metrics.AccumMetric, inherited from MetricX
  3. fastxtend metrics add AvgSmoothMetricX, a metric version of fastai.learner.AvgSmoothLoss

There are three main metric types: AvgMetricX, AccumMetricX, and AvgSmoothMetricX. These correspond one-to-one with fastai.learner.AvgMetric, fastai.metrics.AccumMetric, and fastai.learner.AvgSmoothLoss. fastxtend metrics inherit from fastai.learner.Metric and run on fastai.learner.Learner via a modified fastai.learner.Recorder callback.

To jump to the fastxtend metrics reference, click here.

Important

To maintain backward compatibility with fastai Metrics, importing the Metrics Extanded module patches Recorder to add the new features. With the exception of error stack traces, this should be unnoticable to end users.

Note

Documentation for metrics are lightly adapted from the fastai metrics documentation.

Using a Metric

To use the accuracy metric, or any fastxtend metrics detailed below, create a Learner like normal (or task specific learner such as vision_learner, text_classifier_learner, etc) and add the metric(s) to the metrics argument:

from fastai.vision.all import *
from fastxtend.vision.all import *

Learner(..., metrics=Accuracy())

Fastxtend metrics can be mixed with fastai metrics:

Learner(..., metrics=[accuracy, Accuracy()])

Fastxtend metrics can be logged during training, validation, or both by setting the log_metric argument to LogMetric.Train, LogMetric.Valid, or LogMetric.Both. The sole exception is AvgSmoothMetricX which only logs during training.

Note

By default, a fastxtend metric will log during validation. fastai metrics can only log during validation.

To log a fastxtend metric during training pass LogMetric.Train to log_metric:

Learner(..., metrics=Accuracy(log_metric=LogMetric.Train))

Non-scikit-learn metrics can have the log type set via the metric_type argument to one of MetricType.Avg, MetricType.Accum, MetricType.Smooth, corresponding to AvgMetricX, AccumMetricX, and AvgSmoothMetricX, respectively.

To log a smooth metric on the training set and normal metric on the valid set:

Learner(..., 
        metrics=[Accuracy(log_metric=LogMetric.Train, metric_type=MetricType.Smooth), 
                 Accuracy()])

Fastxtend metrics also support custom names via the name argument:

Learner(..., metrics=Accuracy(name='metric_name'))

which will result in Accuracy logging under “metric_name” instead of the default “accuracy”.

If a fastxtend metric is logged with multiple MetricTypes, the fastxtend Recorder will automatically deduplication the metric names. Unless the metric’s name argument is set. Then fastxtend will not deduplicate any metric names.

Creating a Metric

AvgMetricX, AccumMetricX, and AvgSmoothMetricX all require func, which is a funcational implementation of the metric. The signature of func should be inp,targ (where inp are the predictions of the model and targ the corresponding labels).

Fastxtend metrics can be logged during training, validation, or both by setting the log_metric argument to LogMetric.Train, LogMetric.Valid, or LogMetric.Both. The sole exception is AvgSmoothMetricX which only computes during training.

AvgMetricX, AccumMetricX, and AvgSmoothMetricX will automatically recognize and pass any func’s unique arguments to func.

Important

Some metrics, like Root Mean Squared Error, will have incorrect results if passed to AvgMetricX via MetricType.Avg, as the mean of multiple batches of RMSE isn’t equal to the RMSE of the whole dataset. For these metrics use AccumMetricX via MetricType.Accum.

An example of creating a fastxtend metric from a functional implementation:

def example_accuracy(inp, targ):
    return (inp == targ).float().mean()

def ExampleAccuracy(dim_argmax=-1, log_metric=LogMetric.Valid, **kwargs):
    return AvgMetricX(example_accuracy, dim_argmax=dim_argmax, log_metric=log_metric, **kwargs)

Alternatively, use the func_to_metric convenience method to create the metric:

def ExampleAccuracy(axis=-1, log_metric=LogMetric.Valid, **kwargs):
    return func_to_metric(example_accuracy, MetricType.Avg, True, axis=axis, log_metric=log_metric, **kwargs)

It is also possible to inherit directly from MetricX to create a fastxtend metric.

class ExampleAccuracy(MetricX):
    def __init__(self, dim_argmax=-1, log_metric=LogMetric.Valid, **kwargs):
    super().__init__(dim_argmax=dim_argmax, log_metric=log_metric, **kwargs)

    def reset(self): self.preds,self.targs = [],[]

    def accumulate(self, learn):
        super().accumulate(learn)
        self.preds.append(learn.to_detach(self.pred))
        self.targs.append(learn.to_detach(self.targ))

    @property
    def value(self):
        if len(self.preds) == 0: return
        preds,targs = torch.cat(self.preds),torch.cat(self.targs)
        return (preds == targs).float().mean()
Important

If your custom MetricX has state depending on tensors, don’t forget to store it on the CPU to avoid any potential memory leaks.

Additional Metrics Functionality

MetricX, and classes which inherit from MetricX such as AvgMetricX, AccumMetricX, and AvgSmoothMetricX, have optional helper functionality in MetricX.accumulate to assist in developing metrics.

For classification problems with single label, predictions need to be transformed with a softmax then an argmax before being compared to the targets. Since a softmax doesn’t change the order of the numbers, apply the argmax. Pass along dim_argmax to have this done by MetricX (usually -1 will work pretty well). If the metric implementation requires probabilities and not predictions, use softmax=True.

For classification problems with multiple labels, or if targets are one-hot encoded, predictions may need to pass through a sigmoid (if it wasn’t included in in the model) then be compared to a given threshold (to decide between 0 and 1), this is done by MetricX by passing sigmoid=True and/or a value for thresh.

AvgMetricX, AccumMetricX, and AvgSmoothMetricX have two additional arguments to assist in creating metrics: to_np and invert_arg.

For example, if using a functional metric from sklearn.metrics, predictions and labels will need to be converted to numpy arrays with to_np=True. Also, scikit-learn metrics adopt the convention y_true, y_preds which is the opposite from fastai, so pass invert_arg=True to make AvgMetricX, AccumMetricX, and AvgSmoothMetricX do the inversion. Alternatively, use the skm_to_fastxtend convenience method to handle sklearn.metrics automatically.


source

LogMetric

 LogMetric (value, names=None, module=None, qualname=None, type=None,
            start=1)

All logging types for MetricX


source

MetricType

 MetricType (value, names=None, module=None, qualname=None, type=None,
             start=1)

All types of MetricX


source

ActivationType

 ActivationType (value, names=None, module=None, qualname=None, type=None,
                 start=1)

All activation classes for `MetricX


source

MetricX

 MetricX (dim_argmax=None, activation=<ActivationType.No: 1>, thresh=None,
          log_metric=None, name=None)

Blueprint for defining an extended metric with accumulate

Note

By default, a MetricX will only log during validation. Metrics can individually set to run during training, validation, or both by passing LogMetric.Train, LogMetric.Valid, or LogMetric.Both to log_metric, respectively.

For classification problems with single label, predictions need to be transformed with a softmax then an argmax before being compared to the targets. Since a softmax doesn’t change the order of the numbers, apply the argmax. Pass along dim_argmax to have this done by MetricX (usually -1 will work pretty well). If the metric implementation requires probabilities and not predictions, use softmax=True.

For classification problems with multiple labels, or if targets are one-hot encoded, predictions may need to pass through a sigmoid (if it wasn’t included in in the model) then be compared to a given threshold (to decide between 0 and 1), this is done by MetricX by passing sigmoid=True and/or a value for thresh.

Metrics can be simple averages (like accuracy) but sometimes their computation is a little bit more complex and can’t be averaged over batches (like precision or recall), which is why we need a special AccumMetricX class for them. For simple functions that can be computed as averages over batches, we can use the class AvgMetricX, otherwise you’ll need to implement the following methods.

Note

If your custom MetricX has state depending on tensors, don’t forget to store it on the CPU to avoid any potential memory leaks.


source

MetricX.reset

 MetricX.reset ()

Reset inner state to prepare for new computation


source

MetricX.accumulate

 MetricX.accumulate (learn)

Store targs and preds from learn, using activation function and argmax as appropriate


source

MetricX.value

 MetricX.value ()

The value of the metric


source

MetricX.name

 MetricX.name ()

Name of the Metric, camel-cased and with Metric removed. Or custom name if provided


source

AvgMetricX

 AvgMetricX (func, to_np=False, invert_arg=False, dim_argmax=None,
             activation=<ActivationType.No: 1>, thresh=None,
             log_metric=None, name=None)

Average the values of func taking into account potential different batch sizes

func is applied to each batch of predictions/targets and then averaged when value attribute is asked for.The signature of func should be inp,targ (where inp are the predictions of the model and targ the corresponding labels).

Important

Some metrics, like Root Mean Squared Error, will have incorrect results if passed to AvgMetricX, as the mean of multiple batches of RMSE isn’t equal to the RMSE of the whole dataset. For these metrics use AccumMetricX.

If using a functional metric from sklearn.metrics, predictions and labels will need to be converted to numpy arrays with to_np=True. Also, scikit-learn metrics adopt the convention y_true, y_preds which is the opposite from fastai, so pass invert_arg=True to make AvgMetricX, AccumMetricX, and AvgSmoothMetricX do the inversion. Alternatively, use the skm_to_fastxtend convenience method to handle sklearn.metrics automatically.

By default, fastxtend’s scikit-learn metrics use AccumMetricX.


source

AccumMetricX

 AccumMetricX (func, to_np=False, invert_arg=False, flatten=True,
               dim_argmax=None, activation=<ActivationType.No: 1>,
               thresh=None, log_metric=None, name=None)

Stores predictions and targets on CPU in accumulate to perform final calculations with func.

func is only applied to the accumulated predictions/targets when the value attribute is asked for (so at the end of a validation/training phase, in use with Learner and its Recorder).The signature of func should be inp,targ (where inp are the predictions of the model and targ the corresponding labels).

If using a functional metric from sklearn.metrics, predictions and labels will need to be converted to numpy arrays with to_np=True. Also, scikit-learn metrics adopt the convention y_true, y_preds which is the opposite from fastai, so pass invert_arg=True to make AvgMetricX, AccumMetricX, and AvgSmoothMetricX do the inversion. Alternatively, use the skm_to_fastxtend convenience method to handle sklearn.metrics automatically.

By default, fastai’s scikit-learn metrics use AccumMetricX.


source

AvgSmoothMetricX

 AvgSmoothMetricX (func, beta=0.98, to_np=False, invert_arg=False,
                   dim_argmax=None, activation=<ActivationType.No: 1>,
                   thresh=None, name=None)

Smooth average the values of func (exponentially weighted with beta). Only computed on training set.

func is only applied to the accumulated predictions/targets when the value attribute is asked for (so at the end of a validation/training phase, in use with Learner and its Recorder).The signature of func should be inp,targ (where inp are the predictions of the model and targ the corresponding labels).

If using a functional metric from sklearn.metrics, predictions and labels will need to be converted to numpy arrays with to_np=True. Also, scikit-learn metrics adopt the convention y_true, y_preds which is the opposite from fastai, so pass invert_arg=True to make AvgMetricX, AccumMetricX, and AvgSmoothMetricX do the inversion. Alternatively, use the skm_to_fastxtend convenience method to handle sklearn.metrics automatically.


source

AvgLossX

 AvgLossX (dim_argmax=None, activation=<ActivationType.No: 1>,
           thresh=None, log_metric=None, name=None)

Average the losses taking into account potential different batch sizes


source

AvgSmoothLossX

 AvgSmoothLossX (beta=0.98)

Smooth average of the losses (exponentially weighted with beta)


source

ValueMetricX

 ValueMetricX (func, name=None, log_metric=None)

Use to include a pre-calculated metric value (for instance calculated in a Callback) and returned by func

Metrics

Custom Metric Creation

fastxtend provides two convenience methods for creating custom metrics from functions: func_to_metric and skm_to_fastxtend.


source

func_to_metric

 func_to_metric (func, metric_type, is_class, thresh=None, axis=-1,
                 activation=None, log_metric=<LogMetric.Valid: 2>,
                 dim_argmax=None, name=None)

Convert func metric to a fastai metric

This is the quickest way to use a functional metric as a fastxtend metric.

metric_type is one of MetricType.Avg, MetricType.Accum, or MetricType.Smooth which set the metric to use AvgMetricX, AccumMetricX, or AvgSmoothMetricX, respectively.

is_class indicates if you are in a classification problem or not. In this case: - leaving thresh to None indicates it’s a single-label classification problem and predictions will pass through an argmax over axis before being compared to the targets - setting a value for thresh indicates it’s a multi-label classification problem and predictions will pass through a sigmoid (can be deactivated with sigmoid=False) and be compared to thresh before being compared to the targets

If is_class=False, it indicates you are in a regression problem, and predictions are compared to the targets without being modified. In all cases, kwargs are extra keyword arguments passed to func.

Important

Some metrics, like Root Mean Squared Error, will have incorrect results if passed to AvgMetricX via MetricType.Avg, as the mean of multiple batches of RMSE isn’t equal to the RMSE of the whole dataset. For these metrics use AccumMetricX by setting metric_type to MetricType.Accum.


source

skm_to_fastxtend

 skm_to_fastxtend (func, is_class=True, thresh=None, axis=-1,
                   activation=None, log_metric=<LogMetric.Valid: 2>,
                   dim_argmax=None, name=None)

Convert func from sklearn.metrics to a fastai metric

This is the quickest way to use a scikit-learn metric using fastxtend metrics. It is the same as func_to_metric except it defaults to using AccumMetricX.

Single-label classification

Warning

All functions defined in this section are intended for single-label classification and targets that are not one-hot encoded. For multi-label problems or one-hot encoded targets, use the version suffixed with multi.

Warning

Many metrics in fastxtend are thin wrappers around sklearn functionality. However, sklearn metrics can handle python list strings, amongst other things, whereas fastxtend metrics work with PyTorch, and thus require tensors. The arguments that are passed to metrics are after all transformations, such as categories being converted to indices, have occurred. This means that when you pass a label of a metric, for instance, that you must pass indices, not strings. This can be converted with vocab.map_obj.


source

Accuracy

 Accuracy (axis=-1, metric_type=<MetricType.Avg: 1>,
           log_metric=<LogMetric.Valid: 2>, **kwargs)

Compute accuracy with targ when pred is bs * n_classes


source

ErrorRate

 ErrorRate (axis=-1, metric_type=<MetricType.Avg: 1>,
            log_metric=<LogMetric.Valid: 2>, **kwargs)

Compute 1 - accuracy with targ when pred is bs * n_classes


source

TopKAccuracy

 TopKAccuracy (k=5, axis=-1, metric_type=<MetricType.Avg: 1>,
               log_metric=<LogMetric.Valid: 2>, **kwargs)

Computes the Top-k accuracy (targ is in the top k predictions of inp)


source

APScoreBinary

 APScoreBinary (axis=-1, average='macro', pos_label=1, sample_weight=None,
                log_metric=<LogMetric.Valid: 2>, **kwargs)

Average Precision for single-label binary classification problems

See the scikit-learn documentation for more details.


source

BalancedAccuracy

 BalancedAccuracy (axis=-1, sample_weight=None, adjusted=False,
                   log_metric=<LogMetric.Valid: 2>, **kwargs)

Balanced Accuracy for single-label binary classification problems

See the scikit-learn documentation for more details.


source

BrierScore

 BrierScore (axis=-1, sample_weight=None, pos_label=None,
             log_metric=<LogMetric.Valid: 2>, **kwargs)

Brier score for single-label classification problems

See the scikit-learn documentation for more details.


source

CohenKappa

 CohenKappa (axis=-1, labels=None, weights=None, sample_weight=None,
             log_metric=<LogMetric.Valid: 2>, **kwargs)

Cohen kappa for single-label classification problems

See the scikit-learn documentation for more details.


source

F1Score

 F1Score (axis=-1, labels=None, pos_label=1, average='binary',
          sample_weight=None, log_metric=<LogMetric.Valid: 2>, **kwargs)

F1 score for single-label classification problems

See the scikit-learn documentation for more details.


source

FBeta

 FBeta (beta, axis=-1, labels=None, pos_label=1, average='binary',
        sample_weight=None, log_metric=<LogMetric.Valid: 2>, **kwargs)

FBeta score with beta for single-label classification problems

See the scikit-learn documentation for more details.


source

HammingLoss

 HammingLoss (axis=-1, sample_weight=None, log_metric=<LogMetric.Valid:
              2>, **kwargs)

Hamming loss for single-label classification problems

See the scikit-learn documentation for more details.


source

Jaccard

 Jaccard (axis=-1, labels=None, pos_label=1, average='binary',
          sample_weight=None, log_metric=<LogMetric.Valid: 2>, **kwargs)

Jaccard score for single-label classification problems

See the scikit-learn documentation for more details.


source

Precision

 Precision (axis=-1, labels=None, pos_label=1, average='binary',
            sample_weight=None, log_metric=<LogMetric.Valid: 2>, **kwargs)

Precision for single-label classification problems

See the scikit-learn documentation for more details.


source

Recall

 Recall (axis=-1, labels=None, pos_label=1, average='binary',
         sample_weight=None, log_metric=<LogMetric.Valid: 2>, **kwargs)

Recall for single-label classification problems

See the scikit-learn documentation for more details.


source

RocAuc

 RocAuc (axis=-1, average='macro', sample_weight=None, max_fpr=None,
         multi_class='ovr', log_metric=<LogMetric.Valid: 2>, **kwargs)

Area Under the Receiver Operating Characteristic Curve for single-label multiclass classification problems

See the scikit-learn documentation for more details.


source

RocAucBinary

 RocAucBinary (axis=-1, average='macro', sample_weight=None, max_fpr=None,
               multi_class='raise', log_metric=<LogMetric.Valid: 2>,
               **kwargs)

Area Under the Receiver Operating Characteristic Curve for single-label binary classification problems

See the scikit-learn documentation for more details.


source

MatthewsCorrCoef

 MatthewsCorrCoef (sample_weight=None, log_metric=<LogMetric.Valid: 2>,
                   **kwargs)

Matthews correlation coefficient for single-label classification problems

See the scikit-learn documentation for more details.

Multi-label classification


source

AccuracyMulti

 AccuracyMulti (thresh=0.5, sigmoid=True, metric_type=<MetricType.Avg: 1>,
                log_metric=<LogMetric.Valid: 2>, **kwargs)

Compute accuracy when inp and targ are the same size.


source

APScoreMulti

 APScoreMulti (sigmoid=True, average='macro', pos_label=1,
               sample_weight=None, log_metric=<LogMetric.Valid: 2>,
               **kwargs)

Average Precision for multi-label classification problems

See the scikit-learn documentation for more details.


source

BrierScoreMulti

 BrierScoreMulti (thresh=0.5, sigmoid=True, sample_weight=None,
                  pos_label=None, log_metric=<LogMetric.Valid: 2>,
                  **kwargs)

Brier score for multi-label classification problems

See the scikit-learn documentation for more details.


source

F1ScoreMulti

 F1ScoreMulti (thresh=0.5, sigmoid=True, labels=None, pos_label=1,
               average='macro', sample_weight=None,
               log_metric=<LogMetric.Valid: 2>, **kwargs)

F1 score for multi-label classification problems

See the scikit-learn documentation for more details.


source

FBetaMulti

 FBetaMulti (beta, thresh=0.5, sigmoid=True, labels=None, pos_label=1,
             average='macro', sample_weight=None,
             log_metric=<LogMetric.Valid: 2>, **kwargs)

FBeta score with beta for multi-label classification problems

See the scikit-learn documentation for more details.


source

HammingLossMulti

 HammingLossMulti (thresh=0.5, sigmoid=True, labels=None,
                   sample_weight=None, log_metric=<LogMetric.Valid: 2>,
                   **kwargs)

Hamming loss for multi-label classification problems

See the scikit-learn documentation for more details.


source

JaccardMulti

 JaccardMulti (thresh=0.5, sigmoid=True, labels=None, pos_label=1,
               average='macro', sample_weight=None,
               log_metric=<LogMetric.Valid: 2>, **kwargs)

Jaccard score for multi-label classification problems

See the scikit-learn documentation for more details.


source

MatthewsCorrCoefMulti

 MatthewsCorrCoefMulti (thresh=0.5, sigmoid=True, sample_weight=None,
                        log_metric=<LogMetric.Valid: 2>, **kwargs)

Matthews correlation coefficient for multi-label classification problems

See the scikit-learn documentation for more details.


source

PrecisionMulti

 PrecisionMulti (thresh=0.5, sigmoid=True, labels=None, pos_label=1,
                 average='macro', sample_weight=None,
                 log_metric=<LogMetric.Valid: 2>, **kwargs)

Precision for multi-label classification problems

See the scikit-learn documentation for more details.


source

RecallMulti

 RecallMulti (thresh=0.5, sigmoid=True, labels=None, pos_label=1,
              average='macro', sample_weight=None,
              log_metric=<LogMetric.Valid: 2>, **kwargs)

Recall for multi-label classification problems

See the scikit-learn documentation for more details.


source

RocAucMulti

 RocAucMulti (sigmoid=True, average='macro', sample_weight=None,
              max_fpr=None, log_metric=<LogMetric.Valid: 2>, **kwargs)

Area Under the Receiver Operating Characteristic Curve for multi-label binary classification problems

See the scikit-learn documentation for more details.

Regression


source

MSE

 MSE (metric_type=<MetricType.Avg: 1>, log_metric=<LogMetric.Valid: 2>,
      **kwargs)

Mean squared error between inp and targ.


source

RMSE

 RMSE (log_metric=<LogMetric.Valid: 2>, **kwargs)

Mean squared error between inp and targ.


source

MAE

 MAE (metric_type=<MetricType.Avg: 1>, log_metric=<LogMetric.Valid: 2>,
      **kwargs)

Mean absolute error between inp and targ.


source

MSLE

 MSLE (metric_type=<MetricType.Avg: 1>, log_metric=<LogMetric.Valid: 2>,
       **kwargs)

Mean squared logarithmic error between inp and targ.


source

ExpRMSE

 ExpRMSE (log_metric=<LogMetric.Valid: 2>, **kwargs)

Root mean square percentage error of the exponential of predictions and targets


source

ExplainedVariance

 ExplainedVariance (sample_weight=None, log_metric=<LogMetric.Valid: 2>,
                    **kwargs)

Explained variance between predictions and targets

See the scikit-learn documentation for more details.


source

R2Score

 R2Score (sample_weight=None, log_metric=<LogMetric.Valid: 2>, **kwargs)

R2 score between predictions and targets

See the scikit-learn documentation for more details.


source

PearsonCorrCoef

 PearsonCorrCoef (dim_argmax=None, log_metric=<LogMetric.Valid: 2>,
                  **kwargs)

Pearson correlation coefficient for regression problem

See the scipy documentation for more details.


source

SpearmanCorrCoef

 SpearmanCorrCoef (dim_argmax=None, axis=0, nan_policy='propagate',
                   log_metric=<LogMetric.Valid: 2>, **kwargs)

Spearman correlation coefficient for regression problem

See the scipy documentation for more details.

Segmentation


source

ForegroundAcc

 ForegroundAcc (bkg_idx=0, axis=1, metric_type=<MetricType.Avg: 1>,
                log_metric=<LogMetric.Valid: 2>, **kwargs)

Computes non-background accuracy for multiclass segmentation


source

Dice

 Dice (axis=1, log_metric=<LogMetric.Valid: 2>, **kwargs)

Dice coefficient metric for binary target in segmentation


source

DiceMulti

 DiceMulti (axis=1, log_metric=<LogMetric.Valid: 2>, **kwargs)

Averaged Dice metric (Macro F1) for multiclass target in segmentation

The DiceMulti method implements the “Averaged F1: arithmetic mean over harmonic means” described in this publication: https://arxiv.org/pdf/1911.03347.pdf


source

JaccardCoeff

 JaccardCoeff (axis=1, log_metric=<LogMetric.Valid: 2>, **kwargs)

Implementation of the Jaccard coefficient that is lighter in RAM

NLP


source

CorpusBLEUMetric

 CorpusBLEUMetric (vocab_sz=5000, axis=-1, log_metric=<LogMetric.Valid:
                   2>, name='CorpusBLEU', **kwargs)

BLEU Metric calculated over the validation corpus

The BLEU metric was introduced in this article to come up with a way to evaluate the performance of translation models. It’s based on the precision of n-grams in your prediction compared to your target. See the fastai NLP course BLEU notebook for a more detailed description of BLEU.

The smoothing used in the precision calculation is the same as in SacreBLEU, which in turn is “method 3” from the Chen & Cherry, 2014 paper.


source

Perplexity

 Perplexity (dim_argmax=None, activation=<ActivationType.No: 1>,
             thresh=None, log_metric=None, name=None)

Perplexity (exponential of cross-entropy loss) for Language Models


source

LossMetric

 LossMetric (func, to_np=False, invert_arg=False, dim_argmax=None,
             activation=<ActivationType.No: 1>, thresh=None,
             log_metric=None, name=None)

Create a metric from loss_func.attr named nm


source

LossMetrics

 LossMetrics (attrs, nms=None)

List of LossMetric for each of attrs and nms

Logging

Metrics Extended is compatible with logging to Weights and Biases and TensorBoard using fastai’s WandbCallback and TensorBoardCallback.