Audio Data

Audio DataBlocks and show_batch

source

Spectrogram

 Spectrogram (n_fft:listified[int]=1024,
              win_length:listified[int]|None=None,
              hop_length:listified[int]|None=None, pad:listified[int]=0,
              window_fn:listified[Callable[...,Tensor]]=<built-in method
              hann_window of type object at 0x7fa849ba09c0>,
              power:listified[float]=2.0,
              normalized:listified[bool]=False,
              wkwargs:listified[dict]|None=None,
              center:listified[bool]=True,
              pad_mode:listified[str]='reflect',
              onesided:listified[bool]=True,
              norm:listified[str]|None=None)

Convert a TensorAudio into one or more TensorSpec


source

MelSpectrogram

 MelSpectrogram (sample_rate:listified[int]=16000,
                 n_fft:listified[int]=1024,
                 win_length:listified[int]|None=None,
                 hop_length:listified[int]|None=None,
                 f_min:listified[float]=0.0,
                 f_max:listified[float]|None=None, pad:listified[int]=0,
                 n_mels:listified[int]=128,
                 window_fn:listified[Callable[...,Tensor]]=<built-in
                 method hann_window of type object at 0x7fa849ba09c0>,
                 power:listified[float]=2.0,
                 normalized:listified[bool]=False,
                 wkwargs:listified[dict]|None=None,
                 center:listified[bool]=True,
                 pad_mode:listified[str]='reflect',
                 onesided:listified[bool]=True,
                 norm:listified[str]|None=None,
                 mel_scale:listified[str]='htk')

Convert a TensorAudio into one or more TensorMelSpec

TransformBlocks for audio

Audio data blocks for using with the fastai data block API.


source

AudioBlock

 AudioBlock (cls=<class 'fastxtend.audio.core.TensorAudio'>)

A TransformBlock for audio of cls


source

SpecBlock

 SpecBlock (cls=<class 'fastxtend.audio.core.TensorAudio'>, n_fft:Union[in
            t,Iterable[int],fastcore.foundation.L,fastcore.basics.fastuple
            ]=1024, win_length:Union[int,Iterable[int],fastcore.foundation
            .L,fastcore.basics.fastuple,NoneType]=None, hop_length:Union[i
            nt,Iterable[int],fastcore.foundation.L,fastcore.basics.fastupl
            e,NoneType]=None, pad:Union[int,Iterable[int],fastcore.foundat
            ion.L,fastcore.basics.fastuple]=0, window_fn:Union[Callable[..
            .,torch.Tensor],Iterable[Callable[...,torch.Tensor]],fastcore.
            foundation.L,fastcore.basics.fastuple]=<built-in method
            hann_window of type object at 0x7fa849ba09c0>, power:Union[flo
            at,Iterable[float],fastcore.foundation.L,fastcore.basics.fastu
            ple]=2.0, normalized:Union[bool,Iterable[bool],fastcore.founda
            tion.L,fastcore.basics.fastuple]=False, wkwargs:Union[dict,Ite
            rable[dict],fastcore.foundation.L,fastcore.basics.fastuple,Non
            eType]=None, center:Union[bool,Iterable[bool],fastcore.foundat
            ion.L,fastcore.basics.fastuple]=True, pad_mode:Union[str,Itera
            ble[str],fastcore.foundation.L,fastcore.basics.fastuple]='refl
            ect', onesided:Union[bool,Iterable[bool],fastcore.foundation.L
            ,fastcore.basics.fastuple]=True, norm:Union[str,Iterable[str],
            fastcore.foundation.L,fastcore.basics.fastuple,NoneType]=None)

A TransformBlock to read TensorAudio and then use the GPU to turn audio into one or more Spectrograms

Type Default Details
cls _TensorMeta TensorAudio
n_fft listified[int] 1024 Spectrogram args
win_length listified[int] | None None
hop_length listified[int] | None None
pad listified[int] 0
window_fn listified[Callable[…, Tensor]] hann_window
power listified[float] 2.0
normalized listified[bool] False
wkwargs listified[dict] | None None
center listified[bool] True
pad_mode listified[str] reflect
onesided listified[bool] True
norm listified[str] | None None

source

MelSpecBlock

 MelSpecBlock (cls=<class 'fastxtend.audio.core.TensorAudio'>, sr:Union[in
               t,Iterable[int],fastcore.foundation.L,fastcore.basics.fastu
               ple]=16000, n_fft:Union[int,Iterable[int],fastcore.foundati
               on.L,fastcore.basics.fastuple]=1024, win_length:Union[int,I
               terable[int],fastcore.foundation.L,fastcore.basics.fastuple
               ,NoneType]=None, hop_length:Union[int,Iterable[int],fastcor
               e.foundation.L,fastcore.basics.fastuple,NoneType]=None, f_m
               in:Union[float,Iterable[float],fastcore.foundation.L,fastco
               re.basics.fastuple]=0.0, f_max:Union[float,Iterable[float],
               fastcore.foundation.L,fastcore.basics.fastuple,NoneType]=No
               ne, pad:Union[int,Iterable[int],fastcore.foundation.L,fastc
               ore.basics.fastuple]=0, n_mels:Union[int,Iterable[int],fast
               core.foundation.L,fastcore.basics.fastuple]=128, window_fn:
               Union[Callable[...,torch.Tensor],Iterable[Callable[...,torc
               h.Tensor]],fastcore.foundation.L,fastcore.basics.fastuple]=
               <built-in method hann_window of type object at
               0x7fa849ba09c0>, power:Union[float,Iterable[float],fastcore
               .foundation.L,fastcore.basics.fastuple]=2.0, normalized:Uni
               on[bool,Iterable[bool],fastcore.foundation.L,fastcore.basic
               s.fastuple]=False, wkwargs:Union[dict,Iterable[dict],fastco
               re.foundation.L,fastcore.basics.fastuple,NoneType]=None, ce
               nter:Union[bool,Iterable[bool],fastcore.foundation.L,fastco
               re.basics.fastuple]=True, pad_mode:Union[str,Iterable[str],
               fastcore.foundation.L,fastcore.basics.fastuple]='reflect', 
               onesided:Union[bool,Iterable[bool],fastcore.foundation.L,fa
               stcore.basics.fastuple]=True, norm:Union[str,Iterable[str],
               fastcore.foundation.L,fastcore.basics.fastuple,NoneType]=No
               ne, mel_scale:Union[str,Iterable[str],fastcore.foundation.L
               ,fastcore.basics.fastuple]='htk')

A TransformBlock to read TensorAudio and then use the GPU to turn audio into one or more MelSpectrograms

Type Default Details
cls _TensorMeta TensorAudio
sr listified[int] 16000 MelSpectrogram args
n_fft listified[int] 1024
win_length listified[int] | None None
hop_length listified[int] | None None
f_min listified[float] 0.0
f_max listified[float] | None None
pad listified[int] 0
n_mels listified[int] 128
window_fn listified[Callable[…, Tensor]] hann_window
power listified[float] 2.0
normalized listified[bool] False
wkwargs listified[dict] | None None
center listified[bool] True
pad_mode listified[str] reflect
onesided listified[bool] True
norm listified[str] | None None
mel_scale listified[str] htk