Attention Modules

Adds three efficient attention modules in addition to Squeeze and Excitation

 ECA (nf, ks:int=None, gamma:int=2, beta:int=1)

Efficient Channel Attention, from https://arxiv.org/abs/1910.03151.

	Type	Default	Details
nf			number of input features
ks	int	None	if set, Cov1D uses a fixed kernel size instead of adaptive kernel size
gamma	int	2	used for adaptive kernel size, see paper for more details
beta	int	1	used for adaptive kernel size, see paper for more details

Efficient Channel Attention modified from https://github.com/BangguWu/ECANet

 ShuffleAttention (nf, groups=64)

Implementation of Shuffle Attention, from https://arxiv.org/abs/2102.00240

	Type	Default	Details
nf			number of input features
groups	int	64	number of subfeature groups, usually 32 or 64

 TripletAttention (nf, ks:int=7, no_spatial=False)

Lightly modified implementation of Triplet Attention, from http://arxiv.org/abs/2010.03045

	Type	Default	Details
nf			unused input features, for compatibility
ks	int	7	kernel size for AttentionGate
no_spatial	bool	False	exclude Spatial attention as third attention