Attention mechanism.
1 self attention
2 multihead attention.
Refer the new attention system
Including
Multihead latent attention by deep seek
Kolmogorov attention
Fouriet kolmogorov attention.
These all produce effect on memory usage.
https://huggingface.co/blog/Kseniase/attentions?fbclid=IwY2xjawJgyjZleHRuA2FlbQIxMQABHrrek-OgeJ4lHLc-OH6MEuL2nYQT1L6AEfV3fd8x32hI9EV1083JV2w75GM0_aem_ioT_stKCd3E0Y9roAaQN7Q
Comments
Post a Comment