Repost- Generalized Transformers from Applicative Functors

https://cybercat.institute/2025/02/12/transformers-applicative-functors/

a generalization of Transformer models that can operate on (almost) arbitrary structures such as functions, graphs, probability distributions, not just matrices and vectors.
exploring machine learning through abstract diagrammatical means
- other resources
  - On the anatomy of attention arXiv:2407.02423(https://arxiv.org/abs/2407.02423) (the ‘tube’ notation in Part 4 is equivalent to the ‘SIMD’ notation in that paper)
  - A pattern language for machine learning tasks arXiv:2407.02424(https://arxiv.org/abs/2407.02424)