好博客

#misc

解释 Q, K , V 关系 https://www.zhihu.com/question/653658936/answer/3545520807

Attn(K,Q,V)=softmax(QKTdk)VAttn(K, Q, V) = softmax(\dfrac{QK^T}{\sqrt{d_k}})V