Interpreting notation i.e.: h00 from attention topic. #56

jdgh000 · 2024-12-02T02:05:29Z

I am trying to verbally explain to myself the meaning of notation introduced from Fig 9.10 vs. Fig 9.6, where 9.10 depiction is first illustration of attention network basics.
Fig 9.6 is pretty simple: h0, h1 - hidden state 0 (vector), hidden state 1, and x0, x1 is input0, input1.

Fig 9.10: h00, h01 is output s0 (source 0) and h10, h11 is output by s1. I can explain, just like Fig 9.9 h0 part of h00, h01 corresponds to hidden state 0 from source 0 and so on, however second digit, I am trying to wrangle my head over it.
Should I interpret the second digit as denoting of each vector element? That seems evident from Figure 9.12 where it explains how attn scores are calculated. Somehow we do some operations on [h00, h01] op [h20, h21] => we get 0.8 and [h10, h11] op [h20, h21] => we get another scalar value of 0.2. However scalar value is only possible if we do dot product: [h00, h01] dot [h10, h11] = h00 * h10 + h10 * h11 = 0.8

As a convenience, I attached the screenshots

dvgodoy · 2024-12-16T10:38:37Z

Hi @jdgh000 ,

I'm sorry for the delayed response.

You're right in your interpretation of the notation. The second digit in every group, both in Figure 9.6 and 9.10 represents the index of the i-th element in the vector. So, there are three vectors (h0, h1, and h2) which have two elements each.

You're also right about the dot product. In the attention formula (the QK^T part), we compute the dot product between the query (h2) and the keys (h0 and h1).
The first dot product, between h0 and h2, is: h00 * h20 + h01 * h21, resulting in the hypothetical attention score of 0.8 (Figure 9.12).
Similarly, we compute the dot product for the second key (h1): h10 * h20 + h11 * h21, with a resulting score of 0.2.

Best,
Daniel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interpreting notation i.e.: h00 from attention topic. #56

Interpreting notation i.e.: h00 from attention topic. #56

jdgh000 commented Dec 2, 2024 •

edited

Loading

dvgodoy commented Dec 16, 2024 •

edited

Loading

Interpreting notation i.e.: h00 from attention topic. #56

Interpreting notation i.e.: h00 from attention topic. #56

Comments

jdgh000 commented Dec 2, 2024 • edited Loading

dvgodoy commented Dec 16, 2024 • edited Loading

jdgh000 commented Dec 2, 2024 •

edited

Loading

dvgodoy commented Dec 16, 2024 •

edited

Loading