GENERATIVE AI
WEEK-4
Asst.Prof.Dr.Murat ŞİMŞEK
NLP
NLP
Transformer
Self-Attention
Seq2Seq
Seq2Seq
Seq2Seq
Attention Mechanism
The attention mechanism is a neural network
architecture that allows a deep learning
model to focus on specific and relevant
Transformer Architecture
24
LLM
25
Transformer
Transformer
Encoder
• The main purpose of the
encoder block is to provide the
necessary data for the output
we want to get by encoding
the given input data.
• This data includes the words
and context in the given
sentence.
• To perform the encoding, the
encoder block uses word
embedding, positional
Word Embedding
Positional Encoding
Positional Encoding enables the conversion of
word vectors resulting from the word
embedding layer of text data into numerical
vectors that contain the order information of
Positional Encoding
Data passing
through the
positional
encoding
layer
becomes
ready to be
processed in
the encoder
section of the
Multi-Head Attention
Multi-Head Attention The first step in
calculating self-
attention is to create
three vectors from
each of the encoder’s
input vectors (in this
case, the embedding
of each word).
So for each word, we
create a Query
vector, a Key vector,
and a Value vector.
These vectors are
created by
multiplying the
Self-Attention
Self-Attention
Self-Attention
Multi-Headed Attention
Multi-Headed Attention
Transformer Architecture
40
Transformer Architecture
41
Transformer Architecture
42
Transformer Architecture
43
Transformer Architecture
44
Transformer Architecture
45
Transformer Architecture
46
Transformer Architecture
47
Transformer Architecture
48
Transformer Architecture
49
Transformer Architecture
50
Transformer Architecture
51
Transformer Architecture
52
Transformer Architecture
53
Transformer Architecture
54
Transformer Architecture
55
Transformer Architecture
56
Transformer Architecture
57
Transformer Architecture
58
Transformer Architecture
59
Transformer Architecture
60
Transformer Architecture
61
Transformer Architecture
62
Transformer Architecture
63
Transformer Architecture
64
Transformer Architecture
65
Attention
66
Attention
67
Attention
68
Attention
69
Attention
70
Attention
71
Attention
72
Attention
73
Attention
74
Attention
75
Self-Attention
76
Self-Attention
77
Self-Attention
78
Self-Attention
79
Self-Attention
80
Self-Attention
81
Self-Attention
82
Self-Attention
83
Self-Attention
84
Self-Attention
85
Self-Attention
86
87
88
89
90
91
92
93
94
95
96
97
98
99
10
0
10
1
10
2
10
3
10
4
10
5
10
6
10
7
10
8
10
9
11
0
11
1
11
2
11
3
11
4
11
5
11
6
11
7
11
8
11
9
12
0
12
1
12
2
12
3
12
4
12
5
12
6
12
7
12
8
12
9
13
0
13
1
13
2