Myvideo

Guest

Login

Attention is all you need (Transformer) - Model explanation (including math), Inference and Training

Uploaded By: Myvideo
1 view
0
0 votes
0

A complete explanation of all the layers of a Transformer Model: Multi-Head Self-Attention, Positional Encoding, including all the matrix multiplications and a complete description of the training and inference process. Slides PDF: Chapters 00:00 - Intro 01:10 - RNN and their problems 08:04 - Transformer Model 09:02 - Maths background and notations 12:20 - Encoder (overview) 12:31 - Input Embeddings 15:04 - Positional Encoding 20:08 - Single Head Self-Attention 28:30 - Multi-Head Attention 35:39 - Query, Key, Value 37:55 - Layer Normalization 40:13 - Decoder (overview) 42:24 - Masked Multi-Head Attention 44:59 - Training 52:09 - Inference

Share with your friends

Link:

Embed:

Video Size:

Custom size:

x

Add to Playlist:

Favorites
My Playlist
Watch Later