Myvideo

Guest

Login

What are Transformer Neural Networks

Uploaded By: Myvideo
1 view
0
0 votes
0

This short tutorial covers the basics of the Transformer, a neural network architecture designed for handling sequential data in machine learning. Timestamps: 0:00 - Intro 1:18 - Motivation for developing the Transformer 2:44 - Input embeddings (start of encoder walk-through) 3:29 - Attention 6:29 - Multi-head attention 7:55 - Positional encodings 9:59 - Add & norm, feedforward, & stacking encoder layers 11:14 - Masked multi-head attention (start of decoder walk-through) 12:35 - Cross-attention 13:38 - Decoder output & prediction probabilities 14:46 - Complexity analysis 16:00 - Transformers as graph neural networks Original Transformers paper: Attention is All You Need - Other papers mentioned: (GPT-3) Language Models are Few-Shot Learners - (DALL-E) Zero-Shot Text-to-Image Generation - BERT: Pre-training of Deep Bidirectional Tran

Share with your friends

Link:

Embed:

Video Size:

Custom size:

x

Add to Playlist:

Favorites
My Playlist
Watch Later