Myvideo

Guest

Login

Anatoly Potapov: Pre-training Transformers with Catalyst

Uploaded By: Myvideo
1 view
0
0 votes
0

Data fest Online 2020 Catalyst Workshop track Since the NLP’s Imagenet moment (emergence of ELMO, BERT), language models are widely used as a backbone for a variety of supervised tasks (intent classification, named entity recognition, question answering), etc At Tinkoff, we have tens of millions of unlabelled customer conversation samples. In such a scenario, it is highly beneficial to pre-train on in-domain data and having a custom vocabulary. We transferred our pre-training pipeline to the Catalyst framework and reduced our codebase while keeping features like distributed training and fp16 training. In my presentation, I will tell you how to pre-train transformers at scale with the Catalyst framework without writing lots of “infrastructure“ code. Register and get access to the tracks: Join the community:

Share with your friends

Link:

Embed:

Video Size:

Custom size:

x

Add to Playlist:

Favorites
My Playlist
Watch Later