Can a ConvNet outperform a Vision Transformer? What kind of modifications do we have to apply to a ConvNet to make it as powerful as a Transformer? Spoiler: itβs not attention. βΊ SPONSOR: Weights & Biases π The official ConvNeXt repo has a W&B integration! Also, W&B built the CIFAR10 training colab linked there: π₯³ β Check out our daily #MachineLearning Quiz Questions: Explained Paper π: Liu, Zhuang, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. βA ConvNet for the 2020s.β arXiv preprint arXiv: (2022). π Tweet of Lukas Beyer (ViT author): π Depthwise convolutions image and explanation:
Hide player controls
Hide resume playing