📜SimVLM explained. What the authors tell us, what they don’t tell us and how this all works. Enjoy with coffee! 📺 Vision & Language Transformer explained (ViLBERT): 📺 ViT explained: Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏 donor, Dres. Trost GbR, Yannik Schneider Paper: 📜 Wang, Zirui, Jiahui Yu, Adams Wei Yu, Zihang Dai, Yulia Tsvetkov, and Yuan Cao. “SimVLM: Simple Visual Language Model Pretraining with Weak Supervision.“ arXiv preprint arXiv: (2021). 🔗 SimVLM AI Google Blog post: 📜 Jia, Chao, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, and Tom Duerig. “Scaling up visual and vision-language representation learning with noisy text supervision.&quo
Hide player controls
Hide resume playing