Yandex School of Data Analysis Conference Machine Learning: Prospects and Applications We present a system for solving the holy grail of computer vision — matching images and text and describing an image by an automatically generated text. Our system is based on combining deep learning tools for images and text, namely Convolutional Neural Networks, word2vec, and Recurrent Neural Networks, with a classical computer vision tool, the Fisher Vector. The Fisher Vector i
Hide player controls
Hide resume playing