[Pycon] [new paper] "Jennifer Seale" - Multi-modal classification in PyTorch

Sab 19 Gen 2019 18:19:14 CET

Title: Multi-modal classification in PyTorch
Duration: 45 (includes Q&A)
Q&A Session: 15
Language: en
Type: Talk

Abstract: Recent work by Kiela et al. (2018) reveals that image and text multi-modal classification models far outperform both text- and image-only models. This talk will review work that extends Kiela et al.'s (2018) research by determining if accuracy in classification may be increased by the implementation of features that mimic human attributes. Novel multi-modal classification models are generated by implementing isolated, and then combined, feature adjustments to the bilinear gated model from Kiela et al. (2018). Attention is implemented in a ResNet model, and transfer learning implemented for language processing. The performance of each model over the MM-IMDB subset datasets is analyzed and compared to the baseline provided by Kiela et al. (2018). 

The work is implemented with PyTorch and the goal of the talk will be review details of the implementation, and performance of the model as compared to that recorded in Kiela et al. (2018). Attendees of this talk should have a basic familiarity with neural nets, developed in PyTorch for the purposes of NLP and computer vision.

Reference:
Kiela, Douwe, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2018. Efficient large- scale multi-modal classification. arXiv preprint arXiv:1802.02892.

Tags: [u'mathematical-modelling']