I have 3 projects on Github
This project is to reproduce the resulting code from the paper MUTAN: Multimodal Tucker Fusion for Visual Question Answering, and I will teach you in detail how to run through this code, including the processing of the dataset.
小李的博客地址如下:
论文解读:Dynamic Fusion with Intra-and-Inter-modality Attention Flow for Visual Question Answering