News
To tackle this problem, in this work, we propose a novel multi-level multimodal fusion network and properly embed it into an attention-based LSTM so that both the visual information and the linguistic ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results