资讯
Abstract: Image-text matching aims to bridge the semantic gap between visual and textual modalities, which is a fundamental task for multimodal learning applications. However, most Transformer-based ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果一些您可能无法访问的结果已被隐去。
显示无法访问的结果