资讯

Official implementation of DiffSal, a diffusion-based generalized audio-visual saliency prediction framework using simple MSE objective function. The DiffSal model is then fine-tuned on the ...