资讯

Abstract: We present a new multimodal face image generation method that converts a text prompt and a visual input, such as a semantic mask or scribble map, into a photorealistic face image. To do this ...