资讯

We present a new multimodal face image generation method that converts a text prompt and a visual input, such as a semantic mask or scribble map, into a photorealistic face image. To do this, we ...