Debiasing Vision-Language Models via Biased Prompts¶

The authors developed a projection matrix capable of debiasing text prompts of VLMs such as CLIP. By doing so, "a photo of a male doctor" and "a photo of a female doctor" can keep similar embeddings. It also works on generative models like Stable Diffusion, because it works with the same text prompt embeddings as CLIP.

[…] we define a set of biased directions in the embedding using prompts that describe the biases.

However, solely relying on prompts to define biased directions may be unstable and noisy [15]. To address this issue, we propose a calibration loss that minimizes the discrepancy of a pair of prompt embeddings.