CLIP-LoRA

In this work, we show that lightweight tuning of vision–language foundation models, combined with domain-adapted face recognition networks, can effectively bridge the domain gap between photographs and paintings. Our fusion approach achieves state-of-the-art accuracy in sitter identification. Face recognition on artworks remains a particularly difficult task compared to traditional FR due to the scarcity of labelled data, stylistic variation, and the interpretive nature of portraiture. However, the results show that adapting modern architectures to this setting is feasible and promising. This opens up new research avenues, including synthetic data generation to augment the limited training set and heterogeneous domain adaptation techniques to improve generalisation across visual domains. Project page: https://www.idiap.ch/paper/artface/

Overview

  • Training: ArtFace was trained on The Historical Faces dataset (it consists of 766 paintings of 210 different sitters)
  • Backbone: CLIP-LoRA is adapted from CLIP (ViT-B/16) by OpenAI.
  • Parameters: 1M
  • Task: Towards Historical Portrait Face Identification via Model Adaptation
  • Framework: Pytorch
  • Output structure: Batch of face embeddings (ie, features)

Evaluation of Models:

ArtFace

Overview of the proposed method: (a) LoRA-based adaptation of the CLIP model, and (b) head adaptation using triplet loss.

ArtFace ROC{width=80%}

ROC curves of tuned and base CLIP, IResNet100, COTS and proposed fusion method. Fusion provides consistent improvements even at low FAR.

Model EER TAR @ 0.1% FAR TAR @ 1% FAR
COTS FR system 12.6 34.3% 58.1%
CLIP-Base 17.9 8.4% 33.2%
IResNet100-Base 14.0 29.9% 55.1%
CLIP-Base + IResNet100-Base 13.1 29.0% 54.7%
CLIP-Base + IResNet100-Tuned 12.6 35.1% 57.9%
CLIP-LoRA + IResNet100-Base 11.1 34.6% 62.6%
CLIP-LoRA + IResNet100-Tuned 10.7 39.7% 62.15%
CLIP-LoRA + IResNet100-Base + IResNet100-Tuned 9.9 39.7% 65.9%

Performance Comparison of Base, Tuned models, Fusion, and COTS FR Systems. Fusion enhances overall accuracy.

Running Code

  • Minimal code to instantiate the model and perform inference:
  # The command below can be used to align the images.
  python align.py -f [path_to_paintings]/* -o data/paintings
  # Run the commands below to test the full model.
  python generate-scores.py fusion
  python evaluate.py table -f out/fusion.csv
  python plot.py roc --log -f out/fusion.csv
  # To use the model directly, use the following code snippet:
  from lib.models import get_model
  from PIL import Image
  model, preprocess = get_model("fusion").torch()
  model.eval()
  image = Image.open("...")
  inputs = preprocess(image)
  embedding = model(inputs).squeeze()

License

CC BY-NC 4.0

Copyright

(c) 2025, Francois Poh, Anjith George, Sébastien Marcel Idiap Research Institute, Martigny 1920, Switzerland.

https://gitlab.idiap.ch/biometric/code.iccv2025artmetrics.artface/

Please refer to the link for information about the License & Copyright terms and conditions.

Citation

If you find our work useful, please cite the following publication:

@article{poh2025artface,
  title={ArtFace: Towards Historical Portrait Face Identification via Model Adaptation},
  author={Poh, Francois and George, Anjith and Marcel, S{\'e}bastien},
  journal={arXiv preprint arXiv:2508.20626},
  year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support