ClipCap: CLIP Prefix for Image Captioning
Paper
•
2111.09734
•
Published
Image captioning model based on CLIP and GPT-2, trained on Conceptual Captions dataset.
See the test notebook for usage examples.
model.pt: Model checkpoint (state_dict)If you use this model, please cite:
@article{mokady2021clipcap,
title={ClipCap: CLIP Prefix for Image Captioning},
author={Mokady, Ron and Hertz, Amir and Bermano, Amit H},
journal={arXiv preprint arXiv:2111.09734},
year={2021}
}