[162] CALIP: Zero-Shot Enhancement of CLIP with Parameter-free Attention

July 11, 2024 ยท 1 min ยท long8v ยท 

[159] Long-CLIP: Unlocking the Long-Text Capability of CLIP

May 10, 2024 ยท 1 min ยท long8v ยท 

[156] Interpreting CLIP's Image Representation via Text-Based Decomposition

May 6, 2024 ยท 2 min ยท long8v ยท 

[157] LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity

May 6, 2024 ยท 2 min ยท long8v ยท 

[152] Sigmoid Loss for Language Image Pre-Training

March 12, 2024 ยท 2 min ยท long8v ยท 

[148] I Can't Believe There's No Images! Learning Visual Tasks Using only Language Supervision

February 11, 2024 ยท 2 min ยท long8v ยท 

[145] CLIPScore: A Reference-free Evaluation Metric for Image Captioning

February 5, 2024 ยท 3 min ยท long8v ยท 

[141] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

December 15, 2023 ยท 3 min ยท long8v ยท 

[133] DataComp: In search of the next generation of multimodal datasets

October 5, 2023 ยท 2 min ยท long8v ยท 

[132] Hyperbolic Image-Text Representations

September 26, 2023 ยท 2 min ยท long8v ยท 

[125] RILS: Masked Visual Reconstruction in Language Semantic Space

August 2, 2023 ยท 2 min ยท long8v ยท 

[124] LiT: Zero-Shot Transfer with Locked-image text Tuning

July 6, 2023 ยท 4 min ยท long8v ยท 

[121] Open-domain Visual Entity Recognition: Towards Recognizing Millions of Wikipedia Entities

June 23, 2023 ยท 3 min ยท long8v ยท 

feat: add open-clip

June 21, 2023 ยท 1 min ยท long8v ยท 

[120] Large-scale Bilingual Language-Image Contrastive Learning

June 19, 2023 ยท 3 min ยท long8v ยท 

[98] Bridging the Gap between Object and Image-level Representations for Open-Vocabulary Detection

January 17, 2023 ยท 1 min ยท long8v ยท 

[74] โ€œThis is my unicorn, Fluffyโ€: Personalizing frozen vision-language representations

November 4, 2022 ยท 2 min ยท long8v ยท